Transaction Management
A transaction is a logical unit of work that transforms the database from one consistent state to another. Banking transfers, airline reservations, and e-commerce checkouts all depend on the transaction abstraction to ensure correctness in the face of failures and concurrent access.
ACID Properties
Transactions provide four fundamental guarantees, known collectively as ACID: Atomicity (either all operations complete or none do), Consistency (the database moves between valid states), Isolation (concurrent transactions do not interfere), and Durability (committed changes persist through failures). These properties underpin the reliability of relational databases.
Transaction States
A transaction passes through a sequence of states: active while operations are executing; partially committed after the last operation; committed once its changes are durably stored; failed if an error prevents normal progress; and aborted after rollback. The state diagram is enforced by the DBMS transaction manager.
Schedules and Serializability
A schedule is an ordering of operations from concurrent transactions. A schedule is serial if transactions execute without interleaving; serializable if it is equivalent to some serial schedule. Two common equivalence notions are conflict-serializable (same order of conflicting operations) and view-serializable (same reads-from relationships).
Conflict and View Equivalence
Two operations conflict if they are from different transactions, operate on the same data item, and at least one is a write. A schedule is conflict-serializable iff its precedence graph (with an edge for each conflict) is acyclic. View equivalence is more permissive but harder to test.
Recoverability
Beyond serializability, schedules must be recoverable. A schedule is recoverable if no transaction commits until all transactions from which it has read have committed. Cascadeless schedules avoid dependencies on uncommitted data; strict schedules go further, disallowing reading or overwriting uncommitted writes. Strictness simplifies recovery.
Read and Write Operations
Operations like READ(X) and WRITE(X) form the atomic units. Between them the transaction manager tracks locks, dependencies, and log entries. Aggregating many operations into transactions is the programmer's job; enforcing correctness is the DBMS's.
Concurrency vs Performance
Strong isolation limits concurrency and hence performance. Lower isolation levels (READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ) let more transactions run in parallel at the cost of anomalies like dirty reads, non-repeatable reads, and phantoms. Choosing an appropriate isolation level is a careful engineering decision.
Implementation Mechanisms
DBMSs implement transactions using locking, timestamp ordering, multiversion concurrency control, and write-ahead logging. These mechanisms are the subject of the next two chapters; they cooperate to provide ACID guarantees even under heavy concurrency and hardware failures.
Distributed Transactions
When data spans multiple nodes, two-phase commit (2PC) ensures atomicity across sites: a coordinator asks all participants to prepare, then commits if all agree. 2PC is reliable but blocks if the coordinator fails, motivating alternatives like three-phase commit and consensus protocols.
Summary
Transactions hide the complexity of concurrency and failure behind a simple interface. Understanding ACID, serializability, and recoverability is the foundation for studying concurrency control and recovery in detail.