Updated on 21 Mar, 202616 mins read 22 views

What is a Transaction?

A transaction is a group of one or more operations treats as a single unit of work.

Either:

  • All operations succeed: transaction is committed
  • Any operation fails: everything is rolled back

Example:

Bank transfer:

  1. Deduct 1000 from Account A
  2. Add 1000 to Account B

If step 2 fails, step 1 must be undone.

Transaction States (Lifecycle)

A transaction doesn't just go from start -> success/failure. It passes through multiple well-defined states:

1 Active

The transaction has started and is executing operations (reads/writes).

Example:

Updating account balances, inserting rows, etc.

This is the “in progress” state.

2 Partially Committed

The transaction has executed all its operations successfully

But the changes are not yet permanently saved to disk

Think: “Everything looks good, but not finalized yet”

3 Committed

The transaction is successfully completed

All changes are permanently stored (durable)

After this point, changes cannot be undone.

4 Failed

Something went wrong during execution

Could be due to:

  • System crash
  • Constraint violation
  • Deadlock

The transaction cannot proceed further.

5 Aborted (Rolled Back)

The system undoes all changes made by the transaction

Database is restored to the previous consistent state

After abort:

  • It may be restarted or completely discarded.

State Transition Flow

Here's how a transaction typically moves:

Active → Partially Committed → Committed
   ↓
 Failed → Aborted

Or more explicitly:

  1. Start -> Active
  2. If all operations succeed -> Partially Committed
  3. If everything is safely written -> Committed

If something fails:

  • Active -> Failed -> Aborted

ACID Properties (Core of Transactions)

Transactions are defined by the ACID principles.:

A – Atomicity

  • “All or nothing”
  • No partial updates

C – Consistency

  • Database remains in a valid state
  • Rules (constraints, invariants) are preserved

I – Isolation

  • Concurrent transactions don't interfere
  • Each transaction behaves as if it's alone

D – Durability

  • Once committed, data is permanently stored
  • Survives crashes

Types of Transaction Systems

Single-node (Traditional DB)

Distributed Transactions

When data spans multiple services/databases.

A distributed transaction is a set of operations on data that is performed across two or more databases. It is typically coordinated across separate nodes connected by a network, but may also span multiple databases on a single server.

Why Do we Need Distributed Transactions?

Unlike an ACID transaction on a single database, a distributed transaction involves altering data on multiple databases. Consequently, distributed transaction processing is more complicated, because the database must coordinate the committing or rollback of the changes in a transaction as a self-contained unit.

In other words, all the nodes must commit, or all must abort and the entire transaction rolls back. This is why we need distributed transactions.

Problem:

What if one service succeeds and another fails?

For example:

Service A updates Database A

Service B updates Database B

What if:

  • A succeeds
  • B fails

Now the system is inconsistent.

Two-Phase Commit (2PC)

In it we have Coordinator (Transaction Manager)

  • Controls the transaction
  • Decides commit or rollback

Participants (Cohorts)

  • Individual services/databases
  • Execute the transaction steps

The Two Phases:

Phase 1: Prepare Phase (Voting Phase)

Coordinator asks:

“Can you commit?”

Steps:

  1. Coordinator sends PREPARE requests to all participants
  2. Each participant:
    1. Execute the transaction locally
    2. Writes changes to a log (but does NOT commit yet)
    3. Replies:
      1. Yes (ready to commit)
      2. No (cannot commit)

Phase 2: Commit Phase

Case 1: All vote YES

  • Coordinator sends COMMIT
  • All participants finalize changes

Case 2: Any vote NO

  • Coordinator sends ROLLBACK
  • All participants undo changes

Flow Diagram:

        Coordinator
             |
   +---------+---------+
   |         |         |
  P1        P2        P3

Phase 1:
Coordinator → PREPARE → All
P1 → YES
P2 → YES
P3 → YES

Phase 2:
Coordinator → COMMIT → All

Failure case:

P2 → NO

Coordinator → ROLLBACK → All

Guarantees:

Atomicity across systems:

  • Either all commit or all rollback

Consistency:

  • No partial updates

Problems with 2PC

  1. Blocking Problem
    1. If coordinator crashes after “prepare”:
      1. Participants are stuck waiting
      2. They cannot decide commit/rollback themselves
  2. Single Point of Failure
    1. Coordinator failure can halt the system
  3. Slow Performance
    1. Requires multiple network round trips
    2. Participants must lock resources until final decision
  4. Not Scalable
    1. Doesn't work well in large microservices architectures

Where 2PC Is Used

  • Traditional distributed databases
  • Banking systems (where consistency is critical)
  • Some enterprise systems
Buy Me A Coffee

Leave a comment

Your email address will not be published. Required fields are marked *