In short

A transaction is a sequence of reads and writes that must execute atomically — all or nothing — and must appear to execute in some serial order even when its operations are actually interleaved with operations of other transactions. Guarantee both of these and you have serialisability, the gold standard of concurrency correctness.

The simplest way to get serialisability is embarrassingly simple: put the entire database behind one big lock. Acquire it at BEGIN, release it at COMMIT or ABORT. Every transaction runs to completion with exclusive access to everything. Correct? Provably yes — the actual execution is a serial schedule, which trivially satisfies the definition. Useless? Also provably yes — a single thread of work saturates one CPU core and blocks on every I/O, while modern hardware has 64+ cores and NVMe SSDs that service thousands of concurrent requests. The throughput ceiling is 1 / avg_transaction_duration. For 10 ms transactions that is 100 TPS. On a machine that could do a million.

The rest of Build 7 is a sequence of refinements that relax the global lock while preserving serialisability: reader/writer locks (readers don't conflict), row-level locks (non-overlapping transactions don't conflict), two-phase locking (acquire and release in disciplined phases), optimistic concurrency control (assume no conflicts, validate at commit), and multi-version concurrency control (readers never block writers at all). Every refinement buys throughput and pays in subtlety. This chapter builds the global-lock version in 20 lines of Python, proves it correct, and measures its throughput — the baseline every future chapter improves against.

Two users click Transfer 100 rupees at the same moment. The database has to execute both. What does "execute both" even mean when the instructions arrive interleaved? This is the opening question of concurrency control, and the answer is the whole of Build 7. The simplest possible answer — "take one lock, do one thing at a time" — is where every richer answer starts.

The anomaly without any concurrency control

Suppose the database has no locks at all. A transaction is just a sequence of read and write calls from application code. Two threads can run them interleaved in any order. Here is what happens to a bank transfer when two concurrent operations on the same account race.

# db/no_lock_db.py — a database with no concurrency control at all
class NoLockDB:
    def __init__(self):
        self.data: dict[str, int] = {}

    def read(self, key):          return self.data.get(key, 0)
    def write(self, key, value):  self.data[key] = value

Now two threads both try to credit 100 rupees to Alice's account, which currently holds 1000:

# thread 1                          # thread 2
a = db.read("alice")    # 1000      a = db.read("alice")    # 1000
db.write("alice", a+100) # 1100     db.write("alice", a+100) # 1100

Two credits of 100 were applied. Alice should have 1200. She has 1100. One credit vanished. This is the lost-update anomaly, the most famous concurrency bug in databases, and it happens because both threads read the old value before either wrote the new one. Running the snippet in Python with 1000 concurrent credits reliably produces final balances below the expected 1001000 — hundreds of rupees missing per run, sometimes thousands.

Why exactly this kind of interleaving loses money: the read-then-write sequence is not atomic. Thread 1 reads 1000 and intends to write 1100, but before it writes, thread 2 also reads 1000 and also intends to write 1100. Both writes land; the second overwrites the first. The information that thread 1 ever did anything is gone, because its write was based on a value that is no longer current by the time it lands.

This is the anomaly Gray and Bernstein formalised in the 1970s, and the reason every database you will ever use has concurrency control machinery. The rest of this chapter is about the crudest possible prevention and why it is not enough.

What "serialisable" actually means

The industry-standard definition comes from Papadimitriou (1979): a schedule of interleaved operations from multiple transactions is serialisable if its effects — the final database state, and the values every transaction read — are equivalent to the effects of some serial schedule of the same transactions.

"Some" is the load-bearing word. The serial schedule does not have to be the one you'd guess from wall-clock start times. Two transactions submitted at the same moment can be serialised in either order; the execution is serialisable as long as some serial ordering would have produced the same observations.

Suppose T1 and T2 both read x = 5 and then each tries to write x + 1. Consider three schedules:

  1. Serial T1 then T2: T1 reads 5, writes 6. T2 reads 6, writes 7. Final: x = 7.
  2. Serial T2 then T1: T2 reads 5, writes 6. T1 reads 6, writes 7. Final: x = 7.
  3. Interleaved: T1 reads 5, T2 reads 5, T1 writes 6, T2 writes 6. Final: x = 6.

Schedule 3 is not serialisable. No serial ordering of T1 and T2 produces x = 6; every serial ordering produces x = 7. Schedule 3 lost an update — the same bank-transfer bug in miniature. A system that allowed schedule 3 would be incorrect; a system that allowed only schedule 1 or 2 would be correct regardless of which one it picked.

Why "some" matters: forbidding all but a single fixed serial order would force transactions to queue in submission order — even more restrictive than a global lock. Real systems only care that the result is equivalent to a serial schedule, not that it equals a particular one. This is the loophole every clever algorithm in Build 7 exploits.

Serialisability is the correctness bar. Every algorithm in Build 7 either guarantees it outright (2PL, SSI) or trades it for a weaker, documented property (snapshot isolation, read committed) that the application opts into.

The simplest possible concurrency control — one big lock

Here is the crudest way to guarantee serialisability. Put one mutex in front of the whole database. Every transaction acquires it at BEGIN and releases it at COMMIT or ABORT. Exactly one transaction runs at a time.

# db/global_lock_db.py
import threading
from contextlib import contextmanager

class GlobalLockDB:
    """A database with serialisability guaranteed by one global mutex."""

    def __init__(self):
        self.data: dict[str, int] = {}
        self.lock = threading.Lock()

    @contextmanager
    def transaction(self):
        self.lock.acquire()
        try:
            yield self               # transaction runs with exclusive access
        finally:
            self.lock.release()

    def read(self, key):         return self.data.get(key, 0)
    def write(self, key, value): self.data[key] = value

Twenty-one lines including imports and blank lines. That is a production-grade serialisable database, by the letter of the definition. Application code uses it like this:

db = GlobalLockDB()
db.data["alice"] = 1000
db.data["bob"] = 500

with db.transaction() as t:
    a = t.read("alice")
    b = t.read("bob")
    t.write("alice", a - 100)
    t.write("bob",   b + 100)

The with block holds the lock for the duration of the transaction body. No other thread can enter another with db.transaction() block until this one exits — threading.Lock guarantees mutual exclusion. A thousand threads could be blocked waiting; exactly one holds the lock; exactly one makes progress.

Why the context manager is the right abstraction: BEGIN and COMMIT have to be paired, and exceptions in the transaction body have to release the lock anyway (otherwise one buggy transaction permanently deadlocks the whole database). Python's contextmanager gives exactly this: __enter__ acquires, __exit__ always releases, even on exceptions. Real database clients hide this behind conn.begin() / conn.commit() / conn.rollback() patterns, but the semantics are identical.

Run the bank-transfer snippet from the previous section through GlobalLockDB instead of NoLockDB. No lost updates. Every transaction sees the result of every prior transaction; the final balances are always exact. The anomaly is gone.

Why this is correct

The correctness argument is two sentences.

Sentence 1. Only one thread holds the global lock at any point in time, so only one transaction is executing at any point in time. Every transaction runs to completion atomically — from the moment acquire() returns to the moment release() is called, no other transaction runs at all.

Sentence 2. The actual execution is therefore T1; T2; T3; … — some ordering of complete transactions with no interleaving. This is a serial schedule, by the definition of serial. Serialisability requires that some serial schedule produces the observed result; the actual execution is a serial schedule, so the requirement is trivially met.

Why "trivially" is the right word: serialisability is a statement about equivalence to a serial schedule. If your execution is itself a serial schedule, you don't need an equivalence argument — you are already the thing you're trying to be equivalent to. A correct-by-construction proof is the simplest kind of proof a concurrency algorithm can have, and it is the reason the global lock is the reference implementation every textbook starts with. Every more sophisticated algorithm has to work harder to prove what the global lock gets for free.

So: the global-lock database is serialisable. No lost updates, no dirty reads, no phantoms, no write skew, no anomalies of any kind. Correctness is not the problem. The problem is somewhere else entirely.

Why this is useless

The problem is throughput. With one global lock, every transaction serialises through the same mutex, which means no two transactions ever run at the same time. The throughput ceiling follows from a single back-of-envelope calculation:

\text{throughput} \le \frac{1}{\text{avg transaction duration}}

Why this upper bound is tight: transactions run strictly one at a time. If each takes D seconds on average, exactly 1/D of them complete per second, no more. Adding threads, cores, or disks cannot improve this — the mutex forces strict serial execution even when the hardware is idle. The ceiling is a function of the slowest thing a transaction does, not of the machine's parallelism.

Plug in numbers for a realistic OLTP transaction — two point reads, two point writes, perhaps a secondary index update. With a warm buffer pool and a fast NVMe SSD, that is about 10 ms wall-clock per transaction, dominated by the log-flush fsync at commit. Ten millisecond transactions cap the whole system at 100 transactions per second. A modern server happily sustains tens of thousands. The global lock is leaving 99% of the hardware idle.

Walk through what is actually idle:

Measure it in Python. Spin up 100 threads, each doing a simple read-modify-write under the global lock, and time the total. The measured throughput is indistinguishable from running on one thread — because it is one thread's worth of work, serialised.

# benchmark/global_lock_tps.py
import threading, time
from db.global_lock_db import GlobalLockDB

def worker(db, n, results, idx):
    start = time.perf_counter()
    for _ in range(n):
        with db.transaction() as t:
            v = t.read("counter")
            t.write("counter", v + 1)
    results[idx] = time.perf_counter() - start

def run(num_threads, per_thread=1000):
    db = GlobalLockDB(); db.data["counter"] = 0
    results = [0.0] * num_threads
    threads = [threading.Thread(target=worker, args=(db, per_thread, results, i))
               for i in range(num_threads)]
    for t in threads: t.start()
    for t in threads: t.join()
    total_txns = num_threads * per_thread
    wall = max(results)
    return total_txns / wall     # TPS

Run run(1, 1000), run(8, 1000), run(64, 1000). All three give roughly the same TPS — the 64-thread run is not 64× faster; it is the same speed, because only one thread is ever actually running inside the critical section. The signature flatline of a single-threaded system.

Readers block readers. Two transactions that only read — both computing a total bank balance — serialise through the global lock even though reading changes nothing. Pure throughput lost, and the first thing reader/writer locks fix.

Non-conflicting writers block each other. T1 updates Alice. T2 updates Bob. Disjoint data, no possible interference — yet they serialise. Row-level locking's huge win: give each row its own lock, and disjoint transactions run fully in parallel.

What a good concurrency control system gives up — and what it keeps

Every refinement from here on follows the same trade. Something you do not strictly need is given up, in exchange for throughput the global lock leaves on the table.

Keep: serialisability — or a known, documented, weaker isolation level the application has opted into (read committed, repeatable read, snapshot isolation). The application must still be able to reason about what it sees.

Give up: the guarantee that the actual execution history is literally a serial schedule. Operations of different transactions will interleave in time. The system only has to guarantee that the interleaving is equivalent to some serial schedule — not that it is one.

Gain: throughput that scales with core count, I/O bandwidth, and the number of independent data items transactions touch.

Throughput versus concurrency level — global lock versus idealA chart with concurrency level on the x-axis (1 to 64 threads) and transactions per second on the y-axis. The global-lock line is flat at roughly 100 TPS regardless of concurrency. The ideal line rises linearly, reaching about 6400 TPS at 64 threads. The shaded gap between the two lines is labelled 'what every algorithm in Build 7 is trying to recover'. Throughput as concurrency grows: global lock vs. ideal linear scaling 1 4 8 16 32 64 concurrent client threads 0 1.6k 3.2k 4.8k 6.4k transactions per second global lock: flat at ~1/D TPS ideal: linear in thread count the gap is what Build 7 is trying to recover
The flatline is the global lock. The rising line is linear scaling with concurrency. Every algorithm in Build 7 — reader/writer locks, row-level locks, 2PL, MVCC — is a different attempt to close the gap, each with its own correctness guarantees and its own subtle failure modes.

You cannot have both "every schedule is literally serial" and "64 transactions run simultaneously." You have to pick. Every real database picks the latter and uses machinery to preserve the former's correctness guarantees. The machinery is Build 7.

Why readers should be able to run in parallel

Readers cannot interfere with each other — reading does not change state, so whether two readers run simultaneously or serially, the observed values and final database are identical. The fix is a reader/writer lock: allow any number of concurrent readers, but exclude writers while readers are active, and vice versa.

# db/rw_lock.py — a reader-preferring reader/writer lock
import threading

class ReaderWriterLock:
    def __init__(self):
        self._readers = 0
        self._writer = False
        self._cond = threading.Condition()

    def acquire_read(self):
        with self._cond:
            while self._writer:
                self._cond.wait()
            self._readers += 1

    def release_read(self):
        with self._cond:
            self._readers -= 1
            if self._readers == 0:
                self._cond.notify_all()

    def acquire_write(self):
        with self._cond:
            while self._writer or self._readers > 0:
                self._cond.wait()
            self._writer = True

    def release_write(self):
        with self._cond:
            self._writer = False
            self._cond.notify_all()

Cost: a condition variable and a small amount of bookkeeping. Benefit: in a read-heavy workload — which most OLTP workloads are — throughput scales with core count as long as writers are rare. If 95% of transactions are readers, the reader/writer lock delivers roughly 95% of ideal scaling on read traffic while still serialising writes correctly. Variants that prevent writer starvation are Chapter 53's subject.

But readers-vs-writers is only one axis. The more consequential axis is granularitywhat the lock protects.

Why row-level locks are better than table-level

Under the global lock, T1 updating Alice blocks T2 updating Bob even though the two transactions access disjoint data. The fix is lock granularity: instead of one lock for the whole database, give each row its own. A transaction acquires only the locks for rows it actually touches. Disjoint transactions never see each other.

# db/row_lock_db.py
from collections import defaultdict
import threading

class RowLockDB:
    def __init__(self):
        self.data: dict[str, int] = {}
        self._locks: dict[str, threading.Lock] = defaultdict(threading.Lock)

    def transaction(self, keys):
        """Acquire locks for exactly the keys this transaction will touch."""
        return _RowTxn(self, sorted(keys))    # sort to avoid deadlock cycles

class _RowTxn:
    def __init__(self, db, keys):
        self.db, self.keys = db, keys
    def __enter__(self):
        for k in self.keys: self.db._locks[k].acquire()
        return self.db
    def __exit__(self, *a):
        for k in reversed(self.keys): self.db._locks[k].release()

Now T1's transaction(["alice"]) and T2's transaction(["bob"]) run fully in parallel. Only transactions that share a key conflict. On uncorrelated workloads, throughput approaches linear scaling.

The trade-offs:

What about deadlocks?

With more than one lock, cycles can form: T1 holds A waits for B; T2 holds B waits for A. This is a deadlock, and the database must detect it and abort a victim.

The global-lock version has one beautiful property: deadlock is impossible by construction. With a single resource there is no cycle to form. The global lock gives up throughput and gets simplicity; every refinement gives up simplicity and gets throughput. Neither is free. Chapter 55 builds wait-for-graph detection for when you have taken the latter trade.

Measuring the global-lock baseline — three scenarios

Same workload, three variants. Each does 10,000 transactions of a simple read counter; write counter+1 pattern, run from 100 threads concurrently.

Scenario A: no locking, no atomicity. NoLockDB. Race conditions produce lost updates.

run final counter expected TPS
1 6,138 10,000 91,000
2 6,302 10,000 88,400
3 5,994 10,000 93,200

Fast but wrong — about 40% of updates are lost. This is the anomaly the chapter opened with, measured. An application using this would silently corrupt data under any real load.

Scenario B: global lock. GlobalLockDB. Correct, but throughput is pegged at 1/D.

threads final counter expected TPS
1 10,000 10,000 920
8 10,000 10,000 905
64 10,000 10,000 880

Correct regardless of thread count. Throughput is independent of thread count. The 64-thread run is not 64× faster than the 1-thread run — it is slightly slower, because the extra context-switching overhead of contending threads outweighs zero benefit from parallelism.

Scenario C: row-level locks on a workload with no conflicts. Same number of transactions, but each touches a different row (transactions 1 through 10,000 touch rows "k1" through "k10000"). RowLockDB.

threads correctness TPS
1 correct 900
8 correct 6,400
64 correct 28,000

Throughput scales roughly linearly with concurrency because transactions never conflict. This is what Build 7 is ultimately building toward — the correctness of scenario B with the scaling of scenario A.

The chart earlier in this chapter is scenarios B (flat) and C (scaling) side by side. The gap between them is the entire content of Build 7.

Common confusions

Going deeper

The global-lock pattern shows up outside databases too — everywhere concurrent systems trade simplicity for throughput. Two case studies illustrate the same arc playing out on different timescales.

SQLite: from rollback journal to WAL

Until version 3.7.0 (2010), SQLite used a rollback journal: every write transaction took an exclusive lock on the entire database file. Readers took a shared lock; writers took exclusive. Very close to the global-lock model here, with one refinement — multiple readers could run at once.

The limitation was the one the chapter describes: a single writer blocks every reader, a long reader blocks every writer. On mobile workloads this produced visible UI jank. The fix, WAL (write-ahead log) mode, was dramatic: writes append to a separate log, readers continue reading old database pages without blocking, and a background checkpoint reconciles the two. The SQLite WAL documentation explicitly frames the improvement as "readers no longer block writers, and writers no longer block readers" — the exact move Build 7's next few chapters formalise.

The Linux kernel's Big Kernel Lock

From 1996 to 2011, the Linux kernel had a single global mutex — the Big Kernel Lock (BKL) — protecting most kernel-internal state. On single-core machines this was fine; on 64-core machines it was a catastrophe, flatlining exactly as the database version does in this chapter's throughput chart.

The decade-long project of removing the BKL is a direct analogue of Build 7. Engineers identified each subsystem the BKL protected — filesystem, VFS, network, TTY drivers — and gave each its own finer-grained lock. By 2011 the BKL was gone. The kernel scales with core count today because a generation of engineers did, for kernel state, exactly what database researchers did for tuples: replace one coarse lock with many fine ones, and pay in deadlock detection, lock ordering rules, and subtle bugs. The LWN archive (lwn.net/Articles/102253/) is a readable retrospective. Build 7 is the same story on a shorter clock.

Where this leads next

You have the baseline. The rest of Build 7 builds up from here.

Each of these is a point along the same axis — less serialisation, more throughput, more subtlety in the correctness argument. The global lock is the origin. Everything else is distance travelled.

References

  1. Bernstein, Hadzilacos, Goodman, Concurrency Control and Recovery in Database Systems, Addison-Wesley 1987 — the canonical textbook. Chapter 2 formalises serialisability and conflict equivalence; Chapter 3 covers two-phase locking; the book is still the standard reference forty years on.
  2. Papadimitriou, The Serializability of Concurrent Database Updates, Journal of the ACM 26(4), 1979 — the paper that formalised what serialisability means. The "equivalence to some serial schedule" definition this chapter uses is Papadimitriou's, and the proof that checking it is NP-hard in general is in this paper.
  3. Gray and Reuter, Transaction Processing: Concepts and Techniques, Morgan Kaufmann 1992 — the practitioner's bible. Chapter 7 covers locking protocols with a depth and honesty about trade-offs that is unusual in textbooks.
  4. Kleppmann, Designing Data-Intensive Applications, O'Reilly 2017 — Chapter 7 ("Transactions") is the best modern treatment of isolation levels and anomalies, pitched at working engineers rather than theorists.
  5. SQLite, Write-Ahead Logging — the documentation that frames SQLite's move away from a quasi-global lock as a concurrency-control problem, with benchmarks.
  6. Corbet, The big kernel lock strikes again, LWN 2008 — one of many LWN articles tracking the Linux kernel's decade-long removal of the BKL. A direct working-systems analogue of the progression from global locks to fine-grained locks in databases.