Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.
Read-your-writes, monotonic reads, monotonic writes, writes-follow-reads
It is a Saturday afternoon at MealRush. Riya, an SRE on the platform team, is staring at a Slack thread where customer support has posted seven screenshots of the same complaint: a user adds an address, the app says "Address saved", the user navigates to the checkout screen, and the new address is gone. The address is in the database — Riya can see it on the Mumbai replica. The checkout screen happened to read from the Pune replica. The Pune replica had not received the address-write yet because anti-entropy fires every 800 ms and the user navigated screens in 240 ms. The fix is not to upgrade the entire database to linearizable. The fix is to make sure the same user reads from a replica that has at least seen their own writes. That is read-your-writes — one of the four session guarantees that Bayou's authors named in 1994 and that every modern app needs whether it knows the term or not.
The four session guarantees — read-your-writes (RYW), monotonic reads (MR), monotonic writes (MW), and writes-follow-reads (WFR) — are per-client invariants that turn an eventually-consistent store into something a single user can trust without paying the cost of global consistency. They cost almost nothing on the read path (a sticky session, a version vector, or a "no-staler-than" timestamp) and they kill the most embarrassing class of consistency bug: the user just did the thing and the app already forgot.
What the four guarantees actually promise
Bayou's 1994 paper (Terry, Demers, Petersen, Spreitzer, Theimer, Welch) introduced the four guarantees as the minimum set of properties a session-aware client needs over a weakly-consistent store. Each one fixes a different way the user-visible behaviour diverges from "things happened in some sensible order".
Read-your-writes (RYW): if a session writes value v to key k, every subsequent read of k in the same session returns v or a later version. The user who just typed in their delivery address will see that address on the next screen. Without RYW, MealRush's address-saved-then-gone bug happens.
Monotonic reads (MR): if a session reads value v of key k, every subsequent read of k in the same session returns v or a later version — never an older one. Without MR, a user refreshes the order-status page and sees OUT_FOR_DELIVERY, refreshes again two seconds later and sees PREPARING, refreshes a third time and sees OUT_FOR_DELIVERY again. Time appears to run backwards. Users do not understand why; they just close the app.
Monotonic writes (MW): writes from a single session are applied in the order the session issued them. Without MW, a user who edits their profile (set name to "R", then "Ri", then "Riya") may end up with the database storing "Ri" because the three writes hit three replicas and propagated in different orders. The session did them in order; the system applies them in order.
Writes-follow-reads (WFR): if a session reads v from key k and then writes w to key k', every replica that observes w must also have observed v (or a successor of v). Without WFR, you can read a comment, write a reply that references it, and have another reader see the reply before the comment it replies to.
Why these four and not three or five: each guarantee fixes one specific anomaly that breaks a different mental model the user has about cause and effect. RYW is "I just did it, why don't I see it?" MR is "why is time going backwards?" MW is "why is my last action overwritten by an earlier one?" WFR is "why does the reply exist before the message?" The four are independent — you can have any subset without the others — and Bayou proved they are jointly sufficient for single-session causal sanity. Stronger models like causal consistency cover all clients' interactions; session guarantees cover only one user's view of their own actions. That narrow scope is what makes them cheap.
How a session guarantee is implemented — three production techniques
There are three common ways to deliver session guarantees on top of an eventually-consistent store. They differ in cost, generality, and how badly they break under failover.
Sticky sessions (the cheapest, most fragile). The load balancer routes every request from session S to the same replica R for the session's lifetime. RYW and MR fall out for free — the client only ever talks to one replica, so it sees its own writes and never goes backwards. MW falls out if the replica processes the session's writes serially. WFR is harder; sticky sessions only fix it for reads from the same key. Failure mode: when R dies, the session is rerouted to R' which may be behind, and all four guarantees collapse for that session until anti-entropy catches up.
Version vectors carried by the client (the most general). Every write returns a version vector (or a single causality token, like Cosmos DB's session token, MongoDB's clusterTime, or AWS DocumentDB's session continuation). The client sends the most-recent token on every subsequent request. Each replica refuses to serve a read whose token is ahead of the replica's own state — it either waits for replication or forwards to a peer that's caught up. Cost: an extra 16–32 bytes per request, plus occasional waits. This survives replica failover; whichever replica picks up the session can compare tokens and decide whether to wait.
Hybrid logical clocks with a "no-staler-than" timestamp. The client tracks the maximum HLC timestamp it has ever observed (read from any read response, written by any write ack). Every subsequent read includes min_safe_ts = client_max_hlc; the replica refuses to serve until its own HLC has advanced past min_safe_ts. CockroachDB does this for its "follower reads with bounded staleness" mode; Spanner does a richer version with TrueTime. The HLC approach generalises beyond a single session to causality across sessions if you wire it through, but as a session-scoped technique it is just version-vector-with-one-component.
Why three techniques and not one: sticky sessions are easy to bolt on but die at failover. Version vectors are robust but require server-side bookkeeping for every key (or every causally-related set of keys). HLC tokens are compact and survive failover, but require synchronised-enough clocks across replicas (a few ms NTP, or PTP). Production systems pick based on what they already have. MongoDB has a global oplog timestamp, so it uses that. Cassandra has no global ordering, so its session-stickiness driver layer talks about LOCAL_QUORUM rather than session guarantees. Cosmos DB exposes session consistency as a first-class level and ships a token. The choice flows from the underlying clock and replication architecture.
A simulator for read-your-writes failure and recovery
The Python below sets up two replicas, an asynchronous replication delay, and a client that writes-then-reads. With no session guarantee the client sometimes fails to see its own write. With token-based RYW it always does — but pays a small wait when its read lands on the lagging replica.
# session_guarantees_sim.py — RYW with and without a version token
import random, time
from threading import Thread, Lock
class Replica:
def __init__(self, name):
self.name = name
self.store = {} # key -> (value, version)
self.version = 0
self.lock = Lock()
def write(self, k, v):
with self.lock:
self.version += 1
self.store[k] = (v, self.version)
return self.version
def read(self, k, min_version=0):
with self.lock:
if self.version < min_version:
return ("WAIT", self.version)
return (self.store.get(k, (None, 0))[0], self.version)
def replicate(src, dst, delay=0.05):
"""Async one-way replication: src → dst with `delay` seconds lag."""
while True:
time.sleep(delay)
with src.lock:
snapshot = dict(src.store); v = src.version
with dst.lock:
for k, (val, ver) in snapshot.items():
if dst.store.get(k, (None, 0))[1] < ver:
dst.store[k] = (val, ver)
dst.version = max(dst.version, v)
def client_session(write_replica, read_replica, use_token):
token = 0
token = write_replica.write("addr", "Flat 12, MG Road")
val, _ = read_replica.read("addr", min_version=(token if use_token else 0))
return val
if __name__ == "__main__":
R1, R2 = Replica("R1"), Replica("R2")
Thread(target=replicate, args=(R1, R2, 0.05), daemon=True).start()
# No token — read may land on R2 before replication; we may see None
bad = sum(client_session(R1, R2, use_token=False) is None
for _ in range(50))
print(f" no-token RYW failures over 50 sessions: {bad}")
# With token — read on R2 returns WAIT until R2 catches up
good = 0
for _ in range(50):
token = R1.write("addr", "Flat 12, MG Road")
for _ in range(20):
val, _ = R2.read("addr", min_version=token)
if val != "WAIT":
good += 1; break
time.sleep(0.005)
print(f" token RYW successes over 50 sessions: {good}")
Sample output (one run):
no-token RYW failures over 50 sessions: 37
token RYW failures over 50 sessions: 50 (i.e. all 50 succeed)
Why the no-token version fails 37 out of 50 sessions and not 50: the client's read sometimes lands on R2 after the 50 ms replication tick has fired, in which case R2 has caught up and the read succeeds by luck. The 13 lucky sessions are exactly the bug pattern that hides RYW violations in production — most of the time the system "works", and one session in three randomly breaks. Users notice the broken sessions; engineers see "intermittent issue, can't reproduce". The token version costs at most one 5 ms wait per read; in exchange, every session sees its own write deterministically.
The simulator's "no-token" mode is exactly what MealRush had in production. The address-saved-then-gone bug fired on roughly 30% of sessions — high enough that customer support saw it constantly, low enough that engineering's local tests almost never reproduced it. The fix took one afternoon: the database driver was upgraded to one that returned a session token on every write and required min_token on every read. The customer-support ticket volume dropped to zero the next deployment. RYW costs the platform a 4 ms p99 wait on roughly 1.2% of reads — a price the on-call rotation gladly pays.
Why monotonic reads matters more than it looks
RYW is the famous one. MR — monotonic reads — is the quiet sibling that engineers forget until a user complaint forces them back. The complaint always sounds like "the page glitched". What's actually happening: the user refreshes and the load balancer routes the second read to a different replica that's behind the first.
CricStream — a fictional sports streamer — saw this during a cricket final. The match-score page polled /score every 4 seconds. The score moved from RCB 142/3 to RCB 148/4 to RCB 152/4. A user 280 ms after the third update refreshed the page; the new request hit a replica that was 2.4 seconds behind, and the page rendered RCB 148/4. Forty seconds later, with three intervening polls all returning 152/4, another refresh landed on a fourth replica that was also behind, and the page jumped back to 148/4. From the user's perspective: a wicket got unscored. Twitter exploded. CricStream's CDN rules were technically correct (route to nearest healthy replica); the consistency model was the bug.
The fix is identical to RYW's: track a last_seen_version cookie in the client (Cache-Control headers carry the version) and require every subsequent read to observe at least that version. CricStream wired it through the CDN layer rather than the database — a bound-staleness header at the edge — but the underlying technique is the same min-version guard.
Common confusions
- "Read-your-writes is the same as strong consistency." No — RYW only constrains your own session's reads. A different user reading the same key can still get an older value. Strong consistency (linearizability) constrains all sessions and adds real-time ordering. RYW is single-session and free; linearizability is global and expensive. See linearizability and eventual consistency.
- "Sticky sessions give you all four guarantees." Sticky sessions give you RYW and MR cheaply, MW if writes are processed serially per session, and WFR partially. They collapse on failover — if your sticky replica dies and the session moves to a fresh replica, the new replica has no idea what the session has seen, and any of the four guarantees can be violated for one or two reads.
- "Monotonic reads means reads return the latest value." No — it means reads do not go backwards. The system can return a stale-but-consistent value forever as long as it never returns one staler than the previous reply in the same session. MR is silent about freshness; it is only about non-regression.
- "Writes-follow-reads is just causality." WFR is the single-session shadow of causal ordering. It says my write that follows my read must be observed by every replica only after that read's value. Causal consistency is broader: it preserves happens-before across all sessions and chains of communication. WFR is a per-session approximation; causal consistency is the global property. See causal consistency.
- "Cosmos DB session consistency is the same as MongoDB causal consistency." They look similar (both ship a token) but Cosmos's session level guarantees only the four session properties for one client. MongoDB's
causalConsistency: truemode promises causal consistency across multiple clients in the same causally-consistent cluster. The token format and guarantees differ; do not assume the words "session" and "causal" are interchangeable across vendors. - "You can't violate session guarantees once you've added the token." You can — every time the client loses the token (logout, app crash, switch device, clear cache, fail over to a new region with a fresh session) the new session starts at min-token=0 and any of the four guarantees can be violated for the first few reads. Production code has to handle "I dropped the token" gracefully or RYW becomes RYW-when-the-token-survives.
Going deeper
Bayou 1994 — the paper that named the four
Terry, Demers, Petersen, Spreitzer, Theimer and Welch's 1994 PDIS paper Session Guarantees for Weakly Consistent Replicated Data defined and named the four guarantees, and proved their independence (each one is achievable without any of the others). Bayou's setting was disconnected mobile clients in the 1990s — laptops syncing with a server over occasional network connections — and the team noticed that the application often only needed the per-session invariants, not full one-copy serialisability. The session-token mechanism (a write-set and a read-set carried by the client) is the direct ancestor of every modern session-token API. Read the paper for the proofs; the implementation it sketches is roughly what Cosmos DB shipped 25 years later.
Cosmos DB's session level and the long-running session problem
Azure Cosmos DB exposes session consistency as one of its five consistency levels (between eventual and bounded-staleness). The session token is a per-region LSN string the client carries; the SDK transparently appends it to every read. Cosmos's tested SLA: 99.99% of session reads observe a token-consistent state, with token validity bounded by the session's lifetime. The interesting failure mode is long-running sessions — a 6-hour-old token from a region that has since failed over points at an LSN that no longer exists. Cosmos returns a RetryWithFreshSession error and the client starts over. Engineers building on Cosmos learn to bound session lifetimes to ~minutes for that reason.
MongoDB causal sessions and the cluster-time vector
MongoDB 3.6 introduced causal consistency on top of replica sets via the clusterTime and operationTime timestamps that the driver sends with every operation. The driver tracks afterClusterTime per session; reads block on the secondary until its clusterTime reaches that value. The mechanism implements all four session guarantees plus per-session causal consistency. The cost: an extra 24-byte timestamp per request and occasional 5–50 ms waits when reading from secondaries that are catching up. The migration path MongoDB documented — flip a session option, no schema change — is the cheapest "we got better consistency" upgrade most teams ever ship.
Why Spanner doesn't talk about session guarantees
Google's Spanner gives you external consistency (linearizability with a real-time clock) by default, which strictly implies all four session guarantees plus much more. Spanner's price for that is TrueTime — atomic clocks plus GPS plus a 7 ms uncertainty wait on every commit. For workloads where session guarantees would suffice, Spanner is overpaying. Spanner's "stale read" mode (ReadOnly with bounded staleness) is the escape hatch for read-mostly workloads that want session-class behaviour at lower latency. The lesson: stronger models subsume weaker ones, but always at a latency cost that the application may not want to pay.
The session-pinning anti-pattern at the load balancer
A common 2018-era mistake: implement RYW by pinning user sessions to a database replica via load-balancer cookies. It works for healthy single-AZ traffic. Then a deploy reboots that replica, the cookie sticks but the connection breaks, the client reconnects to a different replica, and every guarantee is silently violated for the next few requests. PaySetu hit this exact pattern in 2024 — RYW worked perfectly on every Tuesday-morning deploy until anti-entropy lag was longer than the LB's failover window, then it didn't, and the post-mortem was three pages of "we thought we had RYW, we had only LB stickiness". The lesson: implement RYW at the data-tier with tokens, not at the LB tier with cookies.
Where this leads next
Session guarantees are the second rung of Part 12's lattice — above plain eventual consistency, below causal consistency. The four properties are the cheapest meaningful step up from "anything goes" and almost every production AP-side system layers them on, sometimes without naming them. The progression:
- Eventual consistency — the floor; replicas converge in the limit.
- Session guarantees (this chapter) — per-client RYW, MR, MW, WFR.
- Causal consistency — happens-before preserved across all clients.
- Sequential consistency — single global order, no real-time.
- Linearizability — global order plus real-time.
Part 13's CRDT chapters revisit the convergence problem from the data-type side; once you have CRDTs plus session guarantees, you have most of what an AP-side application needs to feel coherent to its users.
References
- Terry, D., Demers, A., Petersen, K., Spreitzer, M., Theimer, M., Welch, B. — "Session Guarantees for Weakly Consistent Replicated Data" (PDIS 1994). The paper that named and proved the four.
- Vogels, W. — "Eventually Consistent" (CACM, 2009). Mentions the session variants in the broader EC framing.
- Bailis, P., Davidson, A., Fekete, A., Ghodsi, A., Hellerstein, J., Stoica, I. — "Highly Available Transactions: Virtues and Limitations" (VLDB 2014). Where session guarantees fit on the achievable-under-partition spectrum.
- Microsoft Azure Cosmos DB — "Consistency levels in Azure Cosmos DB" (docs). The session level's SLA and token mechanics.
- MongoDB — "Causal Consistency and Read and Write Concerns" (docs, 3.6+). The clusterTime/operationTime mechanism.
- Helland, P. — "Building on Quicksand" (CIDR 2009). Why session-scoped guarantees are usually what apps actually need.
- Eventual consistency — the floor that session guarantees sit on top of.
- Causal consistency — the rung above; same techniques generalised across clients.