Where this is all going: the database as a materialized view
Pull up the Razorpay architecture diagram in your head. Postgres holds payments. Elasticsearch serves merchant search. Redis caches dashboard tiles. Snowflake answers analytics. A Flink job computes fraud features. Five databases — and exactly one truth, dripping out of Postgres's WAL into a Kafka topic and fanning out into the other four. They look like five systems. They are one log and four views.
Every database in your stack — the OLTP row store, the search index, the cache, the warehouse, the ML feature store — is a materialized view over the same ordered log of facts. The log is the source of truth; everything else is a derived structure that can be rebuilt by replay. This is Jay Kreps's "turning the database inside out," and it is the thesis Build 23 has been quietly building toward.
The thesis, in one sentence
Throughout Build 23 you have seen the same pattern arrive from four different angles. Kafka is a durable ordered log. CDC turns Postgres's WAL into a Kafka topic. The stream/table duality says a table is just the fold of its changelog. Materialize keeps a SQL query's answer correct as new rows land. Each chapter handed you one piece. This chapter is the picture on the box.
The picture is this: in 2013, Jay Kreps — at the time the lead engineer of Kafka at LinkedIn — wrote an essay called The Log: What every software engineer should know about real-time data's unifying abstraction, followed in 2015 by a talk titled Turning the Database Inside Out. The argument is one sentence: a database, viewed from far enough away, is a log of changes plus one or more derived views that index the log differently. Postgres has a WAL underneath and B-trees on top. Elasticsearch has a translog underneath and inverted indexes on top. Cassandra has a commitlog underneath and SSTables on top. Every storage engine's "real" state is its log; everything you query is a derivative.
Kreps's move is to take that observation and turn it inside out at the system level. Instead of one database whose log is a private implementation detail, imagine a single shared log that every system in your company subscribes to. Each system maintains its own derived view — search keeps an inverted index, the cache keeps a hash map, the warehouse keeps columnar files, the ML pipeline keeps feature aggregates. None of them owns the truth. The log does.
Why this reframing matters: in the classical picture, when you want a new view of the data (say, a graph index for fraud analysis), you have to either dual-write into a second system or run a nightly batch ETL. Both are hacks around the fact that the log was private. Once the log is shared, adding a view is a job: subscribe to the topic, fold it into the structure you want. No dual writes. No drift.
How the four Build 23 chapters add up
Walk Build 23 backwards and the thesis assembles itself.
Kafka (chapter 174) gave you the durable ordered log as a service: a partitioned, replicated append-only file that every consumer reads at its own pace, identified by a byte offset. Kafka is the substrate.
The stream/table duality (chapter 175) gave you the equivalence: every stream folds into a table, every table emits a stream. There is no "log database" versus "row database" — there is a log, and there are folds of the log.
CDC with Debezium (chapter 180) gave you the on-ramp: how to turn an existing OLTP database (Postgres, MySQL) into a Kafka producer without dual writes. The WAL was already there. Debezium just teaches your other systems to read it.
Materialize and Differential Dataflow (chapter 179) gave you the off-ramp: a SQL query that stays correct as new rows land in the log. The query is the view, the log is the source, and the system maintains the relationship.
Glue them together and you have the inside-out architecture: Postgres feeds Kafka via Debezium; Materialize, Flink, Elasticsearch, and Redis all subscribe; each maintains its own structure; none of them write back to Postgres because none of them are the truth.
A worked example: Razorpay's payment view fan-out
Make this concrete. Razorpay has one source-of-truth table — payments in Postgres. It serves 50 million merchants and processes 5,000 transactions per second at peak. The single source-of-truth row, when a payment captures, needs to power:
- The merchant dashboard —
SELECT SUM(amount) WHERE merchant_id=? AND date=today. Hot, sub-100ms latency requirement. - The fraud model — features like count of payments by this card in the last 60 seconds. Sliding-window aggregate, sub-second latency.
- The settlement pipeline — at 11pm, sum every captured payment by merchant and produce a settlement file. Daily batch.
- The customer support search — "find me Riya's failed payment from Tuesday." Free-text search across email, transaction ID, last 4 of the card.
- The compliance archive — every payment, immutable, for 7 years. RBI audit requirement.
Five different access patterns. In the inside-out architecture, none of them queries Postgres directly. Postgres handles the OLTP write — INSERT INTO payments (...) — and produces a WAL entry. Debezium turns that WAL entry into a Kafka record on the payments-cdc topic. Five subscribers consume it:
# pseudocode for the five views, each maintained by a different process
# view 1: dashboard cache (Redis)
def consume_for_dashboard(event):
if event.status == "captured":
redis.hincrby(f"merchant:{event.merchant_id}:today", "total", event.amount)
# view 2: fraud features (Flink, sliding-window)
def consume_for_fraud(event):
flink_state.append(event.card_hash, event.timestamp)
flink_state.expire_older_than(60_seconds)
# view 3: settlement (Snowflake, batch read at 23:00)
# Snowflake just consumes the topic with a CDC source connector;
# at 23:00 a query aggregates the day's partition.
# view 4: search (Elasticsearch)
def consume_for_search(event):
es.index(doc_id=event.payment_id, body={
"merchant_id": event.merchant_id,
"amount": event.amount,
"email": event.email,
"card_last4": event.card_last4,
})
# view 5: archive (S3 + Iceberg)
def consume_for_archive(event):
iceberg.append("payments_archive", event)
Each consumer holds an offset on the Kafka topic. If the dashboard cache crashes, you redeploy it pointed at offset zero (or the last good checkpoint), and it rebuilds itself by replaying the log. If you decide on Tuesday that the fraud team needs a new feature — count of payments to merchants in Bengaluru in the last hour — you write a new Flink job, point it at the topic from offset zero, and it computes itself in 20 minutes. No backfill SQL. No dual writes. No "but the legacy system doesn't have this column."
The five views are independent. They can have different schemas, different indexes, different retention policies. The only thing they share is the topic — the log of facts that happened.
Building the simplest possible inside-out system in 40 lines
You can demo this on your laptop. The log is a list. The "database" is anything that subscribes and folds. Type this in.
# inside_out.py — Kreps's vision in one file
import json, time
from collections import defaultdict
class Log:
def __init__(self):
self.records = [] # the source of truth
def append(self, record):
self.records.append(record)
def read_from(self, offset):
return list(enumerate(self.records[offset:], start=offset))
# --- View 1: a key-value store, just like Postgres' tables ---
class KVView:
def __init__(self, log): self.log, self.offset, self.kv = log, 0, {}
def catch_up(self):
for off, rec in self.log.read_from(self.offset):
self.kv[rec["payment_id"]] = rec
self.offset = off + 1
def get(self, pid): self.catch_up(); return self.kv.get(pid)
# --- View 2: a daily-total aggregate, just like a dashboard cache ---
class DailyTotalView:
def __init__(self, log): self.log, self.offset, self.totals = log, 0, defaultdict(int)
def catch_up(self):
for off, rec in self.log.read_from(self.offset):
if rec["status"] == "captured":
self.totals[rec["merchant_id"]] += rec["amount"]
self.offset = off + 1
def total_for(self, mid): self.catch_up(); return self.totals[mid]
# --- View 3: a search index by email, just like Elasticsearch ---
class EmailIndexView:
def __init__(self, log): self.log, self.offset, self.by_email = log, 0, defaultdict(list)
def catch_up(self):
for off, rec in self.log.read_from(self.offset):
self.by_email[rec["email"]].append(rec["payment_id"])
self.offset = off + 1
def search(self, email): self.catch_up(); return self.by_email[email]
# --- demo ---
log = Log()
kv, daily, by_email = KVView(log), DailyTotalView(log), EmailIndexView(log)
log.append({"payment_id": "p1", "merchant_id": "m42", "amount": 50000, "status": "captured", "email": "riya@example.in"})
log.append({"payment_id": "p2", "merchant_id": "m42", "amount": 25000, "status": "captured", "email": "rahul@example.in"})
log.append({"payment_id": "p3", "merchant_id": "m99", "amount": 10000, "status": "failed", "email": "asha@example.in"})
print("kv lookup p2:", kv.get("p2")["amount"]) # 25000
print("daily total m42:", daily.total_for("m42")) # 75000
print("search riya:", by_email.search("riya@example.in"))# ['p1']
Output, on a 2024 MacBook:
$ python inside_out.py
kv lookup p2: 25000
daily total m42: 75000
search riya: ['p1']
Three "databases" — a row store, a dashboard aggregate, a search index — sitting in 40 lines. Each one has its own data structure (a dict, a counter, an inverted index). Each one is rebuildable from the log alone — delete the dict, recreate the view, call catch_up(), you have it back.
Log.append. The source of truth. Every fact in the system goes here, in order, exactly once. No view ever calls Log.replace or Log.delete. There is no such method.
KVView.catch_up. The contract every view obeys. Read every record from your last offset to the current end, fold each one into your private structure, advance the offset. This is exactly what a Kafka consumer does. The offset is durable; the structure is rebuildable.
DailyTotalView.catch_up. Same contract, different fold. The fold is += instead of =. Same input, different aggregate, no extra write to the log. Why this is the whole point: the read-side schema is decoupled from the write-side schema. Postgres knows nothing about "merchant daily totals." The view computes that fact from the log. Add a new view tomorrow — count of payments per email domain — and it costs 10 lines plus a replay.
Now imagine each View is a separate process running on a separate server, the Log is a Kafka topic, and read_from is a Kafka consumer. You have the Razorpay architecture from the previous section. Three orders of magnitude bigger; same shape.
Why is this the right architecture for the 2020s
Three forces are pushing every serious data team toward this picture, whether they have read Kreps or not.
1. The number of "databases" per company is going up, not down. A 2010 startup had Postgres. A 2025 startup has Postgres, Redis, Elasticsearch, Snowflake, a vector DB, a feature store, a graph DB for fraud, and a time-series DB for metrics. Each is best-in-class for its access pattern. The classical "single database" answer broke when access patterns multiplied. The log-plus-views answer scales — you add a database by adding a consumer.
2. Dual writes are the most-debugged-least-fixable bug in distributed systems. When the application code does db.insert(payment); cache.set(payment); search.index(payment), any one of those three writes can fail while the others succeed. You have now lost transactional consistency across systems and there is no Saturday-morning fix. The CDC + log architecture removes the bug at the source: the application writes once, to Postgres, and Postgres's WAL — the part that is already atomic with the commit — fans out to everything else. Why this is structurally different: dual writes split the durability guarantee across N systems with no shared protocol. CDC inherits Postgres's durability for free, because the log entry that fanned out to Kafka was the same entry that committed the row.
3. ML and analytics need event histories, not snapshots. A fraud model trained on yesterday's snapshot of payments is worse than one trained on every event in the last 90 days, because the snapshot has thrown away the sequence. The log preserves history natively — every event is in there, in order, with timestamps. Once your architecture is log-first, ML and analytics get the data they actually want without anyone building a "history table" in Postgres.
Common confusions
-
"This means Kafka replaces Postgres." No. Postgres remains the OLTP front door — your application still does
INSERTs andSELECTs and transactions against it. The log behind Postgres's WAL is what fans out. Kafka does not understand foreign keys, does not enforce uniqueness, does not run yourON CONFLICT DO UPDATE. It carries facts that Postgres has already validated. -
"The materialized view will eventually drift from the source." Only if you do dual writes alongside the log. The view is defined as the fold of the log; if you rebuild it from offset zero you get the same answer every time. Drift in real systems comes from someone writing to the view directly (bypassing the log) or from non-deterministic folds (random tie-breaks). Both are bugs in the architecture, not in the idea.
-
"My application can't tolerate stale views." Most can — a sub-second lag on a fraud feature or a search index is fine. The ones that can't (read-your-writes after a write) need a separate mechanism: read from the source of truth (Postgres) for that one query path, or use a synchronous read-your-writes wait on the consumer offset. The architecture does not forbid synchronous reads; it just makes them an exception, not the default.
-
"This is just CQRS / event sourcing with extra branding." It overlaps heavily, and Kreps's essay credits both. The difference is operational: CQRS is usually framed as a pattern inside one application; the inside-out database picture is the same idea applied across the entire company's systems, with a real distributed log as the substrate. Same primitive, different scale.
-
"If the log is the source of truth, deletes are impossible." Soft deletes (a tombstone event) replace hard deletes; the views interpret a tombstone as "remove this key." For GDPR-style hard erasure, you compact the log to drop the offending records — Kafka calls this log compaction, Postgres calls it
VACUUM. The log being the source does not mean it can never forget; it means forgetting is itself a fact you log. -
"You still need transactions across views." You don't, if the views are eventually consistent — and most application requirements are. For the rare case where you need atomic visibility across two views (say, reserve seat and charge card together), you write to the log inside a single application transaction and let the views catch up; the application reads from the log offset where its own write was committed before returning success. This is the "exactly-once" pattern from chapter 176, applied at the architecture level.
Going deeper
Kreps's three-line argument
The 2013 essay compresses to three claims. (a) A database's storage engine is itself a log + indexes. (b) Replication is just exposing the log to a second machine. (c) Once you expose the log to N machines, you have a multi-system architecture, and the "database" boundary stops being meaningful. The boundary was never around the data — it was around the log. Pull the log out and the boundary moves with it. Read the original at engineering.linkedin.com/distributed-systems/log; it is shorter than you think.
The "turning the database inside out" talk
Kreps's 2015 Strange Loop talk takes the essay's argument and applies it to one specific question: what if a SQL query were a long-lived subscription instead of a momentary scan? That question is what Materialize and ksqlDB later shipped. The talk is the bridge from "the log is the truth" to "every query is a streaming computation." It is the second half of the thesis Build 23 has been building — the first half is that the log exists; the second half is that queries become views over it.
Where it breaks down: secondary-index updates with strong consistency
The inside-out picture is eventually consistent across views. If a Razorpay merchant captures a payment and the dashboard cache hasn't caught up yet, they see stale numbers for a hundred milliseconds. For most use cases, fine. But when an Elasticsearch index is the only way to look up a record that was just inserted, and the application immediately reads it back, you get the "I just wrote it, why isn't it there" bug. The fix is to either route that read to Postgres directly (the source of truth) or to wait on the consumer's offset before returning success. Both are real; both are footnotes in production systems. Strong consistency across views is possible but expensive.
How this connects back to Build 1
You built an append-only log as the simplest database back in chapter 2. Every storage engine has one. What Build 23 has done is pull that log out of the storage engine and make it the protocol between systems. The same primitive — durable, ordered, append-only — that made one database crash-safe now makes a whole company's data architecture coherent. The fractal repeats: log inside a process, log between processes, log between data centres. Same shape at every scale.
Adoption in Indian engineering teams
Razorpay, Flipkart, Swiggy, Meesho, and Zerodha all run versions of this architecture in 2025. Razorpay publishes payment events to Kafka and fans out to seven downstream systems. Flipkart's order graph is a Kafka topic feeding both the fulfilment engine and the recommendation pipeline. Zerodha's tick data is a log; the matching engine, the analytics dashboards, and the regulatory archive are three views. None of these companies started here — they all migrated, painfully, from a tangle of dual writes and nightly ETLs. The pattern of migration is its own playbook (Debezium first, then a couple of consumers, then everyone else).
What this chapter is not claiming
Three things to be honest about. (a) Not every application benefits — a single-server CRUD app on Postgres is fine without any of this; the architecture pays off when N ≥ 3 systems share data. (b) The log is not magic — operating Kafka well is hard, schema evolution is hard, and exactly-once semantics is hard (chapter 176). (c) Strong consistency across views remains expensive; the architecture is a great fit for eventual-consistency workloads and a careful fit for synchronous-read ones.
The "kappa architecture" naming, and why we did not lead with it
Around 2014, Kreps named the architecture this chapter describes the kappa architecture — one log, many views, no separate batch layer — in opposition to the older lambda architecture (a batch path plus a streaming path producing two answers that get merged). The names matter less than the idea: kappa is what falls out naturally when you take Kreps's thesis seriously, because the batch path becomes "replay the log from the beginning" and the streaming path becomes "consume from the end." Same code, different starting offset. Once you see this, the lambda architecture's two-codebase split looks like a workaround for not having a long-enough-retained log. Most modern teams that say "we're streaming-first" mean "we're kappa, with one codebase that runs in batch mode by reading from offset zero."
Schema evolution is the operational sharp edge
The architectural argument is clean; the operational one is messy. Once a hundred consumers depend on a topic, changing the schema of that topic — adding a column to payments, renaming a field — is a coordinated migration, not a ALTER TABLE. Confluent's Schema Registry, Avro/Protobuf with backward-compatibility rules, and "consumer-driven contracts" exist to make this tractable. The lesson from production teams: invest in schema discipline before you have ten consumers, not after, because retrofitting schema rigour onto a topic that twenty teams already read is a six-month project. This is the day-to-day cost of the architecture's benefits — and the chapter on it is its own future article.
Where this leads next
Build 23 ends here. The thesis — log as source of truth, databases as views — is the lens through which Build 24 (putting it together) and Build 25+ (case studies of real systems) become readable. When you read about Confluent's "stream-data platform," they mean this. When you read about Materialize's "operational data warehouse," they mean this. When LinkedIn talks about "Pinot + Samza + Kafka," they mean this.
- The append-only log: simplest store — chapter 2: the primitive Build 23 has been scaling up.
- Kafka as a distributed log — chapter 174: the substrate.
- The stream/table duality — chapter 175: the equivalence.
- Materialize and Differential Dataflow — chapter 179: the practical engine.
- Polyglot persistence: picking the right DB per workload — chapter 182 (Build 24): how to pick which view to maintain in which engine.
References
- Jay Kreps, The Log: What every software engineer should know about real-time data's unifying abstraction (LinkedIn Engineering, 2013) — the essay that started this entire framing. Required reading once in your career. engineering.linkedin.com.
- Jay Kreps, Turning the Database Inside Out with Apache Samza (Strange Loop, 2015) — the talk that moved the idea from "log is a primitive" to "every query is a view." Watch on YouTube. martin.kleppmann.com/2015/03/04/turning-the-database-inside-out.html (Kleppmann's follow-up writeup).
- Martin Kleppmann, Designing Data-Intensive Applications (O'Reilly, 2017), Ch. 11 Stream Processing and Ch. 12 The Future of Data Systems — the book-length version of this argument, written by the second-most-cited person in the field. dataintensive.net.
- Pat Helland, Immutability Changes Everything (CIDR, 2015) — the parallel argument that immutable, append-only data is a structural advantage, not just a storage trick. cidrdb.org.
- Confluent, What is a Streaming Data Platform? — the commercial-side framing of Kreps's thesis, by the company he co-founded. confluent.io/learn/streaming-data-pipelines.
- Frank McSherry, Differential Dataflow (PhD-level technical reports, 2013–) — the math that makes "every query is a long-lived view" actually efficient. github.com/timelydataflow/differential-dataflow.
- The Append-Only Log: Simplest Store — chapter 2 of this track, the primitive that the entire Build 23 thesis scales up.