Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.

Polyglot persistence: picking the right DB per workload

Open the architecture page of any company you have heard of — PaisaBridge, BhojanBox, ParakhTrade, BharatResell — and count the databases on it. Five is normal. Seven is common. One is a lie. This chapter is about why no single engine survives a real workload, and how to read the trade-off matrix you have been quietly building since Build 1.

A real system runs three to seven different databases at once because workloads have wildly different shapes — and every storage engine you have built in this track makes the opposite trade-off from at least one other engine. You do not pick a database. You pick a portfolio. The skill is reading a workload's access pattern, latency budget, and consistency need, and matching it to the engine whose trade-offs that pattern can afford.

The one-database fantasy

Class assignments have one database. Production has many. The reason is not laziness or hype — it is geometry. Every storage engine in this track sits on a trade-off triangle whose corners cannot all be reached at once.

Recall the walls. B-trees give fast point reads and reasonable range scans, at the cost of write amplification on update. LSM-trees flip that — fast writes, painful read amplification, and tail-latency spikes during compaction. Column stores like Parquet or ClickHouse make a billion-row aggregate fly, but want nothing to do with a single-row update. Inverted indexes answer "find me every doc containing the word refund" in milliseconds and answer "give me row 4892's email" by re-scanning. Hash-based caches serve a hundred thousand reads per second per core, then forget everything when the box reboots. Vector indexes find the ten closest embeddings to a query in 5 ms — and cannot tell you which row was updated yesterday.

Every engine wins at one workload by losing at another. That is the whole game.

Six engines, four corners, no winner. Every engine sits where its data structure thrives. Picking a database is picking which corner your workload lives in.

Why a single engine cannot win every corner: the on-disk layout that makes a column store fast (sorted, compressed, per-column files of millions of values) is the same layout that makes a single-row update brutal — you have to rewrite the whole column chunk. The inverted-list layout that makes Elasticsearch fast at "every doc with refund" is the same layout that makes "give me row 4892" expensive. There is no neutral data structure. The shape of the bytes on disk is the trade-off.

Reading a workload before you pick

Before you pick a database, write down four numbers and one shape. Without these, the choice is religion.

Read-to-write ratio. Is this 99% read and 1% write (a product catalogue), 50/50 (a chat app), or 10% read and 90% write (a metrics pipeline)?
Latency budget. What is the p99 your application can tolerate? 1 ms (cache layer), 50 ms (user-facing API), 500 ms (back-office screen), 30 s (analyst dashboard)?
Working-set size. How much hot data does this workload touch in the last hour? 1 GB (fits in RAM), 100 GB (fits in NVMe), 50 TB (one column-store cluster), 10 PB (only object storage scales)?
Consistency need. Does the read have to see the last write? (Bank balance: yes. Trending products: no.) This collapses to the CAP/PACELC decision.
Access shape. Point lookup by primary key? Range scan? Full-text? K-nearest-neighbour vector? Aggregate over millions of rows? Graph traversal of friend-of-friend?

The shape is the most important. If you cannot describe the access shape in one sentence, you do not yet understand the workload, and any database you pick is a coin flip.

Five workloads inside one PaisaBridge request

A merchant opens the PaisaBridge dashboard at 11am on the 1st of the month — settlement day. A single page load fans out into five very different queries:

#	Query the page asks	Read shape	Engine that fits
1	"Show me my total volume yesterday"	aggregate of ~50k rows	column store / OLAP
2	"List my last 20 captured payments"	range scan, sorted by `created_at`, by `merchant_id`	B-tree on Postgres
3	"Search for customer 'rahul@gmail.com'"	inverted index on email field	Elasticsearch
4	"What is my available balance right now?"	single-row lookup, must reflect the most recent write	Postgres OLTP
5	"Show me three suggested actions you think I'll click"	nearest-neighbour over a feature vector	vector index (HNSW)

If you tried to answer all five from one Postgres instance: query 1 takes 8 seconds because the row store has to read every payment column to sum one of them; query 3 either does a LIKE '%rahul%' table scan or you bolt on pg_trgm and pray; query 5 needs pgvector, which works but reaches its limit before you finish 1 lakh merchants. By the third such page-load per second, the database is on fire.

The fix is not "buy a bigger Postgres." The fix is to send each query to the engine its shape was designed for, and to keep all of them in sync from the same source of truth.

The portfolio, not the pick

Once you accept that one engine cannot serve every workload, the architecture you end up with looks remarkably consistent across companies. There is an OLTP engine that owns the truth — usually Postgres or MySQL, sometimes a NewSQL like CockroachDB. Around it sits a halo of derived stores, each indexing the truth differently for one specific access shape.

This is the inside-out architecture from chapter 181, but viewed from the operations side. Build 23 told you how the log fans out to many systems. Build 24 starts with which systems, and why these and not others.

The shape that almost every fintech, marketplace, and SaaS company in India converges on by Series B. One source of truth, one log to fan it out, and one engine per access shape.

The discipline is not to add stores randomly — it is to add a store only when an existing store cannot meet the workload's latency or shape. Each store earns its place by answering one question better than any incumbent. Add Elasticsearch only when free-text search becomes a real workload. Add ClickHouse only when analytics queries hit Postgres hard enough to threaten OLTP latency. Add the vector index only when "find similar items" is a product feature.

Worked example: Riya picks the stack for a small wallet app

Riya runs a four-engineer team in Pune building a wallet-style app — call it walletkart. They have 50,000 active users, growing at 20% a month, and a runway pressure that says do not buy a database you do not have to. She runs through the workloads on paper, in order of priority:

# walletkart_workloads.py — workload catalogue, in plain Python dicts
workloads = [
    {
        "name": "wallet balance lookup",
        "ops_per_sec": 800,
        "p99_latency_ms": 50,
        "consistency": "strict — must reflect last write",
        "shape": "point lookup by user_id",
        "fits": "OLTP B-tree (Postgres)",
    },
    {
        "name": "transaction history",
        "ops_per_sec": 200,
        "p99_latency_ms": 100,
        "consistency": "read-your-writes",
        "shape": "range scan: last 30 transactions for user_id, by created_at",
        "fits": "Postgres with index on (user_id, created_at)",
    },
    {
        "name": "merchant search",
        "ops_per_sec": 60,
        "p99_latency_ms": 200,
        "consistency": "stale by 30s OK",
        "shape": "free-text on merchant name + category",
        "fits": "Elasticsearch (or Postgres tsvector for now)",
    },
    {
        "name": "homepage carousel: 'recent merchants like yours'",
        "ops_per_sec": 800,
        "p99_latency_ms": 30,
        "consistency": "stale by hours OK",
        "shape": "point lookup of precomputed list, by user_id",
        "fits": "Redis (cache, recomputed nightly)",
    },
    {
        "name": "monthly settlement aggregates for ops team",
        "ops_per_sec": 0.01,  # 1 query every 100s during business hours
        "p99_latency_ms": 5000,
        "consistency": "yesterday's data is fine",
        "shape": "GROUP BY merchant SUM amount over 30M rows",
        "fits": "Postgres for now, ClickHouse if it gets slow",
    },
]

for w in workloads:
    print(f"{w['name']:<40} {w['ops_per_sec']:>6} ops/s  p99 {w['p99_latency_ms']}ms  "
          f"shape={w['shape'][:40]}  →  {w['fits']}")

Output:

wallet balance lookup                       800 ops/s  p99 50ms  shape=point lookup by user_id  →  OLTP B-tree (Postgres)
transaction history                         200 ops/s  p99 100ms  shape=range scan: last 30 transactions...  →  Postgres index
merchant search                              60 ops/s  p99 200ms  shape=free-text on merchant name...  →  Elasticsearch (or tsvector)
homepage carousel                           800 ops/s  p99 30ms  shape=point lookup of precomputed list...  →  Redis (cache)
monthly settlement aggregates              0.01 ops/s  p99 5000ms shape=GROUP BY merchant SUM amount...  →  Postgres for now, ClickHouse later

Riya's day-1 stack is Postgres + Redis, with three workloads sitting on Postgres and one on Redis. The merchant search is being answered by a Postgres tsvector column — slower than Elasticsearch but inside the latency budget, and one fewer system to operate. Settlement aggregates run against a read replica because the volume is currently fine.

Why she did not start with five databases: every additional engine costs an on-call rotation, a backup pipeline, a monitoring dashboard, and a CDC connector. A four-engineer team buying its sixth database is a team that has stopped shipping product. The right rule is "the smallest portfolio that answers the workloads inside their latency budgets, plus one engine of headroom for the workload most likely to outgrow its current home."

Six months later, search traffic crosses 500 ops/s and tsvector p99 climbs to 500 ms. Now Elasticsearch earns its place — it is solving a real problem that Postgres has stopped solving. Twelve months later, settlement queries take a minute and start blocking other replica reads. Now ClickHouse earns its place. The portfolio grows by demand, not by aspiration.

Why the order of additions matters: the first additional engine is almost always a cache (Redis), because cache misses on hot read paths are the cheapest workload to peel off. The second is almost always either a search engine or an analytics column store, depending on which workload outgrows Postgres first. The third is whatever the product roadmap demands next — vector index for ML features, graph store for fraud rings, time-series for metrics. Adding them in this order means each addition replaces a clearly visible bottleneck in the system you have, rather than answering a problem you do not yet have.

The picking matrix, made concrete

Builds 1 through 23 have given you the full inventory. This is the one-page lookup table that compresses what each build taught into a routing decision. Treat the rows as access shapes, not as products — the products listed are the canonical engines for that shape, and you should be willing to swap them for an equivalent.

Access shape	Latency budget	Working set	Canonical engine	Where in this track
Point lookup, strict consistency	5–50 ms	up to 10 TB / shard	Postgres / MySQL B-tree	Build 4
Write-heavy point + range	1–20 ms	TB to PB, shardable	Cassandra / RocksDB LSM	Build 3
Sub-millisecond hot read	0.2–2 ms	fits in RAM	Redis / Memcached	Build 21
Free-text search	10–200 ms	TB	Elasticsearch / OpenSearch	Build 17
K-nearest-neighbour vectors	5–50 ms	GB to TB	pgvector / Pinecone / Weaviate	Build 18
Graph traversal	10–500 ms	GB to TB	Neo4j / Memgraph	Build 19
Time-series ingest + rollup	10–100 ms	TB, append-mostly	Timescale / InfluxDB / VictoriaMetrics	Build 20
Aggregate over millions	1–10 s	TB to PB	ClickHouse / DuckDB	Build 16
Lakehouse archive + ad-hoc	seconds to minutes	PB	S3 + Iceberg/Delta + Trino	Build 16
Distributed strong-consistency OLTP	10–100 ms	TB to PB	CockroachDB / Spanner / TiDB	Build 12

Two warnings about reading this table. First, the engines have meaningful overlap — Postgres can do point lookup, free-text, vector, and time-series, just worse than the specialist at high scale. The decision rule is "use the specialist when the cheaper general-purpose engine is no longer fast enough at this workload's scale," not "use the specialist on day one because the table says so." Second, latency budgets are p99 targets in steady state; tail latency under load is much worse and is the number you should actually engineer to.

What forces a new engine into the stack

There are exactly four signals that justify adding a new database. If you cannot point to at least one of them for a proposed addition, the addition is premature.

An access shape the existing engines do not natively support. Free-text search, k-nearest-neighbour vectors, graph traversals, time-series rollups. You can fake some of these in Postgres for a while; eventually the fake hits a wall and a specialised engine pays its keep.
A latency budget the existing engines cannot meet. A 1 ms p99 read forces a cache. A 5-second analytics query becomes unacceptable when the dashboard is on the customer-facing path. The trade-off matrix says "this engine cannot meet that budget at this scale" — and that is when you reach for the next engine.
A working set that has outgrown the existing engine's capacity. 50 TB of historical events does not belong in Postgres no matter how many indexes you add — it belongs in object storage with a lakehouse table format. Trying to keep it in the OLTP engine wastes IOPS and crowds out hot data.
A workload whose consistency need is genuinely weaker than the OLTP engine, and where the throughput cost of synchronous OLTP is not affordable. This is the cache case (eventual is fine) and the analytics case (yesterday is fine). The point of the polyglot stack is that you spend strong consistency only where you need it.

If none of these four hold, you do not need a new engine. You need a better index on the one you have.

Common confusions

"More databases is more sophisticated." No. More databases is more operational surface. Every store needs a backup, a recovery runbook, a monitoring dashboard, a schema-migration plan, and people on call. A polyglot stack is justified by workload diversity, not by resume-driven design. The right number of databases is the smallest number that meets the workloads' latency and shape budgets.
"NewSQL (CockroachDB, Spanner, TiDB) makes polyglot persistence obsolete." It does not. NewSQL solves OLTP at scale — strong consistency across many shards. It does not turn into Elasticsearch, ClickHouse, Redis, or a vector index. NewSQL replaces your sharded MySQL, not your search and analytics tier. Spanner-shaped systems still ship CDC into a Kafka log so search and analytics can subscribe.
"Postgres can do everything: JSON, full-text, vector, time-series." Postgres can do some of everything: jsonb, tsvector, pgvector, TimescaleDB extension, PostGIS. For a small team in the early stage, this is correct and good. But each extension is a 10x weaker version of the specialist engine. tsvector is not Elasticsearch. pgvector HNSW caps out where Pinecone or Weaviate are warming up. The "Postgres for everything" rule is a staging rule, not an end state.
"The OLTP engine should be the single source of truth and write-master forever." Mostly true, but not always. For event-sourced systems (Kafka-first architectures), the log is the source of truth, and Postgres itself is one of the derived views. PaisaBridge's payments table is the master because the WAL is the log; ProfNet's member-update stream is the master because it predates any single database. The "source of truth" is whichever store guarantees the durable, ordered facts, and that is sometimes a log rather than a row store.
"Cache invalidation is so hard, just run everything against the OLTP engine." This is appealing until 1 lakh dashboard requests per second arrive at 11am IST and Postgres burns. The cache exists because the OLTP engine cannot meet the latency at the throughput. Yes, cache invalidation is hard — and that is why the cache should be fed by the same CDC stream that feeds search and analytics, so invalidation is not a manual step but a subscription.
"Polyglot is just microservices for databases and the same critique applies." They are not the same thing. Microservices is about who owns the code; polyglot persistence is about who owns the access shape. You can run a single monolithic service that talks to seven databases — that is fine. You can run thirty microservices each owning its own Postgres — that is the database-per-service problem and is much worse. Polyglot is about workload-engine fit; service decomposition is orthogonal.

Going deeper

The Fowler / Sadalage frame and where it came from

The phrase "polyglot persistence" was coined by Pramod Sadalage and Martin Fowler around 2011, in the run-up to the NoSQL Distilled book. Their case study was a fictional retail application broken into eight services, each picking a database from {RDBMS, key-value, document, column-family, graph} based on access shape. At the time it was a counter to two extremes — the classical "everything in Oracle" monolith, and the over-corrective "MongoDB for everything" early NoSQL fashion. The lasting contribution is the frame: stop asking "which database?" and start asking "which access shape, and what does it want?"

The frame survived because the trade-offs survived. Fifteen years later the engines are mostly different — RocksDB instead of Voldemort, ClickHouse instead of HBase, Pinecone instead of nothing — but the corners of the trade-off space have not moved. A scan-friendly engine is still bad at point updates; a write-fast engine still pays at read time; a cache is still volatile. The names change; the geometry does not.

CDC as the connective tissue

Polyglot persistence only works if the derived stores stay in sync with the OLTP engine. The naive approach — application code writes to Postgres, then writes to Elasticsearch, then writes to Redis — is the dual-write problem, and it is a bug factory. If the application crashes between the Postgres commit and the Elasticsearch write, you have permanent drift and no clean way to detect it.

The clean solution is change data capture. The application writes only to Postgres. Postgres's WAL emits a change record. Debezium (or equivalent) reads the WAL and publishes to Kafka. The other engines subscribe. Every derived store is now a function of the same log, and "which engines are in sync" is no longer a per-write question — it is "where is each subscriber's offset?" Drift becomes lag, lag becomes a number you can graph, and you can reset a derived store by deleting its data and replaying the topic from the beginning.

The CDC pattern is what makes polyglot survive at scale. Without it, polyglot is dual-write hell and most teams revert to "everything in one DB" within a year.

The reverse anti-pattern: too many engines

The Fowler frame is older than its current critique. Around 2017, several large engineering blogs (Stayloft, Glydex, Stitch Fix) wrote post-mortems on polyglot stacks that had grown to fifteen, twenty, or thirty databases — every team picking its own — and the operational cost crushed the company. Each engine needed its own runbook, on-call rotation, backup pipeline, schema-evolution story, and security review. The teams that survived consolidated back down to a portfolio of five to seven engines, with strong defaults that every team had to justify deviating from.

The healthy rule is "right tool for the job, but the smallest set of tools that does all the jobs." Five Postgres clusters with different schemas is fine. Five different kinds of database, each new and unfamiliar, is fragile. Add a new engine when an existing one fails a workload, not when an engineer wants to learn it.

The CAP/PACELC layer underneath

The polyglot decision has a consistency dimension that the workload table does not always make explicit. Different stores in the same stack make different CAP/PACELC choices. The OLTP store is usually CP and PC/EC — strict consistency, accept latency cost. The cache is CP-shaped during normal operation but degrades to AP under partition (you serve stale, you keep going). The search index is AP, near-eventual. The lakehouse is asynchronous, hours-stale, no consistency guarantee at all.

This is fine — and explicit about it makes the polyglot stack honest. The application has to know which store to ask for which question. Asking the cache for an exact balance is a bug. Asking the OLTP store for a top-1000 aggregate is a different bug. The architectural diagram and the developer documentation must list each store's consistency level, so that engineers do not silently downgrade the system by routing the wrong question to the wrong place.

When polyglot is wrong

A useful corrective: polyglot is overkill if your workload has only one shape. A pure event-sourced log analytics tool — say, a tiny startup whose only product is "ingest webhooks, run aggregations, expose a dashboard" — does not need OLTP, cache, or search. ClickHouse alone covers it. A pure transactional ledger app with no search and no analytics can run on Postgres alone for years.

The decision rule is honest: count distinct access shapes in the workload. One shape, one engine. Three shapes, two or three engines. Seven shapes, you are running PaisaBridge and you have a portfolio.

Where this leads next

How to read a database paper — chapter 183. The portfolio gets richer as you learn to read primary sources, not blog summaries.
Benchmarking honestly: TPC-C, TPC-H, YCSB, and lies — chapter 184. Your portfolio decisions are only as good as the benchmarks behind them.
Running a database in production: the parts they don't teach — chapter 185. Every engine in the portfolio carries an operations cost, and that cost dominates the picking decision.
The 30-year arc and where databases go next — chapter 186, the closer of Build 24.
Where this is all going: the database as a materialized view — chapter 181, the conceptual prequel to this chapter.

References

Pramod Sadalage and Martin Fowler, NoSQL Distilled, ch. 13 Polyglot Persistence (Addison-Wesley, 2012) — the book that introduced the term and its discipline. martinfowler.com/bliki/PolyglotPersistence.html.
Martin Kleppmann, Designing Data-Intensive Applications, ch. 1 Reliable, Scalable, Maintainable, and ch. 11 Stream Processing (O'Reilly, 2017) — the modern treatment of "many specialised stores, one log of facts." dataintensive.net.
Jay Kreps, The Log: What every software engineer should know about real-time data's unifying abstraction (ProfNet Engineering, 2013) — argues that CDC + a shared log is the architectural backbone of polyglot persistence. engineering.linkedin.com.
Mark Cavage, There's Just No Getting around It: You're Building a Distributed System (ACM Queue, 2013) — on the operational cost of every additional store; the case against premature polyglot. queue.acm.org.
PaisaBridge Engineering blog, How we handle 5000 TPS at peak (2023) — one public Indian-fintech walk-through of an OLTP + Kafka + ClickHouse + Elasticsearch + Redis stack. engineering.razorpay.com.
Adrian Colyer, The morning paper on polyglot persistence and Lambda Architecture — multiple summaries of the foundational Fowler/Kleppmann arguments. blog.acolyer.org.
padho-wiki: B-tree vs LSM: when to pick which and Lakehouse vs warehouse: the real tradeoffs — the per-engine trade-off chapters this picking framework is built on.