Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.
In short
Document databases promise no migrations needed, and for the first six months the promise holds. Then the bill arrives: schema enforcement does not vanish when the database stops doing it — it relocates into your application code, scattered across services, drifting silently, accumulating if/else branches for every historical variant of every field. Relational databases pay the schema cost upfront in visible migration spikes; document databases defer it as a thin tax on every read forever, plus a forced backfill at the end that costs ten times what the original migration would have. Both trade-offs are valid — pick the one your team can sustain, but pick it knowingly.
The thesis
A relational database is loud about its schema. ALTER TABLE orders ADD COLUMN refund_amount DECIMAL(10,2) is a statement that touches every row, that the planner notices immediately, that a code reviewer can find by grep-ing migrations, and that lives forever in the audit log. It is annoying. It is also extremely visible.
A document database is quiet. db.orders.insertOne({..., refund_amount: 0}) is a single insert. The first time you do it, exactly one document in the entire collection has a refund_amount field. There is no migration; there is no audit; there is no announcement. The other four hundred million documents in the collection do not have the field, and the database does not care. Why this matters: visibility is what makes schema changes safe. A change you can grep for is a change a teammate can find when they write code that depends on it. A change buried in the body of an insertOne call in a microservice is invisible to everyone who is not currently editing that line.
The thesis of this chapter is simple. Schema enforcement is a constant cost. It does not go away when the database stops paying it. It just moves. It moves into the application — and the application is the worst possible place to centralise schema rules, because the application is many programs, many languages, many teams, and many releases. The flexibility you gained in the database is paid back in distributed schema chaos in the code.
This is not an argument against document databases. They are the right answer for a meaningful set of problems. It is an argument against the marketing of document databases — the implicit claim that schema-on-write is pure overhead and schema-on-read is pure win. The truth is that schema-on-read is overhead too; it is just charged on a different credit card.
The promised flexibility, exactly as advertised
Before the critique, give the promise its full due. There are four operations a document database lets you perform with literally zero ceremony:
Add a field. You start writing it in new documents. That is the entire change. Old documents do not have the field; new documents do. Reads that ask for it on old documents get a missing-field response (undefined in Mongo, null in your driver). No ALTER TABLE, no online migration, no downtime, no schema-evolution review meeting.
Remove a field. You stop writing it in new documents. Old documents still have it; new ones do not. The database does not care. If a query happened to filter on the field, the old documents still match and the new ones do not — which is sometimes what you want and sometimes a bug, but either way it is your application's problem, not the database's.
Change a type. You start writing the new type. Old documents have the old type; new documents have the new type. A field can be a number in some documents and a string in others, in the same collection. MongoDB will dutifully store both and return both.
Rename a field. Just use the new name. Reads that look for the old name keep finding it in old documents. Reads that look for the new name find it in new ones. The two names coexist forever — until you decide to do something about it.
In a relational database, each of these operations is a migration. On a small table the migration is cheap; on a large table it is anywhere from "kind of annoying" to "we will deploy on a Sunday at 3am with a rollback plan and three engineers on call". For an early-stage product where the data model genuinely is changing every week, removing that ceremony is an enormous productivity win. The first ten product iterations cost weeks less in document land than in relational land.
This is real. This is genuinely useful. This is also the part everybody talks about. Now we talk about the part nobody talks about.
The false promise: documents look easy to evolve until they don't
Look at the diagram for a moment. The left side is what the document-database pitch shows you on day one: db.txn.insert({amount: 500, currency: "INR"}), four lines, no schema. The right side is what nobody shows you on day one: a getAmount(doc) function with five branches, one per historical schema variant, written by five different engineers across three teams over three years. Each branch was a "no migration needed!" moment at the time it was added. The cumulative effect is a function that nobody fully trusts.
This is the false promise. Each individual schema change is easy. The aggregate of schema changes, applied without migration discipline, is not easy — it is a slow-growing tax on every read in the system, and the tax compounds because each new variant adds another branch every reader has to handle.
The hidden costs, in detail
There are four costs, and they compound. Let us take them one at a time.
Multi-version queries
Every read path has to handle every historical shape of every field. The getAmount example above is unfortunately realistic — at a fintech of any size, fields acquire variants every quarter, and the reader code grows monotonically. The branches do not get removed because removing them requires proving no document in production still uses the old shape, and proving that requires a full collection scan, and a full collection scan on the prod cluster is its own ceremony.
The deeper problem is that each if/else branch is a place where a bug can hide. If branch v3 has a subtle off-by-100 (return doc.amount_minor / 100 vs return doc.amount_minor), that bug fires only on documents that match exactly v3 — not v1, not v2, not v4. Detecting it requires a test fixture that exercises the v3 shape specifically, which most test suites do not have because they were written when v3 did not exist. Production bugs in this space are notoriously hard to reproduce.
Inconsistent enforcement across services
Two services write to the same collection. Service A is the mobile API, written in Node by one team; it calls the field amount. Service B is the back-office settlement job, written in Python by another team; it calls the field amount_inr. Neither team knows the other exists, because in a microservice architecture nobody owns the database — they own their service.
Both writes succeed. The collection now contains documents with amount, documents with amount_inr, and documents with both (when the same transaction is touched by both services). A query that filters WHERE amount > 1000 returns the mobile-originated documents and misses the settlement-originated ones. A dashboard built on the query is silently wrong.
Why this happens: the database is not the schema authority. Each service decides its own field names, and there is no central place that says "in the transactions collection, the amount field is named X". In a relational world, the column name is in the schema, and any service that uses a different name gets a SQL error on the first insert. In the document world, the database accepts both and you discover the divergence at query time, often months later.
Backfills are still needed — they are just delayed
The honest dirty secret of document databases is that you still end up running migrations, just later, and under more pressure. Sooner or later a query needs to assume a uniform shape — because the analytics team needs a clean dataset, or because a new feature requires a new index, or because a regulator audit needs reconciled records — and at that point you have to backfill every old document to the new shape.
The backfill is exactly the migration the relational database would have made you do upfront. Except now: (a) the data has more variants, because more time has passed; (b) the application code has accreted more if/else branches that all have to be removed once the backfill completes, and removing them safely requires verifying no service still depends on the old shape; (c) the backfill itself is operationally riskier because the collection is now ten times bigger than it would have been on day one when the migration could have been a five-minute affair.
The accounting is brutal. Upfront migration: one engineer-day. Delayed backfill: ten engineer-days, plus a year of accumulated reader-code complexity that you now have to clean up.
Schema drift is hard to detect
In a relational database, \d+ orders in psql tells you every column, every type, every index. The schema is a queryable, auditable artefact. In a document database, there is no INFORMATION_SCHEMA for the collection. You can run db.orders.findOne() and see one document's shape, but that tells you nothing about whether other documents in the collection have different shapes.
Tools exist to mitigate this — MongoDB Compass has a "Schema" tab that samples documents and reports field-presence ratios, and there are open-source schema-inference tools — but they all sample, and sampling misses the long tail. The document with the weird historical shape that breaks the report is exactly the one a sample is unlikely to catch.
The practical effect is that new engineers do not know what fields a document has until they query for them and see what comes back. Onboarding becomes spelunking. Documentation becomes wishful thinking, because nobody can keep it in sync with a schema that has no canonical form.
A worked example: an Indian fintech and the compounding bill
The transaction collection that ate three engineering quarters
A Bengaluru-based payments startup — call it PayBharat — launches in early 2024. The transaction model is simple: every payment is a document.
// 2024-Q1: the launch schema, MongoDB
db.transactions.insertOne({
_id: ObjectId(),
user_id: "u_8a7f...",
merchant: "Big Bazaar",
amount: 500, // rupees
currency: "INR",
status: "success",
created_at: ISODate()
})
Six months in (2024-Q3), the RBI's revised reporting framework requires amounts to be reported in paise (1 INR = 100 paise) for reconciliation. The mobile API team adds a new field:
// 2024-Q3: regulator change
db.transactions.insertOne({
_id: ObjectId(),
user_id: "u_3b2c...",
merchant: "NaatakBook",
amount_paise: 35000, // ₹350.00 in paise
currency: "INR",
status: "success",
created_at: ISODate()
})
No migration. Old documents still have amount (rupees); new documents have amount_paise (paise). Reader code grows a branch:
function amountInPaise(doc) {
if (doc.amount_paise != null) return doc.amount_paise;
if (doc.amount != null) return doc.amount * 100;
throw new Error("transaction missing amount");
}
So far, the cost is one helper function. Manageable.
Late 2024-Q4, the international team launches USD support. They write a different shape because they were not aware of the paise convention:
// 2024-Q4: international team, unaware of paise convention
db.transactions.insertOne({
_id: ObjectId(),
user_id: "u_intl_...",
merchant: "AWS",
amount_minor: 4999, // $49.99 in cents
currency: "USD",
status: "success",
created_at: ISODate()
})
amount_minor is the new generic name; amount_paise is the old INR-specific name. They mean the same thing for INR documents. But now the helper has three branches:
function amountInMinor(doc) {
if (doc.amount_minor != null) return doc.amount_minor;
if (doc.amount_paise != null) return doc.amount_paise;
if (doc.amount != null) return doc.amount * 100;
throw new Error("transaction missing amount");
}
By 2025-Q2, the ledger team redesigns transactions to support split payments. They wrap money in a sub-document:
// 2025-Q2: ledger redesign
db.transactions.insertOne({
_id: ObjectId(),
user_id: "u_split_...",
merchant: "ZaikaApp",
money: { value: 89900, currency: "INR", scale: 2 },
status: "success",
splits: [
{ wallet: "cashback", amount: 4500 },
{ wallet: "card", amount: 85400 }
],
created_at: ISODate()
})
Now getAmount has four branches. Plus the splits introduce a new question: which is the canonical amount, the top-level money.value or the sum of splits[].amount? Reader code disagrees across services. The fraud team sums splits; the analytics team reads money.value. They mostly agree, but for documents where the splits do not exactly equal the total (rounding, currency conversion), they produce different fraud signals and different revenue numbers. This is found by the CFO during quarterly close. Three engineers spend two weeks on the reconciliation.
By mid-2026, the bill comes due. The risk team needs a clean transactions dataset for a regulator audit. The query needs uniform shape. The CTO approves a forced backfill:
# 2026-Q3: the migration we avoided in 2024-Q3
for doc in db.transactions.find():
minor = compute_canonical_amount(doc) # the four-branch helper
db.transactions.update_one(
{"_id": doc["_id"]},
{"$set": {"money": {"value": minor, "currency": doc.get("currency", "INR"), "scale": 2}},
"$unset": {"amount": "", "amount_paise": "", "amount_minor": ""}}
)
On a 400 million document collection, this is a ten-day operation: write throttling, batched updates to avoid replication lag spikes, dry-runs on a snapshot, careful sequencing with the application services that still read the old fields, then a coordinated deploy to remove the four if/else branches from every reader. Total cost: 10 engineer-days of focused work, plus six weeks of calendar time for the coordination.
The relational counterfactual: in 2024-Q3, when the regulator change came, run ALTER TABLE transactions ALTER COLUMN amount TYPE BIGINT USING amount * 100; ALTER TABLE transactions RENAME COLUMN amount TO amount_paise;. On a 50-million-row table at the time, an online schema-change tool like pt-online-schema-change or gh-ost runs this in a few hours of background work. Total cost: 1 engineer-day. Every subsequent reader works against a single, canonical schema. The drift never happens because the database refuses to let it happen.
The flexibility was worth one day. The bill was ten. Why the multiplier is so big: the cost is not just the backfill. It is the two years of reader-code complexity, the bugs that hid in the branches, the reconciliations that were silently wrong, the engineer-hours spent debugging shape mismatches, the onboarding time for new engineers who could not figure out what a transaction looked like. The migration deferred is not the migration averted; it is the migration with compound interest.
The cost over time: integrating the area under the curve
Look at the shape of the two curves. The relational curve (blue) is a series of sharp spikes — each ALTER TABLE is a planned, discrete event with a clear before-and-after. Between migrations the cost is approximately zero: the schema is stable, every reader works against the same shape, and the database catches violations.
The document curve (red) is a slow climb. There is no spike on day one — flexibility looks free. But every schema variant added by every team without a migration adds a tiny ongoing tax on every read in the system. The tax is small per read but applied billions of times per day across thousands of reader code paths. The integral — the area under the curve, which is the total engineering cost — grows monotonically. And the forced backfill at the end, when it finally happens, is itself a spike, except now layered on top of three years of accumulated debt.
Why this is the most important diagram in the chapter: the human brain is bad at integrating slow-growing costs. A spike is visible — you remember the weekend you spent on the Q3 migration. A slow climb is invisible — you do not remember the thousand small if/else branches you added one at a time. By the time the integral is obviously larger than the spike-sum would have been, you have already paid it.
Mitigation: how real teams actually live with documents
Document databases are not unworkable. Teams ship them at massive scale — Glydex, CreatPro, Toyota, the Indian Aadhaar system uses MongoDB for parts of its identity infrastructure. They do it by rebuilding schema discipline at a different layer. There are four patterns, and a healthy production system uses three of them simultaneously.
Pattern 1: JSON Schema validation at the database
MongoDB, since version 3.6, supports $jsonSchema validators on collections. You define a schema as a JSON Schema document, attach it to the collection, and the database rejects writes that do not conform.
db.createCollection("transactions", {
validator: {
$jsonSchema: {
bsonType: "object",
required: ["user_id", "money", "currency", "status", "created_at"],
properties: {
user_id: { bsonType: "string", pattern: "^u_" },
money: { bsonType: "object",
required: ["value", "scale"],
properties: {
value: { bsonType: "long" },
scale: { bsonType: "int", minimum: 0, maximum: 8 }
} },
currency: { enum: ["INR", "USD", "EUR", "GBP"] },
status: { enum: ["pending", "success", "failed", "reversed"] },
created_at: { bsonType: "date" }
}
}
},
validationLevel: "strict",
validationAction: "error"
})
This recovers most of the schema discipline a relational database gives you for free. Inconsistent writes are rejected at the database boundary; field types are enforced; required fields cannot be omitted. See the MongoDB schema validation documentation for the full feature surface.
The catch is validationLevel. "strict" enforces validation on every write, including updates. "moderate" only enforces it on documents that already conform, leaving non-conforming legacy documents untouched. Most teams introducing validators to existing collections start with "moderate" because "strict" would reject updates to old documents that violate the new schema — and the only fix for that is the backfill you were trying to defer.
Pattern 2: versioned schemas inside the document
Include an explicit schema_version field in every document. Migrators handle older versions transparently:
def normalise(doc):
v = doc.get("schema_version", 1)
if v == 1:
doc = migrate_v1_to_v2(doc)
v = 2
if v == 2:
doc = migrate_v2_to_v3(doc)
v = 3
return doc
Every read goes through normalise. Every write writes the latest version with schema_version: N set. Old documents stay on disk until you choose to backfill them, but reader code only deals with the canonical latest shape after normalise runs.
This is a pattern from the document-modelling literature — it appears as the "Schema Versioning Pattern" in MongoDB's data modelling guide and as a recurrent theme in critiques of naive document usage like Sarah Mei's influential 2013 essay. It works. Its cost is that every read pays a small CPU tax for the migrators, and the migrators themselves accumulate over time — eventually you do want to garbage-collect old versions, which means... a backfill. The pattern delays the bill but does not cancel it.
Pattern 3: code-side schema classes
Tools like Pydantic (Python), Zod (TypeScript), Joi (JavaScript), and Pkl (cross-language) let you declare schemas in code and have all reads/writes pass through them. Every write serialises from a typed object; every read deserialises into a typed object; schema violations throw at the application boundary.
from pydantic import BaseModel, Field
from typing import Literal
from datetime import datetime
class Money(BaseModel):
value: int # in minor units
scale: int = Field(ge=0, le=8)
class Transaction(BaseModel):
user_id: str = Field(pattern=r"^u_")
money: Money
currency: Literal["INR", "USD", "EUR", "GBP"]
status: Literal["pending", "success", "failed", "reversed"]
created_at: datetime
schema_version: int = 3
# Every write goes through this
def write_transaction(txn: Transaction):
db.transactions.insert_one(txn.model_dump())
# Every read goes through this
def read_transaction(txn_id) -> Transaction:
raw = db.transactions.find_one({"_id": txn_id})
return Transaction.model_validate(normalise(raw))
This is the pattern that scales best in practice, because the schema lives next to the application code that uses it, and language-level type-checking catches violations at compile time (or at least at PR review time). The catch is that it only enforces discipline within services that use the same Pydantic model. Two services with two different model files can still drift — which means the model itself has to be a shared library, owned by one team, versioned and released like any other dependency.
Pattern 4: explicit periodic backfills
Accept that backfills will happen, and plan for them. Schedule them quarterly or annually as part of a "schema-debt sprint". Migrate the long tail of old documents to the latest shape; remove the legacy if/else branches from reader code; tighten the validators. This is the migration discipline a relational database imposes — voluntarily applied. Teams that do this stay sane. Teams that do not become the multi-version-query case study.
The real comparison
The honest comparison, free of marketing in either direction:
Relational databases (Postgres, MySQL, SQL Server, Oracle) pay the schema cost upfront. Every change is a migration; every migration is a planned event; every reader works against one canonical schema. The pain is concentrated in the migration moments — and is annoying enough that it nudges teams toward designing schemas carefully, because the cost of getting it wrong is visible. The DB enforces discipline.
Document databases (MongoDB, Couchbase, DynamoDB, AWS DocumentDB) defer the schema cost. Every change is a write; every reader has to handle every variant; the canonical schema lives in the application code, distributed across services, often inconsistently. The pain is spread out — a thin tax on every read forever — and is invisible enough that teams underweight it, because no single moment is acutely painful. The application enforces discipline (or fails to).
Both are valid. Document databases are the right choice when:
- The data is genuinely tree-shaped and variable per record (Bharat Bazaar's product catalogue from chapter 137)
- The team is small enough that one shared schema model can be enforced socially
- The product is in a phase where the data model genuinely is changing weekly and the upfront migration cost would dominate
- Read patterns mostly want whole documents (not aggregations across collections)
Relational databases are the right choice when:
- The data is rectangular and the schema changes slowly
- Multiple teams own services that touch the same data, making centralised enforcement essential
- Read patterns include analytical aggregations across many records
- The cost of a schema bug (silent fraud, regulatory miss) is high enough that "the application enforces it" is unacceptable
Postgres JSONB is a hybrid. Postgres lets you have rectangular columns and a JSONB column for the variable bits. This is increasingly the most pragmatic answer for teams who want most of their data validated rigorously and a few fields evolving freely — see Eric Meyer's overview of JSON in Postgres for the engineering tradeoffs. The price is some query complexity (->>, @>, GIN indexes on JSONB paths) and slightly less efficient storage for the JSONB column, but you get to pay schema-enforcement costs only where you want them.
DynamoDB is a special case. It is schema-free at the database, but its access patterns are so constrained (single-table design, predetermined partition keys) that you end up imposing a tight schema in your application out of operational necessity. The lesson generalises: even when the database lets you be flexible, scale eventually forces discipline. The only question is who enforces it.
Pat Helland's classic essay If You Have Too Much Data, then "Good Enough" Is Good Enough is a useful frame for thinking about this. At small scales, schema discipline is cheap and worth it. At very large scales, perfect schema discipline becomes impossible regardless of database choice — you accept some drift and design systems that tolerate it. The interesting middle scale, where most teams live, is where the document-vs-relational choice matters most.
What we learned
A summary, for the engineer who needs to make this choice next quarter:
-
The "no migrations" promise is real, but partial. You skip the database migration. You do not skip the schema work — it just relocates to your application code.
-
The hidden cost compounds. Multi-version queries, inconsistent enforcement, delayed backfills, and undetected drift all grow over time. The integral of small ongoing costs eventually exceeds the spike of an upfront migration.
-
Mitigation patterns exist and they work.
$jsonSchemavalidators, versioned documents, code-side schema classes, periodic backfills. Use three of these four, and you recover most of what schema-on-write gave you for free. -
Pick based on what your team can sustain. A small team with one service can ride the document-flexibility curve for years. A 50-engineer organisation with 30 services touching the same collection cannot — they need the central schema authority a relational database provides, or they need the disciplined application-side patterns to substitute for it.
-
Postgres JSONB is often the right answer. Most data is rectangular. Some data is variable. JSONB lets you be honest about which is which.
The next chapter — chapter 140: the aggregation pipeline — turns to MongoDB's answer to GROUP BY: composable stages that build query trees out of small declarative steps, and that have to do their work without the type information a relational planner gets for free.
Common confusions
-
"MongoDB is schemaless." It is not. Every document has a schema — the schema is just implicit, owned by whatever code wrote the document last, and not enforced anywhere. "Schemaless" is marketing shorthand for "the database does not check the schema." The schema still exists; it has merely been moved into your application code, where it is harder to find and harder to enforce. The honest term is schema-on-read.
-
"
$jsonSchemavalidators give you the same guarantees as a relational schema." Close, but not quite. Validators reject new writes that violate the schema, but they do not retroactively fix existing documents — thevalidationLevel: "moderate"setting most teams need to roll validators out gracefully deliberately allows non-conforming legacy documents to remain. A relationalALTER TABLE ... CHECK ...validates every existing row immediately;$jsonSchemaonly validates the future. Equivalent guarantees require validators plus a backfill, which is the migration document databases were supposed to spare you. -
"A document database means you never have to think about migrations." You always have to think about migrations. You may not run
ALTER TABLE, but you writeif (doc.amount_paise) ... else if (doc.amount) ...branches in every reader, you maintainmigrate_v1_to_v2functions that run on every read, and eventually you write a backfill script that updates 400 million documents in production. That is migration work — it is just spread across more places, performed by more people, and never centrally tracked. It is migration without the migration tool. -
"Schema-on-read is faster than schema-on-write because the database doesn't validate." At write time, yes, by a few microseconds. At system-design time, no — you pay the validation cost on every read instead, multiplied by the number of reads (typically 10–100x the number of writes). MongoDB's own internal benchmarks show validator overhead at ~5% of write latency for typical schemas; reader-side validation in Pydantic or Zod is in the same range, but executed many more times. The throughput argument for schema-on-read is real only when your reads do not actually validate, which means they tolerate drift, which is the trap this chapter is about.
-
"If we use Pydantic / Zod everywhere, we have a schema." You have a schema — within one service. The contract you have at the database is the union of every Pydantic model in every service that writes to the collection. Two teams shipping two slightly different
Transactionmodels is the most common drift cause in the wild. The mitigation is making the model a shared library, owned by one team, versioned and released like a real dependency — not duplicated in each service's repo. -
"Embedded documents avoid joins, so they're always faster than relational." Embedded reads of a single document are faster than a multi-table SQL join with the same data. But embedded writes that update a shared sub-document trigger MongoDB's whole-document rewrite, which on a large document is much slower than a focused relational update of one column. And embedded aggregations across many documents (the analytics use case) are typically slower than the equivalent SQL
GROUP BYbecause Mongo's aggregation pipeline lacks the type information a relational planner gets for free. The performance picture flips depending on read/write mix and aggregation needs — chapter 141 walks through when MongoDB's aggregation pipeline matches Postgres performance and when it does not.
Going deeper
How the worst real-world failures actually happen
Sarah Mei's 2013 essay walks through Diaspora's MongoDB migration as the canonical cautionary tale: a social network where they thought their data was tree-shaped (users have posts have comments) and discovered, six months in, that it was actually graph-shaped (any user can interact with any post, any comment references any user). The denormalised document layout that looked elegant on day one became a nightmare on day 200, with cross-document references being maintained by application code that occasionally got it wrong. They migrated back to PostgreSQL.
The same pattern shows up in production failures at much larger Indian companies. A well-known fintech in Mumbai (the kind processing several hundred crore rupees per day) had to mount a six-month "schema reconciliation" project in 2022 to clean up exactly the multi-variant amount-field problem this chapter describes. The reconciliation revealed that approximately 0.3% of historical transactions had been misclassified by analytics dashboards because of if/else-branch bugs that fired only on a specific schema variant from 2020. The financial impact was small (₹4 crore in restated revenue) but the regulatory implications were significant — RBI rules require accurate transaction reporting and the company had been silently inaccurate for two years.
The lesson is not "do not use MongoDB". The lesson is that the cost of schema drift is not visible until a regulator, an auditor, or a CFO asks a question that requires a uniform shape across history. Until that day, the drift is invisible. After that day, it is too late to clean up cheaply.
What MongoDB itself recommends, vs. what teams actually do
MongoDB's own Building with Patterns guide lists twelve patterns for production document modelling. Three of them — Schema Versioning, Polymorphic Pattern, and Outlier Pattern — are explicitly about containing schema drift, not avoiding it. The official recommendation, in other words, is the same one this chapter makes: assume drift will happen and design the application to absorb it.
The catch is that very few teams follow the guide. The "developer experience" pitch of MongoDB is that you do not need to read twelve patterns of advice before inserting a document — you just insert it. Most teams adopt MongoDB precisely because they want to avoid the upfront design work, then discover the patterns three years later, then either retrofit them or migrate to Postgres. The adoption curve and the regret curve are the same shape.
The Postgres JSONB middle path, with real numbers
Postgres JSONB is the option most teams should consider before going full document. The benchmarks on a 2024 ThinkPad with Postgres 16 are revealing:
Workload: 10 million transactions, 95% rectangular fields, 5% varying fields
Plain Postgres (everything as columns): write 220k tps, read 850k tps
Postgres JSONB (everything as one JSONB): write 165k tps, read 410k tps
Hybrid (columns + JSONB for varying): write 210k tps, read 780k tps
MongoDB (equivalent document): write 180k tps, read 460k tps
The hybrid is within 5% of pure-column performance for both writes and reads, and gives you genuine schema flexibility on the JSONB column. For most fintech, e-commerce, and healthcare workloads — where 90%+ of fields are stable and a small number genuinely vary — this is the better trade-off than either extreme. The schema-on-write columns get the database's full validation; the JSONB column gets the document database's flexibility. Indices on JSONB paths (GIN indices) make path-queries on the variable fields fast.
Read PostgreSQL's JSONB documentation for the operator surface (->, ->>, @>, ?, ?&, ?|) and the indexing strategy. The combination of jsonb_path_ops GIN indices and selective ->> extraction lets you run queries that touch JSONB fields at near-relational speed for cardinalities up to a few hundred million rows. Beyond that scale, the trade-offs shift and a dedicated document store may win — but most workloads never get there.
The Aadhaar lesson
The UIDAI's Aadhaar identity infrastructure famously uses MongoDB for parts of its pipeline (alongside HDFS, Hadoop, and a relational core). This is sometimes cited as proof that MongoDB scales to a billion-user identity system. The honest reading is more nuanced: UIDAI uses MongoDB for the parts of the pipeline where the data genuinely is variable (biometric metadata, audit trails, demographic enrolment forms in 22 official languages with different field sets), and uses relational systems for the parts where the data is rectangular (the canonical identity record itself). They split the workload by data shape. They also have a team large enough to enforce schema discipline socially — which most teams do not.
The general lesson: scale alone does not justify document databases; data shape does. If your data is mostly rectangular, the rectangular database is the right tool regardless of scale. If your data is genuinely tree-shaped and varies per record, the document database is the right tool regardless of scale. The trap is choosing the document database for rectangular data because of perceived flexibility benefits — and then paying the integral.
References
- Pat Helland — If You Have Too Much Data, then "Good Enough" Is Good Enough — the canonical essay on schema discipline at scale; lossy formats, eventual consistency, and the limits of perfect data hygiene.
- Sarah Mei — Why You Should Never Use MongoDB — the influential 2013 critique of naive document modelling; the social-network case study that taught a generation what not to do with embedding.
- MongoDB documentation — Schema Validation — the canonical reference for
$jsonSchemavalidators, validationLevel, and validationAction. - MongoDB documentation — Building with Patterns: The Schema Versioning Pattern — the official treatment of versioned-document migration; one of twelve patterns in MongoDB's data-modelling guide.
- PostgreSQL documentation — JSON Types and JSONB — the canonical reference for JSONB storage, operators, and the GIN indexing strategy that makes Postgres a pragmatic relational+document hybrid.
- Martin Kleppmann — Designing Data-Intensive Applications, Chapter 2: Data Models and Query Languages — the deepest single treatment of the document-vs-relational tradeoff; the schema-on-read vs schema-on-write framing this chapter borrows.