Schema evolution across the CDC boundary
A Razorpay backend engineer ships a migration on Tuesday at 3 p.m. that adds a gst_invoice_id column to public.payments. The migration runs in production in 40 milliseconds — a clean ALTER TABLE. Nothing pages. The engineer goes home. By Thursday morning the analytics team finds that the Iceberg payments table in the lakehouse hasn't received a new row in 38 hours, the Snowflake mirror is showing rows where gst_invoice_id is silently missing, and a downstream Flink job for fraud detection has been crash-looping. None of those three breakages are bugs in the database. They are all the same bug: a schema change crossed the CDC boundary and the pipeline didn't know what to do with it.
A CDC pipeline has three places where schema lives — the source database, the change-event payload, and the sink table — and a schema change in the source races against all three. Every CDC system either freezes the schema (rejects incompatible changes), evolves it forward (adds columns, never removes), or breaks. Production-grade pipelines pin the source-of-truth in a schema registry, classify every change as backward-compatible or breaking, and route incompatible changes through a manual upgrade gate.
The three places schema lives
In a non-CDC world, schema lives in one place — the database that owns the table. The application reads INFORMATION_SCHEMA, the ORM regenerates types, life is fine. CDC adds two more places where the same schema must agree, and those places do not update atomically.
The source database is the only place that updates atomically with the ALTER. Everything downstream is eventually consistent, which on a quiet table can mean eventually is hours away. Why this is fundamental and not solvable by being clever: a CDC connector cannot decode WAL or binlog entries that reference a column the connector hasn't yet learned about. Debezium for Postgres maintains its own in-memory schema cache built from ALTER TABLE events that flow through the WAL alongside INSERT and UPDATE events. If no row is written after the ALTER, the connector never decodes any WAL entry that mentions the new column, so the registry is not updated. The window is not a bug — it's the geometry of replication-log decoding.
The three locations need to agree on three things: the set of columns, the type of each column, and the nullability of each column. The taxonomy of schema changes follows directly: you can change the set, you can change the types, you can change nullability, and any combination of these.
The compatibility taxonomy
Not every schema change is created equal. The schema-registry world (Confluent's, Apicurio's, AWS Glue's — they all converge on the same model) classifies changes into compatibility modes that determine whether a change can flow through a CDC pipeline without breaking consumers.
| Change | Backward | Forward | Full |
|---|---|---|---|
| Add column with a default | OK | OK | OK |
| Add column without default | OK | Break | Break |
| Drop column | Break | OK | Break |
| Rename column | Break | Break | Break |
Widen type (int32 → int64) |
OK | Break | Break |
Narrow type (int64 → int32) |
Break | Break | Break |
Change NOT NULL to nullable |
OK | Break | Break |
Change nullable to NOT NULL |
Break | OK | Break |
Backward compatible means new readers can read old data. This is the default in most schema registries: an Iceberg table whose schema added gst_invoice last week can still read parquet files written before the column existed (those rows just get NULL). Forward compatible means old readers can read new data — a frozen Flink job with last week's schema can still parse a payload that has the new column (it ignores the unknown field). Full compatibility is both — required when a pipeline has heterogeneous consumer versions running simultaneously.
The two changes that break everything regardless of mode: drop column and rename column. Renames are particularly nasty because to the database they are atomic — ALTER TABLE payments RENAME COLUMN status TO state is one DDL statement — but to the CDC system they look like a drop-and-add, and worse, the connector emits the new column with all NULL priors because it's never seen state before. Why renames are the biggest source of "the pipeline silently lies" incidents: the connector can't tell semantically that the new column's data should come from the old column. It treats them as unrelated. Snowflake, Iceberg, and Delta all silently accept the new schema. Then dashboards built on status quietly stop receiving rows. The bug surfaces when someone notices "wait, why has status been NULL for everyone for the past three days".
The four strategies a CDC pipeline picks from
Once you accept that schema changes happen and the boundary is racy, the question becomes: how does the pipeline respond? Every production CDC system in 2026 picks one of four strategies, often a different strategy per sink.
Strategy 1 — Freeze. Reject any schema change at the registry boundary. The pipeline is configured with schema.compatibility = NONE or its equivalent. The first incompatible change causes Debezium (or the upstream producer) to fail loudly: "schema X cannot be registered, conflicts with Y". The DBA's ALTER TABLE is not blocked at the source — the source is still happily mutating — but the CDC connector enters a failed state. This forces a human to intervene. In a high-trust org with strong DBA-data-engineer collaboration, this is the right strategy: it makes schema change a planned event, not a surprise. Stripe famously runs this way.
Strategy 2 — Auto-evolve forward only. The pipeline accepts add-column changes, refuses everything else. This is what schema.compatibility = BACKWARD plus a sink writer like Iceberg's merge-schema=true does. New columns appear in the sink as NULL-filled extensions; old data remains queryable. Drops, renames, type narrowings still trigger a hard failure. This is the default for the Razorpay/Flipkart/Swiggy class of org — most schema changes in their experience are additive (new feature ships → new column).
Strategy 3 — Replay through a transformation. Instead of evolving the sink, ship every row through a translation layer that maps the source schema to a stable contract schema. The contract sits in a separate registry; the source can churn but the contract evolves on its own (much slower) cadence. This is what data contracts (chapter 33) and the dbt source layer do. The cost is one more hop and the discipline of maintaining the contract; the benefit is that 90% of schema changes never surface to consumers.
Strategy 4 — Break, alert, fix. No automatic evolution at all. Every schema change requires a manual cutover: stop the connector, stop downstream consumers, update sink schema, restart everything. Used by orgs that run CDC for compliance/audit (every change must be approved by a human) or by teams whose data lake schema is itself a compliance artefact subject to quarterly review. Slow, careful, expensive — but the schema state is always known.
Build it: a Debezium + Schema Registry pipeline that survives an ALTER TABLE
Let's wire the moving parts together. A small Python script that talks to Confluent's Schema Registry, registers a baseline schema, simulates a Debezium-style update with an added column, and shows what the registry does at each compatibility setting.
# schema_evolution_demo.py — exercise the schema registry boundary that sits
# between a Debezium connector emitting Razorpay payments events and the Iceberg
# sink that consumes them. Uses confluent-kafka-python's SchemaRegistryClient.
import json
from confluent_kafka.schema_registry import SchemaRegistryClient, Schema
sr = SchemaRegistryClient({"url": "http://schema-registry:8081"})
subject = "razorpay.public.payments-value"
# Set the compatibility mode for this subject. BACKWARD is the production default
# for a lakehouse sink — new readers can read old data, old data still queryable.
sr.set_compatibility(level="BACKWARD", subject_name=subject)
# v1: the schema as it stands on Tuesday morning. Fields match the Postgres table
# the Debezium connector is mirroring.
v1_schema = {
"type": "record", "name": "Payment", "namespace": "razorpay.public",
"fields": [
{"name": "id", "type": "long"},
{"name": "amount", "type": "long"}, # rupees, in paise
{"name": "status", "type": "string"}, # captured / failed / refunded
{"name": "created_at", "type": "long"}, # epoch micros
]
}
v1_id = sr.register_schema(subject, Schema(json.dumps(v1_schema), "AVRO"))
print(f"v1 registered as schema id {v1_id}")
# Tuesday 3 p.m. — DBA runs ALTER TABLE payments ADD COLUMN gst_invoice TEXT;
# The next write produces a Debezium event with the new field. Connector
# computes a new Avro schema and asks the registry to register it.
# CASE A — backward-compatible: new field has a default. Old consumers can still
# read events written under v2 because they ignore unknown fields; new consumers
# reading old events get the default for the missing field.
v2_compatible = {
"type": "record", "name": "Payment", "namespace": "razorpay.public",
"fields": [
{"name": "id", "type": "long"},
{"name": "amount", "type": "long"},
{"name": "status", "type": "string"},
{"name": "created_at", "type": "long"},
{"name": "gst_invoice", "type": ["null", "string"], "default": None},
]
}
v2_id = sr.register_schema(subject, Schema(json.dumps(v2_compatible), "AVRO"))
print(f"v2 (add-with-default) registered as schema id {v2_id}")
# CASE B — incompatible: drop a column. The registry rejects.
v3_breaking = {
"type": "record", "name": "Payment", "namespace": "razorpay.public",
"fields": [
{"name": "id", "type": "long"},
{"name": "amount", "type": "long"},
# status removed — incompatible because old readers expect it
{"name": "created_at", "type": "long"},
]
}
try:
sr.register_schema(subject, Schema(json.dumps(v3_breaking), "AVRO"))
except Exception as e:
print(f"v3 (drop-column) REJECTED: {e}")
A representative run:
v1 registered as schema id 4471
v2 (add-with-default) registered as schema id 4472
v3 (drop-column) REJECTED: Schema being registered is incompatible
with an earlier schema for subject "razorpay.public.payments-value", details:
[{errorType:'READER_FIELD_MISSING_DEFAULT_VALUE', description:'The field
status at path /fields/2 in the new schema has no default value and is
missing in the old schema'}]
Walking through the lines that earn their place:
sr.set_compatibility(level="BACKWARD", ...)— the policy decision. Setting this once per subject means every subsequentregister_schemacall validates against this rule.BACKWARDis what most lakehouse sinks want;FULLis what real-time serving systems with version skew want.{"name": "gst_invoice", "type": ["null", "string"], "default": None}— Avro's union ofnullandstringmakes the field nullable; thedefault: Noneis what makes the change backward-compatible. Without the default, the registry rejects the schema even though the field is nullable. This is a frequent confusion: nullable is not enough, it must have a default.v3_breaking— dropsstatus. The registry's response includes the exact incompatibility reason and the path. Production pipelines log this output to a structured log so the on-call engineer at 3 a.m. sees "drop-column at path /fields/2" rather than "schema-registry returned 409".- The error contract is HTTP: Confluent's registry is a REST API; all clients (Java, Python, Go) translate
409 Conflictto a typed exception. Why this matters operationally: Debezium's behaviour on registry failure depends on the value oferrors.tolerancein the Connect worker config. The defaultnonehalts the connector — exactly what a Strategy-1 (freeze) deployment wants. Setting it toallwould silently drop the event and continue, which is never what you want for a CDC source.
The script is ~50 lines. The same structure scales to a real pipeline: replace the in-script schema with the schema Debezium emits, replace the print statements with metrics, and you have the schema-evolution gate of a production CDC pipeline.
What sinks actually do when the schema changes
The registry decision is half the story. The other half is what the sink writer does with a new schema. Different sinks have radically different defaults.
- Apache Iceberg. Native schema evolution.
ALTER TABLE ... ADD COLUMNis a metadata-only operation; existing parquet files are read with the new column treated asNULL. Drops are also metadata-only; the column data still sits in old files but is hidden from reads. This is why Iceberg is the lakehouse format of choice for CDC sinks — it tolerates churn natively. - Snowflake.
MERGE INTOagainst a streamed table fails on schema mismatch unlessMATCH_BY_COLUMN_NAME = CASE_INSENSITIVEis set. With it on, new columns are added automatically (with explicit grants); without it, the next merge fails. Most Snowflake-on-Debezium pipelines use Snowpipe Streaming with a connector that auto-evolves. - BigQuery. Auto-detection on by default in the BigQuery streaming insert API. New columns appear as
NULLABLEin the table. Drops are not supported via streaming insert — you must do an explicitALTER TABLEmanually. - ClickHouse. Strict by default. The Kafka engine table fails on extra fields unless you use
Kafka(... format='Avro')withinput_format_avro_allow_missing_fields = 1. This is why ClickHouse-on-Debezium is more brittle than Iceberg-on-Debezium. - Postgres mirror via Debezium → Kafka → JDBC sink connector.
auto.evolve = truein the JDBC sink connector adds columns automatically (usingALTER TABLE); drops are never automatic. This is the most common "vanilla" mirror configuration.
The mismatch between source and sink schema-change semantics is where most "subtle drift" bugs live. A Postgres source supports RENAME COLUMN. The Iceberg sink supports RENAME COLUMN (it is metadata only). But the path between them — Debezium → Kafka → Iceberg writer — has no concept of "rename"; it sees a drop and an add. So a rename on the source becomes a drop and an add on the sink. Reports built on the old name return zeros.
Common confusions
- "Backward compatibility means new code can read old data." Yes — and that's the only part most engineers internalise. The other half is that backward compatibility also requires that old code can keep reading new data, because a v1 consumer doesn't disappear the moment v2 is registered. Read the registry's compatibility rules end to end, not just the headline.
- "Adding a nullable column is always safe." Not in Avro without a default. The registry rejects
{"name": "x", "type": ["null", "string"]}without an explicit"default": nulleven though the field allows null at runtime. This catches teams who used JSON Schema's looser semantics and assumed Avro behaved the same way. - "Schema Registry guarantees end-to-end safety." It guarantees compatibility between schema versions. It does not guarantee that a sink's storage format can apply the new schema, that a downstream Flink job's UDF still parses correctly, or that a BI dashboard will show the new column. The registry is one gate; you still need the sink's own evolution policy.
- "Debezium auto-evolves the sink schema." Debezium emits change events. The sink connector (Iceberg sink, JDBC sink, Snowflake sink) is what evolves the destination schema. A naive Debezium-only pipeline that writes raw envelopes to Kafka does no sink evolution at all.
- "Renames are just drop + add." Mechanically yes, semantically no. To the database the rename preserves data; to CDC consumers it looks like the column was deleted and a new column appeared with all-NULL history. Treat renames as breaking changes that require a manual two-phase migration: add new column, dual-write, backfill, then drop old.
- "
FULLcompatibility is always best." It's the most restrictive — many add-column changes that work fine underBACKWARDare rejected underFULLbecause they aren't forward-compatible. UseFULLwhen you have heterogeneous consumer versions in production simultaneously; otherwiseBACKWARDcovers 90% of real lakehouse use cases.
Going deeper
The dbhistory topic and decoding old events
Debezium's per-connector schema-history topic stores every ALTER TABLE event keyed by LSN (or binlog position). When a consumer needs to decode a six-month-old event from the change-events topic, it looks up the schema-id from the message header, fetches the full Avro schema from the registry, and decodes. This works as long as the registry has retained that schema version — which it does forever by default. The schema-history topic, in contrast, is for the connector itself to rebuild its in-memory state on restart; consumers never read it. The two are often confused. Read /wiki/debeziums-architecture for where each lives.
How Razorpay rolls schema changes in production
The pattern most Indian payments-scale orgs converge on, learned from outages: all schema changes are dual-coded. Step 1, the backend team adds the new column with a default; the migration ships during a low-traffic window. Step 2, the data-platform team validates the registry has accepted the v2 schema and the sink has evolved (this is automatic for Iceberg, manual for ClickHouse). Step 3, the backend team starts writing data to the new column. Step 4, downstream consumers are notified via the data-contract registry that v2 is available. The whole rollout takes 7–10 days for a column that the backend could have shipped on Tuesday afternoon. The friction is intentional — it's what prevents the Thursday-morning fire-drill.
Schema evolution under exactly-once with transactional sinks
Build 9's exactly-once story interacts with schema evolution in a subtle way. A transactional Kafka sink commits a batch atomically — but what if half the batch is encoded under schema v1 and half under v2 (because the schema rotated mid-batch)? The Kafka transactional protocol handles this fine because Avro's schema-id is per-record, not per-batch. But a Flink job's checkpoint that covers a state migration across the schema change is harder: the operator state has to know how to upgrade itself when it reads a v2 record using v1's deserialiser. Flink's TypeSerializerSnapshot is the mechanism; it is one of the gnarliest parts of Flink to get right. See /wiki/checkpointing-the-consistent-snapshot-algorithm.
Why most production teams pin the Debezium schema-name
By default, Debezium derives the Avro schema name from the source table — razorpay.public.Payment. If the source table is renamed (say, payments → txns), the schema name changes and the registry treats it as a brand-new subject. Every downstream consumer breaks. The mitigation is transforms.unwrap.add.headers plus an explicit schema.name.adjustment.mode=avro and a fixed topic.naming.strategy. Boring but load-bearing config.
Where this leads next
The next chapter, /wiki/snapshot-cdc-the-bootstrapping-problem, takes the snapshot phase from a different angle: not "how does Debezium do it" but "what does a sink need to receive to safely apply both snapshot rows and streaming rows without double-counting". Schema evolution interacts with bootstrapping in one specific way — the snapshot reads under the current schema, but the streaming events from before the snapshot (if you bootstrap from a position with WAL retention) are encoded under the historical schema. The registry holds both.
After that, Build 12 picks up the lakehouse story: /wiki/iceberg-internals-snapshot-isolation-on-s3 shows what schema evolution looks like inside Iceberg's metadata tree — every schema is a versioned manifest, every snapshot points to a specific schema-id.
For the contract-layer strategy, /wiki/data-contracts-the-producer-consumer-boundary has the full design: how to define a contract that's stable even when the source schema churns daily.
References
- Confluent: Schema Registry compatibility documentation — the canonical reference for
BACKWARD,FORWARD, andFULLmodes with worked examples for every Avro change type. - Apache Avro specification — schema resolution — the precise rules for what counts as compatible. Read once, refer back often.
- Debezium documentation: schema history and DDL — how Debezium tracks and replays schema changes per connector, and the failure modes when the history topic is misconfigured.
- Apache Iceberg: schema evolution — Iceberg's table-format-level evolution rules, distinct from but interacting with the Avro registry rules above.
- Mark Needham: "Five things I wish I knew about Debezium and schema evolution" — operational gotchas from running Debezium-on-Postgres at scale.
- /wiki/debeziums-architecture — the connector architecture chapter that this one extends.
- /wiki/data-contracts-the-producer-consumer-boundary — the contract-layer strategy in detail.
- /wiki/postgres-logical-decoding-from-scratch — what a Postgres
ALTER TABLElooks like in WAL, which is what Debezium decodes to learn about a schema change.