Cursors, updated_at columns, and their lies

Aditi's high-water mark survived its first month. The transactions pipeline runs in 4 minutes a night, the cursor advances cleanly, the CFO has stopped messaging her on Slack at 09:00. Then her counterpart on the merchants team — Karan — copies the same pattern for the merchants table, swaps id for updated_at because merchants get edited (not just inserted), and ships it. Six weeks later a finance reconciliation flags ₹14 lakh of merchant-tier revenue assigned to the wrong tier. The bug is in Karan's cursor. The column was lying to him, and updated_at lies in four distinct ways that every senior data engineer eventually learns to spot.

An updated_at column looks like the obvious cursor for tables with UPDATEs, but it is set by the application — which means out-of-order commits, missing-update bugs, in-place rewrites, and clock skew all silently move rows behind the cursor. The fix is not "use updated_at more carefully"; it is to know which lie applies to your source and pick a different mechanism — commit timestamps, version columns, CDC, or trigger-maintained shadow timestamps.

Why updated_at looks like the right answer

When the source table has only INSERTs, an auto-increment id or a server-side created_at is enough. The previous chapter covered both. The trouble starts when the source has UPDATEs — and almost every business table has UPDATEs. A merchant changes their tier from Silver to Gold. A customer changes their address. An order's status moves from pendingpaidshippeddelivered. The pipeline must see the new value, not just the old.

The naive fix is intuitive: add an updated_at TIMESTAMP DEFAULT now() column to the source, set up an ON UPDATE trigger so it advances on every row change, and use it as the cursor. The pipeline becomes SELECT * FROM merchants WHERE updated_at > :mark. New rows get a fresh updated_at at insert. Edits get a fresh one at update. The cursor advances. The problem appears solved.

The naive updated_at cursorA diagram showing a merchants table on the left with columns id, name, tier, updated_at, an arrow labelled "updated_at > mark" pointing to a pipeline box, and the pipeline writing rows into a warehouse box on the right. The cursor's last_value is shown in a small box above.merchants (source)id, name, tier,updated_attrigger: bumpupdated_at on UPDATEcursorlast_value = '2026-04-24 23:55:00'pipelineupdated_at > markMERGE into destwarehouse_merchantsUPSERT on idlooks consistentuntil it doesn't
The naive shape — and the shape that 90% of teams ship in their second incremental pipeline. The mechanism is correct. The column is the lie.

Why this looks correct: the contract updated_at > mark does describe a monotonic frontier of "rows whose latest mutation happened after my last run", and a MERGE on the destination's primary key absorbs duplicates. On a single-writer source with no clock skew, no long transactions, and a perfectly-honest application that bumps updated_at on every mutation, this works. The four lies in the next four sections are about the gap between "perfectly-honest application" and the application you actually have in production.

The shape Karan shipped on his merchants pipeline:

# merchants_incremental.py — the naive updated_at cursor
import os, datetime as dt
import psycopg2
from psycopg2.extras import execute_batch

PIPELINE = "merchants_to_warehouse"

def get_mark(cur) -> dt.datetime:
    cur.execute("SELECT last_value FROM pipeline_state WHERE pipeline_name = %s",
                (PIPELINE,))
    row = cur.fetchone()
    if row is None:
        raise RuntimeError(f"no cursor for {PIPELINE}; refuse to default to zero")
    return row[0]

def fetch_changes(cur, since: dt.datetime, batch: int = 10_000):
    cur.execute(
        """SELECT id, name, tier, gst_state, updated_at
             FROM merchants
            WHERE updated_at > %s
            ORDER BY updated_at ASC
            LIMIT %s""",
        (since, batch),
    )
    return cur.fetchall()

def main():
    src = psycopg2.connect(os.environ["SOURCE_DSN"])
    dst = psycopg2.connect(os.environ["DEST_DSN"])
    with src.cursor() as scur, dst.cursor() as dcur:
        mark = get_mark(dcur)
        rows = fetch_changes(scur, mark)
        if not rows:
            print(f"no changes since {mark.isoformat()}")
            return
        execute_batch(dcur,
            """INSERT INTO warehouse_merchants
                  (id, name, tier, gst_state, updated_at)
               VALUES (%s, %s, %s, %s, %s)
               ON CONFLICT (id) DO UPDATE SET
                  name        = EXCLUDED.name,
                  tier        = EXCLUDED.tier,
                  gst_state   = EXCLUDED.gst_state,
                  updated_at  = EXCLUDED.updated_at
               WHERE EXCLUDED.updated_at > warehouse_merchants.updated_at""",
            rows, page_size=1000)
        new_mark = max(r[4] for r in rows)
        dcur.execute(
            """INSERT INTO pipeline_state (pipeline_name, last_value, updated_at)
               VALUES (%s, %s, now())
               ON CONFLICT (pipeline_name) DO UPDATE
                  SET last_value = EXCLUDED.last_value,
                      updated_at = EXCLUDED.updated_at""",
            (PIPELINE, new_mark))
        dst.commit()
    print(f"loaded={len(rows)} old={mark.isoformat()} new={new_mark.isoformat()}")

if __name__ == "__main__": main()

A typical week of runs in Karan's logs:

2026-04-19T01:00:08Z loaded=4,219 old=2026-04-18T23:55:01 new=2026-04-19T00:59:48
2026-04-20T01:00:11Z loaded=3,907 old=2026-04-19T00:59:48 new=2026-04-20T00:59:52
2026-04-21T01:00:09Z loaded=4,144 old=2026-04-20T00:59:52 new=2026-04-21T00:59:54
2026-04-22T01:00:10Z loaded=    0 old=2026-04-21T00:59:54 new=2026-04-21T00:59:54
2026-04-23T01:00:12Z loaded=4,488 old=2026-04-21T00:59:54 new=2026-04-23T00:59:51

The Tuesday loaded=0 is the first hint of the first lie. The pipeline is healthy by every metric — the cursor advances, the destination accepts the rows, no errors. But loaded=0 on a working business day with 7,000+ merchant edits is the data engineer's equivalent of a smoke detector going off in the next room. Most teams ignore it for weeks. Karan ignored it for six.

Three lines in this code matter beyond the boilerplate. ORDER BY updated_at ASC ensures that when the batch is truncated by LIMIT, the cursor advances to the largest seen value without leaving a gap inside the batch. WHERE EXCLUDED.updated_at > warehouse_merchants.updated_at in the MERGE protects against an out-of-order replay overwriting a newer row in the warehouse with an older row from the source — a defensive net that becomes critical once Lie #1 starts firing. max(r[4] for r in rows) advances the cursor only to the largest timestamp actually returned, not to now() — the same discipline as the high-water mark chapter.

Lie #1: the application sets updated_at, the database commits

This is the same trap as the previous chapter, sharpened: the updated_at column is set by the application at row-mutation time, but the row becomes visible to the pipeline at commit time. With multiple writers, those two moments differ, and the difference is what the cursor will skip.

Concretely: thread A on the merchants service starts a transaction at 09:14:32.000, sets updated_at = '09:14:32.001234' on merchant M1, but its commit takes 850 ms because of a network blip. Thread B starts at 09:14:32.500, sets updated_at = '09:14:32.501100' on merchant M2, and commits in 50 ms. The merchant pipeline runs at 09:14:33.000, sees only M2 (because M1 is still uncommitted), advances the mark to 09:14:32.501100. At 09:14:33.150, M1 finally commits — its updated_at is 09:14:32.001234, half a second behind the mark. The cursor will never re-fetch it.

The Karan-on-merchants version: a manual UPDATE merchants SET tier = 'Gold' WHERE id IN (...) run by the operations team in a long-held transaction took 18 minutes to commit because the transaction was waiting on a deadlock with another UPDATE. Every merchant in the batch had updated_at set 18 minutes before the row became visible to the pipeline. The pipeline's mark, advancing every minute, was already 17 minutes ahead by the time the batch committed. Six weeks of intermittent merchants commits like that, scattered across maintenance windows and bulk operations, produced the ₹14 lakh tier-mismatch.

The fix is the same as in the high-water-mark chapter: switch the cursor column from an application timestamp to a database-assigned commit timestamp. Postgres calls this pg_xact_commit_timestamp(xmin), set by the database at commit time, monotonic in commit order. The cost is 5% extra WAL storage and a track_commit_timestamp = on flag that requires a server restart. The benefit is that this entire class of bug disappears.

-- the cursor query, post-fix
SELECT id, name, tier, gst_state,
       pg_xact_commit_timestamp(xmin) AS commit_ts
  FROM merchants
 WHERE pg_xact_commit_timestamp(xmin) > %s
 ORDER BY pg_xact_commit_timestamp(xmin) ASC
 LIMIT %s;

Why commit timestamps cost 5% extra storage: every Postgres heap tuple already records xmin (the inserting transaction id), and every WAL commit record already includes the commit's xact id. Enabling track_commit_timestamp adds an 8-byte timestamp to the commit record and maintains an in-memory pg_commit_ts SLRU cache so lookups are O(1). The 5% number is the per-row overhead in the cache, not the WAL increase, which is closer to 0.5% for OLTP workloads. The reason teams put off enabling it is operational, not technical — a DBA-coordinated restart on a production cluster needs a maintenance window, and the migration's payoff (no more silent skips) is invisible until the next reconciliation.

Lie #2: the application forgot to bump updated_at

The cursor's contract is "every mutation that should affect downstream sees a fresh updated_at". The contract is enforced by application code or a trigger. Both leak.

The application-code version: a junior engineer writes a hotfix that updates merchant tier directly via UPDATE merchants SET tier = ? WHERE id = ? — and forgets to also set updated_at = now(). The hotfix ships, fixes the bug, and quietly poisons the cursor. Every merchant updated by that code path keeps its old updated_at. The pipeline never sees them. They diverge from the warehouse silently — sometimes for months, until a reconciliation catches up.

The trigger version: an ON UPDATE trigger that bumps updated_at works for UPDATE, but not for INSERT ... ON CONFLICT DO UPDATE if the trigger is BEFORE UPDATE and the conflict path skips it (depends on the database; Postgres fires the trigger, MySQL's behaviour with ON DUPLICATE KEY UPDATE is more subtle). It also doesn't fire for TRUNCATE, for direct heap modifications via pg_repack reorganisations, or for replication-applied changes on read replicas if the pipeline reads from a replica.

The fix has two layers. The lower layer is a database-enforced invariant: a BEFORE UPDATE trigger that always sets NEW.updated_at = now() regardless of what the application supplied, so the application can't forget. The upper layer is a periodic audit: a daily query that compares count(*) WHERE updated_at > now() - interval '24 hours' between source and destination — if the source's count is materially higher than the pipeline loaded yesterday, something is poisoning the column.

CREATE OR REPLACE FUNCTION bump_updated_at() RETURNS trigger AS $$
BEGIN
  NEW.updated_at := now();   -- ignore whatever the app sent
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

DROP TRIGGER IF EXISTS bump_updated_at_t ON merchants;
CREATE TRIGGER bump_updated_at_t
  BEFORE UPDATE ON merchants
  FOR EACH ROW
  EXECUTE FUNCTION bump_updated_at();

The trigger is intentionally aggressive: it overwrites whatever the application supplied. This means a backfill script that wants to preserve original updated_at values has to disable the trigger explicitly. That friction is the feature, not the bug — the cursor's correctness is more valuable than the application's flexibility.

Lie #3: the row was rewritten in place; the cursor saw nothing

UPDATE isn't the only way rows change. Three operations rewrite rows without firing the application's updated_at logic and without firing trigger-based fallbacks the way the pipeline expects.

pg_repack and VACUUM FULL rewrite the heap to reclaim space. The new tuple has a new xmin, a new commit timestamp, but the same logical contents and the same updated_at. A pipeline reading pg_xact_commit_timestamp(xmin) will re-fetch every row in the table — a stealth full reload. A pipeline reading the application's updated_at will miss nothing here, because the column didn't change. The two cursor strategies have opposite failure modes for the same operation.

Logical replication conflict-resolution on a multi-master setup can rewrite rows with the winning replica's data and update xmin without re-running application triggers. The data-engineer-readable form: if your source is a Postgres BDR cluster or a Citus distributed table, the cursor strategy that worked on a single-node Postgres might silently miss reconciled rows.

Direct pg_dump + restore for environment migrations creates a snapshot where every row has xmin from the restore transaction and updated_at from the original row. After a migration, an xmin-based cursor sees every row as "new"; an updated_at-based cursor sees them all as "ancient". Both are wrong; both produce predictable problems.

The defence is to know which operations your DBAs run on the source and to either pause the pipeline during them or to plan the cursor's recovery from each. The pg_repack case is the most common in Indian production systems — every database-of-record team runs it monthly to reclaim space, and every analytics team has at least once been surprised by the resulting full-table re-scan that overwhelmed the warehouse.

The four lies of updated_at, side by sideA four-quadrant grid showing each lie: out-of-order commits, missing updates, in-place rewrites, and clock skew. Each quadrant has a small illustration and a one-line consequence.Lie #1: app-time vs commit-timecommits lastcursor advances past row that hasn't committedLie #2: app forgot to bumpUPDATE merchants SET tier=? WHERE id=?/* updated_at unchanged */row mutated; cursor never sees itLie #3: in-place rewritepg_repack / VACUUM FULL / pg_dump restorexmin moves, updated_at doesn'tcursor strategy chooses your failureLie #4: clock went backwardNTP steppost-step row has earlier ts than mark
The four lies. Three of them silently lose rows. One of them (Lie #3, depending on cursor strategy) silently re-loads the table. None of them throw an error.

Lie #4: the clock moved backward

NTP corrects time. Sometimes it does so by stepping the clock — including stepping it backward. A row inserted at wall-clock 01:23:45.000 (post-correction) has an updated_at lower than a row inserted at 01:23:46.000 (pre-correction, before the clock moved back to 01:23:44.500). The pipeline's mark already advanced to the higher value; the post-step row will never be picked up.

This is rarer than the first two lies. It happens once a quarter on a typical Indian-cloud-region Postgres host, more often if the host has poor NTP discipline. The Cred infra team's 2024 retro mentioned a single such event that lost 312 merchant edits over an 8-minute window — small enough that no individual reconciliation flagged it, large enough that a quarterly trend report did.

The defence overlaps with Lie #1's: use a database-assigned commit timestamp that is monotonic regardless of wall-clock direction. Postgres's pg_xact_commit_timestamp ratchets forward inside the database's WAL, immune to NTP steps on the OS clock. If commit timestamps aren't available (older Postgres, MySQL without ts extensions), the alternative is a logical sequence column — a bigint advanced by a sequence on every row mutation via the same BEFORE UPDATE trigger that handles Lie #2. Sequences are monotonic by construction; clocks aren't.

Choosing the cursor when you have updates

The high-water-mark chapter ranked four columns for INSERT-only tables. With UPDATEs in the mix, the ranking shifts:

Application updated_at. Hits all four lies. Acceptable only for low-throughput, single-writer sources where the team accepts a quarterly reconciliation and a small steady-state error rate. Always paired with a database-enforced trigger and an audit query.

Database commit timestamp (pg_xact_commit_timestamp(xmin)). Immune to lies #1 and #4. Vulnerable to lie #3 (counts in-place rewrites as new mutations). Right choice for high-throughput sources where a periodic full-table re-scan after pg_repack is acceptable cost.

Version column maintained by trigger. A bigint version column advanced by nextval('merchants_version_seq') on every BEFORE INSERT OR UPDATE trigger fire. Immune to lies #1, #2, #4. Vulnerable to lie #3 if the rewrite path re-fires triggers (often does in Postgres). Carries the smallest operational surprise per row and is what Razorpay's payments team migrated to in 2022 after the silent-skip incident — but you have to schema-change every source table to add the column.

Change data capture (CDC) — logical replication / binlog. Reads the database's WAL/binlog directly. Sees every committed mutation in commit order, including rewrites. Immune to all four lies. The cost is a fundamentally different pipeline (streaming, not polling), more infrastructure, and a learning curve. Build 11 covers it.

The decision rule: if your source is high-throughput and has UPDATEs that affect financial reporting, you will eventually move to CDC. The version-column intermediate step is a useful 12–18 month bridge while you plan the migration. Every Indian fintech of meaningful scale has walked this path; the lesson is to plan the bridge rather than discover you need it during a 03:00 incident.

Why a version column is the practical bridge rather than commit timestamps: enabling track_commit_timestamp requires a server restart and a DBA-coordinated maintenance window — and the team that owns the source database is rarely the team that owns the pipeline. A version column is a schema change that the pipeline team can negotiate with a single migration PR, doesn't require a restart, and is implementable on any database with a sequence and a trigger. The trade-off is that you've added a column to every table; the win is that you can ship the fix this quarter rather than waiting for next quarter's database-team maintenance window. Pragmatic engineering rarely picks the most-correct option; it picks the one you can ship.

Common confusions

Going deeper

The Razorpay payments team's version-column migration

In 2022, Razorpay's payments engineering team migrated the payments table's downstream cursor from updated_at to a sequence-backed version column. The migration's published post-mortem (internal, summarised in a 2023 conference talk) describes a 9-week project: a Postgres migration that added a version BIGINT NOT NULL DEFAULT nextval('payments_version_seq') column, a BEFORE INSERT OR UPDATE trigger that advanced it on every mutation, a one-time backfill that assigned monotonic versions to existing rows, and a cut-over week where the analytics pipeline ran both cursors in parallel and reconciled the diffs.

The diff during cut-over week revealed 2,847 payments whose updated_at had not advanced despite a status change — most attributable to a single hotfix from 2021 that updated payment_status directly without touching updated_at. The version column caught all of them on the first run. The reconciliation cost was ₹0; the cost of not having caught them earlier was ₹2.3 crore in mis-classified payment success rates over 18 months.

The lesson the team published: the version-column approach has roughly the same per-row overhead as updated_at and roughly the same operational complexity, but is immune to the bug class that updated_at cannot defend against. The migration cost was ~9 person-weeks of engineering. The savings in eliminated reconciliation work paid back in the first quarter.

The two-cursor pattern for soft-deleted rows

A subtle variation: tables that use a deleted_at column for soft deletes (instead of physical DELETE) need a cursor that catches both edits and deletes. A soft delete is logically a row mutation — tier doesn't change, deleted_at flips from NULL to now(). If updated_at is bumped on the soft-delete path (by trigger or by application), the standard cursor handles it. If it isn't, the deleted rows live forever in the warehouse.

The defensive pattern is two cursors per pipeline: one on updated_at for content edits, one on deleted_at IS NOT NULL AND deleted_at > mark for soft deletes. The pipeline runs both queries, MERGEs the unioned result, and advances both marks. The destination model has a is_deleted boolean computed from deleted_at IS NOT NULL, and downstream queries filter on it. This pattern is what the upsert / merge chapter called the "tombstone" pattern; here it's the cursor's responsibility to surface tombstones into the warehouse.

The audit query every cursor pipeline should run

A daily audit query that compares row counts between source and destination over the last 24 hours catches all four lies before they become a finance reconciliation:

-- run daily at 02:00, alert if mismatch > 1%
SELECT
  (SELECT count(*) FROM merchants
    WHERE updated_at >= now() - interval '24 hours') AS src_24h,
  (SELECT count(*) FROM warehouse_merchants
    WHERE updated_at >= now() - interval '24 hours') AS dst_24h,
  (SELECT count(*) FROM merchants
    WHERE updated_at >= now() - interval '24 hours') -
  (SELECT count(*) FROM warehouse_merchants
    WHERE updated_at >= now() - interval '24 hours') AS gap;

The query is dumb on purpose. A 1% mismatch over 24 hours is the finance team's first hint that the cursor is leaking. The pipeline team's first hint is the same query running daily and trending. Plot the gap on a 30-day rolling chart — sudden jumps mean a deploy poisoned the trigger; slow drifts mean a class of mutations is flowing around the cursor; a flat zero gap is the calibration point that says "the cursor is currently fine".

When updated_at is the right column anyway

Despite the four lies, updated_at remains the right cursor for plenty of pipelines. The criteria are: single-writer source (or coordinator-managed sequence so commits are serial), low write throughput (under ~50 mutations/sec), no in-place rewrite operations on the source's maintenance schedule, and a 24-hour reconciliation tolerance with the source being the authoritative system. Most internal CRUD admin tools at small companies fit. Most reference-data tables (currencies, GST states, ISO country codes) fit.

The wrong fit is anything financial, anything with multiple application instances writing concurrently, anything with a long-running batch operation in its background job mix. The cursor will work most of the time and lie loudly the day finance asks. The discipline is to know which category your pipeline is in and pick accordingly — not to default to updated_at because it's the column the schema already has.

The bitemporal model the next chapter needs

Every fix above treats the cursor as a single value. The richer model — bitemporal, covered in chapter 14 — treats every row as living in two time dimensions: valid_time (when this version of the row was true in the business's view) and transaction_time (when this version was recorded in the database). updated_at collapses both into one column and forces the pipeline to choose which one it cares about. CDC streams expose both; the bitemporal model uses both. The current chapter is the practical bridge between "one cursor on updated_at" and "two timestamps per row, queried independently". Read this chapter for what to ship next quarter; read chapter 14 for what to ship next year.

Where this leads next

Build 11 will revisit cursors as logical replication offsets — a cursor on the database's WAL itself, the elimination of all four lies in this chapter at the cost of a fundamentally different pipeline shape. The current chapter is the polling-pipeline view of the same problem; CDC is the streaming view. Most Indian fintech data teams ship the version-column-on-updated_at design described here for 12–18 months before moving to CDC. The chapter exists to make that 12–18 months survivable.

References

  1. PostgreSQL: track_commit_timestamp — the configuration parameter that exposes commit timestamps for use as a cursor column.
  2. Postgres documentation: pg_xact_commit_timestamp — function reference for the column-form cursor query.
  3. Debezium: snapshot and streaming modes — the production CDC system that replaces updated_at cursors once the four lies become unbearable.
  4. dbt: late-arriving facts and incremental models — dbt's framing of the cursor's contract and the unique-key safety net.
  5. Martin Kleppmann, Designing Data-Intensive Applications, Chapter 11 — the chapter on stream processing introduces the watermark abstraction that generalises the cursor.
  6. Razorpay engineering blog: incremental analytics at payments scale — published account of the version-column migration referenced in §"Going deeper".
  7. High-water marks: tracking what you've seen — the previous chapter, the simpler pattern this chapter extends.
  8. Confluent: change data capture with Debezium and Kafka Connect — a vendor-leaning but technically solid introduction to the CDC primitives that subsume this chapter's mechanism.

The honest summary: updated_at is the column every team picks first, and every team eventually replaces. The replacement is sometimes a commit timestamp, sometimes a sequence-backed version column, sometimes a CDC stream. None of them are free; all of them are cheaper than the reconciliation that the original column will eventually force. The senior data engineer's job is to know which lie applies to which source, and to ship the migration before finance asks. The migration cost is a quarter; the reconciliation cost is everything you didn't catch since the day you shipped the cursor.

A practical exercise: pull up your team's most-trusted incremental pipeline and run the audit query from §"Going deeper" on it for the last 30 days. If the gap is non-zero on any day, you have already met one of the four lies. The chapter exists so the meeting is in staging and not in production at 03:00 with the CFO on Slack.