Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.

The headless-BI movement

In late 2023 a senior data engineer at KreditClub named Aditi pulled up the same number in four places: Looker said weekly active users were 9.42 lakh, the Mixpanel cohort said 8.11 lakh, the React dashboard the growth team had built for the founder said 9.78 lakh, and the Slack bot that posted morning numbers said 9.21 lakh. Four sources, four definitions of "active", four implementations of the same SQL — each one written by a different team, each one drifting at its own pace. The fix was not a fifth dashboard. The fix was to put the metric definition behind a single API and let every consumer — Looker, Mixpanel, the React app, the Slack bot — call that API instead of writing its own SQL. That API has a name. It is called headless BI, and the rest of this chapter is about what it does, what it does not do, and why it became the dominant pattern for data stacks in 2024–2026.

Headless BI separates the metric definition from the chart. The metric lives behind an HTTP/JDBC API; the chart, app, notebook, or LLM agent is just a consumer. The shift matters because BI tools are no longer the only thing reading numbers — apps, bots, and agents are — and a metric that lives inside one BI tool cannot reach the others without being re-implemented.

What "headless" actually decouples

A traditional BI tool is two things stapled together: a metric registry (definitions of gmv, active_users, gross_margin) and a renderer (chart components, dashboard layout, drill-down UI). For twenty years the bet was that this coupling was a feature — buy Looker, get the registry and the charts as one thing. Headless BI breaks the staple. The registry stays; the renderer is whatever you want.

The coupled stack on the left has one privileged consumer (the BI tool that owns the metric) and an open question about every other consumer — they re-author the SQL and drift. The headless stack on the right pushes the metric to its own tier with a wire protocol, and every consumer is a thin client.

Why the decoupling is more than rearranging boxes: in the left stack, the BI tool is on the critical path of every metric query. If you rip out Looker, you rip out the metric definitions with it. In the right stack, the BI tool is a renderer — replace it tomorrow and the metric API stays the same. That swappability is what people mean when they say "BI is becoming a commodity".

The metric API — what's actually on the wire

A metric API is a small, opinionated query language. It does not let you SELECT * FROM payments. It lets you ask three things: which metrics, grouped by which dimensions, filtered how. The compiler does the SQL. The dbt Semantic Layer's GraphQL endpoint is the cleanest example of this contract — here is a real query that Aditi at KreditClub runs every morning to populate the founder dashboard.

# Query the metric API for "weekly active users by city, last 8 weeks"
query WeeklyActives {
  query(
    metrics: [{name: "weekly_active_users"}]
    groupBy: [
      {name: "metric_time", grain: WEEK}
      {name: "user__city"}
    ]
    where: [
      {sql: "{{ Dimension('user__city') }} IN ('Bengaluru', 'Mumbai', 'Delhi', 'Pune')"},
      {sql: "{{ TimeDimension('metric_time', 'WEEK') }} >= CURRENT_DATE - INTERVAL '56 days'"}
    ]
    orderBy: [{descending: false, groupBy: {name: "metric_time", grain: WEEK}}]
  ) {
    queryId
    status
    sql
    arrowResult  # base64-encoded Arrow IPC stream
  }
}

# Sample response (truncated):
{
  "data": {
    "query": {
      "queryId": "01HX7K9QAB8M",
      "status": "SUCCESSFUL",
      "sql": "SELECT DATE_TRUNC('week', subq.metric_time) AS metric_time__week, subq.user__city, COUNT(DISTINCT subq.user_id) AS weekly_active_users FROM (SELECT u.user_id, e.event_time AS metric_time, u.city AS user__city FROM analytics.events e JOIN analytics.users u USING (user_id) WHERE u.city IN ('Bengaluru','Mumbai','Delhi','Pune') AND e.event_time >= CURRENT_DATE - INTERVAL '56 days') subq GROUP BY 1, 2 ORDER BY 1",
      "arrowResult": "<base64 bytes — 32 rows × 3 cols>"
    }
  }
}

Walk the call carefully — five things are happening that a raw SQL endpoint does not give you.

The consumer never wrote a JOIN. The query asks for weekly_active_users by city. The metric API knows weekly_active_users is defined on the events semantic model, knows city is a dimension on the users entity, and knows the join key is user_id. Why this matters: the join graph is in the metric registry, not in the consumer. A new consumer (the React app) joins users to events identically to the existing consumer (Looker), because neither writes the join — the compiler does.
Time granularity is a parameter, not a column. metric_time, grain: WEEK is grammar; the compiler picks the right DATE_TRUNC flavour for the warehouse (Snowflake's DATE_TRUNC('week', ...) differs from BigQuery's TIMESTAMP_TRUNC(..., WEEK(MONDAY))). The consumer is portable across warehouses for free.
The where clause uses the same dimension names as groupBy. There is no second namespace where the consumer has to know that user__city is actually users.city in the underlying SQL. The compiler keeps the model and the predicate in the same vocabulary.
Arrow IPC over the wire. The result is not JSON; it is Apache Arrow's columnar IPC format, base64-encoded for GraphQL transport. Why Arrow and not JSON: a 100k-row result set is 5–10× smaller in Arrow than in JSON, decodes 50× faster in pandas/Polars, and preserves typed columns (a bigint does not become a JS number that loses precision past 2⁵³). For a metric API that BI tools, Spark, and Python notebooks all consume, Arrow is the lowest-friction format.
queryId is returned synchronously, even if the query is long-running. The contract is async-by-default — the caller polls queryId for status. This matters for the warehouses (BigQuery, Snowflake) that often take several seconds for a fresh metric query.

Wire protocols — JDBC, GraphQL, REST

The metric API is not one protocol. It is whichever protocol the consumer already speaks. Headless BI took off precisely because the metric tier learned to speak the protocols BI tools have spent two decades trusting.

# JDBC consumer: DashView, DBeaver, any SQL client
import pyarrow.flight as flight

# Connect to the dbt Semantic Layer over Arrow Flight SQL
client = flight.connect(
    "grpc+tls://semantic-layer.cloud.getdbt.com:443",
    middleware=[BearerTokenMiddleware(os.environ["DBT_SL_TOKEN"])],
)

# The query feels like SQL but the table is a virtual semantic model
sql = """
SELECT
  metric_time__week,
  user__city,
  weekly_active_users
FROM {{ semantic_layer.query(
        metrics=['weekly_active_users'],
        group_by=['metric_time__week', 'user__city'],
        where="{{ Dimension('user__city') }} IN ('Bengaluru','Mumbai')"
      ) }}
ORDER BY metric_time__week
"""

info = client.get_flight_info(flight.FlightDescriptor.for_command(sql.encode()))
table = client.do_get(info.endpoints[0].ticket).read_all()  # arrow Table
print(table.to_pandas().head())

#   metric_time__week   user__city  weekly_active_users
# 0       2026-03-02   Bengaluru                284917
# 1       2026-03-02      Mumbai                197844
# 2       2026-03-09   Bengaluru                291102
# 3       2026-03-09      Mumbai                202336
# 4       2026-03-16   Bengaluru                298815

The same metric is reachable three ways from the same registry: DashView hits the JDBC endpoint and sees a virtual table; the React dashboard hits GraphQL and gets typed rows; the LLM agent hits REST and gets JSON it can drop into a function-calling response. Three wire formats, one definition. Why supporting all three matters operationally: a BI tool that already trusts a Postgres-shaped JDBC connection takes zero engineering work to point at the metric API. If the API only spoke GraphQL, every BI vendor would need a custom integration — and most never would.

What headless BI is not

Three claims float around the category that confuse people. Worth being precise.

The category name "headless" is a metaphor — the head (the chart UI) is removable, the body (the metric registry + API + cache) is what stays. Confusion happens because the metaphor lets people imagine "no charts" or "replaces the warehouse"; neither is true.

Why this became the dominant pattern

Three forces converged in 2022–2024.

The consumer surface multiplied. In 2018 the only thing reading metrics was a BI tool. By 2024 a typical Indian D2C company had: a Looker for analysts, a Hex for product managers, a Streamlit for ML engineers, a React dashboard for the founder, a Slack bot for daily standup, a customer-facing dashboard, and an LLM agent answering metric questions in English. Seven consumers, one definition is the only viable architecture. Why the surface multiplied: every team that hired its own engineer wanted its own way to look at the data. The metric registry has to be tool-independent or it will be re-implemented per team. Aditi's "WAU is 9.42 / 8.11 / 9.78 / 9.21 lakh" story is what happens when it isn't.

dbt won the transformation layer. Once dbt was the de-facto place where the warehouse models lived, putting the metric definition next to those models — same git repo, same PR review, same CI — was the obvious move. MetricFlow's acquisition by dbt Labs in early 2023 made this official. The metric registry inherits dbt's governance properties (PR-reviewed, version-controlled, tested) for free.

LLMs forced an API. A Slack bot that takes "how many transactions in Bengaluru last Tuesday?" and answers "₹47.2 crore across 8.4 lakh transactions" cannot be built by giving the LLM raw warehouse access — it would hallucinate joins, miss filters, get the metric wrong. It can be built if the LLM is restricted to calling the metric API with a small, typed schema. The headless-BI tier is the ideal substrate for an LLM agent because it constrains the LLM to known metrics and known dimensions. By 2025 every major Indian fintech (PaisaBridge, KreditClub, Jupiter) had at least one internal LLM tool sitting on top of a semantic layer.

Common confusions

"Headless BI is a new BI tool." It is the absence of a BI tool — or more precisely, the part of a BI tool that survives when you take away the chart UI. The metric registry, the SQL compiler, the cache, the wire protocol. You still need a BI tool (Looker, Hex, Lightdash) for analysts to drag dimensions onto charts. Headless BI just means that BI tool is interchangeable.
"Headless BI replaces dbt." It sits on top of dbt. dbt builds the tables in the warehouse; the metric definitions reference those tables. MetricFlow ships inside dbt, so the line between "transformation layer" and "semantic layer" can blur — but they do different jobs. dbt is INSERT INTO ... SELECT ...; the semantic layer is "given a metric + dimensions, emit the SQL".
"Headless BI is the same as a metrics layer." "Metrics layer" is the older, vaguer term Benn Stancil used in 2021. "Headless BI" is the architecture pattern that emerged once the metric layer started shipping wire protocols. A metrics layer that you can only call from inside one BI tool (e.g., LookML before the JDBC adapter) is not headless. A metrics layer that any consumer can hit (MetricFlow's Arrow Flight SQL endpoint, Cube's Postgres wire) is headless.
"Headless BI removes the need for a BI tool." It removes the lock-in to a particular BI tool. Analysts still want to drag-and-drop dimensions onto a chart; that experience is what BI tools sell. The change is that you can now buy that experience as a thin client over your metric API — Lightdash and Hex were built explicitly for this consumption model — and switch tools without losing the metric definitions.
"Headless BI is just caching." The cache is a feature; the value is the registry and the wire protocol. Even with caching disabled, headless BI still solves the divergence problem — every consumer gets the same SQL because every consumer asks the same compiler.

Going deeper

The "metric API" call shape — why these three verbs

Every headless BI implementation converges on roughly the same call shape: metrics=[...], group_by=[...], where=[...], order_by=[...]. There is no select, no from, no join. This is not a coincidence — it falls out of the constraint that the consumer must not write SQL, otherwise the metric definition can be bypassed. The Cube REST API, the dbt Semantic Layer GraphQL API, and AtScale's MDX-derived API all share these verbs because they all share the constraint. Why this constraint is non-negotiable: if the consumer can write SELECT SUM(amount) FROM payments, they have just defined GMV in their consumer code, in violation of the whole architecture. The API has to be expressive enough to ask any reasonable question and restrictive enough that the question must reference a registered metric.

How LLM agents fit — schema as prompt

The headless-BI tier is the cleanest interface an LLM agent has ever had to a data warehouse. PaisaBridge's internal "ask Riya" bot, deployed to ~600 internal users in mid-2025, works like this: when a user asks "what was UPI volume in Bengaluru last week?", the bot prompts the LLM with the metric registry's schema (a list of metrics with descriptions, a list of dimensions per metric, allowed filters) and asks the LLM to emit a JSON call to the metric API. The LLM never sees SQL, never sees table names, cannot hallucinate columns that don't exist — the registry's schema is the LLM's universe. The answer is then a deterministic SQL query against Snowflake, not a generated SQL string. The error rate (measured against analyst-validated answers) settled at ~3%, mostly from ambiguous user questions, not from generated-SQL bugs. The same architecture without a semantic layer (LLM writes SQL directly) measured at ~22% — seven times worse, mostly from hallucinated joins.

Where the cache lives matters

A fresh metric query against fct_payments (3 billion rows) takes 4–8 seconds on Snowflake even with clustering. A cached one takes 50ms. Where the cache lives determines who pays the cost. dbt Semantic Layer caches per-query in dbt Cloud's tier — fast, but tied to that tenant. Cube caches in Cube Store (a stripped-down columnar engine that runs alongside the API tier), which means the same cache serves Looker, Hex, and the React app. LookML pre-aggregates into the warehouse, which the warehouse caches its own way. Why this matters at scale: a customer-facing dashboard that hits the metric API at 1000 QPS will melt the warehouse if the cache is per-tenant; a Cube-style shared cache or a warehouse-side aggregate table is the only viable answer.

Why "headless" is sticky in 2026 but might not be the final word

The "headless" framing came from the JAMstack / headless CMS movement (Contentful, Strapi). The vocabulary was useful in 2022 because BI vendors had not yet split their products. By 2026 every BI vendor is shipping some form of metric API; calling the architecture "headless" is becoming redundant the way "AJAX" became redundant once every web framework spoke fetch. The thing that will stay is the metric API as a tier in the data stack — same way the warehouse is a tier, the transformation layer is a tier, and now the metric layer is a tier. The name will mature into something like "the metric layer" or "the semantic tier"; the architecture will not change.

The Indian-stack picture, by company size

Stage	Stack	Why
Early-stage (Series A, < 50 engineers)	dbt + MetricFlow + Lightdash (single semantic source, OSS BI)	Cheap; one source of truth from day one
Growth-stage (Series B–C, 50–200 engineers)	dbt + MetricFlow + Looker (analyst BI) + Hex (PMs) + custom React (founder dashboard)	Multiple consumer surfaces; the metric API is what keeps them aligned
Late-stage (Series D+, 200+ engineers)	dbt + Cube (customer-facing) + LookML (internal analyst) + LLM agent (Slack)	Multiple semantic layers, one designated as truth, others as pass-throughs — pragmatic, not pure
Public-listed	Custom semantic layer on top of dbt or Atlan-style metadata catalog	At PaisaBridge/ParakhTrade scale, building rather than buying becomes viable

The interesting ones are the late-stage companies that have two semantic layers: a Cube for the customer-facing dashboard (because Cube's pre-aggregations are operationally proven for sub-100ms reads) and MetricFlow for the internal stack. Aligning the two is the new chapter-of-pain — most companies handle it by declaring MetricFlow the source of truth and letting Cube be a thin pass-through that re-emits the metric, with a CI check that fails if the two diverge.

Where this leads next

The next chapter (/wiki/semantic-layer-llms-the-new-interface) is the natural sequel — once the metric API exists, an LLM agent on top of it is straightforward, and what looks like "ChatGPT for data" is actually "function calling against a typed metric schema". The build does not end there. Build 14 (/wiki/wall-batch-metrics-arent-fresh-enough) confronts the failure mode of every batch-warehouse-backed semantic layer: the analyst wants the GMV from five minutes ago, and the warehouse-backed metric is from last night.

The thread running through Build 13 stays: the metric is the contract, the renderer is interchangeable. Headless BI is what happens when the industry takes that thread seriously and moves the contract out of the BI tool's database.

References

dbt Semantic Layer — overview — the GraphQL and JDBC contracts; the canonical "metric API" implementation.
Benn Stancil — The metrics layer — the 2021 essay that named the category and seeded the round of investment.
Tristan Handy — How is dbt building toward the future — the dbt founder's argument for why the metric layer belongs in the transform layer, not the BI layer.
Cube — Headless BI architecture — Cube's case for the architecture, with the wire-protocol focus.
Apache Arrow Flight SQL — the protocol that makes a metric tier reachable from any JDBC client; the "headless BI talks to DashView for free" story.
Lightdash — open-source BI for dbt — calibration of what a thin client over a metric API actually looks like in production.
/wiki/lookml-cube-metricflow-the-landscape — the previous chapter's three-vendor landscape that this chapter abstracts into a wire-protocol pattern.
/wiki/metric-definitions-once-queried-many-ways — the seven-fields framing of metric definitions that headless BI exposes over the wire.