The headless-BI movement

In late 2023 a senior data engineer at Cred named Aditi pulled up the same number in four places: Looker said weekly active users were 9.42 lakh, the Mixpanel cohort said 8.11 lakh, the React dashboard the growth team had built for the founder said 9.78 lakh, and the Slack bot that posted morning numbers said 9.21 lakh. Four sources, four definitions of "active", four implementations of the same SQL — each one written by a different team, each one drifting at its own pace. The fix was not a fifth dashboard. The fix was to put the metric definition behind a single API and let every consumer — Looker, Mixpanel, the React app, the Slack bot — call that API instead of writing its own SQL. That API has a name. It is called headless BI, and the rest of this chapter is about what it does, what it does not do, and why it became the dominant pattern for data stacks in 2024–2026.

Headless BI separates the metric definition from the chart. The metric lives behind an HTTP/JDBC API; the chart, app, notebook, or LLM agent is just a consumer. The shift matters because BI tools are no longer the only thing reading numbers — apps, bots, and agents are — and a metric that lives inside one BI tool cannot reach the others without being re-implemented.

What "headless" actually decouples

A traditional BI tool is two things stapled together: a metric registry (definitions of gmv, active_users, gross_margin) and a renderer (chart components, dashboard layout, drill-down UI). For twenty years the bet was that this coupling was a feature — buy Looker, get the registry and the charts as one thing. Headless BI breaks the staple. The registry stays; the renderer is whatever you want.

Coupled BI vs headless BITwo side-by-side stacks. Left: traditional coupled BI, where the metric registry and the chart renderer live inside the same Looker/Tableau box, and only that box can see the warehouse. Right: headless BI, where the metric API sits between the warehouse and many independent renderers — Looker, Hex, a React app, a Slack bot, an LLM agent — each consuming the same metric definition. Coupled BI (pre-2022) Headless BI Looker / Tableau / Mode metric registry (LookML, etc.) + chart renderer (UI) React app Slack bot Hex LLM agent re-implement metric in each Warehouse Snowflake / BigQuery Looker Hex / Mode React app LLM agent Metric API (headless BI) JDBC / GraphQL / REST one definition; many wire formats Metric registry (dbt YAML / LookML / Cube) git-versioned source of truth Warehouse Snowflake / BigQuery / Databricks
The coupled stack on the left has one privileged consumer (the BI tool that owns the metric) and an open question about every other consumer — they re-author the SQL and drift. The headless stack on the right pushes the metric to its own tier with a wire protocol, and every consumer is a thin client.

Why the decoupling is more than rearranging boxes: in the left stack, the BI tool is on the critical path of every metric query. If you rip out Looker, you rip out the metric definitions with it. In the right stack, the BI tool is a renderer — replace it tomorrow and the metric API stays the same. That swappability is what people mean when they say "BI is becoming a commodity".

The metric API — what's actually on the wire

A metric API is a small, opinionated query language. It does not let you SELECT * FROM payments. It lets you ask three things: which metrics, grouped by which dimensions, filtered how. The compiler does the SQL. The dbt Semantic Layer's GraphQL endpoint is the cleanest example of this contract — here is a real query that Aditi at Cred runs every morning to populate the founder dashboard.

# Query the metric API for "weekly active users by city, last 8 weeks"
query WeeklyActives {
  query(
    metrics: [{name: "weekly_active_users"}]
    groupBy: [
      {name: "metric_time", grain: WEEK}
      {name: "user__city"}
    ]
    where: [
      {sql: "{{ Dimension('user__city') }} IN ('Bengaluru', 'Mumbai', 'Delhi', 'Pune')"},
      {sql: "{{ TimeDimension('metric_time', 'WEEK') }} >= CURRENT_DATE - INTERVAL '56 days'"}
    ]
    orderBy: [{descending: false, groupBy: {name: "metric_time", grain: WEEK}}]
  ) {
    queryId
    status
    sql
    arrowResult  # base64-encoded Arrow IPC stream
  }
}
# Sample response (truncated):
{
  "data": {
    "query": {
      "queryId": "01HX7K9QAB8M",
      "status": "SUCCESSFUL",
      "sql": "SELECT DATE_TRUNC('week', subq.metric_time) AS metric_time__week, subq.user__city, COUNT(DISTINCT subq.user_id) AS weekly_active_users FROM (SELECT u.user_id, e.event_time AS metric_time, u.city AS user__city FROM analytics.events e JOIN analytics.users u USING (user_id) WHERE u.city IN ('Bengaluru','Mumbai','Delhi','Pune') AND e.event_time >= CURRENT_DATE - INTERVAL '56 days') subq GROUP BY 1, 2 ORDER BY 1",
      "arrowResult": "<base64 bytes — 32 rows × 3 cols>"
    }
  }
}

Walk the call carefully — five things are happening that a raw SQL endpoint does not give you.

Wire protocols — JDBC, GraphQL, REST

The metric API is not one protocol. It is whichever protocol the consumer already speaks. Headless BI took off precisely because the metric tier learned to speak the protocols BI tools have spent two decades trusting.

# JDBC consumer: Tableau, DBeaver, any SQL client
import pyarrow.flight as flight

# Connect to the dbt Semantic Layer over Arrow Flight SQL
client = flight.connect(
    "grpc+tls://semantic-layer.cloud.getdbt.com:443",
    middleware=[BearerTokenMiddleware(os.environ["DBT_SL_TOKEN"])],
)

# The query feels like SQL but the table is a virtual semantic model
sql = """
SELECT
  metric_time__week,
  user__city,
  weekly_active_users
FROM {{ semantic_layer.query(
        metrics=['weekly_active_users'],
        group_by=['metric_time__week', 'user__city'],
        where="{{ Dimension('user__city') }} IN ('Bengaluru','Mumbai')"
      ) }}
ORDER BY metric_time__week
"""

info = client.get_flight_info(flight.FlightDescriptor.for_command(sql.encode()))
table = client.do_get(info.endpoints[0].ticket).read_all()  # arrow Table
print(table.to_pandas().head())

#   metric_time__week   user__city  weekly_active_users
# 0       2026-03-02   Bengaluru                284917
# 1       2026-03-02      Mumbai                197844
# 2       2026-03-09   Bengaluru                291102
# 3       2026-03-09      Mumbai                202336
# 4       2026-03-16   Bengaluru                298815

The same metric is reachable three ways from the same registry: Tableau hits the JDBC endpoint and sees a virtual table; the React dashboard hits GraphQL and gets typed rows; the LLM agent hits REST and gets JSON it can drop into a function-calling response. Three wire formats, one definition. Why supporting all three matters operationally: a BI tool that already trusts a Postgres-shaped JDBC connection takes zero engineering work to point at the metric API. If the API only spoke GraphQL, every BI vendor would need a custom integration — and most never would.

What headless BI is not

Three claims float around the category that confuse people. Worth being precise.

Three things headless BI is notThree side-by-side cards. First card: "not a chart-free BI tool" — shows that consumers still draw charts; the metric API doesn't render. Second card: "not just a SQL proxy" — shows that a SQL proxy passes SQL through but headless BI compiles intent into SQL. Third card: "not a replacement for the warehouse" — shows that the metric API still issues SQL to the underlying warehouse and never stores data itself. not "chart-free BI" charts still get drawn — they're drawn elsewhere Looker, Hex, React, Mode, Streamlit are still the renderers "headless" means the head is detachable not a SQL proxy a proxy passes SQL through headless BI compiles intent consumer asks: gmv by city compiler emits 40 lines of SQL with subqueries + joins intent → SQL is the value not a warehouse no rows are stored here SQL still runs in Snowflake caches: yes (sometimes) storage: no the warehouse stays the truth it's a compiler, not a database
The category name "headless" is a metaphor — the head (the chart UI) is removable, the body (the metric registry + API + cache) is what stays. Confusion happens because the metaphor lets people imagine "no charts" or "replaces the warehouse"; neither is true.

Why this became the dominant pattern

Three forces converged in 2022–2024.

The consumer surface multiplied. In 2018 the only thing reading metrics was a BI tool. By 2024 a typical Indian D2C company had: a Looker for analysts, a Hex for product managers, a Streamlit for ML engineers, a React dashboard for the founder, a Slack bot for daily standup, a customer-facing dashboard, and an LLM agent answering metric questions in English. Seven consumers, one definition is the only viable architecture. Why the surface multiplied: every team that hired its own engineer wanted its own way to look at the data. The metric registry has to be tool-independent or it will be re-implemented per team. Aditi's "WAU is 9.42 / 8.11 / 9.78 / 9.21 lakh" story is what happens when it isn't.

dbt won the transformation layer. Once dbt was the de-facto place where the warehouse models lived, putting the metric definition next to those models — same git repo, same PR review, same CI — was the obvious move. MetricFlow's acquisition by dbt Labs in early 2023 made this official. The metric registry inherits dbt's governance properties (PR-reviewed, version-controlled, tested) for free.

LLMs forced an API. A Slack bot that takes "how many transactions in Bengaluru last Tuesday?" and answers "₹47.2 crore across 8.4 lakh transactions" cannot be built by giving the LLM raw warehouse access — it would hallucinate joins, miss filters, get the metric wrong. It can be built if the LLM is restricted to calling the metric API with a small, typed schema. The headless-BI tier is the ideal substrate for an LLM agent because it constrains the LLM to known metrics and known dimensions. By 2025 every major Indian fintech (Razorpay, Cred, Jupiter) had at least one internal LLM tool sitting on top of a semantic layer.

Common confusions

Going deeper

The "metric API" call shape — why these three verbs

Every headless BI implementation converges on roughly the same call shape: metrics=[...], group_by=[...], where=[...], order_by=[...]. There is no select, no from, no join. This is not a coincidence — it falls out of the constraint that the consumer must not write SQL, otherwise the metric definition can be bypassed. The Cube REST API, the dbt Semantic Layer GraphQL API, and AtScale's MDX-derived API all share these verbs because they all share the constraint. Why this constraint is non-negotiable: if the consumer can write SELECT SUM(amount) FROM payments, they have just defined GMV in their consumer code, in violation of the whole architecture. The API has to be expressive enough to ask any reasonable question and restrictive enough that the question must reference a registered metric.

How LLM agents fit — schema as prompt

The headless-BI tier is the cleanest interface an LLM agent has ever had to a data warehouse. Razorpay's internal "ask Riya" bot, deployed to ~600 internal users in mid-2025, works like this: when a user asks "what was UPI volume in Bengaluru last week?", the bot prompts the LLM with the metric registry's schema (a list of metrics with descriptions, a list of dimensions per metric, allowed filters) and asks the LLM to emit a JSON call to the metric API. The LLM never sees SQL, never sees table names, cannot hallucinate columns that don't exist — the registry's schema is the LLM's universe. The answer is then a deterministic SQL query against Snowflake, not a generated SQL string. The error rate (measured against analyst-validated answers) settled at ~3%, mostly from ambiguous user questions, not from generated-SQL bugs. The same architecture without a semantic layer (LLM writes SQL directly) measured at ~22% — seven times worse, mostly from hallucinated joins.

Where the cache lives matters

A fresh metric query against fct_payments (3 billion rows) takes 4–8 seconds on Snowflake even with clustering. A cached one takes 50ms. Where the cache lives determines who pays the cost. dbt Semantic Layer caches per-query in dbt Cloud's tier — fast, but tied to that tenant. Cube caches in Cube Store (a stripped-down columnar engine that runs alongside the API tier), which means the same cache serves Looker, Hex, and the React app. LookML pre-aggregates into the warehouse, which the warehouse caches its own way. Why this matters at scale: a customer-facing dashboard that hits the metric API at 1000 QPS will melt the warehouse if the cache is per-tenant; a Cube-style shared cache or a warehouse-side aggregate table is the only viable answer.

Why "headless" is sticky in 2026 but might not be the final word

The "headless" framing came from the JAMstack / headless CMS movement (Contentful, Strapi). The vocabulary was useful in 2022 because BI vendors had not yet split their products. By 2026 every BI vendor is shipping some form of metric API; calling the architecture "headless" is becoming redundant the way "AJAX" became redundant once every web framework spoke fetch. The thing that will stay is the metric API as a tier in the data stack — same way the warehouse is a tier, the transformation layer is a tier, and now the metric layer is a tier. The name will mature into something like "the metric layer" or "the semantic tier"; the architecture will not change.

The Indian-stack picture, by company size

Stage Stack Why
Early-stage (Series A, < 50 engineers) dbt + MetricFlow + Lightdash (single semantic source, OSS BI) Cheap; one source of truth from day one
Growth-stage (Series B–C, 50–200 engineers) dbt + MetricFlow + Looker (analyst BI) + Hex (PMs) + custom React (founder dashboard) Multiple consumer surfaces; the metric API is what keeps them aligned
Late-stage (Series D+, 200+ engineers) dbt + Cube (customer-facing) + LookML (internal analyst) + LLM agent (Slack) Multiple semantic layers, one designated as truth, others as pass-throughs — pragmatic, not pure
Public-listed Custom semantic layer on top of dbt or Atlan-style metadata catalog At Razorpay/Zerodha scale, building rather than buying becomes viable

The interesting ones are the late-stage companies that have two semantic layers: a Cube for the customer-facing dashboard (because Cube's pre-aggregations are operationally proven for sub-100ms reads) and MetricFlow for the internal stack. Aligning the two is the new chapter-of-pain — most companies handle it by declaring MetricFlow the source of truth and letting Cube be a thin pass-through that re-emits the metric, with a CI check that fails if the two diverge.

Where this leads next

The next chapter (/wiki/semantic-layer-llms-the-new-interface) is the natural sequel — once the metric API exists, an LLM agent on top of it is straightforward, and what looks like "ChatGPT for data" is actually "function calling against a typed metric schema". The build does not end there. Build 14 (/wiki/wall-batch-metrics-arent-fresh-enough) confronts the failure mode of every batch-warehouse-backed semantic layer: the analyst wants the GMV from five minutes ago, and the warehouse-backed metric is from last night.

The thread running through Build 13 stays: the metric is the contract, the renderer is interchangeable. Headless BI is what happens when the industry takes that thread seriously and moves the contract out of the BI tool's database.

References