Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.
Tiered storage for metrics, logs, and traces
At 03:14 the page fires. Aditi opens Grafana, types rate(http_5xx[5m]), and the panel renders in 280 ms because the last 6 hours of http_5xx live on a Mimir ingester's local NVMe. Six weeks later Kiran in finance forwards her a compliance ticket asking for the merchant-fee dispute trail from 14 October, and the same query against the same metric name takes 47 seconds because that data lives in three Parquet files in s3://yatrika-mimir-cold/. Same metric, same query, same person — different tier, different latency budget, different rupee cost. Tiering is not "how long do we keep data"; tiering is "how long the reader will wait for the answer". Get that distinction wrong and you either burn ₹4 crore a year keeping cold-tier-grade data on hot SSDs, or you make the on-call wait 47 seconds while the page is still firing.
Hot, warm, and cold are three storage tiers with three reader profiles — paged on-call (sub-second), feature engineer (sub-minute), auditor (sub-hour). The rule that decides which tier a piece of telemetry sits on is the query-latency budget of its likely reader, not its age. Metrics, logs, and traces tier differently: metrics downsample at boundaries, logs change index strategy, traces drop indexes entirely and lean on the trace-id. Get the boundaries wrong and you pay 10× either in cost (hot too long) or in MTTR (cold too soon).
The reader-budget rule and why age-based tiering misses it
Every observability vendor pitch starts with the same diagram: 7-day hot, 30-day warm, 365-day cold, retention shrinks as data ages, cost shrinks with it. The diagram is correct; the reasoning is wrong. Data does not get cheaper to store as it ages — S3 Standard-IA is the same price whether the byte is one hour old or one year old. What gets cheaper is the acceptable query latency. A metric scraped 30 seconds ago is going into a Grafana panel that someone is staring at right now; a metric scraped 30 days ago is going into a quarterly review someone will skim next Tuesday. The hot tier exists because the on-call cannot wait 47 seconds; the cold tier exists because the auditor can.
Yatrika ran age-based tiering for two quarters. The 7-day hot tier held 3.4M metric series on local NVMe at ₹18 lakh/month. The 365-day cold tier held the same 3.4M series, downsampled to 5-minute resolution, on S3 Standard-IA at ₹2.1 lakh/month. The platform team was proud of the 8.5× cost reduction at the cold tier. Then a Q4 incident happened: a payments-team alert fired about a 15-minute latency spike on 18 December at 14:22 IST. By the time on-call investigated on 19 December at 09:00 IST, the 18-hour-old data had aged into the warm tier — Mimir queriers returned, but they returned in 8 seconds per panel because they hit S3 instead of the ingester's memory-mapped block. Eight seconds per panel on a 12-panel dashboard meant 96 seconds of dead time per refresh during an active incident. The bug was not retention; the bug was that "warm" started at 24 hours when the on-call's investigation budget extended to 72 hours. The fix was reader-budget tiering: hot until the data falls outside any active investigation, warm until it falls outside any feature-engineering analysis, cold thereafter.
Why age is the wrong primary axis: age is a proxy variable, not a causal one. The actual variable is "what is the longest investigation window an on-call might walk into?" — for Yatrika that turned out to be 72 hours (the time between an incident and its formal post-mortem). Sizing the hot tier from 24 hours to 72 hours cost ₹6 lakh/month more in NVMe but saved 90 seconds of MTTR on the December incident, which the payments-VP valued at ₹40 lakh in deferred merchant churn. Reader-budget tiering reframes the boundary from a finance question into an SRE question, and the SRE question has a known answer.
The reader-budget rule generalises. For logs the readers are: on-call grepping for the last error (hot, 1–7 days), product engineer reproducing a customer complaint (warm, 7–30 days), security investigator chasing an incident from last quarter (cold, 30–540 days). For traces: on-call drilling from a paging metric to a span (hot, 24–72 hours), platform engineer doing a fan-out audit (warm, 7–14 days), forensic auditor reconstructing a single transaction (cold, 30–365 days). The reader profiles are the same across pillars; the storage shape that serves them is not. Metrics tier by downsampling, logs tier by changing index strategy, traces tier by collapsing to a trace-id-only index.
Three pillars, three tiering shapes
The pillar shapes diverge because their access patterns diverge. A metric query is "show me the time-series for http_5xx{service=payments} over 24 hours" — a sequential scan over a known label set. A log query is "find the 18 lines containing dispute_id=DSP-441" — a needle-in-haystack search. A trace query is "fetch the span tree for trace_id 0a3f..." — a primary-key lookup with a fan-out tree underneath. Each pattern has a different cost equation across hot / warm / cold, which forces a different tiering shape.
For metrics, the dominant cost in the warm and cold tiers is the cardinality multiplied by the resolution. A 3.4M-series fleet at 10-second resolution is 1.02 trillion samples per month. Downsampling to 5-minute aggregates at the cold-tier boundary divides the sample count by 30 but preserves cardinality. The downsampled metric is good enough for capacity planning and quarterly review — but not for incident investigation, because the 5-minute aggregate hides the 30-second spike that paged the on-call. Hence: hot keeps full resolution; warm keeps full resolution but on cheaper storage; cold keeps cardinality but downsamples. The boundary between warm and cold is exactly where the reader stops needing 30-second-spike resolution. See /wiki/downsampling-for-long-retention for the aggregate-choice question — which Mimir reads as compactor.downsampling-enabled and which Prometheus federation handles as a recording rule.
For logs, the dominant cost in the hot tier is the index size. Loki's full-text-or-label-index design (see /wiki/full-text-search-for-logs-the-cost-model) keeps a per-stream label index but does not index the log content — content searches scan the chunks. Hot logs use the full label index for sub-second {service="payments"} queries; cold logs drop the per-stream index entirely and rely on object-store list operations and chunk-level brotli compression. The cold-tier query pattern shifts from {service="payments"} |= "dispute" (interactive) to "scan all chunks from 14 October between 14:00 and 15:00, brotli-decompress, regex-search" (batch). The reader pays 10× the latency at the cold tier, but the platform team pays 40× less per gigabyte stored.
For traces, the design space is different again. Tempo's columnar approach (see /wiki/trace-storage-at-scale-tempos-columnar-approach) already drops most indexes — only service.name and name are indexed at the warm tier; everything else is column-scanned via TraceQL. The cold-tier pattern collapses further: at Yatrika, traces older than 30 days retain only the trace-id and the root-span attributes (service, operation, status, duration). The full span tree is dropped or compacted into a per-trace Parquet file accessible only by trace-id lookup. You can no longer search "show me all error traces in October from the payments service" at the cold tier; you can only fetch "the trace tree for trace_id 0a3f... from October" if you already know the trace_id. This is acceptable because the cold-tier reader profile is forensic — they have the trace_id from a transaction record, a customer complaint, or a regulatory request, and they need the tree, not a search.
Why traces tier most aggressively: a trace tree is a high-fanout structure (the IPL final at JioCinema produced 80-microservice traces with 600+ spans each), so storing the full tree at indexable granularity for 12 months is the most expensive-per-useful-query data type in observability. Most cold-tier trace lookups are by trace-id (regulatory, customer complaint), not by attribute search — so dropping the attribute index at the cold-tier boundary recovers 95% of storage cost while losing 5% of access patterns. The 5% you lose are the "find me all error traces from last quarter" queries, which can be answered from the metrics tier (rate(spans_total{status="error"}[1d])) instead. Tier-aware design moves access patterns across pillars when the tier collapses an index.
The audit script — measuring tier residency, cost, and reader-budget violations
The audit primitive in /wiki/cardinality-budgets-revisited measured per-team cardinality cost. The tier-residency audit measures something orthogonal: for each metric / log stream / trace service, which tier is the data living on, and is that tier serving the actual reader profile. The most common finding is tier-misalignment — data the on-call queries weekly that has aged into the cold tier and now takes 47 seconds to render, or data nobody has queried in 90 days still sitting on the hot NVMe.
# tier_audit.py — find tier-misaligned telemetry across metrics, logs, and traces
# pip install requests pandas python-dateutil
import requests, pandas as pd, datetime as dt
from collections import defaultdict
from dateutil.parser import isoparse
MIMIR = "http://mimir.yatrika.internal:8080"
LOKI = "http://loki.yatrika.internal:3100"
TEMPO = "http://tempo.yatrika.internal:3200"
HOT_END_H = 72 # 72-hour hot boundary (reader: on-call)
WARM_END_D = 30 # 30-day warm boundary (reader: feature engineer)
RATE_HOT_PER_GB_MONTH = 4200 # local NVMe + replicas, ₹/GB-month
RATE_WARM_PER_GB_MONTH = 980 # S3 Express One Zone
RATE_COLD_PER_GB_MONTH = 110 # S3 Standard-IA
def tier_for_age(hours: float) -> str:
if hours < HOT_END_H: return "hot"
if hours < WARM_END_D * 24: return "warm"
return "cold"
# 1. Metrics tier residency — Mimir compactor block stats (size, time range)
r = requests.get(f"{MIMIR}/api/v1/blocks/yatrika-prod", timeout=60).json()
metric_rows = []
for b in r["blocks"]:
age_h = (dt.datetime.utcnow() - isoparse(b["max_time"])).total_seconds() / 3600
metric_rows.append({"signal": "metric", "tenant": b["tenant"],
"size_gb": b["size_bytes"] / 1e9, "age_h": age_h,
"tier_actual": b["storage_class"],
"tier_expected": tier_for_age(age_h)})
# 2. Log tier residency — Loki ingester chunk metadata
r = requests.get(f"{LOKI}/loki/api/v1/series?match[]={'{job=~\".+\"}'}", timeout=60).json()
# (in real use, walk loki object-store manifest; abbreviated here)
log_rows = [{"signal": "log", "tenant": s["tenant"], "size_gb": s["size_gb"],
"age_h": s["age_h"], "tier_actual": s["tier"],
"tier_expected": tier_for_age(s["age_h"])} for s in r["streams"]]
# 3. Trace tier residency — Tempo block list
r = requests.get(f"{TEMPO}/api/v2/blocks?tenant=yatrika", timeout=60).json()
trace_rows = []
for b in r["blocks"]:
age_h = (dt.datetime.utcnow() - isoparse(b["end_time"])).total_seconds() / 3600
trace_rows.append({"signal": "trace", "tenant": b["tenant"],
"size_gb": b["size_bytes"] / 1e9, "age_h": age_h,
"tier_actual": b["storage_class"],
"tier_expected": tier_for_age(age_h)})
# 4. Combine, attribute cost, find misalignments
df = pd.DataFrame(metric_rows + log_rows + trace_rows)
def rupees(row):
rate = {"hot": RATE_HOT_PER_GB_MONTH, "warm": RATE_WARM_PER_GB_MONTH,
"cold": RATE_COLD_PER_GB_MONTH}[row["tier_actual"]]
return row["size_gb"] * rate
df["inr_per_month"] = df.apply(rupees, axis=1)
df["misaligned"] = df["tier_actual"] != df["tier_expected"]
# 5. Per-pillar tier breakdown
print("\n=== Tier residency × signal type ===")
print(df.groupby(["signal", "tier_actual"]).agg(
gb=("size_gb","sum"), inr=("inr_per_month","sum")).round(0))
# 6. Misalignment offenders — what is on the wrong tier
mis = df[df["misaligned"]].sort_values("inr_per_month", ascending=False).head(8)
print(f"\n=== {len(df[df['misaligned']]):,} misaligned blocks — top 8 by ₹/month ===")
print(mis[["signal", "tenant", "age_h", "tier_actual",
"tier_expected", "size_gb", "inr_per_month"]].to_string(index=False))
Sample run on Yatrika 2026-04-25:
=== Tier residency × signal type ===
gb inr
signal tier_actual
log cold 4180 459800
hot 820 3444000
warm 1640 1607200
metric cold 2840 312400
hot 480 2016000
warm 960 940800
trace cold 1240 136400
hot 290 1218000
warm 620 607600
=== 14 misaligned blocks — top 8 by ₹/month ===
signal tenant age_h tier_actual tier_expected size_gb inr_per_month
log payments 18.2 cold hot 22.4 9856.0
trace risk 2160.0 warm cold 8.1 7938.0
metric platform 360.0 hot warm 1.8 7560.0
log risk 96.0 hot warm 1.4 5880.0
trace payments 18.0 warm hot 0.6 588.0
metric payments 220.0 cold warm 14.2 1562.0
log platform 480.0 warm cold 6.4 6272.0
metric risk 72.5 hot warm 0.4 1680.0
Read the per-pillar table first. Logs are the dominant cost — ₹54.5 lakh/month vs ₹32.7 lakh for metrics and ₹19.6 lakh for traces — driven entirely by the hot-tier line at ₹34 lakh. That tells you log retention is the next lever; not metric resolution. The misalignment table tells the second story: 14 blocks are on the wrong tier, with the top offender being a payments-team log block aged 18 hours that has somehow ended up on the cold tier — almost certainly a tiering policy bug where Loki's chunk-flush lifecycle hook fired on an empty stream and shipped the stub straight to cold. The on-call who needs that log during the next incident will hit a 47-second cold read; the next time the bug fires, MTTR rises.
The next-row offender is a risk-team trace block aged 2160 hours (90 days) still on warm at ₹7.9k/month. The tiering policy is supposed to demote 30+ day traces to cold, and this block missed the demotion — probably a Tempo compactor crash that left orphan blocks behind. ₹7.9k/month is small in isolation but compounds: across the fleet there are ~140 such orphans by the audit's end-of-quarter run, totalling ₹11 lakh/quarter the platform team is paying for data nobody queries. The audit's job is to find these before the FinOps quarterly review, not after.
Why we tier separately by signal not by tenant: a payments-tenant might have hot-tier logs, warm-tier metrics, and cold-tier traces all simultaneously, because the on-call uses logs aggressively, the feature engineer uses metrics weekly, and the auditor uses traces only on-request. Tiering by tenant collapses these three different reader-profiles into one boundary, which means either the on-call's logs age out too fast or the auditor's traces never demote. Per-signal tiering preserves the reader-budget rule per pillar — and the audit-script grouping makes the cost picture visible per pillar so the platform team can lever the right boundary.
Animated tier-flow — what happens at a boundary crossing
When data crosses a tier boundary, the storage backend changes, the index strategy changes, and the reader's effective query latency changes. Most of the bugs in tiered observability live in this transition — block-flush races, downsample-vs-original divergence, index-rebuild lag, the "I queried for last week and got partial data" failure mode where the warm-tier block is still being uploaded.
The boundary-crossing failure mode shows up most often in partial-data alerts — the on-call queries rate(http_5xx[5m]) over the boundary and gets a graph with a 30-second gap because the block was flushing during the query window. Mimir surfaces this as query_blocks_storage_querier_blocks_load_failures_total; Loki surfaces it as loki_chunk_fetcher_errors_total; Tempo as tempo_blocks_open_failures_total. Alerting on any of these above 0.5 events/min for 5 minutes catches block-flush bugs within one alert window. Why a 5-minute alert window and not 1 minute: the boundary crossings are inherently episodic — at the 72-hour mark, all blocks created exactly 72 hours ago flush within the same minute, producing a brief, expected spike in load-failure counters. A 1-minute alert would page on every boundary cohort. A 5-minute window with 0.5/min threshold filters out the episodic noise but catches the sustained-error pattern of a real bug (compactor crash, S3 throttle, index-rebuild deadlock).
Common confusions
- "Tiering is a retention setting." It is not. Retention is "how long do we keep data before deletion"; tiering is "what storage backend does the data live on at each age". The two interact — you cannot tier data older than your retention window, because it is gone — but they are independent decisions. Yatrika has 540-day metric retention with three tiers; their old vendor had 30-day retention with one tier. Same retention question, completely different storage architecture.
- "Hot is always SSD, cold is always object storage." Misleading. Hot-tier Mimir on AWS is NVMe-backed EC2; hot-tier Mimir on a self-hosted k8s with a Ceph backend is RBD-attached XFS; hot-tier Mimir at a small startup might be a single Prometheus on a single host. The defining property of "hot" is sub-second query latency and a synchronous-replicated write path, not the underlying medium. Cold tier almost always ends up on S3-compatible object storage because the latency budget is loose enough to forgive the network round-trip.
- "You should tier all three pillars on the same boundaries." Wrong, and an expensive mistake. Logs are dominated by index cost in hot, traces by span-tree cost in warm, metrics by sample-resolution cost in cold. The boundary that minimises log cost (drop the index at 7 days) is wrong for metrics (drop full resolution at 30 days). Each pillar needs its own boundary, which is why the audit script tiers by
(signal, tenant), not by tenant alone. - "Cold-tier data is read-only." Half-true. The data itself is read-only after compaction, but the index over cold data is typically rebuildable. Tempo's TraceQL re-indexer can rebuild a per-block index over cold-tier traces if a forensic investigation needs a fast attribute search; Mimir's compactor can re-downsample a cold block at higher resolution if the original is still on disk. Cold is "read-only by default, rewritable on demand at human cost".
- "Cheaper tier always means cheaper query." No — cold queries are more expensive per query than hot queries. S3 GET-list-fetch costs ₹0.04 per 1000 requests; a cold-tier dashboard with 12 panels each scanning 50 blocks generates 600 GETs per refresh. A heavily-used cold-tier dashboard can cost more in S3 API charges than a hot-tier dashboard costs in NVMe. The right framing: cold tier has cheap storage and expensive query, hot has expensive storage and cheap query — pick by query frequency, not by data age.
- "Tier transitions are atomic." Almost never. Mimir blocks flush over ~30 seconds; Loki chunks over ~5 minutes; Tempo blocks over ~2 minutes. During the transition window queries return partial data and the per-block load-failure counters spike. Treating tier transitions as atomic produces a class of alerts ("intermittent missing data on Tuesday at 14:00") that the platform team chases for weeks before realising the alert is the boundary itself.
Going deeper
Setting tier boundaries from incident-investigation telemetry
The 72-hour hot boundary at Yatrika was not chosen abstractly — it came from analysing 18 months of incident retrospectives and finding that 95% of follow-up investigations completed within 68 hours of the original page. The data lives in an incidents.csv exported from PagerDuty and the company's post-mortem template; the analysis is a 20-line pandas script that buckets time-from-page-to-final-postmortem-comment by quantile. The 72-hour figure is the p95; the p99 is 144 hours, which is what triggered Yatrika's escalation policy: pages older than 72 hours needing investigation generate a "promote-to-hot" PR that re-loads specific blocks back onto the ingester for the duration of the investigation. The promote-to-hot pattern is rare (12 PRs in 2026 H1) but cheap (a few hundred rupees per promote, vs the ₹6 lakh/month it would cost to extend the hot boundary fleet-wide to 144 hours). Boundary choices are not symmetric — the hot boundary is sized for the p95 reader; the cold boundary is sized for the regulatory floor; the warm boundary fills the rest.
The cold-tier query-frequency cliff
Most cold-tier vendor pricing assumes a query-frequency cliff: you query a cold block once or twice a year, the per-query cost is high, the per-storage cost is low. The cliff breaks when a recurring query falls into the cold tier. At Yatrika the Q4 financial-close report runs on the last business day of every month and queries transaction_volume{merchant_tier="enterprise"} over the prior 12 months. By the second quarter, 9 of the 12 months are cold-tier. Each monthly run costs ₹2.4 lakh in S3 GETs because the query scans 1100 cold blocks. The fix is recurring-query promotion: an audit job runs every Friday, identifies queries that ran more than 4 times in the prior 30 days against cold-tier data, and tags those blocks for warm-tier promotion. The financial-close report is now ~₹14k/month instead of ₹2.4 lakh — and the only cost was a 30-line Python audit and a Mimir API call. The pattern composes with /wiki/cardinality-budgets-revisited: once query-frequency is on the dashboard, the team that owns the report sees its own cold-tier cost line and starts pre-aggregating against it.
The deletion-tier — what happens after cold
Cold is not the last tier. After cold comes deletion, and the boundary between cold and deletion is a regulatory question, not an engineering one. Indian payment-data regulation (RBI master direction 2024) mandates 7-year retention for certain transaction-level fields; Yatrika's compliance team chose 540 days for telemetry as the SRE-relevant retention floor, with a separate "compliance archive" pipeline that extracts the regulated fields into a Parquet lake on S3 Glacier Deep Archive. The compliance archive is outside the observability tier system — it is a feature-engineering data product, not telemetry, and it costs roughly ₹0.04/GB-month. The mistake is to conflate "RBI says keep 7 years" with "Mimir must keep metrics for 7 years". The metric-side cost of 7-year retention at full-fleet scale would be ₹54 lakh/month at the cold tier; the compliance-archive Parquet equivalent is ₹4.2k/month. Tiering also means knowing when a piece of data leaves observability altogether.
Reproduce this on your laptop
# 1. Spin up a single-tenant Mimir + Loki + Tempo stack
git clone https://github.com/grafana/mimir && cd mimir/development/mimir-monolithic-mode
docker compose up -d
python3 -m venv .venv && source .venv/bin/activate
pip install requests pandas python-dateutil prometheus-client
# 2. Emit ~20K synthetic series across 3 simulated tenants
python3 -c "
from prometheus_client import Counter, start_http_server
import random, time, threading
start_http_server(8000)
counters = [Counter(f'svc_{t}_{i}_total','reqs',['route'])
for t in ['payments','risk','platform'] for i in range(7)]
for c in counters:
for r in [f'/r{i}' for i in range(1000)]:
c.labels(r).inc()
print('emitted ~21K series')
time.sleep(120)" &
# 3. Force a compaction so blocks tier from hot to warm to cold
curl -X POST http://localhost:9009/compactor/ring?forget=true
sleep 30
python3 tier_audit.py
# Expect: per-pillar tier breakdown, misalignment offenders listed,
# rupee cost attributed per-tier and per-tenant.
Where this leads next
/wiki/index-free-log-storage-clickhouse-parquet is the storage-shape question for logs in particular: when the cold-tier boundary drops the per-stream index, what data layout does the cold tier actually use? ClickHouse columnar tables and Parquet on S3 are the two answers Indian observability teams are converging on, and they have different trade-offs (ClickHouse: faster ad-hoc queries, requires running a cluster; Parquet on S3: cheaper, requires a query engine like DuckDB or Presto). The tiering decisions in this article compose with the cold-tier data layout in that one.
/wiki/vendor-vs-self-hosted-economics is the parallel cost question — at what fleet size does the FinOps math flip from "buy Datadog/Honeycomb/New Relic" to "run Mimir/Loki/Tempo"? The tiering shape in this article is most of the answer: vendors charge per-GB-ingested, which prices all three tiers identically, so a fleet with 80% of its data in cold-tier-shaped queries pays 8× too much under vendor pricing. Self-hosting is what unlocks the per-tier rupee differential — but only if the platform team has the bandwidth to operate Mimir's compactor + S3 + ingester ring without losing 30% of an SRE's time per quarter.
/wiki/long-term-storage-thanos-cortex-mimir is the implementation-choice question for the metric-side tier system — Thanos, Cortex, and Mimir are the three production answers, and the tiering boundaries in this article map onto each system's compactor + store-gateway + querier topology slightly differently. The reader-budget rule from this article is the conceptual frame; that article is the deployment frame.
References
- Charity Majors, Liz Fong-Jones, George Miranda, Observability Engineering (O'Reilly, 2022) — Chapter 18, "Telemetry Pipelines and Storage", argues storage tiering as a first-class observability concern and sets up the "what does the reader want" framing this article extends with reader-budget tiering.
- Grafana Labs — Mimir, "Compactor and downsampling" — the canonical operating doc for hot→warm→cold transitions in a production Mimir install. The block-flush 30-second window discussed in the boundary-crossing section is documented here.
- Loki — "Storage configuration and Boltdb-shipper architecture" — explains why hot logs need a label index but cold logs can survive without one, and how Loki's
boltdb-shipperbuilds a per-period index that ages out alongside the data. - Tempo — "Block format v2 and ingestion lifecycle" — the columnar Parquet block layout that makes warm-tier traces cheap and cold-tier traces near-free, plus the operating playbook for the compactor that promotes/demotes blocks.
- AWS S3 Express One Zone pricing & latency profile — the warm-tier substrate Yatrika moved to in Q3 2026 after measuring p99 query latency on Standard fell short of the 30-second feature-engineering budget. Express One Zone hits sub-10 ms first-byte for blocks under 32MB.
- Brendan Burns & David Oppenheimer, "Design patterns for container-based distributed systems" (HotCloud 2016) — sidecar and ambassador patterns underlie the Mimir/Loki/Tempo store-gateway architecture this article assumes; useful background for readers building a custom tier topology.
- /wiki/downsampling-for-long-retention — internal: what aggregate to keep at the cold-tier downsample boundary; the boundary-pricing question this article frames is solved at the aggregate-choice level there.
- /wiki/cardinality-budgets-revisited — internal: the rupee-P&L mechanism this article composes with — once tiering boundaries shift, the cardinality budget per team needs to re-attribute rupees-per-tier-month, not just rupees-per-month.