Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.
In short
Redis is best understood not as "a cache" but as an in-memory database whose product is its data structures — nine typed values (STRING, LIST, HASH, SET, SORTED SET, BITMAP, HYPERLOGLOG, STREAM, GEO) sitting under one giant hash table, each with its own atomic operations. Two design choices do all the work: data lives in RAM, so a command takes microseconds, and the server is single-threaded, so every command is atomic without locks. That is why one ZADD runs a Strava leaderboard, one INCR runs a PaisaBridge rate limiter, and most "production caches" are not caches at all.
Most engineers meet Redis the same way: a senior shows them a slow Postgres query, says "stick a Redis cache in front of it", and they wire up cache.get(key) or compute_and_set(key, value, ttl=300). Six months later, the same engineer is using Redis for the user's shopping cart, the homepage leaderboard, the login rate limiter, and the pub/sub bus — none of that is caching, and all of it is the same Redis instance treating each use case as a different choice of data structure on the same key-value substrate. The product is not the cache; the product is the data structures.
This chapter opens Build 22 — caching and in-memory data stores — and the right way to start is by killing the framing that calls Redis "just a cache". Once you see the nine data types as first-class primitives, with their own atomic operations and their own asymptotic guarantees, the rest of the build (persistence, replication, eviction, cache patterns) becomes a series of refinements on a system you already understand.
The thesis: data structures are the product
A typical key-value store — Memcached, DynamoDB, an unordered_map<string, string> — gives you two operations: get(key) returns bytes, set(key, bytes) stores them. The application is responsible for serialising rich data (JSON, protobuf) into bytes on the way in and parsing it back on the way out. If you want to "increment the visit counter", you do current = parse(get('counter')); set('counter', serialise(current + 1)) — three round trips and a race condition (two clients reading the same value, both incrementing, one update lost).
Redis breaks that pattern. The value attached to each key is not a blob; it is a typed structure, and the operations on that structure live inside the server. Incrementing the visit counter is one command — INCR counter — that goes over the wire as 12 bytes, returns the new value as 4 bytes, takes about 50 microseconds end-to-end, and is atomic because the Redis server is single-threaded and processes commands one at a time. Why single-threaded is a feature, not a bug: a multithreaded server has to lock every shared structure or use lock-free algorithms; both add overhead and bugs. Redis runs the entire command-handling loop on one core and uses the saved cycles (no locks, no thread switches, perfect cache locality) to handle 100 K+ ops/second per instance. The cost is that one slow command (a KEYS * on a million-key keyspace) blocks every other client — which is why "do not call O(N) commands on big keyspaces" is the first rule of Redis operations.
Pushing structure into the server unlocks server-side computation. When you want the top 10 of a leaderboard with a million users, you do not pull a million rows over the network and sort them in Python; you call ZRANGE leaderboard 0 9 REV WITHSCORES and Redis walks the head of the skiplist, returns 10 (member, score) pairs, and you are done in one round trip. The work happened in Redis. The same pattern applies to set intersection (SINTERSTORE), sorted-set range scans, geographic radius queries (GEOSEARCH), and stream consumer groups. The data structure is the API.
The data type zoo: nine primitives, every use case
Every Redis key has a type. The type determines which commands are legal on that key (LPUSH works on lists, errors on strings) and what asymptotic complexity each command has. Here is the zoo, with the use cases that show up in production over and over.
The cards above hide one detail that matters in production: most types have multiple internal encodings that Redis swaps automatically based on size. A SORTED SET with fewer than 128 members uses a listpack (a flat compact array — cache-friendly, good for small N); above that, it switches to a skiplist + dict combo (logarithmic for everything, more memory per entry). A HASH with fewer than 128 small fields uses a listpack too; above that, it becomes a real hash table. A SET of small integers uses an intset (sorted array, binary search). Why this matters: a HASH with 50 short fields takes maybe 200 bytes total because of listpack packing; the same data in 50 separate STRING keys takes 50 × ~80 bytes of overhead = 4 KB before counting any payload. When you store millions of small objects, picking HASH over many STRINGs cuts memory by an order of magnitude. This is why the canonical "shopping cart per user" pattern is one HASH per user with sku → qty, not one STRING per (user, sku) pair.
The architecture: one giant hash table, values are typed structures
Zoom out. A running Redis server is, fundamentally, a single C process running an event loop on top of an epoll (Linux) or kqueue (BSD) reactor. The state of the database is a redisDb struct, which contains — and this is the load-bearing sentence — a single hash table called dict mapping every key in the database to a robj (Redis object). The robj carries a type tag (string / list / hash / set / zset / stream) and a pointer to the actual data structure for that type.
Two consequences of this architecture deserve their own line. First, atomicity is free: any single Redis command, no matter how complex (ZADD, LPUSH, SINTERSTORE, XADD), is observed by every client either as fully done or not done at all, because no other command can interleave with it. This is why INCR is the textbook way to build a distributed counter and why SETNX key value (set if not exists) is the textbook distributed lock primitive — both are race-free for free. Second, slow commands are everyone's problem: a KEYS * against a million-key keyspace, or a SMEMBERS of a million-member set, or a Lua script that runs for 200 ms, blocks every other client for that duration. The whole operational discipline of running Redis is "do not call O(N) commands on big N", and the antidote is to use SCAN-family cursors instead.
Common patterns: five lines of Redis, five production systems
Once you know the data type zoo and the architecture, almost every Redis use case collapses to one of five canonical patterns. Each pattern is a few lines of code in any language. Each pattern, deployed at scale, runs a billion-dollar feature.
A note on rate limiting because it is the pattern engineers most often get wrong. The "fixed window" version (INCR of a key like rl:user:42:minute:1714060860 with a 60-second TTL) is what the diagram shows; it is correct, fast, and good enough for almost all real systems. The subtle bug is the boundary: a user can fire 100 requests at second 59 and another 100 at second 61 and slip 200 requests through inside three seconds. The fix is the sliding-window variant using a SORTED SET (ZADD the timestamp, ZREMRANGEBYSCORE everything older than 60 seconds, ZCARD to count) — five lines of Lua wrapping it makes the whole thing atomic. We will revisit this in chapter 173 on cache patterns; for now, know that both shapes exist and that the SORTED SET version costs more memory but gives you real per-request precision.
A worked example: one Redis instance, six features, one Indian e-commerce site
Imagine you are the back-end engineer for an Indian e-commerce site — call it kirana.in — and one Redis instance handles all of the following: caching the product detail page, holding each user's shopping cart, tracking recently-viewed products per user, surfacing the top-selling products on the homepage, counting the number of unique active users in the last hour, and rate-limiting the add-to-cart API so a script kiddie cannot drain inventory on a flash sale. One process. Six features. Six different data types. End-to-end latency: under a millisecond per operation.
One Redis, six e-commerce features
This walks through the Python (redis-py) calls for each of the six features. Run a Redis server locally (docker run -p 6379:6379 redis:7-alpine) and connect:
import redis, json, time
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
(1) Product cache — STRING with TTL. The product detail page is rendered from data scattered across MySQL, the inventory service, and the price service. Caching the assembled JSON for 5 minutes turns a 200 ms render into a 1 ms read.
def get_product(sku):
cached = r.get(f'product:{sku}')
if cached:
return json.loads(cached) # cache hit: ~0.5 ms
product = assemble_from_backends(sku) # cache miss: ~200 ms
r.setex(f'product:{sku}', 300, json.dumps(product)) # TTL = 5 min
return product
(2) Shopping cart — HASH per user. One HASH per user means you can HINCRBY a single SKU's quantity without rewriting the whole cart, and you can HGETALL to render the cart page in one round trip.
def add_to_cart(user_id, sku, qty=1):
r.hincrby(f'cart:{user_id}', sku, qty) # atomic increment
r.expire(f'cart:{user_id}', 86400 * 7) # cart lives for a week
def view_cart(user_id):
return r.hgetall(f'cart:{user_id}') # {sku: qty} dict
(3) Recently viewed — LIST trimmed to N. Push the SKU on view, trim to the last 10. The user's "recently viewed" rail on the homepage is one LRANGE.
def viewed(user_id, sku):
key = f'recent:{user_id}'
r.lpush(key, sku)
r.ltrim(key, 0, 9) # keep newest 10 only
r.expire(key, 86400 * 30)
def recent_views(user_id):
return r.lrange(f'recent:{user_id}', 0, 9)
(4) Top sellers — SORTED SET, score = units sold. Every time an order completes, bump the SKU's score by quantity. The homepage's "top sellers today" widget is one ZRANGE.
def record_sale(sku, qty):
today = time.strftime('%Y-%m-%d')
r.zincrby(f'sales:{today}', qty, sku)
r.expire(f'sales:{today}', 86400 * 3) # keep 3 days
def top_sellers(n=10):
today = time.strftime('%Y-%m-%d')
return r.zrange(f'sales:{today}', 0, n - 1, desc=True, withscores=True)
(5) Active users in the last hour — HYPERLOGLOG. You do not need exact counts; ±0.81% is fine for a marketing dashboard. PFADD a user ID per request; PFCOUNT to read. The whole structure costs 12 KB regardless of whether you have 1 K or 1 B users.
def saw_user(user_id):
bucket = f'active:{int(time.time() // 3600)}' # one HLL per hour
r.pfadd(bucket, str(user_id))
r.expire(bucket, 7200)
def active_last_hour():
bucket = f'active:{int(time.time() // 3600)}'
return r.pfcount(bucket) # approximate, ~12 KB
(6) Rate limit on add-to-cart — INCR + EXPIRE. Cap each user at 30 add-to-cart calls per minute. The first call sets the TTL; subsequent calls just increment. The if n == 1 check is the standard idiom for "first request in this window".
def allow_add_to_cart(user_id):
key = f'rl:atc:{user_id}:{int(time.time() // 60)}'
n = r.incr(key)
if n == 1:
r.expire(key, 60)
return n <= 30 # True if under the cap
Six features. One Redis. Roughly one network round trip and a handful of microseconds per operation. The product cache cut MySQL load by 95%, the rate limiter survived a Republic Day flash sale that would have melted the API tier, and the leaderboard made the homepage feel real-time without anyone touching the database. Why all six fit on one machine: each operation touches a tiny part of the keyspace and runs in microseconds; even at 50 K req/s — which is enormous for a startup — Redis spends at most 25–50% of one core. The bottleneck on a single instance, when you eventually hit it, is almost always the network card or the per-connection RTT, not Redis itself. Vertical scale (more RAM, faster CPU) carries Redis to about a million ops/second per node before clustering becomes necessary.
The point of the example is not the code; it is the shape of the architecture. Every feature picks a primitive, uses one or two operations, and runs at memory speed. There is no schema migration, no JOIN, no query planner, no surprise full-table scan. The cost is that everything lives in RAM (so your dataset must fit in RAM or you must accept eviction) and that durability is best-effort (covered in chapter 170 on persistence, where RDB snapshots and AOF append-only files give you the trade-offs).
Common confusions
-
"Redis is just a cache." This is the framing the chapter is built to kill. Caching with TTL is one of the five canonical patterns, but most production Redis is leaderboards (Strava, Stack Overflow's hot-questions list), session stores (BharatBazaar's logged-in carts), rate limiters (PaisaBridge's per-merchant API quotas), queues (Sidekiq, RQ, Bull), pub/sub buses, and geographic radius lookups (ZaikaApp's "delivery riders within 2 km"). Treating Redis as a cache only is using a BoschCorp power drill to drive a single screw — it will work, but you are paying for 90% of capability you ignore.
-
"Single-threaded means slow." A single-threaded Redis server handles 100 K–1 M operations per second per node on commodity hardware because every operation runs on hot CPU caches with zero lock contention and zero thread switches. The model is the opposite of "slow": it is "extremely fast at one thing at a time, on one core". The catch is the inverse — one slow command (a
KEYS *on a million-key DB, aSMEMBERSon a giant set, a 200 ms Lua script) blocks every other client. The discipline is "no O(N) commands on big N", withSCAN-family cursors as the antidote. -
"
SET key value EX 300and an in-process LRU cache are interchangeable." They are not. An in-process cache lives inside one application instance — five web servers means five separate, inconsistent caches and no atomic counters across them. Redis is shared across all your processes: oneINCRis globally atomic, oneSETNXis a distributed lock, oneZADDupdates the leaderboard everyone reads from. The moment you have more than one app instance, in-process caches stop being a substitute. -
"Pub/Sub gives me reliable messaging like Kafka." It does not.
PUBLISH/SUBSCRIBEis fire-and-forget: if no subscriber is connected at the moment the message is published, the message is dropped on the floor. There is no replay, no consumer offset, no durability. For at-least-once delivery with replay, use Redis STREAM (XADD/XREADGROUP/XACK), which is a proper append-only log with consumer groups. Pub/Sub is for "best-effort fan-out of ephemeral events"; STREAM is for "I cannot lose this message". -
"Rate limiting with
INCR+EXPIREis precise." The fixed-window version is correct in count but imprecise at the boundary: a user can fire 100 requests at second 59 and another 100 at second 61 and slip 200 through inside three seconds. For real precision, use a sliding window with a SORTED SET (ZADDthe timestamp,ZREMRANGEBYSCOREeverything older than the window,ZCARDto count) wrapped in a Lua script for atomicity. Fixed-window is fine for "30 OTPs per minute"; sliding-window is what you want for "100 API calls per minute, no boundary cheats". -
"Redis is durable because of
appendonly yes." It is more durable, not durable. AOF withappendfsync everysec(the default) flushes to disk once per second, so a hard crash can lose up to 1 second of writes. AOF withappendfsync alwaysflushes after every command — durable, but kills throughput by 10–100×. RDB snapshots are point-in-time and lose everything between snapshots. The honest summary: Redis trades some durability for sub-millisecond latency. If you cannot lose any write, put the write in Postgres first and use Redis as the read-side cache. Chapter 170 walks the trade-offs in detail.
Going deeper: Redis, Valkey, KeyDB, DragonflyDB, Memcached
A short tour of the family, because the names get confused.
Redis — the canonical implementation, originally written by Salvatore Sanfilippo (antirez) in 2009. Single-threaded, BSD-licensed until 2024 when it switched to a dual SSPL/RSAL source-available licence. The reference design and the system this whole chapter describes.
Valkey — the Linux Foundation's community fork of Redis 7.2.4, created in March 2024 immediately after the licence change, BSD-licensed and backed by AWS, Google Cloud, Oracle, and others. As of 2025 it is API-compatible with Redis and is what major cloud providers offer when they say "managed Redis-compatible cache". For most users, the difference is invisible — same RESP protocol, same commands, same data structures.
KeyDB — a Snap fork (now part of Snap Inc.) from 2019 that adds multi-threaded I/O and command processing while preserving Redis semantics. The pitch: more throughput per box without sharding. The cost: a more complex codebase that lags the Redis feature set. Used where a single Redis instance cannot keep up but Cluster is too operationally heavy.
DragonflyDB — a 2022 ground-up rewrite in C++ (with Rust components) that uses a thread-per-core "shared-nothing" architecture (similar to ScyllaDB or Seastar). Compatible with most Redis commands. The pitch: scales linearly with cores on a single box — claimed millions of ops/second on a 32-core machine — by partitioning the keyspace internally across threads.
Memcached — the simpler, older ancestor (Brad Fitzpatrick, 2003). Multi-threaded, only does string get/set with LRU eviction, no rich data types, no persistence, no replication. Still excellent for pure HTTP page caching at scale (Sociogram ran a famously huge Memcached fleet for years). If your only need is "cache opaque blobs by key with LRU", Memcached is half the operational complexity. If you need anything else — atomic counters, sets, leaderboards — you want Redis.
For a new system in 2026, the default choice is Valkey (open licence, API-identical to Redis, what your cloud provider runs anyway) unless you have specific reasons (existing Redis Enterprise contract, Redis Stack modules like RedisSearch and RedisJSON, or scaling requirements that DragonflyDB solves).
Where this leads next
This chapter set up the substrate. The next four chapters drill into the operational story:
- Chapter 170 — Persistence: RDB snapshots and AOF. Redis is in-memory but offers two durability options: periodic binary snapshots (RDB) for fast restarts, and an append-only command log (AOF) for crash recovery with configurable
fsyncpolicy. Most production deployments run both. - Chapter 171 — Replication, Sentinel, and Redis Cluster. Async master-replica replication gives you read scale-out and HA failover (with Sentinel). Redis Cluster shards the keyspace across N masters using 16,384 hash slots and CRC16 hashing.
- Chapter 172 — Eviction policies. When
maxmemoryis hit, Redis evicts. The choices arenoeviction(errors writes),allkeys-lru,allkeys-lfu,volatile-lru,volatile-ttl,volatile-random. Picking the wrong one turns a cache into a bug. See also buffer pool design for the corresponding ideas on disk-resident systems. - Chapter 173 — Cache patterns. Cache-aside (read-through), write-through, write-behind, refresh-ahead, and the cache-stampede problem (with the
SETNX-lock and probabilistic-early-recompute solutions). The patterns work on Memcached or Valkey or any in-memory store, not just Redis. - For the sharding ideas underneath Redis Cluster, see consistent hashing and virtual nodes.
By the end of Build 22, you will be able to design the Redis layer for a real system end-to-end: pick the data structures, configure persistence and eviction, decide between replication and Cluster, and avoid the three or four cache patterns that look right but cause outages.
References
- Redis documentation: data types — the canonical reference for every command and its complexity.
- "Redis in Action" by Josiah L. Carlson (Manning, 2013, free PDF on redis.io) — chapter 1 covers exactly the data-type-as-product framing this chapter opens with.
- "Redis Essentials" by Maxwell Dayvson Da Silva and Hugo Tavares (Packt, 2015) — practical patterns including leaderboards, queues, and rate limiters.
- antirez's blog: "Why Redis" — Salvatore Sanfilippo's own short essay on what Redis is and why the data-structure framing matters.
- Valkey project home — the Linux Foundation fork created after the 2024 Redis licence change.
- DragonflyDB design overview — the thread-per-core alternative architecture.