In short
Every graph fits in a relational schema: one table for nodes, one for edges, a B-tree index on edges.src_id, done. The query "find the neighbours of node 42" is a single indexed lookup — Postgres does it in roughly 100 microseconds end to end (descend the B-tree, read a leaf page, fetch the edge tuples from the heap). The catch shows up the moment you ask for neighbours of neighbours of neighbours. Each hop is another 100 µs index probe, and the work fans out as the branching factor compounds: a four-hop traversal over a fan-out-ten graph touches 10,000 edges and pays 10,000 separate B-tree lookups — about a full second. Native graph engines reject this entire design. In Neo4j, every node record stores a file pointer to its first relationship; every relationship record stores pointers to its source, its destination, and the next relationship in each endpoints adjacency list (a doubly linked list of edges threaded through the node). Walking from a node to its neighbours is **pointer-chasing**: dereference the pointer, read the relationship record, follow the destination pointer to the next node. No B-tree, no index seek — Neo4j calls this **index-free adjacency**, and it is the entire reason a separate category of database exists. The cost per hop drops from 100 µs to roughly **1 µs** — a 100× speedup, which *compounds* hop by hop. Four-hop fraud-ring detection: ~1 second in Postgres, ~10 milliseconds in Neo4j. The same trick powers JanusGraph (when backed by its native BerkeleyDB store), TigerGraph (which adds aggressive parallelism), and Memgraph (which keeps the whole pointer graph in RAM for sub-microsecond hops). The trade-offs bite where you would expect: writes touch both endpoints adjacency lists (more write amplification than INSERT INTO edges), supernodes — the Bollywood actor with five million followers — become concurrency hotspots, and tabular operations like "average age of all users" run faster on Postgres because there is no users table to scan. Modern native engines paper over the second weakness with secondary indexes layered on top of index-free adjacency, giving you the best of both worlds for the price of a more complex storage engine.
The previous chapter ended with the property-graph data model — nodes with labels and properties, relationships with types and properties, the whole thing modelled as a labelled multigraph. That is a logical model. This chapter is about the physical one: how do you actually lay out a graph on disk so that traversing it is fast? The first answer is the obvious one — store nodes in a table, store edges in a table, index the edge table on the source column. It works, it scales, and Postgres will happily run a graph this way. The second answer is the one Neo4j shipped in 2010 and changed an industry over: store the adjacency as direct pointers inside the node record itself, so that walking the graph never touches an index. Both work. They have wildly different performance profiles. Understanding why is the point of this chapter.
The thesis: a graph is just two tables, until you traverse it
The relational answer to "store me a graph" is so natural it barely deserves discussion. Two tables:
CREATE TABLE nodes (
id BIGINT PRIMARY KEY,
label TEXT NOT NULL,
properties JSONB
);
CREATE TABLE edges (
src_id BIGINT NOT NULL REFERENCES nodes(id),
dst_id BIGINT NOT NULL REFERENCES nodes(id),
type TEXT NOT NULL,
properties JSONB,
PRIMARY KEY (src_id, dst_id, type)
);
CREATE INDEX edges_dst_idx ON edges (dst_id); -- for reverse traversal
This is the schema every social network started with — Facebook ran on it for years, LinkedIn ran on it, every Indian payments network running on Postgres or MySQL today still runs on a variant of it. The schema is honest, the constraints are checked, the JSONB column lets you stuff arbitrary properties without an ALTER TABLE. Reads against single nodes are cheap. Aggregations across nodes are cheap. The whole machinery of relational query optimisation is at your disposal.
The query "give me the neighbours of node 42" looks like this:
SELECT n.* FROM nodes n
JOIN edges e ON e.dst_id = n.id
WHERE e.src_id = 42;
The Postgres planner descends the primary-key B-tree on edges to find the leaf page where (42, *) lives, scans the matching tuples, then for each dst_id does a primary-key lookup on nodes to fetch the row. Why this costs ~100 µs: a B-tree of, say, a billion edges is roughly 4 levels deep — three internal pages and one leaf. Each page touch is a buffer-pool lookup (cheap, but not free), plus a binary search within the page (a few hundred nanoseconds). The leaf scan reads a handful of consecutive tuples. Then for each neighbour, another B-tree descent on nodes. Add in latch acquisition, MVCC visibility checks, and the cost of returning rows through the executor, and you are at roughly 100 µs per hop on a warm cache. On a cold cache (an SSD page read), make that 200 µs.
So one hop is 100 µs. Two hops is 100 µs × fan-out. Three hops is 100 µs × fan-out². Four hops in a graph where the average node has ten neighbours is 100 µs × 10³ = 100,000 µs = 100 ms — and that ignores the cost of de-duplicating nodes you have already seen.
Native adjacency: the node knows where its edges live
The native-graph answer rejects the framing. Instead of two heap files plus an index that maps node IDs to edge rows, store the adjacency information inside the node record itself. Specifically, every node record on disk holds a single field: a file offset pointing to the first edge in its adjacency list. Every edge record holds pointers to: its source node, its destination node, the next edge in the sources adjacency list, and the *next* edge in the destinations adjacency list. The result is a doubly linked list of edges threaded through each node — every edge is a member of two such lists, one per endpoint.
This is the layout Neo4j has shipped since version 1.0. The on-disk files are named after their contents: neostore.nodestore.db holds fixed-size node records (15 bytes each in older versions, larger now), neostore.relationshipstore.db holds fixed-size relationship records (34 bytes each), and neostore.propertystore.db holds the variable-length property lists that hang off both. Because every record is fixed-size, the file offset of node n is just n × record_size — there is no need for an index from node ID to file location. Why fixed-size matters: the moment record sizes are variable, you need an index to find a record by ID, and you have just reintroduced the very B-tree probe you were trying to eliminate. Fixed-size records make ID-to-offset translation a multiplication, which the CPU does in a single cycle.
Walking from a node to a neighbour now looks like this. Read the node record at offset id × 15. Take its first-relationship pointer. Read the relationship record at that offset. The relationship record contains both endpoint IDs and pointers to the next relationship in each endpoint`s list. Follow the destination pointer (multiply by node record size, read) to land on the neighbour. No index lookups. Just two file-offset multiplications and two page reads. On a warm buffer cache, both pages are already in RAM, and a hop costs roughly 1 µs — dominated by the cost of the pointer dereference and the L2-cache miss on the page.
The speedup: 100× per hop, compounded
Put the two together and the gap is stark. A single hop in Postgres is roughly 100 µs; the same hop in Neo4j is roughly 1 µs. A factor of one hundred per hop. Multi-hop queries do not just inherit this speedup — they amplify it, because the relational query has to re-pay the index cost at every hop, while the native query continues pointer-chasing at the same rate.
Index-free adjacency, in code
The pseudo-code makes it concrete. Here is what one hop looks like in each world.
# Relational: every hop is an indexed lookup
def neighbours_postgres(conn, node_id):
cur = conn.cursor()
# Postgres planner: index seek on edges_pk_idx, then NL-join into nodes
cur.execute("""
SELECT n.id, n.label, n.properties
FROM nodes n
JOIN edges e ON e.dst_id = n.id
WHERE e.src_id = %s
""", (node_id,))
return cur.fetchall() # ~100 µs warm, ~200 µs cold
# Native: every hop is a pointer chase
NODE_REC_SIZE = 15 # bytes (illustrative)
REL_REC_SIZE = 34
def neighbours_neo4j(store, node_id):
# Compute file offset directly; no index lookup
node_off = node_id * NODE_REC_SIZE
node = store.read_node(node_off) # one page read
neighbours = []
rel_off = node.first_rel # the pointer that lives in the node record
while rel_off is not None:
rel = store.read_rel(rel_off) # one page read per edge
# Decide which end of the edge is "us" so we walk the correct linked list
if rel.src == node_id:
other_id = rel.dst
rel_off = rel.next_src_edge
else:
other_id = rel.src
rel_off = rel.next_dst_edge
neighbours.append(other_id)
return neighbours # ~1 µs/edge warm
Notice the symmetry. The relational version asks the database "find me the rows where src_id = X", and the database descends a B-tree to answer. The native version already knows where the first edge lives — its file offset is a field in the node record — and walks a linked list from there. The same primitive (read_rel(offset)) handles every subsequent edge. Why this changes asymptotic feel: in the relational query, traversal cost is O(hops × log N) because each hop is a log N B-tree descent. In the native query, traversal cost is O(hops × degree) — proportional to the edges actually walked, not to the size of the database. Whether you have a thousand nodes or a billion, the per-edge cost is the same.
Trade-offs: writes, supernodes, and tabular queries
Native adjacency is not free. Three costs show up in production.
Write amplification. Inserting an edge in the relational schema is a single INSERT INTO edges plus an index-update — two B-tree pages dirtied. Inserting an edge in Neo4j requires (1) allocating a new relationship record, (2) updating the source nodes adjacency list (find the previous "first" edge, point it back to the new one, point the node to the new one), and (3) updating the destination nodes adjacency list the same way. Three records dirtied, with a strong locality penalty if the source and destination live on different pages — and they almost always do, because nodes are clustered by ID, not by neighbour. Why this matters at scale: a payments graph that ingests a million transactions per second pays this overhead a million times. Neo4j Enterprise added group-commit and parallel transaction logging to amortise the cost, and TigerGraph added a write-back cache layer on top of the same idea — but the fundamental write cost is higher than relational, and that is the price of read speed.
Supernodes. The doubly-linked-list-of-edges layout is fine when nodes have a few hundred edges. It becomes a concurrency disaster when one node has five million edges — the canonical Bollywood-actor-with-five-million-followers problem. Adding an edge to that nodes adjacency list means traversing or splicing into a linked list of five million entries, with locks held the whole time. Neo4j 3.0 introduced **dense-node optimisations** that switch supernodes from a single linked list to a per-relationship-type list (so traversal can skip irrelevant types) and eventually to a small per-type B-tree-like structure. JanusGraph mitigates differently by sharding each nodes edge list across the underlying Cassandra/HBase store so that the linked-list traversal is parallel. Either way, supernodes are the place where index-free adjacency`s simplifying assumption — that a degree-N adjacency is a single linked list — breaks, and engineering complexity creeps back in.
Tabular queries. "What is the average age of all our users?" is a one-line SQL query that scans the users table sequentially, and Postgres will do it in a few hundred milliseconds for ten million rows. The same query in Neo4j has to scan every node record, decode every property list (which lives in a separate file), filter by label, and aggregate — and the property list is variable-length, so there is no nice row-store SIMD. Native graph databases have steadily added secondary indexes (Neo4js schema indexes, JanusGraphs composite indexes) to accelerate label-and-property scans, but the sweet spot is genuinely traversal. If your workload is 90 % aggregations and 10 % traversals, you want Postgres. If it is 90 % traversals and 10 % aggregations, you want Neo4j. The interesting world is everywhere in between, which is why production graph stacks usually keep both — Postgres for the system of record, Neo4j or TigerGraph for the traversal index — and CDC-replicate from one to the other.
The native landscape
Four systems matter today, each occupying a slightly different point in the design space.
Neo4j is the canonical implementation — it shipped index-free adjacency in 2010, popularised the term, and remains the most widely deployed native graph database. Storage is on disk, with a buffer pool (page cache) sized like Postgres`s; the engine, query language (Cypher), and operational tooling are mature. The ecosystem includes hot-failover, role-based clustering, and a graph data science library. If you are starting a graph project today and have no other constraints, this is the default.
JanusGraph decouples the graph layer from the storage backend: you can run JanusGraph on top of Cassandra, HBase, BerkeleyDB, or ScyllaDB. The native-adjacency story holds when JanusGraph is configured with BerkeleyDB (single-node, fixed-size records); on Cassandra, the adjacency is "logical" — each nodes edge list is one Cassandra row, with edges stored as columns, and traversal is a Cassandra read rather than a pointer chase. This loses some single-hop latency but inherits Cassandras horizontal scalability, which is the trade Twitter and others made for their early graph workloads.
TigerGraph is the speed extremist. It compresses node and edge records aggressively, runs traversal in massively parallel C++ kernels, and consistently benchmarks an order of magnitude faster than Neo4j on multi-hop queries. The catch is a more proprietary stack and a steeper learning curve for its query language (GSQL). It is the choice when you have a very large graph (tens of billions of edges) and need traversal latency at single-digit milliseconds.
Memgraph keeps the entire graph in RAM and uses a skiplist-of-records layout instead of a fixed-record file. The pointer-chase becomes a memory-pointer dereference (no buffer-pool indirection at all), pushing per-hop cost into the nanosecond range. Crash recovery uses a write-ahead log replayed on startup. The trade-off is RAM cost: a graph that fits comfortably on a 1 TB SSD might need a 256 GB RAM machine. For real-time recommendation engines and trading-graph analytics, the cost is justified.
Tracing a fraud ring across an Indian payments graph
The compliance team at an Indian payment processor has flagged a UPI account belonging to one Asha Mehta as suspicious — a sudden cluster of small-value transactions with mismatched KYC details. The investigators want to know who Asha is connected to, three hops out: not just the people she paid directly, but the people those people paid, and the people those people paid. A fraud ring typically shows up as a tight cycle two or three hops away from the seed account.
The graph: 50 million UPI accounts, average degree 12 (mean transactions per account in the lookback window). The query: starting from Ashas node, expand out three hops, return all nodes reached. The expected fan-out is 12 × 12 × 12 ≈ 1,700` edges visited — but with cycle de-duplication, the unique node count typically settles around 800.
Postgres path. The query is a recursive CTE:
WITH RECURSIVE ring AS (
SELECT 42 AS node_id, 0 AS depth
UNION ALL
SELECT e.dst_id, r.depth + 1
FROM ring r JOIN edges e ON e.src_id = r.node_id
WHERE r.depth < 3
)
SELECT DISTINCT node_id FROM ring;
Each recursive step does a B-tree probe per source node. Step 1: 1 probe → 12 results. Step 2: 12 probes → 144 results. Step 3: 144 probes → 1,728 results. Total probes: 1 + 12 + 144 = 157, but each probe also fans out — really the planner does roughly 1,700 individual edge lookups. At 100 µs per indexed edge fetch (factoring in the join into nodes for each dst_id), the query lands in the 600 ms to 1.5 s range, depending on cache state. Live measurements on a real payments dataset of this size routinely show 800 ms.
Neo4j path. The same query in Cypher:
MATCH (asha {id: 42})-[*1..3]->(neighbour)
RETURN DISTINCT neighbour
The engine starts at node 42 (one offset multiplication, one page read). It walks the doubly linked edge list (about 12 pointer dereferences). For each neighbour it recursively walks its edge list. Total pointer chases: roughly 1,700, at about 1–5 µs each (hot cache). The query lands in the 5–15 ms range — you can run it on every UPI alert, in real time, while the customer is still inside the payment screen.
The compliance team that ran this on Postgres saw the dashboard refresh in around a second per query and ran the analysis as a nightly batch. After moving the traversal layer to Neo4j (with Postgres remaining the source of truth, replicated via Debezium), the same query landed under fifteen milliseconds and they wired it into the real-time fraud check — every UPI transaction over ₹10,000 now triggers a three-hop expansion before approval. The 100× per-hop speedup turned a batch analysis into a live control plane. Same data, same query semantics, completely different operating mode.
The takeaway
Index-free adjacency is one of those architectural decisions that sounds incremental — "store a pointer in the node record" — and turns out to be the entire reason a category of database exists. The relational schema with nodes and edges is correct, queryable, and extensible; it is also irrecoverably slow at multi-hop traversal because the only way to answer "what are the edges of node X" is to look them up. Native engines refuse the lookup. They store the answer where the question gets asked — inside the node record itself — and the cost per hop falls by two orders of magnitude. That falling cost compounds over hops, which is why fraud-ring detection, recommendation cascades, and citation-graph BFS have moved from batch to interactive in the decade since Neo4j shipped its pointer-chasing storage engine.
The trade-offs are real. Writes are heavier because every edge update touches both endpoints. Supernodes are concurrency hotspots that need special-case handling. Tabular aggregations run faster on Postgres because Postgres has a sequential row-store and the graph engine does not. Modern systems mitigate all three — dense-node optimisations, secondary indexes, write-back caches — but the underlying tension never fully goes away. The right answer in production is almost always both: a relational system of record for tabular truth, a native graph alongside it for traversal-heavy queries, replicated through CDC. The next chapter looks at the traversal primitives those native engines expose — out, in, both, step, path — and how they compose into the query languages (Cypher, Gremlin, GSQL) you actually write against them.
References
- Neo4j, Database internals: storage engine — neo4j.com/docs/operations-manual/current/database-internals/. Definitive on-disk layout reference.
- Robinson, Webber, Eifrem, Graph Databases (2nd ed., O
Reilly, 2015), Chapter 6 — [graphdatabases.com](https://graphdatabases.com/). The canonical exposition of index-free adjacency, by Neo4js founders. - Neo4j blog, Graph databases for beginners: native vs non-native graph technology — neo4j.com/blog/native-vs-non-native-graph-technology/. The original "index-free adjacency" framing.
- JanusGraph documentation, Storage backends overview — docs.janusgraph.org/storage-backend/. How a storage-agnostic graph engine maps adjacency onto Cassandra, HBase, and BerkeleyDB.
- Apache TinkerPop, Why graphs? — tinkerpop.apache.org/docs/current/reference/. The reference Gremlin guide includes a clear treatment of when graph traversal beats relational joins.
- Malewicz et al., Pregel: a system for large-scale graph processing, SIGMOD 2010 — research.google/pubs/pregel-a-system-for-large-scale-graph-processing/. The other influential graph-storage school: vertex-centric, not adjacency-centric.