Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.
In short
Graph databases come in two flavours that draw the same picture but disagree about what the picture means. Property graphs (Neo4j, queried with Cypher) treat nodes and edges as first-class objects that carry their own properties — pragmatic, app-friendly, schema-on-read. RDF (Apache Jena, Wikidata, queried with SPARQL) flattens everything into uniform (subject, predicate, object) triples with global URIs and formal OWL semantics — verbose for one team, indispensable when data crosses organisational boundaries. Pick property graphs to ship a product feature; pick RDF when the world has to query your data.
Ask a relational database for the friends-of-friends-of-friends of user 42 who live in Bengaluru and bought shoes last month, and you get a self-join three times deep that grows with the size of the friendship table. Ask a graph database the same question and you walk three hops out from user 42, paying only for the friends that actually exist. That single change in cost shape is why the last category in this curriculum stores relationships as first-class citizens — but the moment you commit to a graph database, you walk into a fork most tutorials never mention.
There are two fundamentally different graph data models on the market, with two different histories, two different query languages, two different developer ecosystems, and two different mental models. They are called property graphs and RDF. Choosing between them up front, before you write your first query, is more important than choosing between Neo4j and JanusGraph, or between Apache Jena and Stardog — because once you have committed to one model, the other is effectively unreachable without rewriting your application.
This chapter derives both models from first principles, walks the same Indian e-commerce example through each one, and gives you a decision tree.
The property graph model: nodes and edges that carry data
The property graph model is the simpler of the two to grasp, partly because it matches how most engineers already draw graphs on whiteboards and partly because it borrows naturally from object-oriented programming. The model has exactly four primitive concepts.
Nodes. A node is an entity. Riya is a node. Rahul is a node. The product "Nike Pegasus 41" is a node. Nodes have an internal identifier (Neo4j gives them numeric IDs, but this is an implementation detail you rarely touch).
Labels. A node can carry one or more labels that classify it. Riya is labelled :Person; the Nike Pegasus 41 is labelled :Product. A node can have multiple labels (:Person:Customer:PrimeMember), which makes labels feel like a mix of "type" and "tag". Labels are how you say "give me all the Persons" without scanning every node.
Properties. A node carries a map of key-value pairs — its properties. Riya's properties might be {name: "Riya Sharma", age: 27, city: "Bengaluru", joined: 2024-03-15}. Properties are typed (string, integer, boolean, date, list, point) but the schema is flexible — two :Person nodes are not required to have identical property sets. This is the "NoSQL feel" of property graphs.
Edges (relationships). An edge connects two nodes and itself carries a type and properties. Riya [:KNOWS {since: 2019, on: "Picstrand"}] Rahul. The edge has a single type (:KNOWS), is directed (Riya → Rahul, distinct from Rahul → Riya, though the query language can ignore direction when you want), and has its own property map. The fact that edges carry properties is the headline feature — the since of a friendship lives on the friendship itself, not on a separate "friendship metadata" table.
That is the whole model. Four primitives: nodes, labels, properties, edges with types and properties. Everything else — indexes, constraints, schema validation — is operational sugar layered on top.
Why properties on edges matter so much: in a relational world, "Riya KNOWS Rahul since 2019" requires a friendships table with (person_a_id, person_b_id, since_date) columns, and every query about the friendship has to join it. In RDF (which we will see next), it requires either reification or a named graph trick — both verbose. In a property graph, it is one edge with one property. This single design decision is why teams that have done graph modelling in both worlds tend to find property graphs faster to iterate on.
The query language is Cypher, originally invented at Neo4j and now standardised as ISO/IEC GQL (Graph Query Language, ratified 2024). Cypher's syntax draws ASCII pictures of the patterns you want to match. To find friends-of-friends of Riya:
MATCH (riya:Person {name: "Riya"})-[:KNOWS]->(friend)-[:KNOWS]->(fof)
WHERE fof.city = "Bengaluru"
RETURN fof.name, fof.age
Read it left to right: start at a :Person node named Riya, follow a :KNOWS edge to a friend, follow another :KNOWS edge to a friend-of-friend, filter to those in Bengaluru, return the name and age. The pattern in the MATCH clause looks like the picture you would draw on a whiteboard, and that resemblance is the entire reason Cypher took off — Cypher is the first graph query language that beginners can read aloud.
The other major property-graph query language is Gremlin, part of the Apache TinkerPop project and supported by JanusGraph, Riverone Neptune, OrientDB, and others. Gremlin is a traversal language — you write g.V().has('name','Riya').out('KNOWS').out('KNOWS').has('city','Bengaluru').values('name','age') — which is more imperative and chains better in code, but harder to read for declarative pattern queries. Cypher and Gremlin coexist; chapter 161 walks both in detail.
The RDF model: everything is a triple
RDF (Resource Description Framework) was born in a different world. While the property graph model evolved organically from "let's add types to graph theory and make it useful for apps", RDF was designed top-down by the W3C between 1999 and 2004 as the foundation of the semantic web — Tim Berners-Lee's vision of a web where machines could read and reason about data the way humans read HTML. The design priorities were therefore different: maximum interoperability across organisations, formal logical semantics, and the ability to compose facts from unrelated sources.
The RDF model has exactly one primitive: the triple. A triple is a three-part statement of the form (subject, predicate, object). Read it like a sentence: subject predicate object. "Riya is a Person." "Riya knows Rahul." "Riya has age 27."
<http://example.org/riya> <rdf:type> <http://example.org/Person>
<http://example.org/riya> <foaf:name> "Riya Sharma"
<http://example.org/riya> <foaf:age> 27
<http://example.org/riya> <http://example.org/knows> <http://example.org/rahul>
<http://example.org/rahul> <rdf:type> <http://example.org/Person>
<http://example.org/rahul> <foaf:name> "Rahul Verma"
Six triples encode what we earlier expressed as two property-graph nodes plus one edge. Everything is uniform. There are no nodes versus edges; there are no labels versus properties. Just triples. The "type" of a resource (rdf:type Person) is a triple. The "name" of a resource (foaf:name "Riya Sharma") is a triple. The "knows" relationship is a triple. The atomic data unit is the triple, full stop.
URIs identify resources. Every subject and predicate is a URI — a globally unique identifier in the same namespace as URLs on the web. <http://example.org/riya> is not necessarily a clickable web page; the URI is just a name with a guaranteed-unique structure. The fact that two organisations can independently mint URIs in their own namespaces (<http://flipkart.com/data/user/12345>, <http://wikidata.org/entity/Q42>) without colliding is the foundation of RDF's federation story. Merge two RDF datasets and triples about the same URI line up automatically.
Objects can be URIs or literals. The object of a triple is either another resource (a URI) or a literal value (a string, integer, date, with optional language tag and datatype). (riya, knows, rahul) has a URI object — it is a relationship. (riya, name, "Riya Sharma"@en) has a literal object — it is a property in the property-graph sense.
Why uniformity is both RDF's strength and its weakness: because everything is a triple, you can merge two RDF datasets by dumping their triples into the same store and the meaning is preserved — facts about the same URI from different sources line up automatically. This is the federation story. The cost is verbosity: a single conceptual entity ("Riya") explodes into half a dozen rows, and adding a property to a relationship (the since on a friendship) requires either reification — a four-triple workaround that creates a "statement about a statement" — or a named graph, both of which add cognitive overhead. In a property graph that property is just one key on the edge.
The query language for RDF is SPARQL (SPARQL Protocol and RDF Query Language, W3C standard, version 1.1 ratified 2013, 1.2 in late stages). SPARQL's syntax is also pattern-matching, but the patterns are written as triples:
PREFIX ex: <http://example.org/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?fofName ?fofAge WHERE {
?riya foaf:name "Riya Sharma" .
?riya ex:knows ?friend .
?friend ex:knows ?fof .
?fof foaf:name ?fofName .
?fof foaf:age ?fofAge .
?fof ex:city "Bengaluru" .
}
Each line in the WHERE block is a triple pattern with variables (the ? prefix) where you want SPARQL to find bindings. The query asks: find a ?riya whose name is "Riya Sharma", find someone she knows, find someone they know, and return that person's name and age provided they live in Bengaluru. Same query as the Cypher example, expressed via triple patterns instead of an ASCII picture.
Beyond SELECT queries, SPARQL also supports CONSTRUCT (build new triples from query results — handy for inference), ASK (boolean queries), DESCRIBE (return all triples about a resource), and SPARQL Update for INSERT/DELETE.
Vocabularies, ontologies, and inference: the layer above RDF
RDF rarely travels alone. The semantic-web stack adds two layers on top of the bare triple model.
RDFS (RDF Schema) lets you declare classes and subclasses, properties and subproperties, domains and ranges. "ex:Customer rdfs:subClassOf ex:Person" is itself a triple — the schema lives in the same triple store as the data. A reasoner that sees (riya rdf:type Customer) and the subclass triple can infer (riya rdf:type Person) automatically without that triple ever being stored.
OWL (Web Ontology Language) goes further. You can declare that a property is symmetric (ex:friendOf — if Riya is a friend of Rahul, Rahul is a friend of Riya), transitive (ex:ancestorOf), inverse (ex:parentOf is the inverse of ex:childOf), or functional (only one value allowed per subject). You can express disjoint classes, equivalence, cardinality constraints, and complex class definitions. An OWL reasoner uses these axioms to derive new triples from existing ones — inference becomes a first-class capability.
This is what people mean when they say "RDF has formal semantics". The data model is grounded in description logic; reasoners like Pellet, HermiT, and the one shipped inside Stardog can answer queries that involve derived facts, not just stored facts. Property graphs have nothing equivalent at the data-model level — you can implement inference in application code or with stored procedures, but it is not a native feature of Cypher or Gremlin.
For most application teams, formal semantics are over-engineering. For pharma researchers integrating SNOMED CT (the standard medical terminology, 350,000 concepts) with the Gene Ontology (47,000 terms) with the Disease Ontology and ChEBI (chemicals), formal semantics are the only way the integration can work without writing custom reconciliation code for every pair.
A side-by-side comparison: Riya knows Rahul
The single example below shows the same fact — Riya knows Rahul, with both being people who joined in 2024 — expressed in both models. This is the most useful diagram in the chapter; print it.
The two columns above explain why most application teams find property graphs faster to ship with. To add metadata to a relationship in Neo4j you write (riya)-[:KNOWS {since: 2019, on: "Picstrand"}]->(rahul) and you are done. To do the same in RDF you either reify the statement (verbose but standard) or you use RDF-star — an extension supported by Apache Jena, GraphDB, and others that lets you write <<ex:riya ex:knows ex:rahul>> ex:since 2019 and treats the embedded triple as a first-class subject. RDF-star solves the verbosity problem and is the modern answer, but it is an extension layered on top of the original model rather than the core.
Worked: an Indian e-commerce recommendation graph
Build a tiny recommendation graph for a BharatBazaar-style store. The data: User user42 (Riya in Bengaluru) bought Product prod7 (Nike Pegasus 41) on 2026-03-12; Product prod7 is in Category cat3 (Running Shoes); Category cat3 is a subcategory of cat1 (Footwear). Five facts. Express the same data in both graph models, then write a recommendation query in both.
Property graph (Cypher). Loading the data:
CREATE (u:User {id: "user42", name: "Riya", city: "Bengaluru"})
CREATE (p:Product {sku: "prod7", name: "Pegasus 41", price: 12995})
CREATE (c3:Category {id: "cat3", name: "Running Shoes"})
CREATE (c1:Category {id: "cat1", name: "Footwear"})
CREATE (u)-[:BOUGHT {on: date("2026-03-12")}]->(p)
CREATE (p)-[:IN_CATEGORY]->(c3)
CREATE (c3)-[:SUBCATEGORY_OF]->(c1)
Recommendation query: "find products in the same top-level category that Riya has not bought yet":
MATCH (riya:User {id: "user42"})-[:BOUGHT]->(bought:Product)
-[:IN_CATEGORY]->()-[:SUBCATEGORY_OF*0..]->(top:Category)
<-[:SUBCATEGORY_OF*0..]-()<-[:IN_CATEGORY]-(rec:Product)
WHERE NOT (riya)-[:BOUGHT]->(rec)
RETURN DISTINCT rec.name, rec.price
ORDER BY rec.price
LIMIT 10
The *0.. is the variable-length path operator — match zero or more SUBCATEGORY_OF edges, so the query catches both products in the same leaf category (Running Shoes) and products in sibling categories under Footwear (e.g. Casual Shoes). The pattern reads like the picture you would draw on a whiteboard.
RDF (SPARQL). Loading the same data, in Turtle syntax:
@prefix ex: <http://flipkart.example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
ex:user42 rdf:type ex:User ; ex:name "Riya" ; ex:city "Bengaluru" .
ex:prod7 rdf:type ex:Product ; ex:name "Pegasus 41" ; ex:price 12995 .
ex:cat3 rdf:type ex:Category ; ex:name "Running Shoes" .
ex:cat1 rdf:type ex:Category ; ex:name "Footwear" .
ex:user42 ex:bought ex:prod7 .
ex:prod7 ex:inCategory ex:cat3 .
ex:cat3 ex:subcategoryOf ex:cat1 .
Recommendation query in SPARQL using property paths (the + and * operators on predicates):
PREFIX ex: <http://flipkart.example.org/>
SELECT DISTINCT ?recName ?recPrice WHERE {
ex:user42 ex:bought ?bought .
?bought ex:inCategory ?leaf .
?leaf ex:subcategoryOf* ?top .
?other ex:subcategoryOf* ?top .
?rec ex:inCategory ?other .
?rec ex:name ?recName ;
ex:price ?recPrice .
FILTER NOT EXISTS { ex:user42 ex:bought ?rec . }
}
ORDER BY ?recPrice
LIMIT 10
Both queries return the same recommendations. Both walk the same graph topology. The Cypher version uses ASCII-art patterns; the SPARQL version uses triple patterns with variables. Cypher is roughly half the line count and noticeably easier for someone who has not used either before. SPARQL is more uniform — every clause is a triple pattern, no special node-vs-edge syntax — and integrates cleanly if you also want to merge in product data from an external RDF source like a manufacturer's catalog or a public taxonomy.
The takeaway: same underlying graph, same traversal logic, same result, two genuinely different developer experiences. For a recommendation engine inside one company's product, the Cypher version is what most teams ship. For a query that has to join BharatBazaar's data with a public RDF taxonomy of footwear categories published by an industry body, the SPARQL version is the natural fit.
When to choose which
The decision is not about which model is "better" — both are mature, both have scaled to billion-edge production deployments, both have active vendor ecosystems. The decision is about which is better fitted to your problem. Three questions sort it.
Question 1: are you building a product feature or integrating a knowledge base? Product features — recommendations, fraud detection, social graphs, network analysis, master data management inside one company — almost always benefit from property graphs. Faster developer onboarding, less verbose data model, more readable queries, less ceremony. Knowledge bases that integrate data from many sources — biomedical, government, scholarly, multi-organisation enterprise — benefit from RDF. URIs and ontologies are the price you pay for plug-and-play federation.
Question 2: do you need formal reasoning? If your application requires deriving new facts from declared rules — "an employee of a subsidiary of a parent company is also an employee of the parent for compliance purposes", "a substance that is a sub-class of a class with property X also has property X" — RDF with OWL gives you this declaratively, with off-the-shelf reasoners. Property graphs require you to write the inference rules in application code or stored procedures.
Question 3: who else needs to query your data? If only your application queries the graph, the choice is internal — pick whichever your team finds productive. If you publish data for the world to query (Wikidata, DBpedia, open government data, public scientific datasets), RDF is the lingua franca; consumers expect SPARQL endpoints, JSON-LD serialisation, and standard vocabularies (FOAF for people, Schema.org for things, Dublin Core for documents).
The market reflects this split. Neo4j is the largest property-graph vendor, used heavily for enterprise knowledge graphs (NASA, eBay, Cisco), fraud detection (Italian financial regulator UIF), social and recommendation systems. JanusGraph (Apache Foundation, originally Titan) and TigerGraph target large-scale property-graph deployments. Memgraph is the in-memory Cypher-compatible competitor. Riverone Neptune supports both models in one service. Apache Jena is the open-source RDF stack of choice — Java, with the Fuseki HTTP server and the ARQ SPARQL engine. Stardog is the commercial RDF leader, used heavily in pharma and finance for knowledge graphs with reasoning. Virtuoso powers DBpedia and many large LOD (Linked Open Data) deployments. GraphDB (by Ontotext) is widely used in publishing and enterprise. Wikidata itself runs on a custom Blazegraph deployment with 1.5 billion triples and a public SPARQL endpoint that anyone can query.
The two communities have started to converge in recent years. RDF-star and SPARQL-star bring property-on-edge expressiveness to RDF without reification. The new ISO GQL standard borrows ideas from SPARQL while keeping the Cypher syntax. Multi-model vector-plus-graph databases like Weaviate (chapter 157) blur the lines further. But for the next decade, the basic split remains: pick property graphs for app development, pick RDF for data integration.
Common confusions
-
"RDF and property graphs are the same thing — both are just nodes and edges." They draw the same picture on a whiteboard, but the data model underneath is different. RDF has one primitive (the triple) and forces every fact — type, attribute, relationship — into that shape; property graphs have four primitives (node, label, property, edge) and treat attributes-on-relationships as native. The query languages, the schema mechanisms, and the tooling ecosystems are not interoperable without translation. A Neo4j dump is not a SPARQL endpoint, and importing Wikidata into Neo4j is a multi-week project, not a configuration toggle.
-
"SPARQL is harder to learn than Cypher because it is older." It is harder to learn because it is more uniform. Cypher gives you a node syntax
(n), an edge syntax[r], and a path syntax that draws an ASCII picture; SPARQL gives you exactly one syntax (the triple pattern), and you express everything through it. Uniformity is a virtue when you are composing queries from many sources or generating queries programmatically — which is why semantic-web tooling lives in SPARQL — but it is a tax when you are sketching a one-off recommendation query on a whiteboard. -
"OWL reasoning is just like SQL views — derived data on demand." OWL inference is grounded in description logic and can be computationally expensive (the more expressive OWL profiles are NEXPTIME-hard to reason over completely). Most production RDF deployments use a restricted profile — OWL 2 RL or OWL 2 EL — that trades expressiveness for tractable rule-based forward-chaining. SQL views are syntactic; OWL inference is semantic. Stardog, GraphDB, and Apache Jena Fuseki each ship a different reasoner, and the same ontology can produce slightly different inferred triples on each.
-
"You can always migrate from one model to the other later." In theory yes, in practice no. A property-graph schema with thousands of edge-types and edge-properties is not mechanically convertible to RDF without making URI-minting decisions and choosing between reification, RDF-star, and named graphs — each of which changes how SPARQL queries against the result must be written. Going the other way, an RDF dataset that uses OWL inference loses its inferred triples unless you materialise them first. Teams that have done both migrations report 6–18 month rewrites; treat the model choice as load-bearing for the lifetime of the system.
-
"Wikidata uses property graphs because it has properties on edges." Wikidata uses RDF — specifically a Blazegraph cluster with a custom data model called Wikibase that uses RDF reification heavily to attach qualifiers (rank, time of validity, source) to statements. The properties-on-edges feel comes from the Wikibase UI layer, not from the underlying triple store. The public SPARQL endpoint at
query.wikidata.orgexposes the unsugared triples — including the reification machinery — and writing performant queries against it requires understanding howwdt:direct properties differ fromp:/ps:/pq:qualified statement triples. -
"RDF is dead — nobody uses the semantic web." The original "agents browse linked data" vision did not happen, but the data model thrived in the integration layer that vision implied. Wikidata, DBpedia, the EU's open data portal, the UK government's data.gov.uk, the BBC's content metadata, the Library of Congress, the FDA's drug labels, and the entire biomedical ontology stack (SNOMED CT, ICD-11, Gene Ontology, ChEBI) all run on RDF. The death-of-RDF narrative is true for the consumer browser experience and false for the data-integration spine of governments, libraries, and pharma.
Going deeper
The pieces above give you the model split. This section sharpens three places where the choice has practical consequences — schema evolution, federation, and the new convergence work — and points at the papers and production systems that make each one concrete.
Schema evolution: when does adding a property break something?
In a property graph the schema is implicit. Add a new key to some :Person nodes — email — and the existing nodes simply do not have it. Queries that did not ask for email keep working unchanged. Queries that filter on email quietly skip the nodes that lack it. This is the headline ergonomic of property graphs and the reason they suit fast-moving product teams; a BharatBazaar engineer can introduce a new edge-type for "wishlisted" without a migration script. The cost is that there is no central place that says what :Person nodes are supposed to have; teams develop ad-hoc Cypher constraints (CREATE CONSTRAINT ON (p:Person) ASSERT p.email IS UNIQUE) that act as guardrails after the fact.
In RDF the schema can be explicit (RDFS/OWL), implicit (no schema, just triples), or both (an OWL ontology plus triples that may not satisfy the ontology). Adding foaf:email to a resource is one new triple; nothing else has to change. But if your ontology declares foaf:email as a FunctionalProperty (at most one value per subject), and you add a second email triple, an OWL reasoner will infer that the two literals are the same — leading to either a contradiction or a surprising merge. RDF gives you the option of a strong contract via OWL, which most application teams refuse and most data-integration teams require. The tension is the design's by-design feature: pay the contract cost only where federation forces you to.
Why the schema-on-read vs schema-by-ontology distinction maps to organisational structure: when one team owns the data, they own the schema, and inline-evolving the schema (property graphs) is faster. When many teams or many organisations contribute data, the ontology is the contract that prevents one team's "name" field from clashing with another team's "name" field — which is why Wikidata, DBpedia, and the EU open data portal could not be built as property graphs. The choice of model encodes a guess about who will write to your data five years from now.
Federation: SPARQL's killer feature, property graphs' open problem
The single capability that justifies RDF for many organisations is the SPARQL SERVICE clause: a SPARQL query can include a sub-query that runs on a remote SPARQL endpoint and joins results back into the local query. Wikidata exposes a public SPARQL endpoint; so does DBpedia, Wikipathways, the European Bioinformatics Institute, and roughly 600 other endpoints listed in the LOD Cloud. A pharma researcher can write a single query that joins their internal RDF dataset with three public ones, and the query planner pushes sub-queries to each remote endpoint without ETL.
Property graphs have no equivalent standard. Cypher does not have a remote-execution operator; if you want to combine data from two Neo4j instances you ETL one into the other or use a custom application-layer aggregator. Neo4j Fabric (introduced in Neo4j 4.0) lets you query multiple Neo4j databases in one session, but only Neo4j; cross-vendor federation is not a thing. This is not a bug, it is a market choice — Neo4j sells one product where federation between organisations is rare; Wikidata's reason to exist is federation. If your problem looks like Wikidata, RDF wins for this reason alone.
RDF-star, SPARQL-star, and ISO GQL: the convergence
The split between the two models is narrowing. RDF-star (also called RDF*) lets you write <<ex:riya ex:knows ex:rahul>> ex:since 2019 — embedding a triple as the subject of another triple — which gives you property-on-edge ergonomics without the five-triple reification dance. SPARQL-star extends the query language to match these embedded triples directly. Apache Jena, GraphDB, and Stardog all ship RDF-star support; the W3C is working on standardisation as part of the RDF 1.2 draft. From the property-graph side, the ISO GQL standard (ratified in 2024, the first new ISO database query language since SQL) consolidates Cypher's syntax and incorporates ideas from SPARQL — bound variables, set semantics, optional matches. Vendors are converging on GQL while preserving Cypher as a dialect.
What this means in practice: in five years, it is plausible that a single graph data model with two surface syntaxes (Cypher-flavoured and SPARQL-flavoured) is the industry default, with the formal-semantics layer being optional rather than entangled with the data model. Until then, the model you pick today is the model you live with for the application's lifetime. The papers worth reading: Hartig's Foundations of RDF-star, Angles et al.'s G-CORE foundational design which informed GQL, and the ISO/IEC 39075:2024 GQL standard itself (the first 30 pages are readable; the rest is grammar).
Where this leads next
The next chapter, native adjacency storage and index-free adjacency, goes one layer down — into the storage representation that makes graph traversals fast regardless of which data model sits on top.
- Cypher and Gremlin: query languages for property graphs — the two dialects that drive Neo4j, JanusGraph, Riverone Neptune, and most production property-graph systems.
- Native adjacency storage and index-free adjacency — the storage trick that makes both models traverse a graph in microseconds per hop.
- SPARQL fundamentals — the W3C-standard query language for RDF, with property paths, named graphs, and the
SERVICEfederation clause. - Knowledge graphs at Wikidata scale — the production case study that motivates most public-facing RDF deployments.
- Vector-plus-graph: hybrid stores like Weaviate — what happens when the graph is also indexed for similarity search.
References
- Neo4j Cypher Manual — the canonical reference for property-graph query syntax and semantics.
- W3C RDF 1.1 Concepts and Abstract Syntax — the formal definition of the RDF data model.
- SPARQL 1.1 Query Language (W3C Recommendation) — the standard for querying RDF.
- Apache Jena documentation — the open-source Java RDF/SPARQL toolkit.
- Robinson, Webber, Eifrem, Graph Databases (2nd ed., O'Reilly, 2015) — the property-graph standard reference.
- Angles, Arenas, Barceló, et al., Foundations of Modern Graph Query Languages (ACM Computing Surveys, 2017) — formal comparison of property-graph and RDF query languages.