GHZ and W States — padho-wiki

Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.

In short

Three qubits can entangle in two genuinely different ways — and no local operation can turn one into the other, even probabilistically. The GHZ state |\text{GHZ}\rangle = \tfrac{1}{\sqrt{2}}(|000\rangle + |111\rangle) is the three-qubit generalisation of the Bell state |\Phi^+\rangle — all-zeros or all-ones, in perfect superposition. It violates local realism more strongly than any Bell state (the Mermin/GHZ inequality), but it is fragile: measure any one qubit and the remaining two collapse to an unentangled product. The W state |W\rangle = \tfrac{1}{\sqrt{3}}(|001\rangle + |010\rangle + |100\rangle) is the other class: exactly one qubit is |1\rangle, symmetric over which. It is robust — losing any one qubit leaves the other two in an entangled state that still violates a Bell inequality. The Dür-Vidal-Cirac theorem (2000) proved these two classes are inequivalent under local operations and classical communication (LOCC). GHZ builds the three-qubit cat; W wires the three-qubit network.

You have met the Bell states. Two qubits, four canonical entangled states, all mutually convertible by local rotations — one class, period. Now push the system up by one qubit and ask the obvious question: what does three-qubit entanglement look like?

The naive guess is "more of the same" — a bigger family of GHZ-like states, each maximally entangled in some suitable sense, each the three-qubit generalisation of a Bell state. That guess is wrong in an interesting way. The Hilbert space of three qubits is 8-complex-dimensional, and the set of states that are genuinely entangled across all three qubits splits into two classes that cannot be connected by any local protocol.

One class is represented by the GHZ state — named after Greenberger, Horne, and Zeilinger, who introduced it in 1989 as the sharpest theoretical knife against local realism. It looks like (|000\rangle + |111\rangle)/\sqrt{2}: the three-qubit cat, "all down" or "all up" in equal superposition. The other class is represented by the W state — three qubits, one of them excited, symmetric over which. It looks like (|001\rangle + |010\rangle + |100\rangle)/\sqrt{3}. Both are genuinely three-partite entangled. Both are orthogonal to every product state. And yet — surprise — there is no local protocol that turns one into the other, not even probabilistically.

This chapter introduces both states, builds their preparation circuits, computes their reduced density matrices, and explains the Dür-Vidal-Cirac classification that proves their inequivalence. The take-home is a single fact worth internalising: the Bell-state story was too clean to generalise. Multipartite entanglement is a zoo, not a library shelf.

From two qubits to three — what changes

With two qubits, every maximally entangled state is a Bell state up to local unitaries. The four Bell states are all convertible into each other by applying single-qubit gates on Alice's or Bob's side alone. There is, in a useful sense, one kind of two-qubit entanglement.

With three qubits, the picture breaks. The Hilbert space \mathcal{H}_{ABC} = \mathcal{H}_A \otimes \mathcal{H}_B \otimes \mathcal{H}_C is 8-dimensional, with computational basis \{|000\rangle, |001\rangle, |010\rangle, |011\rangle, |100\rangle, |101\rangle, |110\rangle, |111\rangle\}. A general three-qubit pure state is a unit vector in this space:

|\psi\rangle_{ABC} = \sum_{i,j,k \in \{0,1\}} c_{ijk}|ijk\rangle, \qquad \sum |c_{ijk}|^2 = 1.

Such a state can be:

Fully separable — |\psi\rangle_{ABC} = |a\rangle_A \otimes |b\rangle_B \otimes |c\rangle_C — no entanglement anywhere.
Biseparable — factors into a product of a single qubit and a two-qubit (possibly entangled) state. For example, |0\rangle_A \otimes |\Phi^+\rangle_{BC}: qubit A is uncoupled, but B and C are entangled with each other. Three such configurations exist (A vs BC, B vs AC, C vs AB).
Genuinely tripartite entangled — cannot be written as a product across any bipartition. The three qubits are bound together as a whole.

The first two are easy to understand — they are extensions of ideas you already have. The third is where the surprise lives. Among genuinely tripartite entangled states, there are at least two inequivalent classes under local operations. GHZ and W are the canonical representatives.

Three-qubit pure states fall into three broad types: fully separable, biseparable, and genuinely tripartite. The genuinely tripartite states themselves split into two LOCC-inequivalent classes — GHZ and W.

The GHZ state — the three-qubit cat

The GHZ state is the first and simplest genuine three-qubit entangled state:

|\text{GHZ}\rangle \;=\; \tfrac{1}{\sqrt{2}}\bigl(|000\rangle + |111\rangle\bigr).

The amplitudes tell a short story. Two of the eight basis states — |000\rangle and |111\rangle — have amplitude 1/\sqrt{2}. The other six have amplitude zero. The three qubits are in perfect lockstep: either all three are |0\rangle, or all three are |1\rangle, with equal superposition.

The GHZ state has amplitude $1/\sqrt{2}$ on exactly two basis states — $|000\rangle$ and $|111\rangle$ — and zero on the other six. It is the three-qubit generalisation of $|\Phi^+\rangle = (|00\rangle + |11\rangle)/\sqrt{2}$.

Preparation circuit

Building a GHZ state from three qubits initialised in |000\rangle takes one Hadamard and two CNOTs:

Start in |000\rangle.
Apply H to qubit 0. State becomes \tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle) \otimes |0\rangle \otimes |0\rangle = \tfrac{1}{\sqrt{2}}(|000\rangle + |100\rangle).
Apply CNOT(0→1). The CNOT flips qubit 1 whenever qubit 0 is |1\rangle, sending |100\rangle \to |110\rangle. State becomes \tfrac{1}{\sqrt{2}}(|000\rangle + |110\rangle).
Apply CNOT(1→2). Flips qubit 2 whenever qubit 1 is |1\rangle, sending |110\rangle \to |111\rangle. State becomes \tfrac{1}{\sqrt{2}}(|000\rangle + |111\rangle) = |\text{GHZ}\rangle.

Why the chain of CNOTs works: the Hadamard puts qubit 0 into the superposition "|0\rangle or |1\rangle with equal amplitude." Each CNOT after it copies the classical value of its control onto its target — in the |0\rangle branch, both targets stay |0\rangle; in the |1\rangle branch, both targets get flipped to |1\rangle. The result is a branch-preserving propagation of qubit 0's value through to qubits 1 and 2 — exactly the all-zeros-or-all-ones structure of GHZ.

The three-gate circuit that prepares a GHZ state on three qubits initialised to $|000\rangle$: $H$ on the first qubit, then a cascade of two CNOTs that propagate the superposition to the other two.

Verifying the amplitudes — Example 1

Example 1 — derive the GHZ state from the circuit, step by step

Apply the circuit \text{CNOT}(1\to 2) \cdot \text{CNOT}(0\to 1) \cdot (H \otimes I \otimes I) to |000\rangle and verify that the output has amplitude 1/\sqrt{2} on |000\rangle and |111\rangle, and zero on the other six basis states.

Step 1 — Apply H to qubit 0. The Hadamard sends |0\rangle \mapsto \tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle). The other two qubits are untouched, so:

(H \otimes I \otimes I)|000\rangle = \tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle) \otimes |0\rangle \otimes |0\rangle = \tfrac{1}{\sqrt{2}}|000\rangle + \tfrac{1}{\sqrt{2}}|100\rangle.

Why the tensor product just distributes: H acts only on qubit 0, and the identity I on qubits 1 and 2 leaves them alone. By the bilinearity of the tensor product, acting on a product state of three qubits means acting on each factor separately.

Step 2 — Apply CNOT(0→1). CNOT flips qubit 1 iff qubit 0 is |1\rangle. Process each basis-state term:

On |000\rangle: qubit 0 is |0\rangle, target unchanged → |000\rangle.
On |100\rangle: qubit 0 is |1\rangle, target flips → |110\rangle.

So the state becomes:

\tfrac{1}{\sqrt{2}}|000\rangle + \tfrac{1}{\sqrt{2}}|110\rangle.

Step 3 — Apply CNOT(1→2). CNOT flips qubit 2 iff qubit 1 is |1\rangle. Process:

On |000\rangle: qubit 1 is |0\rangle, target unchanged → |000\rangle.
On |110\rangle: qubit 1 is |1\rangle, target flips → |111\rangle.

State:

\tfrac{1}{\sqrt{2}}|000\rangle + \tfrac{1}{\sqrt{2}}|111\rangle.

Step 4 — Read the amplitudes. In the 8-dimensional column-vector form (c_{000}, c_{001}, c_{010}, c_{011}, c_{100}, c_{101}, c_{110}, c_{111}), the output is:

\bigl(\tfrac{1}{\sqrt{2}},\,0,\,0,\,0,\,0,\,0,\,0,\,\tfrac{1}{\sqrt{2}}\bigr).

Why exactly these two entries are non-zero: the circuit starts in |000\rangle, creates a two-branch superposition at step 1 (|000\rangle and |100\rangle), then each CNOT rewires the |1\rangle branch by flipping a bit, but never creates new branches. Two terms in, two terms out, throughout — the circuit is a straight-line propagator, not a branch-multiplier.

Result. The circuit's output is \tfrac{1}{\sqrt{2}}(|000\rangle + |111\rangle) = |\text{GHZ}\rangle. Norm check: |1/\sqrt{2}|^2 + |1/\sqrt{2}|^2 = \tfrac{1}{2} + \tfrac{1}{2} = 1. Unit vector, as every quantum state must be.

What this shows. The GHZ state isn't a theoretical curiosity written down abstractly — it falls out of three elementary gates. Any quantum computer that can do H and CNOT can prepare GHZ states, and every major public hardware platform (Compustar Quantum, Querion, IonQ, Quantinuum) does so routinely as a calibration and benchmarking target.

The GHZ state as an 8-component complex column vector. The only non-zero entries are at the all-zeros and all-ones positions, each with amplitude $1/\sqrt{2}$.

Reduced state of one GHZ qubit — Example 2

Here is the first concrete sign that GHZ is fragile. Trace out any one of the three qubits and see what happens to the reduced state of the remaining two — or of just one.

Example 2 — the reduced density matrix of a single GHZ qubit

Compute \rho_0 = \text{tr}_{1,2}(|\text{GHZ}\rangle\langle \text{GHZ}|) — the state of qubit 0 alone when the other two are ignored.

Step 1 — Form the joint density matrix. With |\text{GHZ}\rangle = \tfrac{1}{\sqrt{2}}(|000\rangle + |111\rangle):

\rho_{012} = |\text{GHZ}\rangle\langle\text{GHZ}| = \tfrac{1}{2}\bigl(|000\rangle + |111\rangle\bigr)\bigl(\langle 000| + \langle 111|\bigr).

Expanding:

\rho_{012} = \tfrac{1}{2}\bigl(|000\rangle\langle 000| + |000\rangle\langle 111| + |111\rangle\langle 000| + |111\rangle\langle 111|\bigr).

Step 2 — Trace out qubit 2 first. Use the partial-trace rule term by term: for each outer product |abc\rangle\langle def|, isolate the qubit-2 factor |c\rangle\langle f| and apply \text{tr}(|c\rangle\langle f|) = \langle f|c\rangle.

|000\rangle\langle 000|: qubit-2 factor is |0\rangle\langle 0|, trace =1 → contributes |00\rangle\langle 00| on qubits 0, 1.
|000\rangle\langle 111|: qubit-2 factor is |0\rangle\langle 1|, trace = \langle 1|0\rangle = 0 → contributes 0.
|111\rangle\langle 000|: qubit-2 factor is |1\rangle\langle 0|, trace =\langle 0|1\rangle = 0 → contributes 0.
|111\rangle\langle 111|: qubit-2 factor is |1\rangle\langle 1|, trace =1 → contributes |11\rangle\langle 11|.

So:

\rho_{01} = \text{tr}_2(\rho_{012}) = \tfrac{1}{2}\bigl(|00\rangle\langle 00| + |11\rangle\langle 11|\bigr).

Why the cross-terms vanish: the coherent off-diagonal pieces of the GHZ density matrix sit between |000\rangle and |111\rangle — states that disagree on every qubit. Tracing over any single qubit inserts a factor \langle 0|1\rangle = 0 from that qubit's mismatched labels, killing the cross-term. The coherence between the "all zero" and "all one" branches is destroyed the moment you ignore any qubit. This is exactly the GHZ fragility you will see in the classification theorem below.

Step 3 — Trace out qubit 1 from what remains. Apply the same rule to \rho_{01}:

|00\rangle\langle 00| = |0\rangle\langle 0|_0 \otimes |0\rangle\langle 0|_1: qubit-1 trace =1 → contributes |0\rangle\langle 0|_0.
|11\rangle\langle 11| = |1\rangle\langle 1|_0 \otimes |1\rangle\langle 1|_1: qubit-1 trace =1 → contributes |1\rangle\langle 1|_0.

\rho_0 = \text{tr}_1(\rho_{01}) = \tfrac{1}{2}|0\rangle\langle 0| + \tfrac{1}{2}|1\rangle\langle 1| = \tfrac{1}{2}I = \begin{pmatrix}\tfrac{1}{2} & 0 \\ 0 & \tfrac{1}{2}\end{pmatrix}.

Step 4 — Interpret. \rho_0 = I/2 is the maximally mixed state on a single qubit — a 50/50 classical coin. Purity \text{tr}(\rho_0^2) = \tfrac{1}{2}, as low as it gets on a qubit. Looking at one GHZ qubit alone, you see nothing but noise: no phase, no preferred direction, no information.

Result. \rho_0 = I/2. The single-qubit reduction of GHZ is maximally mixed across every bipartition. In the "A versus BC" cut, GHZ is maximally entangled — Alice's reduction is pure noise.

What this shows. GHZ is maximally entangled in the bipartite sense: cut it in any one-vs-two way, and the one side is maximally mixed. This matches the Schmidt-decomposition story (ch.40) — the maximum reduction entropy is \log 2 = 1 bit, and GHZ achieves it on every cut. But the step-2 result, \rho_{01} = \tfrac{1}{2}(|00\rangle\langle 00| + |11\rangle\langle 11|), is more revealing: it is a classical mixture of |00\rangle and |11\rangle, with zero off-diagonal coherence. After tracing out one qubit, the remaining two are no longer entangled — they are classically correlated (always agree on their bits) but not quantumly so. This is the first signal of GHZ's fragility: lose one qubit and the entanglement between the other two vanishes.

Tracing out one GHZ qubit leaves a classical mixture of $|00\rangle$ and $|11\rangle$ — no entanglement remains between the two surviving qubits. Tracing out a second qubit leaves $I/2$. GHZ is fragile under loss.

The W state — the robust alternative

The W state tells a different story:

|W\rangle \;=\; \tfrac{1}{\sqrt{3}}\bigl(|001\rangle + |010\rangle + |100\rangle\bigr).

Three non-zero amplitudes, each equal to 1/\sqrt{3}. Every basis state with exactly one |1\rangle gets an amplitude; all others are zero. The W state is symmetric under any permutation of the three qubits — swap qubit 1 with qubit 2, the state is unchanged, and likewise for any permutation. It describes "one excitation, distributed equally over three sites."

The "W" comes from Wolfgang Dür, the lead author of the 2000 paper that first identified this state as the canonical representative of the second three-qubit entanglement class. "W" for Wolfgang, and also a useful letter for contrasting with GHZ.

The W state has equal amplitude on the three basis states with exactly one $|1\rangle$, and zero elsewhere. It is permutation-symmetric across the three qubits.

Preparation circuit

Preparing |W\rangle is noticeably harder than preparing |\text{GHZ}\rangle. The standard textbook construction uses controlled rotations and CNOTs with specific angles, structured so that the amplitude gets divided into three equal branches. Here is a clean version (due originally to Diker, 2016; many equivalent circuits exist):

Start in |000\rangle.
Apply R_y(2\theta_1) on qubit 0, with \theta_1 = \arccos(1/\sqrt{3}). This rotates |0\rangle \mapsto \sqrt{2/3}|0\rangle + \sqrt{1/3}|1\rangle.
Controlled on qubit 0 being |0\rangle, apply R_y(2\theta_2) on qubit 1 with \theta_2 = \pi/4. This splits the |0\rangle branch of qubit 0 evenly between qubits 1 and 2.
Controlled on qubit 1 being |1\rangle, apply CNOT(1→0) and X on qubit 2 conditioned on a specific pattern — the details rearrange the bits so that exactly one qubit per branch carries the |1\rangle.

The angles \theta_1 = \arccos(1/\sqrt{3}) and \theta_2 = \pi/4 make the three output branches carry exactly 1/\sqrt{3} each. Working through the full circuit is a useful exercise in tracking amplitudes, but the take-home is this: W states require amplitude-dependent controls (rotations with specific angles), whereas GHZ states require only the two gates H and CNOT. This is the first hardware clue that GHZ and W sit in different corners of the gate-set landscape.

A W-state preparation circuit. Two amplitude-specific controlled rotations ($R_y(2\theta_1)$ and $R_y(2\theta_2)$) split the amplitude into three equal branches; the trailing gates rearrange bits so exactly one qubit is excited per branch.

Reduced state of one W qubit

Now the W-state reduction. Trace out qubit 2 from |W\rangle and look at what qubits 0, 1 inherit.

The density matrix is \rho_{012} = |W\rangle\langle W|. Write |W\rangle = \tfrac{1}{\sqrt{3}}(|001\rangle + |010\rangle + |100\rangle). The density matrix has nine terms — three diagonal and six off-diagonal. Apply \text{tr}_2 term by term; for each outer product |abc\rangle\langle def|, pull out the factor \langle f|c\rangle from qubit 2.

Doing the bookkeeping carefully (each cross-term between basis states with the same qubit-2 label contributes; those with different qubit-2 labels vanish):

\rho_{01} = \text{tr}_2(|W\rangle\langle W|) = \tfrac{1}{3}\bigl(|00\rangle\langle 00| + |01\rangle\langle 01| + |01\rangle\langle 10| + |10\rangle\langle 01| + |10\rangle\langle 10|\bigr).

That last fact is the crux. The cross-terms between |010\rangle and |100\rangle have the same value of qubit 2 (both zero), so the partial trace does not kill them. Some coherence survives — the W state leaves genuine entanglement behind after the trace.

To see it clearly: \rho_{01} in the \{|00\rangle, |01\rangle, |10\rangle, |11\rangle\} basis is the 4\times 4 matrix

\rho_{01} = \tfrac{1}{3}\begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 0\end{pmatrix}.

Why this matrix is still entangled: the 2\times 2 block in the |01\rangle, |10\rangle subspace is \tfrac{1}{3}\begin{pmatrix}1 & 1 \\ 1 & 1\end{pmatrix}, which has non-zero off-diagonals — coherence, not just classical correlation. That block is proportional to |\Psi^+\rangle\langle \Psi^+| where |\Psi^+\rangle = \tfrac{1}{\sqrt{2}}(|01\rangle + |10\rangle) is a Bell state. The residual entanglement in the "one excitation" sector is literally a Bell-state fragment, weighted against the product-like term |00\rangle\langle 00|.

Trace out qubit 1 further to get the reduced state on qubit 0 alone:

\rho_0 = \text{tr}_1(\rho_{01}) = \tfrac{1}{3}\bigl(|0\rangle\langle 0| + |0\rangle\langle 0| + |1\rangle\langle 1|\bigr) = \tfrac{2}{3}|0\rangle\langle 0| + \tfrac{1}{3}|1\rangle\langle 1| = \begin{pmatrix}\tfrac{2}{3} & 0 \\ 0 & \tfrac{1}{3}\end{pmatrix}.

This is a biased but still mixed state — purity \text{tr}(\rho_0^2) = (2/3)^2 + (1/3)^2 = 5/9, less than 1 but more than 1/2. The single-qubit reduction of W is less mixed than the single-qubit reduction of GHZ.

Tracing out one qubit of the W state leaves the remaining two qubits in a genuinely entangled mixed state (the central $2\times 2$ block has non-zero coherence). Tracing out a second leaves a biased mixed state with purity $5/9$. W is robust under qubit loss in a sense GHZ is not.

GHZ versus W — the comparison

The two states are both genuinely three-qubit entangled. They both have \rho_0 mixed. They both appear in every decent multi-qubit demonstration on real hardware. But the differences matter:

Property	GHZ	W
Amplitude structure	Two non-zero terms	Three non-zero terms
Preparation gates	H + 2 CNOTs	Rotations with specific angles + CNOTs
Single-qubit reduction \rho_0	I/2 (maximally mixed)	\text{diag}(\tfrac{2}{3}, \tfrac{1}{3}) (biased)
Purity of \rho_0	\tfrac{1}{2}	\tfrac{5}{9}
Two-qubit reduction \rho_{01}	Classical mixture — not entangled	Bell-state fragment — still entangled
Behaviour on qubit loss	Entanglement destroyed	Entanglement preserved
Bell-inequality violation	Strongest — via Mermin/GHZ inequality	Moderate — still violates CHSH
LOCC equivalence	Different class from W	Different class from GHZ
Symmetry	Symmetric under all permutations	Symmetric under all permutations
Typical application	Error correction, quantum cat demos	Quantum networks, decoherence-resistant coding

The single most important row is the two-qubit reduction. Trace out one qubit of a GHZ state, and you are left with classical correlation; trace out one qubit of a W state, and you are left with quantum entanglement. This is the operational signature: GHZ is brittle, W is robust.

GHZ and W differ sharply under qubit loss. Losing any qubit of GHZ collapses the other two into a classical mixture. Losing any qubit of W leaves the other two in an entangled state that still violates a Bell inequality.

The Dür-Vidal-Cirac theorem — why GHZ and W cannot be interconverted

All of the above is prelude. The fact that makes GHZ and W genuinely different is a theorem, proved in 2000 by Dür, Vidal, and Cirac: no local protocol can turn a GHZ state into a W state, or vice versa, even probabilistically.

The setup is called LOCC — local operations and classical communication. Three parties, each holding one qubit, can do anything they like to their own qubit (apply gates, make measurements, add ancillas) and can talk to each other over classical channels. The question: given a state of one type (say GHZ), can the parties, working only with LOCC, transform it into a state of the other type (say W), with any non-zero probability of success?

Dür, Vidal, and Cirac proved: no. Not even with a tiny probability. The two classes are completely separated under LOCC, even stochastic LOCC.

More precisely, they classified the three-qubit pure states into exactly six LOCC-equivalence classes:

Fully separable — |a\rangle|b\rangle|c\rangle, three independent single-qubit states.
A–BC biseparable — |a\rangle \otimes |\Psi\rangle_{BC}, qubit A alone and BC entangled.
B–AC biseparable — |b\rangle \otimes |\Psi\rangle_{AC}.
C–AB biseparable — |c\rangle \otimes |\Psi\rangle_{AB}.
GHZ class — all states LOCC-convertible to |\text{GHZ}\rangle (with non-zero probability).
W class — all states LOCC-convertible to |W\rangle (with non-zero probability).

The six classes are pairwise LOCC-incomparable — no conversion between any two of them, in either direction. The two-qubit story has one "truly entangled" class (Bell); the three-qubit story splits it into two irreconcilable classes. Add a fourth qubit and the classification explodes into a continuous family of inequivalent classes, uncountable in any useful sense. Three qubits sit at exactly the boundary where multipartite entanglement starts being a zoo.

Why the proof works (in one paragraph): the Dür-Vidal-Cirac proof is an invariant-theoretic argument. They compute a polynomial invariant — the 3-tangle \tau_3 — on each three-qubit pure state and show it is preserved under invertible LOCC. For |\text{GHZ}\rangle, the 3-tangle is 1 (maximal). For |W\rangle, it is 0. Since LOCC cannot change the 3-tangle (up to the non-invertible parts, which only decrease it), and the two states sit at the extreme values, no LOCC protocol can connect them. The invariant divides the space.

Common confusions

"Three-qubit entanglement is just GHZ." GHZ is one class. W is another. Biseparable states are three more. "Three-qubit entanglement" is a category with internal structure, not a single state type. If an article or slide says "the three-qubit entangled state" in the singular, it is wrong — there is no single canonical representative. The curriculum should have said "a three-qubit entangled state."
"GHZ is the most entangled three-qubit state." This depends on the entanglement measure. GHZ maximises the 3-tangle (a specific polynomial invariant). W has zero 3-tangle but maximises a different measure — the sum of bipartite entanglements across the three cuts (A-vs-BC, B-vs-AC, C-vs-AB). For that measure, W exceeds GHZ. There is no single winner — "maximal" depends on which direction of the entanglement zoo you look from.
"'Robust to qubit loss' means the state is unchanged when you lose a qubit." It doesn't. It means the reduced state on the surviving qubits still contains entanglement, not that the surviving qubits are still in the original state. You cannot recover the full W state from its two-qubit reduction; you just see that there is non-trivial quantum structure left, as opposed to a classical mixture.
"W can be made from GHZ by applying single-qubit gates." No. Single-qubit gates are local operations; if they could turn GHZ into W, LOCC would connect the two classes, contradicting Dür-Vidal-Cirac. This is a concrete example of how LOCC incomparability blocks every conceivable local protocol, not just the simple-minded gate sequences.
"GHZ and W violate Bell inequalities the same way." They violate different Bell-type inequalities, at different strengths. GHZ violates the Mermin inequality — a three-party generalisation of CHSH — all the way to its algebraic bound (in Mermin's original setup, 4 vs a classical bound of 2). W does not saturate the Mermin inequality but does violate CHSH on its reductions to pairs, precisely because the reductions retain entanglement. The experimental signatures are different.
"There are lots of three-qubit entanglement classes." Under LOCC, there are exactly six classes (counting biseparable as three). Under invertible LOCC alone (no probabilistic conversion, only deterministic reversible protocols), there are still six. Four qubits open up a continuum of classes; three qubits are at the sweet spot where the classification is finite and clean.

Going deeper

The two representative states, their circuits, their reduced density matrices, and the qualitative meaning of LOCC inequivalence are the take-home. What follows is the technical backbone: the Dür-Vidal-Cirac classification via polynomial invariants, quantitative measures of multipartite entanglement, the GHZ Mermin inequality, the role of GHZ and W states in quantum error correction and quantum networks, and a preview of multipartite Schmidt-like decompositions.

Polynomial invariants and the 3-tangle

The Dür-Vidal-Cirac classification rests on a polynomial invariant of three-qubit states called the 3-tangle or Cayley hyperdeterminant:

\tau_3(|\psi\rangle) = 4|d_{ABC}|,

where d_{ABC} is a quartic polynomial in the amplitudes c_{ijk}. For a generic three-qubit pure state, \tau_3 \in [0, 1], with \tau_3 = 1 for |\text{GHZ}\rangle and \tau_3 = 0 for |W\rangle. Invertible LOCC preserves \tau_3 up to a positive rescaling. The strict separation \tau_3 > 0 for GHZ-class states and \tau_3 = 0 for W-class states (and biseparable states) is the invariant that proves the classes are disjoint.

This is the tip of a large iceberg. For four qubits, the invariant-theoretic classification is due to Verstraete, Dehaene, De Moor, and Verschelde (2002), producing nine classes. For five or more qubits, a complete classification is no longer possible — the moduli space of entanglement classes becomes high-dimensional, and only families of invariants (not a finite list) distinguish them. Multipartite entanglement classification is, in this sense, an open problem for \geq 4 qubits.

Genuine multipartite entanglement measures

Two qubits have one natural measure of entanglement (entanglement entropy, or equivalently concurrence). Three or more qubits require a choice among many:

3-tangle \tau_3 — zero on biseparable and W states, maximal on GHZ.
Sum of bipartite entropies — maximal on W, smaller on GHZ.
Entanglement of formation across different cuts — captures the cost to prepare.
Relative entropy of entanglement — distance from the convex hull of separable states.
Geometric entanglement — maximum overlap with a product state, subtracted from 1.

Each measure captures a different operational aspect. No single measure dominates. In particular, "GHZ is more entangled than W" is not a defendable statement without specifying which measure.

The Mermin-GHZ inequality

The sharpest demonstration of GHZ's quantum-mechanical extremity is the Mermin inequality (1990), also called the GHZ inequality. It is a three-party analogue of the CHSH inequality, designed so that classical local-hidden-variable theories satisfy one bound and quantum GHZ states saturate a strictly larger one.

The setup: three parties, each making a measurement in one of two settings (X or Y). For any local hidden-variable theory, the expected value of a specific product of four correlation functions is bounded by 2. For quantum mechanics on a GHZ state, that same expected value reaches 4. The violation is "all-or-nothing" — it arises from a single run of the experiment under specific measurement settings, not from averaged statistics. In the Mermin-GHZ setup, there are four combinations of measurement choices for which the GHZ state produces a deterministic outcome that no classical theory can reproduce. Pan, Bouwmeester, Daniell, Weinfurter, and Zeilinger demonstrated this experimentally in 2000, using entangled photons — the experimental confirmation of a three-particle Bell-type violation.

GHZ in quantum error correction

The three-qubit bit-flip code and phase-flip code — the simplest quantum error-correcting codes — encode a single logical qubit into a three-qubit entangled state that looks a lot like GHZ:

|0_L\rangle = |000\rangle, \qquad |1_L\rangle = |111\rangle,

with a logical-+ or logical-- state corresponding to \tfrac{1}{\sqrt{2}}(|000\rangle \pm |111\rangle). The superposition side of the code space is a GHZ state. A single bit-flip error on any physical qubit moves the code state out of the GHZ span into an orthogonal subspace, which stabiliser measurements detect. The three-qubit repetition code scales up to the 5-qubit, 7-qubit Steane, and surface codes used in practice, but the three-qubit GHZ-like structure is the prototype.

W in quantum networks

W states are the natural resource for quantum networking scenarios where loss is the dominant error. If a quantum network distributes a W state across three nodes and one node's qubit is lost or decoheres, the remaining two nodes still share an entangled state — a Bell fragment — that can be used for teleportation, QKD, or other protocols. GHZ states, by contrast, collapse to classical correlations under any single-qubit loss, which makes them unsuitable for loss-tolerant networking.

The National Quantum Mission, India's eight-year flagship initiative launched in 2023 with ₹6,000 crore, includes multipartite-entanglement distribution among its mission targets. The Raman Research Institute in Bangalore has active research on generating W-like states in photonic systems for quantum-network applications, and IIT Madras, IISc Bangalore, and the TIFR Mumbai group are working on multipartite protocols that exploit the robustness of W-class states.

Beyond three qubits — towards a Schmidt-like tool

Bipartite pure states have the Schmidt decomposition (ch.40 coming up). Multipartite pure states do not have a single canonical decomposition of the same kind — there is no general "Schmidt decomposition of a three-qubit state" into a finite sum of product terms that is both always achievable and informative. What exists are partial analogues: the tensor-train decomposition, the higher-order singular-value decomposition (HOSVD), and matrix-product states (MPS). Each captures some of the intuition of Schmidt in higher-partite settings, but none exactly generalises it. This is why the multipartite zoo is a zoo: without a diagonalisation-like canonical form, there is no clean inventory of classes.

Where this leads next

Bell states — the two-qubit cousins; the template that three-qubit states generalise in two inconsistent directions.
Entanglement, defined — the definition that this chapter extends to three parties.
Schmidt decomposition — the bipartite canonical form; the tool that works cleanly for two parties and less cleanly for three.
Entanglement classification — the broader taxonomy: LOCC, SLOCC, bipartite cuts, invariants.
Quantum network basics — where multipartite entanglement distribution and loss-robust protocols become real engineering.
The partial trace — the operation that revealed GHZ's fragility and W's robustness side by side.

References

W. Dür, G. Vidal, J. I. Cirac, Three qubits can be entangled in two inequivalent ways (2000) — arXiv:quant-ph/0005115. The classification theorem.
Wikipedia, Greenberger-Horne-Zeilinger state — definition, history, and the GHZ inequality.
Wikipedia, W state — the state, its properties, and applications in quantum networking.
John Preskill, Lecture Notes on Quantum Computation, Ch. 4 on multipartite entanglement — theory.caltech.edu/~preskill/ph229.
Nielsen and Chuang, Quantum Computation and Quantum Information (2010), §2.4 and §12.5 on multipartite entanglement — Cambridge University Press.
Jian-Wei Pan, Dik Bouwmeester, M. Daniell, Harald Weinfurter, Anton Zeilinger, Experimental test of quantum nonlocality in three-photon GHZ entanglement (2000) — Nature 403, 515. The first experimental Mermin-GHZ violation.