In short

In 1964, John Bell proved that no local hidden-variable theory — any classical model in which distant particles carry pre-existing outcomes determined by local variables — can reproduce all the correlations of quantum mechanics. The cleanest test is the CHSH inequality: a specific combination of four correlation expectation values, S = \langle A_0 B_0\rangle + \langle A_0 B_1\rangle + \langle A_1 B_0\rangle - \langle A_1 B_1\rangle. Every local classical theory is forced by algebra to satisfy |S| \leq 2. Quantum mechanics, on a Bell state with the right measurement angles, reaches S = 2\sqrt{2} \approx 2.828 — the Tsirelson bound. Aspect (1982), then Hensen, Giustina, and Shalm (all 2015, loophole-free) measured values clearly above 2, closing the question experimentally. Bell violation does not enable faster-than-light signalling — the no-communication theorem forbids it — but it is the operational core of device-independent quantum cryptography, of the security argument behind satellite-based QKD, and of the sharpest statement we have of what is quantum about quantum mechanics.

In 1935, three physicists in Princeton wrote a paper whose title was a question. Albert Einstein, Boris Podolsky, and Nathan Rosen asked: can quantum-mechanical description of physical reality be considered complete? Their answer was no. Quantum mechanics, they argued, must be an incomplete description of the world — because it seemed to allow two particles, prepared together and separated by any distance, to produce correlated measurement outcomes that no classical story could explain without invoking instantaneous influence across space. Einstein coined his famous dismissal: "spooky action at a distance."

The EPR paper, as it is now known, did not kill quantum mechanics. But it did sharpen the question: if quantum mechanics gives correct predictions, is there some deeper, classical-looking theory underneath? One in which each particle secretly carries "hidden variables" — instructions it received at preparation — that determine outcomes locally, without any action at a distance?

For nearly thirty years, that question sat open as a matter of philosophy. Then, in 1964, an Irish physicist at CERN named John Stewart Bell turned it into a theorem. He showed that any local hidden-variable theory must satisfy a specific inequality on measurement correlations — and that quantum mechanics, in certain experiments, predicts a violation. The question was no longer philosophical. It was an experimental target.

This chapter tells you what Bell proved, derives the cleanest form of the inequality (the CHSH inequality, due to Clauser, Horne, Shimony, and Holt in 1969), shows you why a Bell state reaches 2\sqrt{2}, and surveys the experiments — culminating in the 2015 loophole-free trio — that put the question to rest. It also does an honest hype-check on the word "non-local," which popular accounts persistently misread as "faster than light."

The setup — a two-player game

Picture the experiment as a game, because that is exactly what it is. Two players, Alice and Bob, are separated. They cannot communicate during the game. A referee gives each of them a single-bit input: Alice gets x \in \{0, 1\}, Bob gets y \in \{0, 1\}. Each of them must output a single bit: Alice outputs a \in \{-1, +1\}, Bob outputs b \in \{-1, +1\}. (We use \pm 1 instead of \{0, 1\} because the algebra of expectation values is cleaner.)

The referee scores them by computing a specific correlation quantity after many rounds:

S = \langle A_0 B_0\rangle + \langle A_0 B_1\rangle + \langle A_1 B_0\rangle - \langle A_1 B_1\rangle,

where \langle A_x B_y\rangle is the average product of Alice's output times Bob's output, taken over all rounds where Alice got input x and Bob got input y. The three plus signs and one minus sign are deliberate, and they are the whole reason the game is non-trivial.

The CHSH experimental setupA source in the centre emits an entangled pair of particles, one travelling left to Alice, one travelling right to Bob. Alice has a detector with two measurement settings labelled A zero and A one; Bob has a detector with two settings labelled B zero and B one. Each chooses a setting independently for each pair. The outputs are plus one or minus one. A referee at the bottom collects all outcomes and computes the CHSH quantity S.source|Φ⁺⟩Alicesetting x ∈ {0, 1}A_x: measure axisoutput a = ±1Bobsetting y ∈ {0, 1}B_y: measure axisoutput b = ±1referee computes SS = ⟨A₀B₀⟩ + ⟨A₀B₁⟩ + ⟨A₁B₀⟩ − ⟨A₁B₁⟩pairs distributed; inputs x, y chosen independently; outputs ±1
The CHSH setup. A source emits pairs of particles. Alice and Bob, far apart, each receive one. Each chooses one of two measurement settings. Each produces an output in $\{-1, +1\}$. The CHSH quantity $S$ combines the four correlation averages with three plus signs and one minus sign.

Two questions hang over the game:

  1. What is the best Alice and Bob can do classically? Meaning: before the game, they agree on any strategy whatsoever — they may share as much classical randomness as they like, any pre-arranged lookup table, any hidden variables that were set at the source. But during the game, they cannot communicate. What is the maximum value of S they can achieve?
  2. What is the best they can do quantumly? Meaning: they may share any entangled quantum state prepared at the source, and their measurement boxes are quantum devices. What is the maximum value of S then?

Bell's theorem is the statement that the answers to these two questions are different, and that the difference is experimentally measurable.

The classical bound — |S| \leq 2

This is where the algebra pays for itself. A local hidden-variable theory is any model in which Alice's output depends only on (i) her input x and (ii) some hidden variable \lambda shared with Bob at preparation, and Bob's output depends only on y and \lambda. The locality is the key clause: Alice's output does not depend on Bob's input, and vice versa.

In such a model, let A_x(\lambda) \in \{-1, +1\} be Alice's output given input x and hidden variable \lambda, and similarly B_y(\lambda) \in \{-1, +1\}. The correlation is the average over the distribution of \lambda:

\langle A_x B_y\rangle = \int A_x(\lambda)\,B_y(\lambda)\,\rho(\lambda)\,d\lambda,

where \rho(\lambda) is the probability distribution over hidden variables.

Now look at the combination S(\lambda) = A_0(\lambda)B_0(\lambda) + A_0(\lambda)B_1(\lambda) + A_1(\lambda)B_0(\lambda) - A_1(\lambda)B_1(\lambda), evaluated for a single value of \lambda. Factor the first two terms and the last two:

S(\lambda) = A_0(\lambda)\bigl[B_0(\lambda) + B_1(\lambda)\bigr] + A_1(\lambda)\bigl[B_0(\lambda) - B_1(\lambda)\bigr].

Why factor this way: B_0, B_1 \in \{-1, +1\}. Their sum B_0 + B_1 is \pm 2 if they agree, 0 if they disagree. Their difference B_0 - B_1 is the opposite: 0 if they agree, \pm 2 if they disagree. So at most one of the two bracketed quantities is non-zero for any given \lambda — the other is automatically zero.

Exactly one of the two brackets is \pm 2 and the other is 0, so |S(\lambda)| \leq 2 for every \lambda. Averaging over \lambda (which can only shrink things, since averaging a bounded quantity gives something bounded):

|S| = \left|\int S(\lambda)\,\rho(\lambda)\,d\lambda\right| \leq \int |S(\lambda)|\,\rho(\lambda)\,d\lambda \leq 2.

That is the CHSH inequality: for any local hidden-variable theory, |S| \leq 2. The derivation took four lines. It holds for any shared randomness, any lookup table, any strategy built on any classical resource, no matter how elaborate.

Factoring S(λ) — why the classical bound is 2A two-column visual. Left column: S of lambda equals A zero times the sum of B zero and B one, plus A one times the difference of B zero and B one. Right column: a case table showing if B zero equals B one then the sum is plus or minus two and the difference is zero, or if B zero does not equal B one then the sum is zero and the difference is plus or minus two. In either case exactly one bracket survives with magnitude two, so S of lambda has magnitude at most two.S(λ) = A₀(B₀ + B₁) + A₁(B₀ − B₁)case: B₀ = B₁B₀ + B₁ = ±2B₀ − B₁ = 0S(λ) = ±2 · A₀|S(λ)| = 2case: B₀ ≠ B₁B₀ + B₁ = 0B₀ − B₁ = ±2S(λ) = ±2 · A₁|S(λ)| = 2
Why classical $|S| \leq 2$. For any hidden variable $\lambda$, either $B_0$ and $B_1$ agree — making the first bracket $\pm 2$ and the second $0$ — or they disagree, swapping the roles. Exactly one bracket survives; the other vanishes.

Example 1 — the best classical strategy saturates $S = 2$

Suppose Alice and Bob agree in advance that Alice will always output +1 regardless of her input, and Bob will output +1 when his input is 0 or 1 — again always +1. This is deterministic: no randomness, no tricks. Compute the correlators:

\langle A_0 B_0\rangle = (+1)(+1) = +1, \quad \langle A_0 B_1\rangle = +1, \quad \langle A_1 B_0\rangle = +1, \quad \langle A_1 B_1\rangle = +1.

Then

S = 1 + 1 + 1 - 1 = 2.

The all-+1 strategy saturates the classical bound. The minus sign in the CHSH combination is the whole reason they cannot do better: it asks the correlations to be aligned on three measurement-setting pairs and anti-aligned on the fourth, which is algebraically incompatible with fixed classical answers.

What this shows. Saturating |S| = 2 is easy for a classical team; exceeding it is impossible. If Alice and Bob ever produce S > 2 in the lab, their behaviour is not explicable by any local classical strategy — no matter how they prepare, no matter what random bits they share, no matter what lookup tables they use. That is the content of Bell's theorem in one sentence.

The quantum bound — |S| \leq 2\sqrt 2

Now let Alice and Bob share the Bell state |\Phi^+\rangle = \tfrac{1}{\sqrt 2}(|00\rangle + |11\rangle). Each of them will measure along a direction on the Bloch sphere: Alice in direction \vec{a}_x (depending on her input x \in \{0, 1\}), Bob in direction \vec{b}_y.

The observable "measure spin along direction \vec{n}" is the Pauli operator \vec{n} \cdot \vec{\sigma} = n_x X + n_y Y + n_z Z, where X, Y, Z are the Pauli matrices. Its eigenvalues are \pm 1, which is why we chose \{-1, +1\} as the output alphabet.

For the Bell state, the correlation of Alice measuring along \vec{a} and Bob along \vec{b} is the textbook result (derived from a direct computation of \langle \Phi^+ |\, (\vec{a}\cdot\vec\sigma) \otimes (\vec{b}\cdot\vec\sigma)\, |\Phi^+\rangle):

\langle A(\vec a)\, B(\vec b)\rangle_{\Phi^+} = \vec{a} \cdot \vec{b} = \cos\theta_{ab},

where \theta_{ab} is the angle between the two measurement axes. The correlation is just the dot product; cosines of angles between the Bloch-sphere directions.

Why the correlation is \vec a \cdot \vec b on |\Phi^+\rangle: the Bell state has a rotational structure — (U \otimes U^*)|\Phi^+\rangle = |\Phi^+\rangle for any single-qubit unitary U. This symmetry forces the expectation value to depend only on the angle between the measurement axes, not on their absolute orientation. A direct calculation in the computational basis nails down the dependence as the cosine. The cleanest derivation runs through Schmidt decomposition; the details can wait for ch.22.

Now pick four directions, one for each setting:

The angles between each Alice-Bob pair work out to:

Plug into the CHSH sum:

S = \tfrac{1}{\sqrt 2} + \tfrac{1}{\sqrt 2} + \tfrac{1}{\sqrt 2} - \bigl(-\tfrac{1}{\sqrt 2}\bigr) = \tfrac{4}{\sqrt 2} = 2\sqrt 2.

S = 2\sqrt 2 \approx 2.828. Above the classical bound of 2 by a factor of \sqrt 2. The quantum strategy beats every conceivable classical one.

The optimal CHSH measurement anglesA circle drawn in the x-z plane of the Bloch sphere. Alice's axes A zero and A one are at zero and ninety degrees; Bob's axes B zero and B one are at forty-five and minus forty-five degrees. The angles between the Alice-Bob pairs are all forty-five degrees except the A one, B one pair which is one hundred thirty-five degrees. The cosines of these angles give the four CHSH correlators.zxA₀ = ZA₁ = XB₀ = (Z+X)/√2B₁ = (Z−X)/√245°the four angles:∠(A₀,B₀) = 45°, cos = 1/√2∠(A₀,B₁) = 45°, cos = 1/√2∠(A₁,B₀) = 45°, cos = 1/√2∠(A₁,B₁) = 135°, cos = −1/√2S = 4 × (1/√2) = 2√2beats classical bound of 2
The optimal CHSH configuration. Alice measures along $Z$ or $X$; Bob measures along the two axes $45°$ away from each. Three correlators give $+1/\sqrt 2$, one gives $-1/\sqrt 2$, and the CHSH sum (with its three plus signs and one minus sign) aligns all four contributions constructively to give $2\sqrt 2$.

Example 2 — computing $\langle A_0 B_0\rangle$ for the Bell state

Verify directly that measuring Z on Alice's qubit and (Z+X)/\sqrt 2 on Bob's qubit of |\Phi^+\rangle gives correlation 1/\sqrt 2.

Step 1 — The operator. Alice measures Z; Bob measures (Z+X)/\sqrt 2. The joint observable is

A_0 \otimes B_0 = Z \otimes \frac{Z+X}{\sqrt 2} = \frac{1}{\sqrt 2}\bigl(Z \otimes Z + Z \otimes X\bigr).

Step 2 — Action on |\Phi^+\rangle. First compute Z \otimes Z on |\Phi^+\rangle. Since Z|0\rangle = +|0\rangle and Z|1\rangle = -|1\rangle:

(Z \otimes Z)\tfrac{1}{\sqrt 2}(|00\rangle + |11\rangle) = \tfrac{1}{\sqrt 2}\bigl((+1)(+1)|00\rangle + (-1)(-1)|11\rangle\bigr) = |\Phi^+\rangle.

Why Z \otimes Z is the identity on |\Phi^+\rangle: each component has even parity (both |00\rangle and |11\rangle have 0+0 and 1+1, both even). Z \otimes Z multiplies by (-1)^{\text{parity}}, which is +1 on even-parity states. So (Z \otimes Z)|\Phi^+\rangle = |\Phi^+\rangle. Immediately, \langle \Phi^+|Z \otimes Z|\Phi^+\rangle = \langle \Phi^+|\Phi^+\rangle = 1.

Now compute Z \otimes X on |\Phi^+\rangle. Since X|0\rangle = |1\rangle and X|1\rangle = |0\rangle:

(Z \otimes X)\tfrac{1}{\sqrt 2}(|00\rangle + |11\rangle) = \tfrac{1}{\sqrt 2}\bigl((+1)|01\rangle + (-1)|10\rangle\bigr) = \tfrac{1}{\sqrt 2}(|01\rangle - |10\rangle) = |\Psi^-\rangle.

Then \langle \Phi^+ | \Psi^-\rangle = 0 because |\Phi^+\rangle and |\Psi^-\rangle are orthogonal Bell basis states.

Step 3 — Assemble.

\langle A_0 B_0\rangle = \langle \Phi^+|A_0 \otimes B_0|\Phi^+\rangle = \tfrac{1}{\sqrt 2}\bigl(\langle \Phi^+|Z\otimes Z|\Phi^+\rangle + \langle \Phi^+|Z \otimes X|\Phi^+\rangle\bigr) = \tfrac{1}{\sqrt 2}(1 + 0) = \tfrac{1}{\sqrt 2}.

Result. \langle A_0 B_0\rangle = 1/\sqrt 2, matching \vec a_0 \cdot \vec b_0 = \cos 45°. The dot-product formula was a shortcut; Example 2 is the direct computation under the hood.

What this shows. The quantum correlation formula \langle A(\vec a)B(\vec b)\rangle = \vec a \cdot \vec b for the Bell state is not a definition — it is a computable consequence of the state, the Pauli operators, and the tensor-product structure. Every CHSH experiment that reports a value of S is, in effect, repeating this calculation in hardware.

The Tsirelson bound — why not beyond 2\sqrt 2?

A natural next question: could some other quantum state or set of measurements push S higher than 2\sqrt 2? The answer is no: 2\sqrt 2 is the maximum quantum value, regardless of state, regardless of measurements, for the standard CHSH combination. This is known as the Tsirelson bound (Boris Tsirelson, 1980).

The proof is not hard in outline. Treat A_0, A_1, B_0, B_1 as operators with eigenvalues in \{-1, +1\} (so A_x^2 = I, B_y^2 = I) and with [A_x, B_y] = 0 (Alice's and Bob's operators commute because they act on disjoint factors). Then compute the square of the CHSH operator \hat S = A_0(B_0 + B_1) + A_1(B_0 - B_1):

\hat S^2 = 4 I - [A_0, A_1][B_0, B_1].

The second term's magnitude is bounded by 4 since both commutators are bounded by 2 in operator norm. So \hat S^2 \leq 8\,I, giving \|\hat S\| \leq 2\sqrt 2. Clean identity, clean bound. The algebraic constants of quantum mechanics — specifically the i in [X, Z] = -2iY — put the ceiling at exactly 2\sqrt 2.

The three bounds on CHSHA horizontal number line from zero to four. Tick marks and vertical lines at S equals two (classical bound, labelled CHSH inequality), S equals two root two approximately two point eight two eight (quantum Tsirelson bound), and S equals four (algebraic maximum, labelled PR-box). The regions between are shaded and labelled classical, quantum, and hypothetical.02classical bound(CHSH inequality)2√2quantum maximum(Tsirelson)4algebraic max(PR-box)classical regionquantumno physical theory
Three bounds on the CHSH quantity. Classical local hidden-variable theories live below $2$. Quantum mechanics reaches up to $2\sqrt 2$ — no higher. The algebraic maximum $4$ is reachable by hypothetical theories ("PR-boxes") that are non-signalling but more powerful than quantum mechanics; no such theory is realised in nature.

So CHSH has three bounds, not two:

The gap between 2\sqrt 2 and 4 is a curious thing. It is possible to write down a fictitious theory — the PR-box, after Popescu and Rohrlich (1994) — that saturates |S| = 4, remains non-signalling (so doesn't violate relativity), but is not quantum mechanics. Why does nature stop at 2\sqrt 2 rather than 4? This is an open question in quantum foundations. It is a hint that quantum mechanics is more constrained than pure non-signalling alone would require, and the reasons are not fully understood.

The experiments — Aspect 1982 and the 2015 loophole-free trio

The CHSH inequality was proposed in 1969 as a cleaner experimental target than Bell's original 1964 inequality. The first experiment to report a clear violation was Alain Aspect's 1982 work at Orsay, France. Using polarisation-entangled photon pairs from a calcium cascade, Aspect and collaborators measured S \approx 2.70 \pm 0.02 — more than thirty standard deviations above the classical bound.

But the Aspect experiment, and every subsequent experiment for decades, had loopholes. A loophole is a specific way a dedicated sceptic might still defend a local hidden-variable interpretation, by pointing to an imperfection in the experiment. Three loopholes mattered:

Closing all three loopholes simultaneously took three decades of experimental progress: better photon detectors, space-like separation of measurement events, and verifiably random choice of settings. In 2015, three experiments achieved it:

Three independent experiments, three independent platforms (electron spins, photons, photons), three independent labs, all reporting loophole-free violations of the CHSH inequality within the same year. The question was settled. Local hidden-variable theories are wrong. Quantum mechanics makes the right predictions, and those predictions exceed anything classical physics can accommodate.

Measured CHSH values in landmark experimentsA horizontal plot showing measured S values on an axis from 2 to 3. Four experiments are shown as horizontal dashes with error bars. Aspect 1982 reads 2.70 plus minus 0.02. Hensen 2015 reads 2.42 plus minus 0.20. Giustina 2015 reads above 2.4. Shalm 2015 reads 2.35. Each is well above the classical bound of 2 shown as a red vertical line. The Tsirelson bound of 2.828 is shown as a dashed vertical line.2.02.252.52.75classical: S = 2Tsirelson2√2 ≈ 2.83Aspect 1982photon pairs, Orsay2.70 ± 0.02Hensen 2015 (Delft)electron spins, 1.3 km2.42 ± 0.20Shalm 2015 (NIST)photons, 180 m2.35
Measured CHSH values in selected experiments, with the classical bound at $2$ and the quantum Tsirelson bound at $2\sqrt 2$. All four measurements are above the classical bound by many standard deviations. Hensen, Giustina, and Shalm (all 2015) closed all three major loopholes simultaneously.

What Bell violation means — and what it does not mean

The result is clean and the temptation to over-read it is enormous. Take the temperature down and state carefully what has been proved.

What is established: no local hidden-variable theory can reproduce quantum mechanical predictions for CHSH-type experiments. The world cannot be described by a model in which (a) each particle carries pre-determined outcomes as local properties, and (b) Alice's measurement outcome depends only on her local property and her measurement choice. One of (a) or (b) — or both — must fail.

What is not established: that anything travels faster than light. Bell violation is perfectly compatible with special relativity. Careful: in the literature, the word "non-local" is used in a specific technical sense — violating the Bell inequality — not in the colloquial sense of "one event affects another at spacelike separation." Those are different statements. The no-communication theorem (seen in ch.18) forbids using a Bell state to send information; Bell violation is a statement about correlations that need no causal influence to arise.

What Bell violation also does not prove: that "reality" doesn't exist, that "the moon isn't there when nobody looks," or any of the other sweeping metaphysical claims that hover around quantum foundations. What Bell proved is narrower and sharper: any hidden-variable model attached to quantum mechanics must either reject locality (in the technical sense) or reject counterfactual definiteness (the claim that measurements have definite outcomes even when not performed). Most practising physicists interpret this as: quantum mechanics is not a local classical theory with hidden extras. The deeper philosophical fallout depends on which interpretation you pick — and interpretations of quantum mechanics are a choose-your-own-adventure genre, not an empirical distinction.

Hype check. Bell's theorem does not prove faster-than-light signalling is possible. It does not say the two particles are "communicating instantaneously." It does not say Alice's measurement "changes" Bob's particle. What it says is: the joint probability distribution over Alice's and Bob's measurement outcomes cannot be reproduced by any classical model that respects locality. The correlation is real; the causation is not. Every Bell test uses post-hoc classical communication to compare outcomes; no test has ever shown, or could show, faster-than-light information transfer. Pop-science articles that describe entanglement as "instant communication between particles" are wrong. The technical content of Bell's theorem is precisely what this chapter just said — no more, no less.

Why Bell violation matters — device independence

Beyond the philosophical stakes, Bell violation has a deeply practical consequence. It powers a class of quantum cryptographic protocols called device-independent: protocols whose security depends only on the observed CHSH value, not on any trust in the quantum devices Alice and Bob use.

In ordinary quantum key distribution (QKD), Alice and Bob assume their devices correctly prepare and measure the quantum states they claim to. If a manufacturer slipped in a flaw, security is lost. Device-independent QKD, first proposed by Ekert (1991) and formalised by Mayers and Yao (1998), sidesteps this. If Alice and Bob repeatedly measure CHSH on pairs from a shared source and consistently get S close to 2\sqrt 2, they can certify — without trusting the manufacturer at all — that their source is producing near-Bell-state pairs and their measurements are near-optimal. Any eavesdropper's attempt to tamper would reduce their observed S.

The first demonstrations of device-independent QKD appeared in 2022 (Nadlinger et al., using trapped ions; Zhang et al., using photons), and the field has grown rapidly since. The underlying guarantee — that a quantum state's entanglement can be certified by a number, the CHSH violation, without any further modelling — is sometimes called the self-testing property of Bell states. It is the strongest form of quantum security argument known.

ISRO's 2022 quantum key distribution demonstration from Bengaluru to Hyderabad, part of the National Quantum Mission, uses BB84 rather than device-independent QKD; but the longer-term roadmap, articulated in the mission's programme documents, anticipates moving toward Bell-inequality-based security once the photon-detection efficiencies are high enough. The Raman Research Institute in Bangalore — home to some of India's leading quantum-optics experimentalists — has published CHSH-based entanglement-certification results since the 2010s, and is one of the few Indian labs capable of closing Bell-test loopholes on photonic setups.

Common confusions

Going deeper

The CHSH game, the classical bound, the quantum bound, and the Tsirelson ceiling are the core. What follows is the machinery — a full derivation of the Tsirelson bound via the SDP (semidefinite programme) hierarchy, a pointer into device-independent quantum cryptography and its modern practice, the structure of non-signalling theories beyond quantum, the PR-box, and some historical notes on John Bell himself.

The Tsirelson bound via SDP hierarchy

Tsirelson's \sqrt 2 was proved in 1980 via a direct operator-algebra argument. A more systematic treatment, the Navascués-Pironio-Acín (NPA) hierarchy (2007), turns the problem "what are the quantum values achievable in a Bell scenario?" into a sequence of semidefinite programmes. The k-th level of the hierarchy is a relaxation whose optimal value upper-bounds the true quantum value, and the sequence converges to the exact quantum value as k \to \infty.

For the CHSH scenario, the first level of the NPA hierarchy already saturates at 2\sqrt 2 — the quantum maximum is exactly the level-1 SDP bound. For more complicated scenarios, higher levels are needed. The NPA hierarchy is the standard computational tool in Bell-inequality research today; it is how a working physicist asks "what is the quantum value of this new Bell expression" and gets a tight number.

Finite-level NPA bounds can be computed in standard SDP solvers (Mosek, SCS, CVXOPT). For the interested student: Qiskit does not currently include NPA tooling out of the box; the ncpol2sdpa Python package is the common entry point.

Device-independent QKD and randomness expansion

Device-independent quantum cryptography is the practical pay-off of Bell violation. Two paradigmatic tasks:

Both are active research areas with growing practical relevance as hardware improves. The common thread is: Bell violation certifies quantum behaviour, and quantum behaviour certifies randomness and security.

Non-signalling theories and the PR-box

The non-signalling principle says Alice's marginal statistics cannot depend on Bob's input — the natural causal constraint on any physical theory compatible with relativity. Any theory satisfying non-signalling obeys |S| \leq 4 (the trivial algebraic maximum). Quantum mechanics satisfies non-signalling and obeys the stronger bound |S| \leq 2\sqrt 2. So something stronger than non-signalling is constraining quantum correlations.

Popescu and Rohrlich (1994) asked: is there a physically reasonable theory saturating |S| = 4, stronger than quantum but still non-signalling? They constructed the PR-box: a hypothetical device producing output pairs (a, b) satisfying a \oplus b = x \cdot y (XOR equals product of inputs) with uniform marginal statistics. A PR-box is non-signalling but achieves S = 4. No such device is realised by any known physical theory, but the PR-box is a useful thought-experiment for understanding what constrains quantum correlations beyond non-signalling alone.

Candidate constraints investigated: information causality (Pawłowski et al., 2009), macroscopic locality, and others. None has yet been shown to single out quantum mechanics exactly — so the question "what principle picks 2\sqrt 2 over 4?" remains genuinely open, and it is one of the cleanest open problems in quantum foundations.

Multipartite Bell inequalities and GHZ proofs

Bell's 1964 argument and CHSH concern two-party scenarios. For three or more parties, stronger inequalities exist. The Mermin inequality (1990) for three-party correlations has a classical bound of 2 and a quantum maximum of 4 — a factor-2 gap rather than \sqrt 2. The GHZ argument (Greenberger-Horne-Zeilinger, 1989) goes further: on the GHZ state |\mathrm{GHZ}\rangle = \tfrac{1}{\sqrt 2}(|000\rangle + |111\rangle), certain measurements give outcomes that are perfectly deterministic under quantum mechanics and perfectly impossible under local hidden variables — a single run of a clean GHZ experiment rules out local hidden variables, without any inequality or statistical averaging.

The GHZ argument is pedagogically famous because it gives the cleanest possible demonstration that Bell-type reasoning does not require statistical distributions — it can be done as a simple counterfactual contradiction. It is the three-qubit analogue of CHSH and it is often the example a lecturer reaches for after showing CHSH.

John Bell — a portrait

Bell, born in Belfast in 1928, worked at CERN on particle physics for most of his career. His 1964 paper on the hidden-variable problem was a side project, published in a low-profile journal (Physics Physique Fizika) that folded a few years later. For decades it was a minority interest; the dominant Copenhagen-interpretation view held that questioning quantum mechanics' foundations was unprofessional. Bell, characteristically, did not care. He pursued the foundational questions with a combination of philosophical clarity and technical rigour that is now textbook.

Bell died in 1990 of a cerebral haemorrhage at 62 — before the loophole-free experiments, before the Nobel Committee awarded Aspect, Clauser, and Zeilinger the 2022 Physics Prize for experimental tests of Bell inequalities. His collected essays, Speakable and Unspeakable in Quantum Mechanics, are among the clearest writings on the conceptual foundations of quantum mechanics ever produced, and remain essential reading for anyone who wants to internalise what Bell actually proved.

Where this leads next

References

  1. B. Hensen et al., Loophole-free Bell inequality violation using electron spins separated by 1.3 kilometres (Nature, 2015) — arXiv:1508.05949.
  2. Wikipedia, Bell's theorem — history, statement, and context.
  3. Wikipedia, CHSH inequality — the specific inequality used in most modern Bell tests, including its derivation and bounds.
  4. Wikipedia, Loophole-free Bell test — a survey of the 2015 experiments and the closed loopholes.
  5. John Preskill, Lecture Notes on Quantum Computation, Ch. 4 (Bell inequalities and CHSH) — theory.caltech.edu/~preskill/ph229.
  6. Nielsen and Chuang, Quantum Computation and Quantum Information (Cambridge, 2010), §2.6 on the EPR paradox and Bell's inequality — Cambridge University Press.