Superposition vs Classical Randomness

Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.

In short

A classical coin flipped in the air and a qubit in the state |+\rangle = (|0\rangle + |1\rangle)/\sqrt{2} both give "0 or 1 with 50-50 odds" when measured in the computational basis. They are not the same object. Apply a Hadamard gate to each: the coin still gives 50-50, but the qubit gives 0 with probability 1 — deterministically. The difference is that the qubit carries two amplitudes, and amplitudes can be added with signs: they can interfere. A 50-50 classical random mixture is described by two probabilities \{\tfrac12, \tfrac12\} that can only grow; the superposition |+\rangle is described by two amplitudes \{\tfrac{1}{\sqrt{2}}, \tfrac{1}{\sqrt{2}}\} whose relative signs can be rearranged to make one of them cancel to zero. Every quantum algorithm's advantage lives in this gap.

Here is a puzzle. Your friend brings you two identical-looking cardboard boxes. Inside each, she says, is a device that outputs a 0 or a 1 when you press the button, with perfectly fair 50-50 odds. The first box contains a spinning ₹1 coin on a little platform, hidden behind a flap that drops when you press the button. The second box contains a single qubit prepared in some specific quantum state, whose button runs a measurement in the computational basis.

You press each button a thousand times. You get roughly 500 zeros and 500 ones from both. Statistics identical. Boxes indistinguishable.

Your friend then pulls out a new button marked "H" and hands it to you. Press H on the first box, then the original button. The coin spin is different; still 50-50. Press H on the second box, then the original button. The qubit gives 0. Every time. A thousand presses, a thousand zeros.

That single H button is the difference between classical randomness and quantum superposition, and the whole of quantum computing lives in it. The coin box was storing a classical probability distribution — 50% heads, 50% tails — and no deterministic procedure can extract more from it than that distribution contains. The qubit box was storing a superposition, a specific vector |+\rangle = \tfrac{1}{\sqrt{2}}|0\rangle + \tfrac{1}{\sqrt{2}}|1\rangle whose structure — not just its measurement statistics — encodes information that the right gate can read out exactly.

This chapter pulls that difference apart with no hand-waving. You will see why amplitudes are not probabilities, why a Hadamard gate can turn an apparent coin into a certain 0, how the double-slit experiment and the Mach-Zehnder interferometer are the same story in two different experimental clothes, and why the word "superposition" means something strictly richer than the word "random."

The same statistics, two different objects

Write down what the two boxes actually contain.

Box 1 — classical randomness. Inside is a coin that is either already heads or already tails — you just don't know which. Mathematically, the box's state is a probability distribution: \{p(0) = \tfrac12, \, p(1) = \tfrac12\}. Two non-negative real numbers that sum to 1. That's the whole object. When you measure, you sample from the distribution, so you see 0 half the time and 1 half the time. Nothing is hidden in the phase or the sign of a probability — there is no phase, there is no sign.

Box 2 — quantum superposition. Inside is a qubit in the state |+\rangle = \tfrac{1}{\sqrt{2}}|0\rangle + \tfrac{1}{\sqrt{2}}|1\rangle. Two complex amplitudes, each of magnitude 1/\sqrt{2}, attached to the two basis states. The Born rule gives the measurement probabilities: p(0) = |1/\sqrt{2}|^2 = 1/2 and p(1) = 1/2. So the measurement statistics in the computational basis look identical to the coin — 50-50. But the object is not a probability distribution. It is a unit vector in \mathbb{C}^2, and the two amplitudes are signed.

Two boxes with identical computational-basis statistics. The left box stores a probability distribution; the right stores a vector. A Hadamard gate reveals the hidden structure of the vector and leaves the distribution untouched.

The visual is a schematic — the qubit's space is \mathbb{C}^2, which cannot literally fit in a flat page, and the Bloch sphere (chapter 14) is the honest 3D picture. The point of the diagram is the asymmetry: the left box's numbers add, the right box's numbers add with sign.

Why the sign matters: a probability distribution only tells you "how much" of each outcome to expect. A quantum state tells you "how much" and "in what phase." Two probabilities of \tfrac12 cannot combine to zero — two non-negative numbers cannot cancel. Two amplitudes of \tfrac{1}{\sqrt{2}} and -\tfrac{1}{\sqrt{2}} can.

The experiment that distinguishes them — the Hadamard test

Time to build the H button. The Hadamard gate is the single-qubit unitary

H = \frac{1}{\sqrt{2}}\begin{pmatrix}1 & 1 \\ 1 & -1\end{pmatrix}.

It acts on the computational basis states as

H|0\rangle = \tfrac{1}{\sqrt{2}}|0\rangle + \tfrac{1}{\sqrt{2}}|1\rangle = |+\rangle, \qquad H|1\rangle = \tfrac{1}{\sqrt{2}}|0\rangle - \tfrac{1}{\sqrt{2}}|1\rangle = |-\rangle.

Why those outputs: applying a matrix to a basis vector reads off a column of the matrix. The first column of H is \tfrac{1}{\sqrt{2}}(1, 1)^T, which is |+\rangle. The second column is \tfrac{1}{\sqrt{2}}(1, -1)^T, which is |-\rangle. The notation |+\rangle and |-\rangle is mnemonic: the sign between |0\rangle and |1\rangle is plus or minus.

Hadamard is its own inverse: H \cdot H = I. So applying H twice does nothing. In particular, H|+\rangle = HH|0\rangle = |0\rangle. If you start from |+\rangle and apply H, you land exactly on |0\rangle, and a computational-basis measurement then returns 0 with probability 1.

Now run the same gate on the classical coin. There is no classical operation that is a linear map on a probability distribution and also deterministically un-flips a coin. The closest classical analogue to H is some stochastic process that takes the distribution \{\tfrac12, \tfrac12\} to some output distribution; but every stochastic process satisfies p'_i = \sum_j T_{ij} p_j with T_{ij} \geq 0 and \sum_i T_{ij} = 1. A matrix with positive entries cannot cancel anything. The 50-50 stays 50-50 no matter what you do.

This is the gap. Quantum gates are unitary — reversible, and allowed to have negative (and complex) entries. Classical stochastic matrices are positive — non-negative entries, no cancellation, only mixing. Superposition lives in the first world; classical randomness lives in the second.

The Hadamard-test circuit. A qubit in $|+\rangle$ followed by $H$ and a computational-basis measurement gives $0$ deterministically. No classical stochastic map on a 50-50 coin can do the same — stochastic matrices are non-negative and cannot manufacture certainty from randomness.

Why this matters. The Hadamard gate is the tool you will use, again and again, to prepare a quantum register in an equal superposition of all input strings at the start of Deutsch–Jozsa, Simon's, Grover's, and Shor's algorithms. Every one of those algorithms relies on the fact that H creates a specific superposition with known amplitudes — not a random mixture. The second half of each algorithm is designed to rearrange those amplitudes by interference until the right answer is the one H (or its inverse) deterministically extracts.

Amplitudes versus probabilities — what phase adds

Probabilities are non-negative real numbers. Amplitudes are complex numbers. Every real number can be written as magnitude \times sign; every complex number can be written as magnitude \times phase e^{i\varphi}. The magnitude of the amplitude is what you square to get a probability; the phase is the extra piece — the angle in the complex plane — that probabilities throw away.

Consider two states that differ only in a sign on |1\rangle:

|+\rangle = \tfrac{1}{\sqrt{2}}|0\rangle + \tfrac{1}{\sqrt{2}}|1\rangle, \qquad |-\rangle = \tfrac{1}{\sqrt{2}}|0\rangle - \tfrac{1}{\sqrt{2}}|1\rangle.

The probabilities in the computational basis are identical: |1/\sqrt{2}|^2 = 1/2 for both basis outcomes, in both states. Yet the states are physically different — indeed, they are orthogonal: \langle+|-\rangle = \tfrac12 \cdot 1 + \tfrac12 \cdot (-1) = 0. A measurement in the right basis distinguishes them perfectly. Apply H to |+\rangle and you get |0\rangle; apply H to |-\rangle and you get |1\rangle. Same computational-basis statistics, but one is flagged as "was |+\rangle" and the other as "was |-\rangle" once you use the H button to reveal the hidden sign.

The two amplitudes of a qubit combine like two arrows in the complex plane. If you want to know "what is the amplitude for outcome 0?", and outcome 0 has contributions from two different paths inside your circuit, you add the two path-amplitudes as vectors in \mathbb{C} and then square the magnitude of the sum.

This addition-before-squaring is the exact opposite of how probabilities work. For classical random processes, if there are two mutually exclusive paths that lead to outcome 0, you add the path-probabilities: p_0 = p_{\text{path 1}} + p_{\text{path 2}}. The paths' contributions always reinforce each other — two non-negative numbers cannot cancel. For quantum amplitudes, you add the amplitudes first, then square: p_0 = |\alpha_{\text{path 1}} + \alpha_{\text{path 2}}|^2. If the two path amplitudes point in opposite directions, they wipe each other out and the probability drops to zero, even though each individual path would contribute a non-zero amount on its own.

Amplitudes are complex numbers; they add as arrows in the complex plane. When two paths arrive in phase, their arrows reinforce and the probability grows. When they arrive 180° out of phase, the arrows cancel and the probability drops to zero — even though each individual path has a non-zero amplitude.

This "add then square" rule is the Feynman sum-over-paths principle in miniature. In the full path-integral formulation of quantum mechanics, the amplitude to go from A to B is the sum of amplitudes over all possible intermediate routes. Each route contributes e^{iS/\hbar}, where S is a classical action. When neighbouring routes differ by action much larger than \hbar, their phases scramble and their amplitudes cancel; when they differ by less than \hbar, they reinforce. The classical trajectory is where the phase is stationary and the contributions add up constructively. For a qubit circuit, the "paths" are the different sequences of basis states the qubit could be in at each time step, and the rule is the same — the amplitudes add, and the probability is the squared magnitude of the sum.

The double-slit experiment — the canonical interference picture

The cleanest physical demonstration of amplitudes-versus-probabilities is the double-slit experiment. Fire a single particle — a photon, an electron, even a C_{60} fullerene molecule (Zeilinger's group did this in 1999) — at a barrier with two narrow slits. Behind the barrier, a screen records where the particle lands.

If particles were tiny classical billiard balls and the slits were just two possible paths, you would expect the intensity pattern on the screen to be the sum of the intensities from each slit taken alone. Cover slit B, you get a blob behind slit A. Cover slit A, you get a blob behind slit B. Uncover both, you get the sum — two blobs.

That is not what happens. With both slits open, the screen shows a series of alternating bright and dark bands — an interference pattern — with some positions on the screen receiving less light than with only one slit open. In places where "slit A alone" and "slit B alone" would each send some small amount of light, "both slits open" sometimes sends zero. The two light beams are not just adding; they are cancelling at dark fringes and reinforcing at bright fringes.

This is exactly the amplitude-addition rule from the previous section, now played out in space. The amplitude for a photon to land at a point P on the screen is the sum of two path-amplitudes: one for the photon going through slit A to P, one for slit B to P. Each path-amplitude has a magnitude and a phase. The phase depends on the path length — longer paths acquire more phase. When both paths to P have the same length (straight in front of the midpoint between the slits), the amplitudes arrive in phase and add — bright fringe. When the path lengths differ by half a wavelength, the amplitudes are 180° out of phase and cancel — dark fringe.

The double-slit pattern. Each point on the screen receives two path-amplitudes — one through slit A, one through slit B — and the probability of detection there is the squared magnitude of their sum. Dark fringes are where the two amplitudes are 180° out of phase and cancel.

The crucial subtlety: the interference pattern appears even when you fire one photon at a time, with seconds between shots. Each photon arrives at a specific point on the screen; after many thousands of photons, the collected dots recreate the bands. No photon interferes with another photon — each photon interferes with itself, because the amplitude to arrive at a given screen point contains contributions from both slits simultaneously. This is the signature that distinguishes quantum superposition (amplitudes through two paths, for one particle) from classical randomness (probabilities for "which path the particle took," as if it took one or the other). The two-slit experiment was how the difference was finally nailed down, over the first half of the 20th century.

C.V. Raman's 1928 observation of inelastic light scattering — India's first Nobel Prize in the sciences — depends on the same quantum logic in a different dress: photons and molecules exchange energy in amplitude-level transitions whose phases interfere to produce the sharp Raman lines. Interference isn't a laboratory curiosity; it's how quantum mechanics writes itself into every spectrum in a pharmaceutical assay, every diffraction pattern in an X-ray crystallography lab, every colour in a diamond's fluorescence.

The Mach-Zehnder interferometer — the QC-native double-slit

The double-slit is the most famous interference experiment; the Mach-Zehnder interferometer is the one that looks most like a quantum circuit. Instead of slits, you use two beam splitters. A single photon is sent into the first beam splitter, which has equal transmissivity and reflectivity — with probability 1/2 the photon takes the upper path, with probability 1/2 the lower path. Two mirrors redirect both paths into a second beam splitter. Two detectors, call them D0 and D1, sit in the two output ports of the second beam splitter.

If the two beam splitters were classical 50-50 random switches, you would expect each detector to fire 50% of the time — a random photon through random switches. What actually happens, for a photon of the right wavelength and identical arm lengths, is that D0 fires with probability 1, and D1 never fires. The two beam splitters, together, have funnelled the photon into one specific detector. Delete either mirror, unbalance the arms, or insert a glass plate in one arm to shift the phase — the pattern flips: D1 now fires, D0 goes silent. The arm-length difference controls which detector lights up, continuously between 100-0 and 0-100.

This is exactly the Hadamard-test behaviour. The first beam splitter is the first H on the qubit (put the photon into superposition). The mirrors are free-propagation, which in the qubit language is a phase rotation on the path-ket. The second beam splitter is the second H. In quantum-circuit notation:

|\text{in}\rangle \xrightarrow{H} \tfrac{1}{\sqrt{2}}|\text{up}\rangle + \tfrac{1}{\sqrt{2}}|\text{down}\rangle \xrightarrow{\text{phase } \varphi} \tfrac{1}{\sqrt{2}}|\text{up}\rangle + \tfrac{e^{i\varphi}}{\sqrt{2}}|\text{down}\rangle \xrightarrow{H} \text{detector}.

When \varphi = 0 (balanced arms), the two amplitudes at D1 arrive out of phase and cancel: the photon is at D0 with probability 1. When \varphi = \pi (one arm a half-wavelength longer than the other), the amplitudes at D0 cancel and the photon goes to D1 with probability 1. The interferometer is a two-level quantum system with path as its qubit, and every gate you have learned on qubits corresponds to a physical operation you can do to an optical path.

A Mach-Zehnder interferometer is a single photon being Hadamarded into a path-superposition, acquiring a path-dependent phase, then Hadamarded again. With balanced arms, the amplitudes at D1 cancel — the photon arrives at D0 with certainty. This is the physical Hadamard-test in optics.

The Mach-Zehnder makes the path-qubit explicit. The "classical coin that lands heads or tails at the first beam splitter" picture would have each detector firing half the time. Real experiments do not see that. What they see is the structured, deterministic behaviour of a two-state quantum system whose amplitudes add and cancel. Students often find the Mach-Zehnder the first moment the "picture" of superposition clicks — because you can literally point at the two paths, and the amplitude-addition is geometric.

Worked examples

Example 1 — $|+\rangle$ versus the classical 50-50 mixture

Consider two systems:

System A: a qubit prepared in the state |+\rangle = \tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle).
System B: a qubit prepared as follows. Flip a fair coin. If heads, prepare |0\rangle; if tails, prepare |1\rangle. Hand the result to a collaborator without telling them the coin result.

Show that System A and System B give identical statistics when measured in the computational basis, but distinguishable statistics when measured in the X basis \{|+\rangle, |-\rangle\}.

Step 1 — Computational-basis measurement of System A.

The Born rule on |+\rangle:

p_A(0) = |\langle 0|+\rangle|^2 = \left|\tfrac{1}{\sqrt{2}}\right|^2 = \tfrac12, \qquad p_A(1) = |\langle 1|+\rangle|^2 = \tfrac12.

Why: \langle 0|+\rangle picks out the coefficient of |0\rangle in |+\rangle, namely 1/\sqrt{2}; squaring its modulus gives the probability.

Step 2 — Computational-basis measurement of System B.

System B is a probability-\tfrac12 mixture of |0\rangle and |1\rangle. With probability \tfrac12 the qubit is |0\rangle (measurement gives 0 with certainty); with probability \tfrac12 the qubit is |1\rangle (measurement gives 1 with certainty). So

p_B(0) = \tfrac12 \cdot 1 + \tfrac12 \cdot 0 = \tfrac12, \qquad p_B(1) = \tfrac12 \cdot 0 + \tfrac12 \cdot 1 = \tfrac12.

Why: mixtures are linear in probabilities — total law of probability, conditioning on the coin outcome.

Identical to System A. Just on Z-measurements, you cannot tell them apart.

Step 3 — X-basis measurement of System A.

p_A(+) = |\langle +|+\rangle|^2 = |1|^2 = 1, \qquad p_A(-) = |\langle -|+\rangle|^2 = |0|^2 = 0.

Step 4 — X-basis measurement of System B.

p_B(+) = \tfrac12 \cdot \tfrac12 + \tfrac12 \cdot \tfrac12 = \tfrac12, \qquad p_B(-) = \tfrac12.

Why: the mixture is 50-50 across two states that individually each give 50-50 in the X basis. Both averages are 1/2.

Result.

| Measurement | System A (pure |+\rangle) | System B (mixed \{|0\rangle,|1\rangle\}) | |-----------|-----------------------|---------------------------------| | Z basis | 50-50 | 50-50 | | X basis | +1 deterministic | 50-50 |

The X-basis measurement — equivalently, the Hadamard-then-Z test — is what tells the pure superposition apart from the classical mixture. System A gives a definite, repeatable outcome in the X basis; System B still gives noise. That experimental difference is the entire evidence that superposition is real and not "just hidden classical randomness."

Example 2 — a Mach-Zehnder amplitude calculation

Track a photon through a balanced Mach-Zehnder interferometer with a tunable phase shifter \varphi in the upper arm, and compute the probability of detection at D0 as a function of \varphi.

Label the upper arm |u\rangle and the lower arm |\ell\rangle, and represent each beam splitter as the Hadamard gate. The photon starts in |\ell\rangle (treating "lower arm" as the input port the source illuminates).

Step 1 — First beam splitter.

H|\ell\rangle = \tfrac{1}{\sqrt 2}|u\rangle - \tfrac{1}{\sqrt 2}|\ell\rangle = |-\rangle.

Why: a 50-50 beam splitter takes the photon into an equal amplitude superposition of the two paths, with a relative sign dictated by its unitary — here the same sign convention as a Hadamard on |1\rangle \to |-\rangle. Other sign conventions exist for beam splitters; they all produce interference.

Step 2 — Phase shifter on the upper arm.

\tfrac{1}{\sqrt 2}|u\rangle - \tfrac{1}{\sqrt 2}|\ell\rangle \xrightarrow{\text{phase } \varphi \text{ on } |u\rangle} \tfrac{e^{i\varphi}}{\sqrt 2}|u\rangle - \tfrac{1}{\sqrt 2}|\ell\rangle.

Why: a longer optical path, or a glass plate, multiplies the amplitude of the affected arm by e^{i\varphi}. The relative phase between the two arms is what the second beam splitter will turn into an intensity difference.

Step 3 — Second beam splitter (Hadamard again).

Recall H|u\rangle = \tfrac{1}{\sqrt 2}(|u\rangle + |\ell\rangle) and H|\ell\rangle = \tfrac{1}{\sqrt 2}(|u\rangle - |\ell\rangle). Identify |u\rangle and |\ell\rangle with the two output ports |D_0\rangle and |D_1\rangle of the second beam splitter (this is a convention about how ports are wired; the physics is unaffected by relabeling). Apply H:

H\left[\tfrac{e^{i\varphi}}{\sqrt 2}|u\rangle - \tfrac{1}{\sqrt 2}|\ell\rangle\right] = \tfrac{e^{i\varphi}}{2}(|D_0\rangle + |D_1\rangle) - \tfrac{1}{2}(|D_0\rangle - |D_1\rangle).

Why: linearity. The beam splitter acts on each arm ket separately.

Collect terms:

= \tfrac{e^{i\varphi} - 1}{2}|D_0\rangle + \tfrac{e^{i\varphi} + 1}{2}|D_1\rangle.

Step 4 — Born rule at the detectors.

The probability of detection at D0:

p(D_0) = \left|\tfrac{e^{i\varphi} - 1}{2}\right|^2 = \tfrac{(e^{i\varphi} - 1)(e^{-i\varphi} - 1)}{4} = \tfrac{2 - e^{i\varphi} - e^{-i\varphi}}{4} = \tfrac{2 - 2\cos\varphi}{4} = \sin^2(\varphi/2).

Why: the second step uses |z|^2 = z\bar z with z = (e^{i\varphi}-1)/2. The third uses e^{i\varphi} + e^{-i\varphi} = 2\cos\varphi. The final step uses the identity 1 - \cos\varphi = 2\sin^2(\varphi/2).

Similarly p(D_1) = \cos^2(\varphi/2), and the two probabilities add to 1.

Result.

\boxed{p(D_0) = \sin^2(\varphi/2), \qquad p(D_1) = \cos^2(\varphi/2).}

What this shows. The detector probabilities oscillate with the phase \varphi. When \varphi = 0 (balanced arms), p(D_0) = 0 and p(D_1) = 1 — the photon deterministically lands at D1, because the amplitudes at D0 cancel and the amplitudes at D1 add. When \varphi = \pi, the situation reverses. Classical random switches could never produce this — they would give p = 1/2 at both detectors regardless of arm length. The continuous phase control is the experimental fingerprint of amplitude addition, and it is the same \sin^2(\varphi/2) pattern that shows up in every two-path interferometer and every one-qubit Ramsey experiment.

The two detector probabilities trace out complementary $\sin^2$ and $\cos^2$ curves as the phase shift in one arm is varied. At $\varphi = 0$ and $2\pi$ all the intensity goes to D1; at $\varphi = \pi$ all of it goes to D0. A classical pair of switches could not produce this structure.

Common confusions

"Superposition means the qubit is both 0 and 1 at the same time." No. A qubit in superposition is in a specific vector state like \tfrac{1}{\sqrt 2}|0\rangle + \tfrac{1}{\sqrt 2}|1\rangle, with two complex amplitudes. When you measure, you get either 0 or 1 — one classical bit, never both. "Both at once" throws away the amplitudes and leaves you with a slogan that cannot predict the result of the Hadamard test. The correct mental picture is amplitude distribution, not "both values at once."
"A superposition is just a 50-50 coin you haven't looked at yet." This is the classical-mixture confusion, and Example 1 is the refutation. The |+\rangle state and the classical 50-50 mixture of |0\rangle and |1\rangle give identical statistics in the computational basis but different statistics in the X basis. The X-basis measurement is an experimentally realisable operation that separates the pure superposition from any classical probability assignment. A "hidden coin" can never give a deterministic outcome under a carefully chosen rotated measurement, because you cannot rotate a probability into certainty — but you can rotate an amplitude pattern into certainty.
"Decoherence just turns a pure superposition into a classical random mixture." This one is approximately true but needs a precise framing. When a qubit interacts with an environment it cannot track, the reduced density matrix of the qubit tends toward a classical mixture, and the off-diagonal elements (which carry the phase information) shrink toward zero. The experimentally measurable quantum behaviour — the Hadamard test, the Mach-Zehnder interference — is lost. So from the qubit's perspective, decoherence is the conversion of a pure superposition into a mixture. But globally, including the environment, the state remains pure and unitary; the information has leaked, not vanished. The full treatment requires density matrices (chapter 19) and the partial trace (chapter 26).
"Interference only happens for photons and electrons, not for qubits." The two words refer to the same phenomenon. Every qubit gate that produces a definite output from a superposition input is doing the amplitude-addition trick — the Mach-Zehnder is literally a one-qubit Hadamard test in optical clothes. Any time you see amplitudes combining to give non-trivial probabilities, you are watching interference, whether it is labelled "quantum optics" or "quantum computing."
"Quantum mechanics is just probability with complex numbers." Mathematically flatter than a full theory, and practically misleading. Yes, amplitudes are complex and probabilities are their squared magnitudes, but the fact that gates operate on amplitudes (not on probabilities) and that measurement is a specific non-linear projection (not an expectation over a prior) are what make quantum mechanics inequivalent to any classical probability theory. The Bell inequality violations (which you will meet in chapter 45) are the sharpest expression of this inequivalence — no classical probability model with local hidden variables can reproduce quantum correlations.

Going deeper

If you have seen that |+\rangle and a classical 50-50 mixture differ in their X-basis statistics, and you have watched the Mach-Zehnder amplitudes cancel and add, you have the core. What follows is the structural machinery — density matrices, purity, decoherence, and the connection to quantum advantage — that the next chapters will build out fully.

Density matrices — one formalism for superpositions and mixtures

A single vector |\psi\rangle describes a pure state. A probability distribution over vectors \{(p_i, |\psi_i\rangle)\} describes a mixed state. To handle both in one notation, introduce the density matrix:

\rho = \sum_i p_i \, |\psi_i\rangle\langle \psi_i|.

For a pure state |\psi\rangle, the density matrix is \rho_{\text{pure}} = |\psi\rangle\langle \psi| — a rank-1 projector. For a 50-50 mixture of |0\rangle and |1\rangle, the density matrix is \rho_{\text{mix}} = \tfrac12 |0\rangle\langle 0| + \tfrac12 |1\rangle\langle 1| = \tfrac12 I — the maximally mixed state on a qubit, also called the "completely random" state.

Compute \rho for |+\rangle: \rho_+ = |+\rangle\langle+| = \tfrac12\begin{pmatrix}1 & 1 \\ 1 & 1\end{pmatrix}. Compare with \rho_{\text{mix}} = \tfrac12\begin{pmatrix}1 & 0 \\ 0 & 1\end{pmatrix}. The diagonal entries are identical — both give 50-50 on a computational-basis measurement. The off-diagonal entries — the "coherences" — are what distinguish the pure superposition from the mixture. \rho_+ has 1/2 off-diagonal; \rho_{\text{mix}} has 0.

Off-diagonals carry the phase information. When a system decoheres, the off-diagonals decay to zero while the diagonals stay put — a pure superposition becomes a classical mixture, from the subsystem's perspective, because the off-diagonals have been washed out by the environment. That is the density-matrix signature of decoherence.

Purity — one number that separates them

The purity of a state is \text{tr}(\rho^2). For a pure state, \text{tr}(\rho^2) = 1. For a completely mixed state on a d-dimensional system, \text{tr}(\rho^2) = 1/d. For a qubit, the boundary is 1/2 — every state lies on or inside the Bloch ball, with pure states on the surface (the Bloch sphere) and mixed states strictly inside.

Computing for the two states above: \text{tr}(\rho_+^2) = \text{tr}(|+\rangle\langle+|+\rangle\langle +|) = \text{tr}(|+\rangle\langle +|) = 1. And \text{tr}(\rho_{\text{mix}}^2) = \text{tr}((\tfrac12 I)^2) = \text{tr}(\tfrac14 I) = \tfrac12. Purity 1 versus purity 1/2 — a single scalar that tells the superposition apart from the mixture, without needing to choose a measurement basis.

Decoherence as a quantitative timescale

Real qubits are not isolated. Each platform has a decoherence time T_2 that measures how long the off-diagonal elements survive before coupling to the environment drags them to zero. Rough orders of magnitude in 2025:

Superconducting qubits: T_2 \sim 100 \, \mu\text{s}.
Trapped ions: T_2 \sim 1{-}100 \, \text{s} (hyperfine qubits can reach even longer).
Neutral atoms: T_2 \sim 1 \, \text{s}.
NV-centre electron spins in diamond: T_2 \sim 1 \, \text{ms} at room temperature.

A quantum algorithm must finish its interference-sensitive operations well inside T_2, or the off-diagonals will have decayed too far for the measurement pattern to reveal the computation's signature. This is the engineering constraint that shapes how many gates you can actually run. Error correction (Part 17) is the long-term answer: encode one logical qubit in many physical qubits, detect and correct errors before they accumulate, and effectively extend T_2 to astronomical values.

Why interference is the quantum advantage

Every proven quantum speedup ultimately uses interference. Shor's algorithm engineers an amplitude pattern whose quantum Fourier transform is sharply peaked at integer multiples of 1/r (the period). Grover's algorithm iteratively rotates the amplitude vector toward the marked state — a sequence of interference events. Deutsch–Jozsa distinguishes constant from balanced by making wrong-branch amplitudes cancel. The quantum-simulation advantage uses interference to track the true many-body dynamics of a quantum system.

This gives you the "why quantum is hard to simulate classically": a classical simulator must track all 2^n amplitudes explicitly, and must update each one per gate with floating-point arithmetic that gets the signs and phases right. The state space grows exponentially, and no classical trick can replace the native physical evolution of amplitudes. A quantum computer, being a quantum system itself, carries the amplitudes as physical properties — it does not need to "store" them as numbers.

The deep fact is that no classical probabilistic model with non-negative weights can reproduce quantum statistics. This is made precise by Bell's theorem for measurements in correlated two-qubit systems. Bell correlations between entangled qubits violate any non-negative, locally realistic probability model; the only known theory that reproduces the full set of experimentally observed correlations requires amplitudes that can go negative (or complex). Superposition is the single-qubit face of this irreducibility; entanglement is the two-qubit face; both point at the same underlying fact — the world computes with signed amplitudes, and no simulation in non-negative numbers will capture it.

Where this leads next

The Measurement Problem (Briefly) — chapter 16, the complementary question: if unitary evolution is reversible and measurement is not, where does one stop and the other begin?
Interference and Phase — deeper dive into how relative phases shape every multi-path amplitude computation.
Density Matrices — Introduction — the formalism for pure and mixed states in one language.
Bell Inequality — the two-qubit statement that quantum correlations cannot be reproduced by any classical hidden-variable model.
Deutsch–Jozsa Algorithm — the first clean demonstration that Hadamard-plus-interference gives a quantum speedup.

References

John Preskill, Lecture Notes on Quantum Computation — theory.caltech.edu/~preskill/ph229.
Nielsen and Chuang, Quantum Computation and Quantum Information — Cambridge University Press.
Wikipedia, Double-slit experiment.
Wikipedia, Mach–Zehnder interferometer.
Markus Arndt et al., Wave–particle duality of C_{60} molecules (1999) — Nature, via Zeilinger group.
Qiskit Textbook, Introduction to Quantum Computing.