In short

Given two quantum states \rho and \sigma on the same Hilbert space, there are two standard numbers that quantify their relationship. Fidelity asks how close they are: F(\rho,\sigma) = \bigl(\text{tr}\sqrt{\sqrt\rho\,\sigma\,\sqrt\rho}\bigr)^2, a number in [0, 1] that equals 1 iff \rho = \sigma and reduces to |\langle\psi|\phi\rangle|^2 for pure states. Trace distance asks how distinguishable they are: D(\rho,\sigma) = \tfrac{1}{2}\,\text{tr}|\rho-\sigma|, also in [0, 1], equal to 1 iff the states have orthogonal supports. Both are preserved by unitaries and monotone under quantum channels. Their operational meanings are sharp: fidelity equals the maximum overlap of purifications on a shared extended space (Uhlmann), and trace distance equals 2p_{\text{correct}} - 1 where p_{\text{correct}} is the best single-shot probability of telling \rho from \sigma. They are related by the Fuchs-van de Graaf inequalities 1 - \sqrt F \leq D \leq \sqrt{1 - F}. Fidelity is the metric experimental papers quote when reporting how well a state was prepared; trace distance is the metric error-correction thresholds and cryptographic security proofs are written in.

You run a quantum circuit on a real device. You wanted the state \rho_{\text{target}} = |\psi\rangle\langle\psi| with |\psi\rangle = \tfrac{1}{\sqrt 2}(|00\rangle + |11\rangle) — a Bell pair. You ask a tomography routine to reconstruct what you actually got, and it hands you back a 4\times 4 matrix \rho_{\text{actual}} that is nearly but not exactly \rho_{\text{target}}. Some elements are slightly off. The eigenvalues are not quite (1, 0, 0, 0); they are more like (0.97, 0.02, 0.008, 0.002).

How close did you get? That is the question this chapter answers. "Close" is a quantitative word, and there are two standard quantitative answers — fidelity and trace distance. They measure different things, they have different operational meanings, and the right tool depends on the question you are asking. An error-correction paper will quote one; a tomography paper will quote the other; a security proof will use the first to prove the second; a cross-platform comparison will use both. Learn them together.

Two questions, two numbers

Before any formulas, separate the two questions cleanly.

"How close are \rho and \sigma?" The answer is fidelity F(\rho, \sigma). F = 1 means the states are identical. F = 0 means they are as far apart as the formalism allows. Higher is better. Fidelity is the quantity experimentalists most often report because it has a clean pure-state limit: if both states are pure, F = |\langle\psi|\phi\rangle|^2, the familiar Born-rule overlap.

"How distinguishable are \rho and \sigma?" The answer is trace distance D(\rho, \sigma). D = 0 means you can't tell them apart by any measurement. D = 1 means a single copy is enough to distinguish them with certainty. Higher is worse if you wanted them equal; higher is better if you wanted to tell them apart. Trace distance is the quantity cryptographic security proofs and error-correction thresholds are usually written in, because it controls the probability that an adversary — or an error — can be caught.

Both are numbers in [0, 1]. Both collapse to familiar classical objects in limits. Both obey strong invariance and monotonicity properties. But they are not the same number, and the relationship between them has its own theorem (Fuchs-van de Graaf, §4 below).

Two questions about two statesTwo adjacent boxes summarising fidelity and trace distance. The fidelity box shows F of rho and sigma equals the trace of the square root of square root rho sigma square root rho all squared, with a note reading how close. The trace-distance box shows D equals one half trace of the absolute value of rho minus sigma with a note reading how distinguishable. Below the two boxes a single summary reads both live in zero to one; F equals one iff rho equals sigma; D equals zero iff rho equals sigma.Fidelity — how close?F(ρ, σ) = (tr √(√ρ σ √ρ))²F = 1 iff ρ = σpure states: F = |⟨ψ|φ⟩|²Trace distance — how different?D(ρ, σ) = ½ tr|ρ − σ|D = 0 iff ρ = σpure states: D = √(1 − |⟨ψ|φ⟩|²)both live in [0, 1]; both are invariant under unitaries; both are monotone under quantum channelsthey measure different things, and the relationship is the Fuchs-van de Graaf inequality
Fidelity and trace distance are the two standard metrics between quantum states. Fidelity measures closeness (higher is better for state preparation); trace distance measures distinguishability (higher means an adversary or an error is easier to catch). Both are constrained to $[0, 1]$ and obey the same invariance rules, but they answer distinct questions.

Fidelity

Fidelity takes its cleanest form for pure states, which is where you should meet it first.

Pure-state fidelity — the Born overlap

For two pure states |\psi\rangle and |\phi\rangle, the fidelity is just the squared modulus of their inner product:

F(|\psi\rangle, |\phi\rangle) \;=\; |\langle\psi|\phi\rangle|^2.

This is exactly the Born-rule probability you already know: if you prepared |\phi\rangle and measured in a basis containing |\psi\rangle, the probability of getting the outcome |\psi\rangle is |\langle\psi|\phi\rangle|^2. Why the square: the inner product \langle\psi|\phi\rangle is a complex amplitude; the observable quantity is the probability |\langle\psi|\phi\rangle|^2, and it is this probability that is 1 when the states are identical and 0 when they are orthogonal.

So for pure states, fidelity has an immediate physical meaning: "if you thought you had |\phi\rangle and were testing whether it is really |\psi\rangle by measuring in the \{|\psi\rangle, \ldots\} basis, fidelity is the probability that you pass the test." Identical states pass with probability 1; orthogonal states always fail.

Mixed-state fidelity — the general formula

For two general density operators, pure-state overlap is no longer defined (you can't take an inner product of density matrices directly; they are operators, not vectors). Fidelity generalises via a surprising but beautiful formula:

Fidelity

The fidelity between two density operators \rho, \sigma on the same Hilbert space is

F(\rho, \sigma) \;=\; \left(\text{tr}\sqrt{\sqrt{\rho}\,\sigma\,\sqrt{\rho}}\right)^2.

It satisfies 0 \leq F(\rho,\sigma) \leq 1, with F = 1 iff \rho = \sigma and F = 0 iff \rho, \sigma have orthogonal supports (no vector is non-zero under both).

The formula has two matrix square roots and a trace, and at first sight is heavy. Notice three things.

First, the definition is symmetric: F(\rho, \sigma) = F(\sigma, \rho), even though the formula is not manifestly symmetric. This is a theorem (proof via Uhlmann's characterisation below), and it is deep — fidelity measures the relationship between states, not the order you supply them in.

Second, when \rho = |\psi\rangle\langle\psi| is pure, the matrix \sqrt\rho = \rho (a rank-1 projector is its own square root), and \sqrt\rho\,\sigma\,\sqrt\rho = |\psi\rangle\langle\psi|\sigma|\psi\rangle\langle\psi| = \langle\psi|\sigma|\psi\rangle\,|\psi\rangle\langle\psi|. The square root of that rank-1 operator is \sqrt{\langle\psi|\sigma|\psi\rangle}\,|\psi\rangle\langle\psi|, whose trace is \sqrt{\langle\psi|\sigma|\psi\rangle}. Squaring: F(|\psi\rangle\langle\psi|, \sigma) = \langle\psi|\sigma|\psi\rangle. Why this matters: when one state is pure, fidelity reduces to the expectation value of the pure state's projector in the mixed state — a single matrix element, not the full double-square-root machinery. Most experimental fidelities are of this "compare to a target pure state" flavour, and this simpler formula is what actually gets computed.

Third, when both \rho and \sigma are pure, the formula further reduces to F(|\psi\rangle\langle\psi|, |\phi\rangle\langle\phi|) = |\langle\psi|\phi\rangle|^2 — the pure-state Born overlap from the previous subsection. The general formula is a strict generalisation.

Uhlmann's theorem — the operational meaning

Where does the awkward \sqrt{\sqrt\rho\,\sigma\,\sqrt\rho} come from? The clean answer is Uhlmann's theorem, which recharacterises fidelity in terms of pure states on a larger space.

Uhlmann's theorem

Let \rho, \sigma be density operators on \mathcal H. Let \mathcal H' be an ancilla space with \dim\mathcal H' \geq \max(\text{rank}\,\rho, \text{rank}\,\sigma). Then

F(\rho, \sigma) \;=\; \max_{|\psi_\rho\rangle, |\psi_\sigma\rangle}\,|\langle\psi_\rho|\psi_\sigma\rangle|^2,

where the maximum is over all purifications of \rho and \sigma in \mathcal H \otimes \mathcal H'.

Read that carefully. Every density operator \rho on \mathcal H admits purifications — pure states |\psi_\rho\rangle \in \mathcal H \otimes \mathcal H' whose reduction on \mathcal H is \rho (see Purification). These purifications are not unique; ancilla unitaries generate all of them. Uhlmann's theorem says: pick the pair of purifications that overlap the most. The squared magnitude of that maximum overlap is exactly F(\rho, \sigma).

Why this is beautiful: pure-state overlaps are trivial to compute (\langle\psi_\rho|\psi_\sigma\rangle is just a complex number). The operator-level formula with its nested square roots is an indirect computation of this maximum — the same number, reached via matrix algebra rather than ancilla optimisation. Uhlmann's theorem says the two routes always agree.

And the operational reading: fidelity is "the best match the two mixed states can achieve when you are allowed to choose how they sit inside a larger pure-state world." Symmetry F(\rho,\sigma) = F(\sigma,\rho) is immediate from the formula — |\langle\psi_\rho|\psi_\sigma\rangle|^2 = |\langle\psi_\sigma|\psi_\rho\rangle|^2. Monotonicity under channels (next section) has a one-line purification-based proof.

Properties of fidelity

Trace distance

Trace distance starts from a different idea: the L^1-norm on matrices, adapted to the quantum setting.

The definition and the one-norm

Trace distance

The trace distance between two density operators \rho, \sigma on the same Hilbert space is

D(\rho, \sigma) \;=\; \tfrac{1}{2}\,\|\rho - \sigma\|_1 \;=\; \tfrac{1}{2}\,\text{tr}|\rho - \sigma|,

where |A| = \sqrt{A^\dagger A} and the one-norm is \|A\|_1 = \text{tr}|A|. It satisfies 0 \leq D(\rho, \sigma) \leq 1, with D = 0 iff \rho = \sigma and D = 1 iff \rho, \sigma have orthogonal supports.

The factor of \tfrac{1}{2} normalises the range to [0, 1] for states (without it, two orthogonal pure states would give \|\rho-\sigma\|_1 = 2, not 1).

When \rho - \sigma is Hermitian (it always is here — the difference of two Hermitian operators is Hermitian), |\rho - \sigma| = \sqrt{(\rho-\sigma)^2} has the same eigenvectors as \rho - \sigma but with eigenvalues replaced by their absolute values. So if \rho - \sigma has eigenvalues \{\lambda_i\}, then \text{tr}|\rho-\sigma| = \sum_i |\lambda_i|, and

D(\rho, \sigma) \;=\; \tfrac{1}{2}\sum_i |\lambda_i|.

Why the sum of absolute eigenvalues: the one-norm of a Hermitian operator is the sum of the absolute values of its eigenvalues, because the matrix |H| has those as its (non-negative) eigenvalues. This is the quantum analogue of |x_1| + |x_2| + \cdots, the classical L^1 norm.

Notice the nice special case. The eigenvalues of \rho - \sigma sum to \text{tr}(\rho-\sigma) = 1 - 1 = 0, so the positive eigenvalues and the negative eigenvalues have equal magnitude. If you split \rho - \sigma = P - Q into its positive and negative parts (both PSD), then \text{tr}(P) = \text{tr}(Q), and

D(\rho,\sigma) \;=\; \text{tr}(P) \;=\; \text{tr}(Q).

This is the Jordan decomposition of the difference, and it is what connects trace distance to measurement probabilities.

Pure-state trace distance

For two pure states, a direct calculation gives

D(|\psi\rangle\langle\psi|, |\phi\rangle\langle\phi|) \;=\; \sqrt{1 - |\langle\psi|\phi\rangle|^2}.

The derivation. Let \rho = |\psi\rangle\langle\psi|, \sigma = |\phi\rangle\langle\phi|. The matrix \rho - \sigma lives in the two-dimensional span of \{|\psi\rangle, |\phi\rangle\}; outside this span, \rho - \sigma is zero. Inside it, you can compute eigenvalues by going to an orthonormal basis of the span and writing the 2\times 2 matrix explicitly. The result is eigenvalues \pm\sqrt{1 - |\langle\psi|\phi\rangle|^2} with zero elsewhere; summing absolute values gives 2\sqrt{1 - |\langle\psi|\phi\rangle|^2}, and dividing by 2 yields the formula.

So for pure states, F = |\langle\psi|\phi\rangle|^2 and D = \sqrt{1 - F}. The two metrics are exactly related: knowing one determines the other. The story is different for mixed states — see the Fuchs-van de Graaf inequalities below.

Operational meaning — the distinguishing task

Trace distance has the sharpest operational meaning of any quantum-state distance. Here is the game.

A referee picks a fair coin and, depending on the outcome, hands you one copy of either \rho or \sigma. You know what \rho and \sigma are; you just don't know which you received. You are allowed any measurement — projective, POVM, anything — and you must guess which state you hold. Your success probability is

p_{\text{correct}} \;=\; \frac{1}{2} + \frac{1}{2}D(\rho, \sigma).

If \rho = \sigma (D = 0), you guess at chance: p_{\text{correct}} = 1/2. If \rho \neq \sigma have orthogonal supports (D = 1), there is a measurement that tells them apart perfectly: p_{\text{correct}} = 1. Every intermediate D gives a linearly-interpolated optimum.

Sketch of why. Decompose \rho - \sigma = P - Q with P, Q \geq 0 supported on orthogonal subspaces. The optimal measurement is the two-outcome projective measurement \{\Pi_+, \Pi_-\} where \Pi_+ projects onto the support of P and \Pi_- = I - \Pi_+. Probability of guessing \rho when it was \rho plus probability of guessing \sigma when it was \sigma: \tfrac{1}{2}\text{tr}(\Pi_+\rho) + \tfrac{1}{2}\text{tr}(\Pi_-\sigma) = \tfrac{1}{2}(1 + \text{tr}(\Pi_+(\rho-\sigma))) = \tfrac{1}{2}(1 + \text{tr}(P)) = \tfrac{1}{2} + \tfrac{1}{2}D(\rho, \sigma). Why this measurement is optimal: the Helstrom bound says no other measurement does better, and you can see the intuition — you are separating the eigenspaces where \rho dominates (\Pi_+) from those where \sigma dominates (\Pi_-), which is exactly the information the trace distance captures.

This is the Helstrom bound. Every security proof, every state-discrimination protocol, every distinguishing argument in quantum information theory eventually collides with this formula. Trace distance is the natural language for "how much can the adversary (or the error) tell?"

The distinguishing-task operational meaningA flow diagram. On the left, a referee box flips a coin and feeds either rho or sigma (probability one half each) to a measurement apparatus labelled guess rho or sigma. An arrow on the right reads best probability equals one half plus one half D of rho and sigma. Below, a horizontal slider shows the probability ranging from one half (at D equals zero) to one (at D equals one).Refereeflips fair coinsends ρ or σ(you don't know which)Best measurementproject onto supportof positive part P(Helstrom)Best p(correct)½ + ½Dthe Helstrom boundD = 0 → p = ½ (guess at random) ; D = 1 → p = 1 (perfect distinction)trace distance is exactly the single-shot distinguishing advantage
The operational meaning of trace distance. Given a single copy of either $\rho$ or $\sigma$ (equally likely), the best single-shot probability of identifying which state you hold is $\tfrac{1}{2} + \tfrac{1}{2}D(\rho, \sigma)$. Trace distance is the only metric on quantum states with this crisp a meaning.

Properties of trace distance

Fuchs-van de Graaf — how fidelity and trace distance relate

Two metrics, one underlying "difference" between states. How do they constrain each other?

Fuchs-van de Graaf inequalities

For any two density operators \rho, \sigma,

1 - \sqrt{F(\rho,\sigma)} \;\leq\; D(\rho, \sigma) \;\leq\; \sqrt{1 - F(\rho,\sigma)}.

For pure states the right inequality is tight: D = \sqrt{1 - F}. For mixed states the bounds are not tight in general.

The lower bound D \geq 1 - \sqrt F says: high fidelity forces low trace distance. If F \geq 1 - \epsilon, then \sqrt F \geq \sqrt{1-\epsilon} \geq 1 - \epsilon/2 (for small \epsilon), so D \leq 1 - \sqrt F \leq \epsilon/2 ... wait, that is the wrong direction; the lower bound on D gives D \geq 1 - \sqrt F, which for F near 1 is close to 0 but still provides a lower constraint. The more useful statement, in practice, is the upper bound.

The upper bound D \leq \sqrt{1 - F} says: high fidelity forces the trace distance to be small. This is the direction experimentalists care about. If you measure a fidelity of F = 0.99 against a target state, then D \leq \sqrt{0.01} = 0.1 — and the adversary or the error can distinguish your prepared state from the target with at most 0.1 advantage over guessing. Fidelity is the cheaper number to measure (a single expectation value of a projector, for target pure states), and Fuchs-van de Graaf turns it into a trace-distance bound automatically.

The two inequalities are sharp in the pure-state limit: F = |\langle\psi|\phi\rangle|^2 and D = \sqrt{1 - F} make both bounds equalities simultaneously. In the mixed-state interior of state space there is genuine slack, and one metric can be much more informative than the other for specific pairs.

Fuchs-van de Graaf inequalities — fidelity versus trace distanceA graph with fidelity F on the horizontal axis (0 to 1) and trace distance D on the vertical axis (0 to 1). Two curves are drawn: the upper bound D = sqrt(1 - F) and the lower bound D = 1 - sqrt(F). The region between the two curves is the set of allowed (F, D) pairs. A single dot at F = 1, D = 0 labels identical states. Another dot at F = 0, D = 1 labels orthogonal states. The pure-state curve D = sqrt(1-F) is labelled as the upper boundary.F = fidelityD011upper bound: D ≤ √(1 − F)lower bound: D ≥ 1 − √Fallowed regionF = 1, D = 0identical statesF = 0, D = 1orthogonal
Fuchs-van de Graaf inequalities sandwich the trace distance $D$ between two functions of the fidelity $F$. Every valid pair $(F, D)$ lies in the shaded region between the two curves. For pure states, $D = \sqrt{1-F}$ exactly (the upper curve); for mixed states, the actual $D$ can be anywhere between the two bounds. Knowing $F$ alone pins $D$ to a short interval.

Worked examples

Example 1: $|0\rangle$ vs $|+\rangle$ — computing both metrics on two pure states

Compute F and D between the two pure single-qubit states |0\rangle and |+\rangle = \tfrac{1}{\sqrt 2}(|0\rangle + |1\rangle). Both are pure, so both formulas collapse to their pure-state versions — and the two numbers will be related by D = \sqrt{1-F} exactly.

Step 1. Compute the inner product. Why start here: for pure states, fidelity is the squared modulus of the inner product, so the inner product is the one number that determines both F and D.

\langle 0 | + \rangle \;=\; \tfrac{1}{\sqrt 2}(\langle 0|0\rangle + \langle 0|1\rangle) \;=\; \tfrac{1}{\sqrt 2}(1 + 0) \;=\; \tfrac{1}{\sqrt 2}.

Step 2. Compute fidelity.

F(|0\rangle, |+\rangle) \;=\; |\langle 0|+\rangle|^2 \;=\; \tfrac{1}{2}.

The fidelity is 1/2 — not great, but not zero. The two states have a substantial overlap.

Step 3. Compute trace distance via the pure-state formula.

D(|0\rangle, |+\rangle) \;=\; \sqrt{1 - F} \;=\; \sqrt{1 - 1/2} \;=\; \tfrac{1}{\sqrt 2} \;\approx\; 0.707.

Step 4. Verify the distinguishing interpretation. If a referee hands you one copy of either |0\rangle or |+\rangle (each with probability 1/2), the best single-shot probability of guessing correctly is

p_{\text{correct}} \;=\; \tfrac{1}{2} + \tfrac{1}{2}D \;=\; \tfrac{1}{2} + \tfrac{1}{2\sqrt 2} \;\approx\; 0.854.

Over 85\% — much better than chance, because the states are more different than similar, but not 100\%, because they are not orthogonal. Why the optimal measurement is a projection midway: the Helstrom measurement projects onto the eigenvectors of \rho - \sigma, and for these two states the positive eigenvector is proportional to (\cos(\pi/8)|0\rangle + \sin(\pi/8)|1\rangle) — the angle bisector between |0\rangle and |+\rangle on the Bloch sphere, rotated by the \pi/8 that is half the angle between the two Bloch vectors.

Step 5. Sanity-check with the Fuchs-van de Graaf inequalities. The lower bound gives D \geq 1 - \sqrt F = 1 - 1/\sqrt 2 \approx 0.293. The upper bound gives D \leq \sqrt{1-F} = 1/\sqrt 2 \approx 0.707. The true value D = 1/\sqrt 2 saturates the upper bound — as expected for pure states.

Result. F(|0\rangle, |+\rangle) = 1/2 and D(|0\rangle, |+\rangle) = 1/\sqrt 2 \approx 0.707. The two numbers are exactly related by D = \sqrt{1-F} because both states are pure.

|0⟩ and |+⟩ on the Bloch sphereA Bloch sphere with |0⟩ marked at the north pole and |+⟩ marked at the positive x equator point. A dashed geodesic connects the two Bloch vectors, showing the angle between them (which is pi over 2 on the sphere, corresponding to amplitude angle pi over 4). Labels F equals one half and D equals one over square root two are shown beside the geodesic.|0⟩|1⟩|+⟩F = ½, D = 1/√2angle between Bloch vectors = π/2pure-state overlap: F = cos²(ϑ/2), D = sin(ϑ/2)
The states $|0\rangle$ (north pole) and $|+\rangle$ (equator, $+x$) sit at a right angle on the Bloch sphere. For pure qubits, fidelity is $\cos^2(\vartheta/2)$ and trace distance is $\sin(\vartheta/2)$ where $\vartheta$ is the Bloch-sphere angle between the states. At $\vartheta = \pi/2$, $F = 1/2$ and $D = 1/\sqrt 2$.

What this shows. For two pure qubit states, the full story of "how close" and "how distinguishable" is contained in the single angle between their Bloch vectors. Fidelity and trace distance are just two different trigonometric functions of that angle — both metrics are legitimate, and for pure states they are redundant.

Example 2: $I/2$ vs $|0\rangle\langle 0|$ — maximally mixed versus a pure state

Compute F and D between the maximally mixed qubit \rho = I/2 and the pure state \sigma = |0\rangle\langle 0|. One is at the centre of the Bloch ball, the other at the north pole — as far apart as a mixed state and a pure state can get in this geometry, yet they are not orthogonal.

Step 1. Compute the difference matrix.

\rho - \sigma \;=\; \tfrac{1}{2}\begin{pmatrix}1 & 0 \\ 0 & 1\end{pmatrix} - \begin{pmatrix}1 & 0 \\ 0 & 0\end{pmatrix} \;=\; \begin{pmatrix}-1/2 & 0 \\ 0 & 1/2\end{pmatrix}.

Step 2. Compute trace distance. The eigenvalues of \rho - \sigma are \pm 1/2; sum of absolute values is 1; divide by 2:

D(I/2, |0\rangle\langle 0|) \;=\; \tfrac{1}{2}\cdot 1 \;=\; \tfrac{1}{2}.

Step 3. Compute fidelity using the pure-target simplification. Since \sigma = |0\rangle\langle 0| is pure, F(\rho, \sigma) = \langle 0|\rho|0\rangle. And \langle 0|(I/2)|0\rangle = 1/2, so

F(I/2, |0\rangle\langle 0|) \;=\; \tfrac{1}{2}.

Step 4. Check the Fuchs-van de Graaf bounds. \sqrt F = 1/\sqrt 2 \approx 0.707. Lower bound: D \geq 1 - 1/\sqrt 2 \approx 0.293. Upper bound: D \leq \sqrt{1 - 1/2} = 1/\sqrt 2 \approx 0.707. The true value D = 1/2 lies strictly inside the bounds (not saturating either) — because \rho is mixed, the bounds aren't tight. Why the bounds aren't tight here: Fuchs-van de Graaf is tight only for pure states. A genuine mixed state can produce (F, D) pairs anywhere in the allowed region — the formalism doesn't pin D exactly from F for non-pure states.

Step 5. Check the distinguishing interpretation. Best single-shot success probability is 1/2 + 1/4 = 3/4. If someone hands you either a pure |0\rangle or a maximally mixed state (each with probability 1/2), you can guess correctly 75\% of the time — by measuring in the computational basis. If you see 0, guess |0\rangle (correct with probability 1 when it was the pure |0\rangle; wrong with probability 1/2 when it was the mixed state, because then the outcome is random). The overall success probability works out to 3/4, matching 1/2 + D/2.

Result. F(I/2, |0\rangle\langle 0|) = 1/2 and D(I/2, |0\rangle\langle 0|) = 1/2. Notice: both metrics give the same number here, but that is an arithmetic coincidence for this specific pair, not a general fact.

Maximally mixed I/2 versus pure |0⟩ on the Bloch ballA Bloch sphere. A dot at the origin labelled I over 2 (maximally mixed). A dot at the north pole labelled |0⟩ (pure). A straight dashed line connects them, labelled r equals one (pointing straight up). The figure notes F equals one half and D equals one half.|0⟩|1⟩Bloch distance|r_σ − r_ρ| = 1I/2 (origin)F = ½, D = ½ (mixed state vs pure state)
The maximally mixed state sits at the origin of the Bloch ball; $|0\rangle$ sits at the north pole. Their Bloch-vector separation is $1$ (maximal within the ball). Fidelity is $1/2$ and trace distance is $1/2$. For single qubits, trace distance has a gorgeous Bloch-ball formula: $D(\rho, \sigma) = \tfrac{1}{2}|\vec r_\rho - \vec r_\sigma|$ — just half the Euclidean distance between the Bloch vectors.

What this shows. Fidelity and trace distance measure different things, and for mixed states they do not determine each other. The Fuchs-van de Graaf inequalities give a range, not a formula. When you report an experimental fidelity of 0.99, the implied trace distance is anywhere in [1 - \sqrt{0.99}, \sqrt{0.01}] = [0.005, 0.1] — a 20\times range. Tight bounds require tight metrics, and which tight metric matters depends on what question you're asking.

Applications

The reason these metrics are worth the algebra is that they anchor the most important practical calculations in quantum computing.

At TIFR and IIT Madras, experimental quantum computing groups validating NMR, trapped-ion, and superconducting-qubit platforms quote fidelities against target Bell states, GHZ states, and prepared magic states as the standard benchmark. The published fidelities on the most advanced Indian platforms as of 2025 sit around 0.95-0.99 for two-qubit entangled states, implying trace distances in the 0.03-0.15 range via Fuchs-van de Graaf. That is above fault-tolerant threshold for some codes and below it for others — the same number tells both stories, depending on which metric you translate it into.

Common confusions

Going deeper

If you are here for the definitions of fidelity and trace distance, the pure-state special cases, the two operational meanings (Uhlmann for F, Helstrom for D), and the Fuchs-van de Graaf inequalities, you have the package. The rest of this section digs into the proof of Uhlmann's theorem, the diamond norm for channels, the Bures metric as an infinitesimal fidelity, and how these metrics plug into tomography and security proofs.

Uhlmann's theorem — proof sketch

The strategy uses the polar decomposition. Fix any purifications |\psi_\rho\rangle, |\psi_\sigma\rangle \in \mathcal H \otimes \mathcal H' of \rho, \sigma (constructed via the spectral-decomposition recipe of the purification chapter). Any other purifications differ by an ancilla unitary: (I \otimes U)|\psi_\rho\rangle and (I \otimes V)|\psi_\sigma\rangle for unitaries U, V on \mathcal H'. The overlap is

\langle\psi_\rho|(I \otimes U^\dagger V)|\psi_\sigma\rangle.

Choose fixed purifications |\psi_\rho\rangle = \sum_i \sqrt{p_i}\,|u_i\rangle|e_i\rangle and |\psi_\sigma\rangle = \sum_j \sqrt{q_j}\,|v_j\rangle|e_j\rangle using the same ancilla basis \{|e_i\rangle\} and the spectral decompositions \rho = \sum_i p_i|u_i\rangle\langle u_i|, \sigma = \sum_j q_j|v_j\rangle\langle v_j|. Then the overlap becomes \text{tr}(A W) for a specific operator A = \sqrt\rho\sqrt\sigma (more precisely, A_{ij} = \sqrt{p_i q_j}\,\langle u_i|v_j\rangle in the ancilla basis) and a unitary W = U^\dagger V. Maximising |\text{tr}(AW)| over unitaries W is a classical optimisation: the maximum is \|A\|_1 = \text{tr}|A|, achieved when W is the unitary part of the polar decomposition of A^\dagger. And \text{tr}|A| = \text{tr}\sqrt{A^\dagger A} = \text{tr}\sqrt{\sqrt\sigma\rho\sqrt\sigma}, which (by the symmetry of the square-root trace) equals \text{tr}\sqrt{\sqrt\rho\sigma\sqrt\rho} = \sqrt{F(\rho,\sigma)}. Squaring gives the fidelity. Full details in Nielsen and Chuang §9.2 or Preskill Ch.5.

Beyond fidelity and trace distance — the diamond norm

The two metrics above measure distance between states. There is also a metric for distance between channels, the diamond norm:

\|\mathcal E_1 - \mathcal E_2\|_{\diamond} \;=\; \sup_{\rho} \|(\mathcal E_1 \otimes \text{id})(\rho) - (\mathcal E_2 \otimes \text{id})(\rho)\|_1,

where the supremum is over all \rho on \mathcal H \otimes \mathcal H (the system plus a reference of the same dimension). The diamond norm is the operational channel-distinguishability metric — it upper-bounds how well any strategy, including those using entanglement with a reference, can distinguish two channels by running each once. It reduces to trace distance on \mathcal E_i(\rho) - \mathcal E_i(\sigma) in the obvious specialisations, and it is the natural language for channel-level error-correction thresholds.

The Bures metric — an infinitesimal fidelity

The Bures metric is the infinitesimal form of fidelity:

d_B(\rho, \sigma) \;=\; \sqrt{2\bigl(1 - \sqrt{F(\rho,\sigma)}\bigr)}.

Unlike fidelity (which is 1 for identical states) and trace distance (which is a norm), the Bures metric satisfies the triangle inequality and defines a genuine Riemannian structure on the space of density operators. In this metric, \rho and \sigma are close iff F is close to 1. The Bures distance is used in quantum metrology — the theory of how precisely a parameter can be estimated from quantum measurements — because the inverse of the metric tensor gives the quantum Fisher information, which bounds parameter-estimation precision via the quantum Cramér-Rao inequality.

Quantum tomography — reading a state

Quantum state tomography is the experimental procedure of reconstructing an unknown density operator \rho from repeated measurements. You measure expectation values of a complete set of observables (for a qubit, \langle\sigma_x\rangle, \langle\sigma_y\rangle, \langle\sigma_z\rangle; for n qubits, 4^n - 1 Pauli strings) and solve for \rho. The reconstructed \hat\rho is an estimate — it has statistical error from finite samples, and it may even fail to be a valid density operator (positive semi-definite) if the statistics are noisy. Post-processing with maximum-likelihood estimation returns a valid \hat\rho close to the raw estimate. Fidelity against a target state \rho_{\text{target}} is then the one-number benchmark. In Indian NMR quantum computing (IIT Madras, TIFR Mumbai), tomographic reconstruction of deviation density matrices has been the daily experimental currency for two decades — each new algorithm's output is reported as a fidelity against the ideal output state.

Monotonicity as a structural theorem

The fact that both F and D are monotone under CPTP channels — F goes up, D goes down — is called the data-processing inequality. It is the quantum analogue of the classical fact that processing a random variable cannot increase its distinguishability from another random variable. The proof goes via purification for F (channels correspond to unitaries on a larger space, so fidelity among purifications is preserved), and via the convex structure of trace distance for D. Data processing is the single deepest property both metrics share, and it is why one-shot quantum protocols can be reasoned about using a handful of state distances rather than a blizzard of measurement statistics.

Indian context — NMR quantum computing and fidelity benchmarks

The Indian NMR quantum computing programme at TIFR and IIT Madras (Anil Kumar, T. S. Mahesh, and collaborators) spent the 2000s developing tomographic reconstruction techniques for deviation density matrices on liquid-state NMR qubits. Each published implementation of Deutsch-Jozsa, Grover, or Shor on a small NMR processor is evaluated by computing the fidelity of the experimental output state against the theoretical ideal. Fidelities around 0.95-0.99 on 37-qubit NMR experiments established these techniques as the standard benchmark long before superconducting and trapped-ion platforms caught up. The metric's use in the NISQ era on those newer platforms is continuous with this earlier Indian work.

Where this leads next

References

  1. Wikipedia, Fidelity of quantum states — definition, pure-state limit, Uhlmann's theorem.
  2. Wikipedia, Trace distance — definition, operational meaning, Helstrom bound.
  3. Nielsen and Chuang, Quantum Computation and Quantum Information (2010), §9.2 (distance measures for quantum information) — Cambridge University Press.
  4. John Preskill, Lecture Notes on Quantum Computation, Ch. 3 and Ch. 5 — theory.caltech.edu/~preskill/ph229.
  5. Armin Uhlmann, The "transition probability" in the state space of a ⋆-algebra (1976), Reports on Mathematical Physics — DOI:10.1016/0034-4877(76)90060-4.
  6. John Watrous, The Theory of Quantum Information (2018), Ch. 3 (similarity and distance among states and channels) — cs.uwaterloo.ca/~watrous/TQI.