Fredkin (CSWAP) — padho-wiki

Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.

In short

The Fredkin gate — also called CSWAP, for controlled-SWAP — is the three-qubit gate that exchanges the two target qubits if and only if the control qubit is |1\rangle. If the control is |0\rangle, the targets pass through untouched. Its 8 \times 8 matrix is block-diagonal: an identity on the four basis states where the control is 0, and a SWAP on the four basis states where the control is 1. Like Toffoli, it is classically universal for reversible computation — every Boolean function can be written as a circuit of Fredkins and ancilla bits. Its most famous quantum use is the SWAP test: a compact circuit that estimates the overlap |\langle \psi | \phi \rangle|^2 between two unknown states without measuring either of them. On real hardware, Fredkin is expensive — a standard decomposition uses one Toffoli plus two CNOTs, so roughly eight CNOTs and a handful of single-qubit gates.

The last chapter met Toffoli, the three-qubit gate that is universal for classical reversible computation. Fredkin is Toffoli's sibling: a different three-qubit gate, same universality status, same self-inverse structure, but a completely different geometric personality. Instead of flipping a target bit conditionally, Fredkin exchanges two target bits conditionally. And because "exchange two things if a flag is set" is a surprisingly versatile primitive, Fredkin shows up in places Toffoli does not — most famously in the SWAP test, a little two-page miracle that lets a quantum computer estimate how similar two unknown quantum states are, without ever measuring them directly.

This chapter is the story of Fredkin. What it does to the eight three-qubit basis states, why its 8 \times 8 matrix splits so cleanly into two blocks, how it relates to Toffoli (cheap interconversion, though with a non-zero gate cost), and how the SWAP test builds on it to give you a probability readout of state overlap. By the end you should be able to sketch the CSWAP circuit symbol, write the matrix down from memory, and walk a state through the SWAP test step by step to see the overlap pop out.

What Fredkin does

The rule, in one sentence: Fredkin swaps the two target qubits if and only if the control qubit is |1\rangle.

The three qubits have named roles. One is the control — call it qubit 1. The other two are the targets — qubits 2 and 3. The gate looks at the control. If the control reads 0, qubits 2 and 3 are left exactly as they were. If the control reads 1, qubits 2 and 3 trade places — whatever state was on qubit 2 is now on qubit 3, and vice versa.

Write the input as |a\,b\,c\rangle where a is the control and b, c are the two target bits. The defining formula is:

\text{Fredkin}\,|a\,b\,c\rangle \;=\; \begin{cases} |a\,b\,c\rangle & \text{if } a = 0 \\ |a\,c\,b\rangle & \text{if } a = 1 \end{cases}

On the eight computational-basis states of three qubits the action is tabulated below.

Input \|abc\rangle	Control a	Output
\|000\rangle	0	\|000\rangle
\|001\rangle	0	\|001\rangle
\|010\rangle	0	\|010\rangle
\|011\rangle	0	\|011\rangle
\|100\rangle	1	\|100\rangle
\|101\rangle	1	\|110\rangle
\|110\rangle	1	\|101\rangle
\|111\rangle	1	\|111\rangle

Why only two rows move: the swap acts on the last two bits. Of the four basis states with control = 1, two of them (|100\rangle and |111\rangle) already have equal target bits, so swapping does nothing. Only |101\rangle and |110\rangle have differing target bits, and those two trade places.

The compressed view is exactly the SWAP story of the previous chapters, with an extra control layer on top: Fredkin = (SWAP on targets) · (control gates it all).

The 8-row truth table of Fredkin. The four control-0 states are fixed. Of the four control-1 states, two are fixed (targets already equal) and two trade places.

The circuit symbol

Fredkin's circuit notation is a direct fusion of the SWAP symbol (two ×'s connected by a vertical line) and the controlled-gate dot. A filled dot sits on the control wire; below it, the two target wires each carry an ×, and a single vertical line runs from the dot through both targets.

The Fredkin symbol: filled dot on the control, two `×` symbols on the two targets, all tied by a single vertical line. When the control reads 1, the two `×`'s fire and the targets exchange.

The matrix

\text{Fredkin} \;=\; \begin{pmatrix} I_4 & 0 \\ 0 & \text{SWAP} \end{pmatrix} \;=\; \begin{pmatrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{pmatrix}

Reading the matrix. Four 1's on the diagonal (rows 1–4) come from the control-0 half: those basis states are left alone. The bottom-right block has 1's at rows 5, 8 on its own diagonal (these are |100\rangle and |111\rangle, the two control-1 states with equal target bits), and two off-diagonal 1's at rows 6–7 — that is the SWAP that exchanges |101\rangle and |110\rangle. Every other entry is zero.

Why the matrix is pure permutation (every entry is 0 or 1): Fredkin, like SWAP and Toffoli, is a reversible Boolean function. It maps each basis state to exactly one other basis state. Permutations have exactly one 1 per row and per column and zeros everywhere else.

Fredkin's $8 \times 8$ matrix: an identity $I_4$ on the four control-0 states, and a SWAP block on the four control-1 states. The two off-diagonal blocks are zero.

Fredkin is its own inverse

Look at the block structure. The top-left identity is self-inverse. The bottom-right SWAP squares to the identity (\text{SWAP}^2 = I). So \text{Fredkin}^2 = I — two Fredkins in a row give you the identity. This matches the picture: swapping the targets twice (whenever the control says to) leaves them where they started, and the control was never touched. Fredkin is Hermitian, unitary, and self-adjoint — all the things you expect a clean reversible gate to be.

Why self-inverse gates matter: when you want to uncompute a Fredkin in a larger circuit (say, because you need to restore an ancilla to its initial state), you just apply Fredkin again. No dagger needed, no separate hardware calibration. This makes Fredkin convenient as a building block.

Why Fredkin is classically universal

Classically, a Boolean function takes some input bits and produces some output bits. Toffoli is universal for reversible Boolean functions — you can build any of them from Toffolis and ancilla bits. Fredkin shares this property. Every reversible Boolean function can be expressed as a circuit of Fredkin gates (plus ancilla bits). This is a theorem, proved by Fredkin and Toffoli in 1982 in the same paper that introduced both gates.

The cleanest way to see it is to show that Fredkin can simulate AND, OR, NOT, and XOR — the four universal classical operations. Because Fredkin is conditional, all the tricks involve seeding the targets with specific ancilla values and reading off what ends up where.

Fredkin as AND

Feed the three qubits |a, b, 0\rangle — control = a, target 1 = b, target 2 = 0. Apply Fredkin. If a = 0, nothing happens, and the output is |0, b, 0\rangle. If a = 1, the targets swap, and the output is |1, 0, b\rangle. Now look at just the third qubit in both cases:

a = 0: third qubit is 0.
a = 1: third qubit is b.

So the third qubit carries the value a \wedge b — the AND of the two input bits. The AND has been computed into the third qubit, reversibly, in one Fredkin.

Why this works: AND is "b if a else 0." Fredkin with the second target fixed at 0 implements exactly that — when a = 1, the b flows into the third slot; when a = 0, the third slot stays 0.

Fredkin as OR, NOT, XOR

OR: feed |a, 1, b\rangle. If a = 0, the middle stays 1, bottom stays b. If a = 1, the middle becomes b and the bottom becomes 1. Look at the middle qubit: it's 1 when a = 0 (in which case the OR is simply b — wait, actually \text{OR}(a=0, b) = b, not 1)... a cleaner route to OR uses De Morgan: a \vee b = \neg(\neg a \wedge \neg b), which needs one Fredkin and a couple of NOTs.
NOT: feed |a, 0, 1\rangle. If a = 0, output is |0, 0, 1\rangle. If a = 1, output is |1, 1, 0\rangle. The second qubit now holds a, and the third holds \neg a — so a Fredkin with targets 0 and 1 produces both a and its complement simultaneously, a useful quantum-computing primitive called fanout of the classical value.
XOR: XOR is not the Fredkin's natural fit — it is a Toffoli's natural fit. You can build it by composing Fredkins with ancilla management, but the Toffoli recipe is shorter.

The punchline

Fredkin can express any reversible Boolean function, which (with ancilla) gives you any Boolean function at all. That is what "classically universal for reversible computation" means. It is the same status Toffoli has — and the choice between them is usually dictated by hardware or by the algorithmic structure at hand.

Fredkin ↔ Toffoli interconversion

You can build one from the other at a modest cost. The standard recipe is:

\text{Fredkin}(a; b, c) \;=\; \text{CNOT}(c, b) \cdot \text{Toffoli}(a, b; c) \cdot \text{CNOT}(c, b)

Read left-to-right as matrix multiplication (right-to-left as time order, so you apply the first CNOT first, then the Toffoli, then the second CNOT). The CNOTs on the two targets convert a swap into a flip-and-unflip pattern that Toffoli can handle.

A three-gate decomposition of Fredkin: one CNOT on the two targets, one Toffoli with both the control and one target as controls, one more CNOT on the two targets.

Why this works — walk |1, 1, 0\rangle through. Start with the control = 1, targets = 1 and 0. After CNOT(c \to b) the middle bit becomes 1 \oplus 0 = 1 — unchanged for this input (the target is the middle wire). So state is |1, 1, 0\rangle. Now apply Toffoli(a, b; c): controls are both 1, target c flips from 0 to 1. State is |1, 1, 1\rangle. Finally CNOT(c \to b): bottom bit is 1, so middle bit flips from 1 to 0. Final state: |1, 0, 1\rangle. This matches Fredkin's table for input |1, 1, 0\rangle — targets 1 and 0 swap to 0 and 1, giving |1, 0, 1\rangle.

The total cost of one Fredkin is then one Toffoli + two CNOTs = roughly 6 CNOTs (for the Toffoli, from the Barenco decomposition) + 2 more CNOTs = 8 CNOTs + 8 single-qubit gates on a CNOT-native chip. That is slightly more expensive than a Toffoli (6 CNOTs + 8 single-qubit gates). On hardware where Toffoli is the primitive, Fredkin costs three gates; on hardware where CNOT is the primitive, Fredkin costs eight.

The SWAP test — comparing two quantum states

Fredkin's most celebrated quantum application is the SWAP test: a compact circuit that estimates the overlap |\langle \psi | \phi \rangle|^2 between two unknown quantum states, using one Fredkin, two Hadamards, and one measurement. It is the rare quantum subroutine that is short enough to carry around in your head, yet powerful enough to be the workhorse of several quantum-machine-learning and state-comparison protocols.

The setup: you have three qubits. One is an ancilla prepared in |0\rangle. The other two carry the two states you want to compare — call them |\psi\rangle and |\phi\rangle. You do not know what |\psi\rangle and |\phi\rangle are; you just have one copy of each.

The SWAP test: a Hadamard on the ancilla, a controlled-SWAP of $|\psi\rangle$ and $|\phi\rangle$, another Hadamard on the ancilla, and a measurement. The probability of reading 0 encodes the overlap.

The circuit, step by step

Step 1: Hadamard the ancilla. The ancilla starts in |0\rangle. Hadamard sends it to |+\rangle = \tfrac{1}{\sqrt 2}(|0\rangle + |1\rangle). The joint state is now

\tfrac{1}{\sqrt 2}\bigl(|0\rangle + |1\rangle\bigr) \otimes |\psi\rangle \otimes |\phi\rangle \;=\; \tfrac{1}{\sqrt 2}\bigl(|0\rangle|\psi\rangle|\phi\rangle + |1\rangle|\psi\rangle|\phi\rangle\bigr).

Step 2: apply Fredkin with the ancilla as control. Fredkin swaps the second and third qubits only when the ancilla is 1. So the |0\rangle branch is unchanged, and the |1\rangle branch has |\psi\rangle and |\phi\rangle swapped:

\tfrac{1}{\sqrt 2}\bigl(|0\rangle|\psi\rangle|\phi\rangle + |1\rangle|\phi\rangle|\psi\rangle\bigr).

Why the swap lands inside the superposition: the Fredkin gate is linear, so it acts separately on each branch of the ancilla's superposition. The ancilla being |0\rangle in one branch and |1\rangle in the other means the swap fires in exactly one of the two branches.

Step 3: Hadamard the ancilla again. The second Hadamard interferes the two branches. Using H|0\rangle = \tfrac{1}{\sqrt 2}(|0\rangle + |1\rangle) and H|1\rangle = \tfrac{1}{\sqrt 2}(|0\rangle - |1\rangle):

\tfrac{1}{2}\bigl(|0\rangle + |1\rangle\bigr)|\psi\rangle|\phi\rangle + \tfrac{1}{2}\bigl(|0\rangle - |1\rangle\bigr)|\phi\rangle|\psi\rangle

= \;\tfrac{1}{2}\,|0\rangle\bigl(|\psi\rangle|\phi\rangle + |\phi\rangle|\psi\rangle\bigr) + \tfrac{1}{2}\,|1\rangle\bigl(|\psi\rangle|\phi\rangle - |\phi\rangle|\psi\rangle\bigr).

Step 4: measure the ancilla. The probability of getting 0 is the squared norm of the |0\rangle coefficient:

P(0) \;=\; \tfrac{1}{4}\bigl\|\,|\psi\rangle|\phi\rangle + |\phi\rangle|\psi\rangle\,\bigr\|^2.

Expand the squared norm using \|v\|^2 = \langle v | v \rangle:

Why four cross terms: (A + B)^\dagger(A + B) = A^\dagger A + A^\dagger B + B^\dagger A + B^\dagger B. Each of those four inner products in our case factors into a product of single-qubit inner products on the two copies.

Each |\psi\rangle and |\phi\rangle is normalised, so \langle\psi|\psi\rangle = \langle\phi|\phi\rangle = 1. And \langle\psi|\phi\rangle\langle\phi|\psi\rangle = |\langle\psi|\phi\rangle|^2 (for complex z, z \cdot \bar z = |z|^2, and \langle\phi|\psi\rangle = \overline{\langle\psi|\phi\rangle}). So the bracket collapses to

\|\cdot\|^2 \;=\; 1 + 2|\langle\psi|\phi\rangle|^2 + 1 \;=\; 2 + 2|\langle\psi|\phi\rangle|^2.

Divide by 4:

\boxed{\;P(0) \;=\; \tfrac{1}{2} + \tfrac{1}{2}|\langle\psi|\phi\rangle|^2.\;}

Reading the result. When |\psi\rangle = |\phi\rangle, the overlap is |\langle\psi|\phi\rangle|^2 = 1 and P(0) = 1 — the ancilla measurement always returns 0. When |\psi\rangle \perp |\phi\rangle (the two states are orthogonal), the overlap is 0 and P(0) = 1/2 — the measurement is a 50/50 coin flip. For intermediate overlaps, P(0) interpolates smoothly between 1/2 and 1. So if you run the circuit many times and measure a frequency of 0's, you can back out |\langle\psi|\phi\rangle|^2 via \hat O = 2 P(0) - 1.

Why this is remarkable

You have just learned a specific functional of two quantum states — their overlap — without ever measuring either state directly. Measuring |\psi\rangle in some basis would collapse it and destroy the phase information. Measuring |\phi\rangle likewise. The SWAP test bypasses both collapses: the ancilla is what you measure, and the two states enter, get shuffled by the controlled-SWAP, and exit with their identities still coherent. The cost is that you need many shots (the result is probabilistic) and the two states get consumed (at least once per shot).

Full quantum-state tomography, by contrast, requires exponentially many measurements in the qubit count — you would have to reconstruct every amplitude of |\psi\rangle and |\phi\rangle, then compute their overlap classically. The SWAP test trades generality (you only get the overlap, not the full states) for efficiency (a single scalar, from a constant-depth circuit).

Example 1 — verify CSWAP on basis states

Check two specific inputs by hand to build trust in the gate.

Setup. Take two computational-basis inputs: |1\rangle|01\rangle (control = 1, targets = 0, 1) and |0\rangle|01\rangle (control = 0, same targets).

Why this matches the table: input |101\rangle has control a = 1 and target bits b = 0, c = 1. Fredkin sends this to |acb\rangle = |110\rangle, which is what we got.

Step 2 — apply CSWAP to |0\rangle|01\rangle. The control is 0, so nothing happens. The output is |001\rangle, same as the input.

Result. The two basis states pass the sanity check: swap fires when the control is 1, does nothing when the control is 0. Both match the formula \text{Fredkin}|abc\rangle = |a\rangle|c^a b^a\rangle where the superscript means "swap if a=1."

Left: control = 1 fires the swap, so $|101\rangle$ becomes $|110\rangle$. Right: control = 0 leaves the targets alone, so $|001\rangle$ passes through.

Example 2 — SWAP test on $|+\rangle$ and $|-\rangle$

The two states |+\rangle = \tfrac{1}{\sqrt 2}(|0\rangle + |1\rangle) and |-\rangle = \tfrac{1}{\sqrt 2}(|0\rangle - |1\rangle) are orthogonal — they differ only in the sign on their |1\rangle component. The SWAP test should read that orthogonality.

Step 1 — compute the overlap directly.

\langle +|-\rangle \;=\; \tfrac{1}{\sqrt 2}(\langle 0| + \langle 1|) \cdot \tfrac{1}{\sqrt 2}(|0\rangle - |1\rangle) \;=\; \tfrac{1}{2}(\langle 0|0\rangle - \langle 0|1\rangle + \langle 1|0\rangle - \langle 1|1\rangle) \;=\; \tfrac{1}{2}(1 - 0 + 0 - 1) \;=\; 0.

The overlap is 0, so |\langle +|-\rangle|^2 = 0.

Why the cross terms drop: \langle 0|1\rangle = \langle 1|0\rangle = 0 by orthogonality of the computational basis. Only the diagonal terms \langle 0|0\rangle = 1 and \langle 1|1\rangle = 1 survive.

Step 2 — plug into the SWAP-test formula.

P(0) \;=\; \tfrac{1}{2} + \tfrac{1}{2}\cdot 0 \;=\; \tfrac{1}{2}.

Step 3 — interpret. The ancilla comes out 0 half the time and 1 half the time — a fair coin flip. If you run the circuit 1000 times, you expect about 500 zeros and 500 ones, plus or minus the usual \sqrt{1000} \approx 32 statistical fluctuation.

Result. A 50/50 ancilla distribution is the signature of orthogonal states. If the SWAP test returned, say, 70% zeros and 30% ones, the overlap would be 2 \cdot 0.7 - 1 = 0.4, meaning |\langle \psi | \phi \rangle|^2 = 0.4 — a moderately similar pair. If the test returned 100% zeros, the two states are identical up to a global phase. The full range of P(0) is [\tfrac{1}{2}, 1], with \tfrac{1}{2} meaning "totally different" and 1 meaning "identical."

For orthogonal input states, the SWAP test gives a 50/50 ancilla distribution — no information about which state was which, only the overlap (zero, in this case).

Common confusions

"CSWAP and SWAP are the same gate." They are not. SWAP is a two-qubit gate that always exchanges its two inputs. CSWAP is a three-qubit gate that exchanges its two targets only if a third control qubit is in |1\rangle. You can obtain SWAP from CSWAP by pinning the control to |1\rangle — but that is a specific choice, not a general equivalence.
"The SWAP test tells you which state is which." No. It tells you a single scalar, |\langle \psi | \phi \rangle|^2, which is the magnitude-squared of their overlap. It does not tell you what either state is, nor does it tell you the relative phase. Two pairs (|\psi\rangle, |\phi\rangle) with the same overlap magnitude are indistinguishable by the SWAP test alone.
"One SWAP test gives you the answer." No — the output is probabilistic. Each shot gives you one classical bit from the ancilla. To estimate P(0) to precision \epsilon, you need roughly 1/\epsilon^2 shots. For \epsilon = 0.01 you need 10,000 shots; for \epsilon = 0.001 you need a million. This is the cost of learning a continuous quantity from discrete measurements.
"The SWAP test works on classical bits." Not as written. If |\psi\rangle and |\phi\rangle are classical bit-strings (computational-basis states), the SWAP test reduces to a bit-equality check that any classical XOR circuit can do in one gate. The point of the SWAP test is that it works on superposition states, where the inner product is a non-trivial complex number and no classical comparison makes sense.
"Fredkin is the same as Toffoli because they are both universal." Both are universal for reversible Boolean computation, but they are different gates. Toffoli is a controlled-controlled-NOT; Fredkin is a controlled-SWAP. Each can simulate the other at a cost of a few extra gates, but the matrices are distinct, the circuit symbols are distinct, and the natural applications are distinct (Toffoli for AND, Fredkin for conditional routing).
"Fredkin is cheap because it is one gate in the diagram." On real hardware, Fredkin is not a primitive. It decomposes into about 8 CNOTs + 8 single-qubit gates. A circuit drawn with 5 Fredkins implements approximately 40 CNOTs in compiled form — a hefty cost on NISQ hardware where two-qubit fidelities are around 99%.

Going deeper

You now know what Fredkin does, how to build it from Toffoli and CNOTs, and how the SWAP test extracts the overlap |\langle\psi|\phi\rangle|^2 from two unknown states. The remaining sections cover the rigorous derivation of the SWAP-test formula, its generalisation to distinguishability measures, a preview of quantum fingerprinting, the Fredkin-Toffoli equivalence proof, and a note on hardware cost.

The SWAP-test derivation — rigorous bookkeeping

The bracket expansion in the main text used a short-hand; here is the careful version. After the controlled-SWAP the joint state of (ancilla, register A, register B) is

|\Psi\rangle \;=\; \tfrac{1}{\sqrt 2}\bigl(|0\rangle|\psi\rangle|\phi\rangle + |1\rangle|\phi\rangle|\psi\rangle\bigr).

Apply the second Hadamard only to the ancilla. In terms of the identity operators on the two registers,

(H \otimes I \otimes I)|\Psi\rangle \;=\; \tfrac{1}{2}|0\rangle\bigl(|\psi\rangle|\phi\rangle + |\phi\rangle|\psi\rangle\bigr) + \tfrac{1}{2}|1\rangle\bigl(|\psi\rangle|\phi\rangle - |\phi\rangle|\psi\rangle\bigr).

The probability of the ancilla reading 0 is

P(0) \;=\; \bigl\|\tfrac{1}{2}\bigl(|\psi\rangle|\phi\rangle + |\phi\rangle|\psi\rangle\bigr)\bigr\|^2 \;=\; \tfrac{1}{4}\bigl(\langle\psi|\langle\phi| + \langle\phi|\langle\psi|\bigr)\bigl(|\psi\rangle|\phi\rangle + |\phi\rangle|\psi\rangle\bigr).

Expanding the bracket gives four terms. Two of them simplify by normalisation: \langle\psi|\psi\rangle\langle\phi|\phi\rangle = 1 and \langle\phi|\phi\rangle\langle\psi|\psi\rangle = 1. The other two combine to \langle\psi|\phi\rangle\langle\phi|\psi\rangle + \langle\phi|\psi\rangle\langle\psi|\phi\rangle = 2|\langle\psi|\phi\rangle|^2 using \langle\phi|\psi\rangle = \overline{\langle\psi|\phi\rangle} and z \bar z = |z|^2. Dividing by 4: P(0) = \tfrac{1}{2} + \tfrac{1}{2}|\langle\psi|\phi\rangle|^2.

The fidelity F(\psi, \phi) := |\langle\psi|\phi\rangle| is a standard distance measure between quantum states (actually, \sqrt{F} is). The SWAP test extracts F^2, which for pure states uniquely determines F.

Quantum fingerprinting

Buhrman, Cleve, Watrous, and de Wolf (2001) extended the SWAP test into a communication-complexity protocol called quantum fingerprinting [3]. The setup: Alice has an n-bit string x, Bob has an n-bit string y, and they want to decide whether x = y using as little communication as possible. Classical protocols require \Theta(n) bits of communication in the worst case. But if Alice and Bob each encode their string into a quantum state |\psi_x\rangle (called a fingerprint) of O(\log n) qubits, and send these to a referee, the referee can run the SWAP test on |\psi_x\rangle and |\psi_y\rangle and decide equality with high probability — using O(\log n) qubits total. That is an exponential saving in communication, powered entirely by the SWAP test.

The encoding is cleverer than simple computational-basis embedding — it uses error-correcting codes so that x \neq y guarantees |\langle\psi_x|\psi_y\rangle|^2 \leq \tfrac{1}{2} (say), which the SWAP test can reliably distinguish from 1.

Distinguishability and trace distance

The SWAP-test output |\langle\psi|\phi\rangle|^2 is related to the trace distance D(|\psi\rangle, |\phi\rangle) = \tfrac{1}{2}\||\psi\rangle\langle\psi| - |\phi\rangle\langle\phi|\|_1 = \sqrt{1 - |\langle\psi|\phi\rangle|^2}. The trace distance is the operationally meaningful measure of how well you can distinguish two states by a single measurement — and the SWAP test gives you direct access to the quantity that controls it. Run many SWAP tests, estimate |\langle\psi|\phi\rangle|^2, take the complement-square-root, and you have the trace distance.

Fredkin-Toffoli equivalence

Both Fredkin and Toffoli are universal for classical reversible computation, and the precise meaning of "equivalent" is that each can be built from a constant number of the other (with ancilla qubits as needed). The converse of the Fredkin-from-Toffoli construction in the main text goes through the observation that Toffoli(a, b; c) is the same unitary as \text{CSWAP}(a; 0, c \oplus b) \cdot \text{things} — the details are a few lines of algebra that Fredkin and Toffoli worked out in their 1982 paper. The upshot is that no matter which gate your hardware likes, the reversible-logic toolkit stays accessible.

Hardware cost of CSWAP

On today's superconducting chips (Compustar Heron, Querion Willow), Fredkin is not a hardware primitive. The compiler decomposes it: one Toffoli + two CNOTs = 6 + 2 = 8 CNOTs. Each CNOT has a two-qubit error of roughly 10^{-3} on the best current machines, so one Fredkin carries about 8 \times 10^{-3} = 8 \times 10^{-3} \approx 1\% total error — not catastrophic, but not free. In a SWAP test, the Fredkin is the dominant error source, which puts a ceiling on how small an overlap you can reliably distinguish: below |\langle\psi|\phi\rangle|^2 \approx \text{hardware error}, the measured P(0) is dominated by noise. Indian experimental groups at TIFR (Mumbai) and IIT Madras have been exploring SWAP-test demonstrations on small superconducting and photonic platforms as part of the National Quantum Mission — with target applications including quantum-machine-learning similarity kernels and quantum-chemistry state comparisons.

Fredkin in fault-tolerant computation

When you move from noisy hardware to fault-tolerant quantum computation (error-corrected logical qubits), the cost story changes. Clifford gates (CNOT, H, S) are cheap; non-Clifford gates (T and Toffoli) are expensive, because they require magic-state distillation. Fredkin, through its Toffoli-plus-CNOTs decomposition, inherits a T-count of 7 (from the Toffoli's optimal Clifford+T decomposition). So the fault-tolerant Fredkin costs 7 magic T-states plus Clifford overhead — roughly the same as a Toffoli. The equivalence is preserved at both the noisy and the fault-tolerant levels.

Where this leads next

Toffoli gate — the sibling three-qubit universal gate; each can build the other.
SWAP and iSWAP — the uncontrolled two-qubit SWAP, the building block inside Fredkin.
Reversible computation — the Fredkin-Toffoli 1982 paper and the classical theory behind both gates.
SWAP test — the dedicated chapter on the protocol, its error analysis, and its applications in QML and quantum fingerprinting.
Controlled-U gates — the general framework for "apply unitary if control is |1\rangle" of which CSWAP is one instance.
Quantum fingerprinting — the exponential-compression application of the SWAP test.

References

Edward Fredkin and Tommaso Toffoli, Conservative logic (1982) — the original paper introducing the Fredkin gate (as "conservative gate") and proving its universality for reversible Boolean computation. International Journal of Theoretical Physics 21, 219.
Nielsen and Chuang, Quantum Computation and Quantum Information (2010), §1.4.4 and §4.3 (Fredkin gate and the SWAP test) — Cambridge University Press.
Harry Buhrman, Richard Cleve, John Watrous, and Ronald de Wolf, Quantum fingerprinting (2001) — the SWAP-test-based exponential-compression protocol for equality testing. arXiv:quant-ph/0102001.
Wikipedia, Fredkin gate — matrix form, truth table, circuit diagram, and historical notes.
Wikipedia, Swap test — the protocol, its probability derivation, and applications.
John Preskill, Lecture Notes on Quantum Computation, Ch. 6 — theory.caltech.edu/~preskill/ph229.