In short

The S gate is a 90° rotation of the Bloch sphere about the z-axis; the T gate is a 45° rotation about the same axis. Both are diagonal in the computational basis: S = \text{diag}(1, i) and T = \text{diag}(1, e^{i\pi/4}). S leaves |0\rangle alone and multiplies |1\rangle by i; T leaves |0\rangle alone and multiplies |1\rangle by e^{i\pi/4}. They are the fourth-root and eighth-root of Z: S^2 = Z, T^4 = Z, T^8 = I. Their inverses are distinct from themselves: S^\dagger = \text{diag}(1, -i), T^\dagger = \text{diag}(1, e^{-i\pi/4}). Together with H and CNOT, S generates the Clifford group — a set of gates that is powerful but classically simulable (Gottesman–Knill theorem). Adding T makes the gate set universal. That is why T is famously "expensive" on a real fault-tolerant quantum computer: it cannot be implemented directly from an error-corrected Clifford framework; it must be injected via magic-state distillation, a protocol that typically consumes hundreds of physical gates per single reliable T.

You have met the Pauli Z gate — the 180° rotation about the z-axis, which multiplies |1\rangle by -1 and leaves |0\rangle alone. A clean, self-inverse gate: Z twice is the identity. But 180° is a coarse move. What if you wanted a smaller rotation about the same axis — a 90° phase flip, or a 45° one?

That is exactly what S and T deliver. S is half of Z: rotate by only 90° about the z-axis and you have multiplied |1\rangle by i instead of by -1. Two S gates back-to-back complete the half-circle and give you Z. T is half of S: rotate 45° about z and you multiply |1\rangle by e^{i\pi/4}. Four Ts give Z, and eight Ts return to the identity.

That makes S the "square root of Z" and T the "fourth root of Z" — or equivalently, "the square root of S." They are the next two gates in the single-qubit zoo after the Paulis and the Hadamard, and they carry a weight disproportionate to their simple matrices. The T gate, in particular, is the single most consequential gate in fault-tolerant quantum computing theory: it is the one gate that escapes the Clifford group and unlocks the full universe of quantum algorithms — at a fault-tolerance cost that drives almost every resource estimate for quantum advantage.

This chapter lays out both gates across all four pictures — matrix, action on basis states, circuit symbol, and Bloch sphere — then explains the Clifford/non-Clifford distinction that makes T the load-bearing gate of the field.

The matrices and what they do

Write them down once:

S \;=\; \begin{pmatrix} 1 & 0 \\ 0 & i \end{pmatrix}, \qquad T \;=\; \begin{pmatrix} 1 & 0 \\ 0 & e^{i\pi/4} \end{pmatrix}.

Two diagonal 2\times 2 matrices. The top-left entry is 1 in both cases, so |0\rangle is untouched. The bottom-right entry is a complex phase — i = e^{i\pi/2} for S, and e^{i\pi/4} for T. That phase is what distinguishes them from the identity and from each other.

Why diagonal matrices represent phase-only rotations: a diagonal matrix \text{diag}(a, b) acts on \alpha|0\rangle + \beta|1\rangle by multiplying \alpha by a and \beta by b. If |a| = |b| = 1 (as here — they are phases), the probabilities |\alpha|^2 and |\beta|^2 are unchanged, only the phases shift. No bit-flip happens; this is pure phase manipulation.

The S and T matrices side by sideTwo 2x2 matrices displayed in cells. The left matrix is S with diagonal entries 1 and i. The right matrix is T with diagonal entries 1 and e raised to i pi over 4. Each is labelled with the rotation it performs about the z-axis.S100i90° rotation about zS² = Z • S⁴ = IT100e^(iπ/4)45° rotation about zT² = S • T⁴ = Z • T⁸ = I
The two phase gates. $S$ adds a $\pi/2$ phase (multiplication by $i$) to the amplitude of $|1\rangle$; $T$ adds a $\pi/4$ phase. Both are identities on $|0\rangle$. Two $S$s give $Z$; four $T$s give $Z$; eight $T$s return to the identity.

The other common names for these gates are the phase gate (S, sometimes denoted P(\pi/2)) and the \pi/8 gate (T). The "\pi/8" naming for T is a historical accident: T is sometimes written as \text{diag}(e^{-i\pi/8}, e^{i\pi/8}) — which differs from the form above only by a global phase of e^{-i\pi/8}. Since global phases are unobservable, both forms describe the same gate. In this track, we use the cleaner form T = \text{diag}(1, e^{i\pi/4}) throughout.

Action on the computational basis

Multiply S and T into |0\rangle and |1\rangle and you get:

S|0\rangle = |0\rangle, \qquad S|1\rangle = i|1\rangle.
T|0\rangle = |0\rangle, \qquad T|1\rangle = e^{i\pi/4}|1\rangle.

Why |0\rangle is untouched in both cases: the top-left entry of both matrices is 1. Multiplying any column vector that has 0 in its second slot by a diagonal matrix whose second entry is a phase leaves the first entry unchanged — and |0\rangle = (1, 0)^T has 0 in its second slot. So only the amplitude of |1\rangle ever picks up a phase.

On a single computational-basis state, S and T do nothing visible — the result is the same state multiplied by a global phase, which cannot be measured. You will only see the effect of S or T when the qubit is in a superposition, where the phase on |1\rangle becomes a relative phase against the amplitude on |0\rangle.

Action on |+⟩ and |−⟩

Apply S to |+\rangle = \tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle):

S|+\rangle = \tfrac{1}{\sqrt{2}}(S|0\rangle + S|1\rangle) = \tfrac{1}{\sqrt{2}}(|0\rangle + i|1\rangle) = |+i\rangle.

Why this is called |+i\rangle: this is the standard notation for the state on the +y axis of the Bloch sphere. It is the equator-state one-quarter-turn counter-clockwise from |+\rangle, with an i in the |1\rangle amplitude instead of a +1 or -1.

Apply S to |-\rangle:

S|-\rangle = \tfrac{1}{\sqrt{2}}(S|0\rangle - S|1\rangle) = \tfrac{1}{\sqrt{2}}(|0\rangle - i|1\rangle) = |-i\rangle.

So S rotates the equator by 90°: |+\rangle \to |+i\rangle, |-\rangle \to |-i\rangle. Four applications of S bring you all the way around the equator back to the start — which matches S^4 = I.

For T, the analogous calculation gives:

T|+\rangle = \tfrac{1}{\sqrt{2}}(|0\rangle + e^{i\pi/4}|1\rangle).

This is a state 45° along the equator, between |+\rangle and |+i\rangle. It does not have its own standard name — there is no standard letter for a 45° equator-state the way there is for |+\rangle, |-\rangle, |+i\rangle, |-i\rangle. But that is precisely why T is useful: it reaches states the Paulis and H and S cannot.

Picture: the Bloch sphere

Every diagonal 2\times 2 unitary whose top-left entry is 1 is a rotation about the z-axis on the Bloch sphere. S is a 90° rotation; T is a 45° rotation. The Pauli Z, for comparison, is the 180° rotation you already know.

S and T as rotations about the z-axis of the Bloch sphereTwo Bloch spheres side by side. The left sphere shows the S gate as a 90-degree rotation about the vertical z-axis, with a curved arrow on the equator and labels for |+⟩ rotating to |+i⟩. The right sphere shows the T gate as a 45-degree rotation about the z-axis with a shorter curved arrow on the equator.|0⟩|1⟩|+⟩y|+i⟩S: 90° about z|+⟩ → |+i⟩|0⟩|1⟩|+⟩yT|+⟩T: 45° about z|+⟩ → midway to |+i⟩
$S$ rotates the Bloch sphere $90°$ about the $z$-axis, carrying $|+\rangle$ to $|+i\rangle$. $T$ rotates $45°$, carrying $|+\rangle$ to a state halfway between $|+\rangle$ and $|+i\rangle$. Both leave the $z$-axis points $|0\rangle$ and $|1\rangle$ fixed.

Two points fall out immediately:

The poles are fixed points of both rotations. Any rotation about the z-axis leaves the north and south poles stationary, which matches the algebra: S|0\rangle = |0\rangle, T|0\rangle = |0\rangle, and |1\rangle is the same state as i|1\rangle or e^{i\pi/4}|1\rangle up to a global phase. Nothing observable changes on a pole.

Equator states rotate into other equator states. |+\rangle sits on the equator at the +x point; S takes it to the +y point (|+i\rangle); T takes it to the point halfway between. All points on the equator are connected by phase-only rotations, which is why those rotations never change the |0\rangle-vs-|1\rangle measurement outcome — only the relative phase between the two amplitudes, which is an equator position.

Picture: circuit symbols

In a quantum circuit, S and T are drawn as labelled boxes on a single wire — exactly like H, X, Y, Z.

Circuit symbols for S and T gatesTwo separate circuit diagrams in one SVG. The left shows a wire labeled psi passing through an S box. The right shows a wire labeled psi passing through a T box.S|ψ⟩S|ψ⟩S: adds π/2 phase to |1⟩
Circuit symbol for T gateA single wire labeled psi passing through a T-gate box, emerging as T-psi on the right.T|ψ⟩T|ψ⟩T: adds π/4 phase to |1⟩
Circuit notation for $S$ and $T$. A single labelled box on one wire. When you see a $T$ in a real circuit-decomposition report, pay attention: it is the most expensive gate in the diagram.

The inverses S^\dagger and T^\dagger are drawn the same way, with a dagger (\dagger) added to the label: S^\dagger is the box "S†", and T^\dagger is the box "T†". Both are valid gates in a circuit and are used just as often as their unadagged counterparts — a +90° rotation and a -90° rotation are equally useful building blocks.

Relationships between S, T, and Z

You have already seen the slogan — S is the square root of Z, T is the fourth root of Z. Verify it explicitly.

S^2 = Z. Multiply S by itself:

S^2 = \begin{pmatrix} 1 & 0 \\ 0 & i \end{pmatrix}\begin{pmatrix} 1 & 0 \\ 0 & i \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & i^2 \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix} = Z.

Why this worked: multiplying two diagonal matrices just multiplies their diagonal entries. i \cdot i = i^2 = -1, which is precisely the bottom-right entry of Z. Two 90° rotations about the same axis combine into one 180° rotation about that axis — the algebra says the same thing.

T^2 = S. Multiply T by itself:

T^2 = \begin{pmatrix} 1 & 0 \\ 0 & e^{i\pi/4} \end{pmatrix}\begin{pmatrix} 1 & 0 \\ 0 & e^{i\pi/4} \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & e^{i\pi/2} \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & i \end{pmatrix} = S.

Why e^{i\pi/4} \cdot e^{i\pi/4} = e^{i\pi/2}: the exponents add — this is the defining rule of the complex exponential, e^a \cdot e^b = e^{a+b}. And e^{i\pi/2} = \cos(\pi/2) + i\sin(\pi/2) = 0 + i \cdot 1 = i, by Euler's formula.

Combining: T^4 = (T^2)^2 = S^2 = Z, and T^8 = Z^2 = I. So eight T gates in a row undo themselves. Geometrically: eight 45° rotations about z amount to a single 360° rotation, which is the identity.

T as the square root of S. Since T^2 = S, we say T = \sqrt{S}. And since S = \sqrt{Z}, we have T = \sqrt{\sqrt{Z}} = Z^{1/4}. T is the fourth root of the Pauli Z gate, just as S is the square root of it.

Inverses: S† and T†

Every unitary has an inverse (its conjugate transpose, U^\dagger), and for diagonal unitaries the inverse is easy: just conjugate each diagonal phase.

S^\dagger = \begin{pmatrix} 1 & 0 \\ 0 & -i \end{pmatrix}, \qquad T^\dagger = \begin{pmatrix} 1 & 0 \\ 0 & e^{-i\pi/4} \end{pmatrix}.

Why this is the inverse: the conjugate transpose of a diagonal matrix is the same matrix with each diagonal entry replaced by its complex conjugate. \overline{i} = -i (the conjugate of a + bi is a - bi, and the conjugate of i = 0 + 1i is 0 - 1i = -i). Similarly \overline{e^{i\pi/4}} = e^{-i\pi/4}. So S^\dagger rotates by -90° about z, and T^\dagger rotates by -45°.

Check that S \cdot S^\dagger = I explicitly:

S \cdot S^\dagger = \begin{pmatrix} 1 & 0 \\ 0 & i \end{pmatrix}\begin{pmatrix} 1 & 0 \\ 0 & -i \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & i \cdot (-i) \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} = I.

Why i \cdot (-i) = 1: by definition i^2 = -1, so i \cdot (-i) = -i^2 = -(-1) = 1. Equivalently, i = e^{i\pi/2} and -i = e^{-i\pi/2}, and e^{i\pi/2} \cdot e^{-i\pi/2} = e^0 = 1.

The critical thing to notice: S \neq S^\dagger and T \neq T^\dagger. Unlike the Paulis and H (which are their own inverses — they are Hermitian-and-unitary), S and T are not Hermitian. You must apply their daggered versions to undo them. This is a real source of bugs in student calculations and in circuit compilation pipelines: applying S twice gives you Z, not I.

Worked examples

Example 1: T acting on |+⟩

Compute T|+\rangle where |+\rangle = \tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle).

Step 1. Use linearity.

T|+\rangle = T\cdot\tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle) = \tfrac{1}{\sqrt{2}}(T|0\rangle + T|1\rangle).

Why linearity applies: every quantum gate is a linear operator. Applying T to a sum of kets is the sum of T applied to each ket. This is the property that lets all of quantum computation be written in terms of matrices.

Step 2. Substitute the action on basis states.

\tfrac{1}{\sqrt{2}}(T|0\rangle + T|1\rangle) = \tfrac{1}{\sqrt{2}}(|0\rangle + e^{i\pi/4}|1\rangle).

Why: T|0\rangle = |0\rangle (top-left entry of T is 1), and T|1\rangle = e^{i\pi/4}|1\rangle (bottom-right entry is e^{i\pi/4}). Substitute and factor.

Step 3. Write the result in closed form.

T|+\rangle = \tfrac{1}{\sqrt{2}}|0\rangle + \tfrac{e^{i\pi/4}}{\sqrt{2}}|1\rangle.

Step 4. Sanity-check the probabilities.

  • Probability of measuring 0: |1/\sqrt{2}|^2 = 1/2.
  • Probability of measuring 1: |e^{i\pi/4}/\sqrt{2}|^2 = |e^{i\pi/4}|^2 \cdot 1/2 = 1 \cdot 1/2 = 1/2.
  • Sum = 1. Correctly normalised. Why |e^{i\pi/4}|^2 = 1: any complex exponential e^{i\theta} has modulus 1 (its real part is \cos\theta, its imaginary part is \sin\theta, and \cos^2\theta + \sin^2\theta = 1). So a pure-phase factor never changes a probability.

Result. T|+\rangle = \tfrac{1}{\sqrt{2}}|0\rangle + \tfrac{e^{i\pi/4}}{\sqrt{2}}|1\rangle — a state with 50-50 measurement probabilities in the computational basis, but a specific relative phase e^{i\pi/4} between its two amplitudes. Geometrically, it sits on the Bloch equator at the point 45° counter-clockwise from |+\rangle, halfway to |+i\rangle.

T|+⟩ on the Bloch equatorA single Bloch sphere showing the input |+⟩ at the +x equator point and the output T|+⟩ at the equator point 45 degrees counter-clockwise, midway to |+i⟩.|0⟩|1⟩|+⟩|+i⟩T|+⟩T rotates |+⟩ by 45° along the equator
Applying $T$ to $|+\rangle$ produces a new state on the Bloch equator, $45°$ from $|+\rangle$ on the way to $|+i\rangle$. The computational-basis probabilities are unchanged (still 50-50), but the relative phase is now $e^{i\pi/4}$.

Example 2: The H T H circuit

Apply the circuit H \cdot T \cdot H to |0\rangle. This sequence — Hadamard, then T, then another Hadamard — is one of the simplest non-trivial circuits in quantum computing.

Step 1. First Hadamard on |0\rangle.

H|0\rangle = |+\rangle = \tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle).

Why: the Hadamard turns |0\rangle into the equal superposition |+\rangle — you saw this in the Hadamard chapter.

Step 2. Apply T.

T|+\rangle = \tfrac{1}{\sqrt{2}}(|0\rangle + e^{i\pi/4}|1\rangle).

Why: as you computed in Example 1. T inserts the phase e^{i\pi/4} on the |1\rangle amplitude.

Step 3. Apply the second Hadamard. Use linearity and H|0\rangle = |+\rangle, H|1\rangle = |-\rangle.

H\cdot\tfrac{1}{\sqrt{2}}(|0\rangle + e^{i\pi/4}|1\rangle) = \tfrac{1}{\sqrt{2}}(H|0\rangle + e^{i\pi/4}H|1\rangle) = \tfrac{1}{\sqrt{2}}(|+\rangle + e^{i\pi/4}|-\rangle).

Why: linearity of H, then substitution of its action on the computational basis. The relative phase e^{i\pi/4} is unaffected — it is a scalar multiplying the |1\rangle piece.

Step 4. Expand |+\rangle and |-\rangle back into |0\rangle and |1\rangle and collect.

\tfrac{1}{\sqrt{2}}\Big(\tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle) + e^{i\pi/4}\tfrac{1}{\sqrt{2}}(|0\rangle - |1\rangle)\Big)
= \tfrac{1}{2}\big((1 + e^{i\pi/4})|0\rangle + (1 - e^{i\pi/4})|1\rangle\big).

Why: distribute and collect like terms. The \tfrac{1}{\sqrt{2}} \cdot \tfrac{1}{\sqrt{2}} = \tfrac{1}{2} comes from the two normalisations.

Step 5. Examine the state on the Bloch sphere. Check measurement probabilities.

  • |1 + e^{i\pi/4}|^2 = (1 + \cos(\pi/4))^2 + \sin^2(\pi/4) = 1 + 2\cos(\pi/4) + \cos^2(\pi/4) + \sin^2(\pi/4) = 2 + \sqrt{2}.
  • |1 - e^{i\pi/4}|^2 = (1 - \cos(\pi/4))^2 + \sin^2(\pi/4) = 2 - \sqrt{2}.
  • Probability of 0 is (2 + \sqrt 2)/4 \approx 0.854. Probability of 1 is (2 - \sqrt 2)/4 \approx 0.146. Why these are correct: |1 + e^{i\theta}|^2 = (1+\cos\theta)^2 + \sin^2\theta = 2 + 2\cos\theta, and |1 - e^{i\theta}|^2 = 2 - 2\cos\theta. With \theta = \pi/4 and \cos(\pi/4) = \sqrt{2}/2, you get the numbers above. Divide by 4 = 2^2 (the normalisation squared) to get probabilities, and 0.854 + 0.146 = 1.

Result. H\,T\,H\,|0\rangle is a biased superposition: measurement gives 0 with probability (2 + \sqrt{2})/4 \approx 85.4\% and 1 with probability (2 - \sqrt{2})/4 \approx 14.6\%. On the Bloch sphere, the state sits tilted by 45° between the +z pole and the +x equator — exactly what you would expect from "conjugating a z-rotation by a basis-change that swaps x and z": H T H is effectively a 45° rotation about the x-axis.

HTH circuit applied to |0⟩A circuit with a single wire labeled zero on the left passing through three boxes in sequence: H, T, H, and emerging as HTH|0⟩ on the right.HTH|0⟩HTH|0⟩Effective operation: 45° rotation about x-axisP(0) ≈ 0.854, P(1) ≈ 0.146
The $HTH$ circuit. By sandwiching $T$ (a $z$-rotation) between two Hadamards, you effectively convert it into an $x$-rotation by the same angle. The biased output probabilities are a direct fingerprint of the $45°$ rotation axis not aligning with either measurement axis.

What this shows. H U H is the "basis-change sandwich" — wrapping a gate U with two Hadamards converts it into the gate you would get by swapping x and z. Here, T = R_z(\pi/4) (up to a global phase), so HTH = R_x(\pi/4). This trick generalises: HZH = X, HXH = Z, HYH = -Y, and now HTH is the x-axis 45° rotation. When a real compiler needs an x-rotation but the hardware only provides z-rotations, it surrounds the z-rotation with Hadamards.

Why S and T are special: Clifford and non-Clifford

Every gate you have met so far — X, Y, Z, H, S — shares a special property: it belongs to the Clifford group. What that means, concretely, is: if you take any Pauli matrix P and compute U P U^\dagger for a Clifford U, the result is always another Pauli matrix (possibly with a minus sign or factor of i).

You saw this in the last chapter. HXH = Z, HZH = X, HYH = -Y. Similarly, conjugating by S gives SXS^\dagger = Y, SYS^\dagger = -X, SZS^\dagger = Z. In every case, Paulis get mapped to Paulis.

Clifford vs non-Clifford gates in a universal gate setA pie chart or segmented bar illustrating that the Clifford group contains H, S, CNOT, X, Y, Z. The T gate is the one non-Clifford addition needed for universality.Clifford gates: H, S, CNOT, X, Y, ZTefficiently simulable on classical hardwarenon-Clifford(Gottesman–Knill theorem)universalityClifford + T = universal gate set— any unitary can be approximated to arbitrary accuracy by a sequenceof these gates (Solovay–Kitaev theorem)T is the one gate that escapes the classical-simulation trap
The Clifford group — generated by $H$, $S$, and CNOT — is everything except the $T$ gate in this picture. Clifford-only circuits, however complex they look, are classically simulable in polynomial time. The moment you add a single $T$ gate, the simulation becomes (believed to be) classically hard, and the gate set becomes universal for quantum computing.

The Gottesman–Knill theorem

Here is the punchline that makes Cliffords different from everything else: a quantum circuit built entirely from Clifford gates, starting from a computational-basis state and ending with computational-basis measurement, can be simulated efficiently on a classical computer.

This result — the Gottesman–Knill theorem [5] — says that no amount of Clifford-only quantum circuitry gives any computational advantage over a laptop. A circuit with a million H's, S's, CNOTs, and Paulis is no harder to simulate classically than a circuit with ten. The Clifford group is, in some sense, "classically tame."

Why Cliffords are classically tractable: Clifford gates preserve the Pauli structure of a state. If you describe the initial state by which Paulis commute with it (the "stabiliser formalism"), each Clifford gate just updates this description by mapping Paulis to Paulis — a quick bookkeeping update. A classical computer can track this updated description in polynomial time, and it uses the updated stabilisers to compute measurement probabilities efficiently.

This is why Cliffords are, somewhat paradoxically, the "easy" quantum gates. They create superposition, they create entanglement (via CNOT), they do lots of quantum-looking things — but none of that "quantum-looking" activity is computationally powerful on its own.

T breaks the Clifford ceiling

Enter the T gate. Under conjugation, T does not map Paulis to Paulis. Check what T X T^\dagger gives:

T X T^\dagger = \begin{pmatrix} 1 & 0 \\ 0 & e^{i\pi/4} \end{pmatrix}\begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}\begin{pmatrix} 1 & 0 \\ 0 & e^{-i\pi/4} \end{pmatrix}.

Work through it and you get something like \tfrac{1}{\sqrt{2}}(X + Y) — a linear combination of Paulis, not a single Pauli. That is what it means to be non-Clifford: conjugation leaves the Pauli group and lands outside it.

Adding T to the Clifford set has two consequences:

  1. The gate set becomes universal. With \{H, S, T\} plus CNOT, you can approximate any single-qubit or multi-qubit unitary to arbitrary accuracy. This is the Solovay–Kitaev theorem: any unitary within \epsilon precision can be reached by a sequence of O(\log^c(1/\epsilon)) gates from this set. The exact value of c is an area of ongoing improvement; the important thing is the efficiency is polylogarithmic.

  2. Classical simulation stops working. A circuit with just one T gate, let alone many, is no longer known to be efficiently simulable classically. (Widely believed, though not proven unconditionally — a proof would resolve major open questions in complexity theory.)

So T is the single gate that takes your otherwise-classical-feeling Clifford circuit and gives it genuine quantum computational power.

Why T is "expensive" in practice

In a fault-tolerant quantum computer — one that corrects its own errors using quantum error correction — most Clifford gates can be implemented transversally: you apply the gate to each physical qubit of an encoded logical qubit, and the error-correction code naturally protects the operation. Transversal gates are cheap: one logical gate = a few physical gates, and errors do not propagate uncontrollably.

But T cannot be transversal in most codes (the Eastin–Knill theorem states roughly that no code can implement a universal gate set transversally; one gate must be done in a costlier way). In the dominant error-correction scheme (surface code), T is implemented via magic-state distillation: you prepare a large number of low-quality copies of a special "magic state" |T\rangle = T|+\rangle, use Clifford operations and measurements to distil them into fewer, higher-quality copies, and consume one high-quality copy to implement one reliable T gate on a logical qubit.

The cost: to produce one reliable T gate, you typically need to prepare and distil hundreds to thousands of noisy magic states. Recent resource estimates for breaking 2048-bit RSA via Shor's algorithm — the big-ticket application of quantum computing — are dominated by the cost of T-gate injection. One such estimate (Gidney & Ekera, 2019) put the total at roughly 2.7 million T-gates; each of those needs a magic-state distillation factory using thousands of physical qubits in superposition. The enormous qubit counts you see in quantum-computing roadmaps (20 million, 100 million) are mostly about having enough parallel magic-state factories to feed the T-gate demand.

That is why T is the "expensive" gate. In a textbook circuit, T costs the same as S or H — one clock tick, one box in the diagram. In a fault-tolerant implementation, T is where your budget goes.

Practical upshot

When a quantum compiler decomposes a unitary into the standard gate set \{H, S, T, \text{CNOT}\}, it counts T-count and T-depth as the primary cost metrics. An algorithm with T-count 100 is cheaper to run fault-tolerantly than one with T-count 10,000, even if they have the same total gate count. This metric has driven an entire sub-field of quantum circuit compilation — people work hard to rewrite circuits to use fewer Ts, because every T saved is potentially thousands of physical operations saved downstream.

Common confusions

Going deeper

If you are here for the single-qubit gate zoo, you have S and T. The key takeaways: S is a 90° z-rotation (S = \sqrt{Z}), T is a 45° z-rotation (T = \sqrt{S}), and T is the "expensive" non-Clifford gate that makes universality possible. The rest of this section digs into the Clifford group structure that makes T the lever, the Solovay–Kitaev theorem that formalises "universality to arbitrary precision," the cost of magic-state distillation as it appears in real resource estimates, and a brief mention of T counts in famous algorithms like Shor's factoring.

The Clifford+T universal gate set

The Clifford+T gate set \{H, S, T, \text{CNOT}\} is the standard universal gate set used in most of the quantum-computing literature. Why this particular collection?

Some hardware platforms natively support different sets, and there is a compilation step that re-expresses everything in terms of the native gates. When you see "T count" reported for an algorithm, it implicitly assumes Clifford+T.

The Solovay–Kitaev theorem

The Solovay–Kitaev theorem makes precise what "universal" means. It says: given any single-qubit unitary U and any desired precision \epsilon > 0, there exists a sequence of gates from \{H, S, T, H^\dagger, S^\dagger, T^\dagger\} — say, g_1 g_2 \cdots g_L — of length L = O(\log^c(1/\epsilon)) for some constant c, such that \|U - g_1 g_2 \cdots g_L\| < \epsilon.

In plain terms: you can approximate any rotation you like, to any precision you like, using only a polylogarithmic number of Clifford+T gates. The original Solovay–Kitaev bound had c = 3.97; modern improvements bring it close to c = 1.

This matters because it says universality is not just a matter of "can you get there" (existence) but of "can you get there efficiently" (polylogarithmic cost). The practical effect: any unitary you would ever want to apply in an algorithm can be compiled into a Clifford+T sequence of manageable length.

Magic-state distillation, briefly

The magic-state distillation protocol (Bravyi & Kitaev 2005) [6] produces one high-fidelity |T\rangle = T|+\rangle state from many low-fidelity copies, using only Clifford operations and computational-basis measurements. The output error rate decreases quadratically (or faster) with each round of distillation, so a constant number of rounds suffices to produce arbitrarily high-quality magic states.

But the input cost is high: typical protocols consume 15 or more noisy magic states per round to produce one distilled state, and multiple rounds of distillation are stacked to drive the error rate below the surface-code threshold. The result is a factory: several hundred to several thousand physical qubits producing one reliable T gate per distillation cycle. For an algorithm needing millions of T gates (like Shor's factoring for RSA-2048), you need a bank of such factories running in parallel.

This is why resource estimates for quantum advantage are dominated by T-count: the Clifford part of the algorithm is almost free in terms of physical qubits (it adds only a few overhead per logical gate), while each T gate drags in an entire magic-state factory.

T-counts in famous algorithms

Some representative numbers, to give a sense of scale.

These numbers are what fault-tolerant hardware targets are calibrated against. When you see "10 million qubits needed for Shor," the majority of those qubits are magic-state distillation factories supplying T gates, not data qubits holding the computation's state.

The dihedral coset problem

One last connection for the more advanced reader. Non-Clifford gate resource counting is closely tied to the dihedral coset problem — a problem in computational group theory that is classically believed hard but quantumly tractable. This connection reveals that the T gate's non-Clifford magic is, at heart, the same magic that lets quantum computers solve certain group-theoretic problems classical computers cannot. It is too technical to fully develop here, but the point is: the T gate is not an arbitrary choice. It is deeply connected to the algebraic structure that makes quantum speedups possible.

The National Quantum Mission dimension

India's National Quantum Mission (launched 2023 with a ₹6000 crore budget over 8 years) explicitly includes fault-tolerant quantum computing as a research pillar, with magic-state distillation and T-gate-efficient compilation among the topics funded. Indian researchers at IIT Madras, TIFR, and the Raman Research Institute have active work on error-correction codes and on circuit compilation optimising T-counts for the superconducting and trapped-ion platforms being developed domestically. The T gate is, in a very practical sense, where the scaling challenge sits — and it is what a substantial fraction of India's quantum-computing research budget is paying to understand better.

Where this leads next

References

  1. Wikipedia, Quantum logic gate — Phase and T gates — matrix forms, circuit symbols, and Clifford-group placement.
  2. Nielsen and Chuang, Quantum Computation and Quantum Information (2010), §4.2 and §10.6 on fault tolerance — Cambridge University Press.
  3. John Preskill, Lecture Notes on Quantum Computation, Ch. 7 (fault tolerance and the Clifford hierarchy) — theory.caltech.edu/~preskill/ph229.
  4. Qiskit Textbook, Single Qubit Gates — hands-on S, T examples with a live simulator.
  5. Wikipedia, Gottesman–Knill theorem — why Clifford circuits are classically simulable.
  6. Sergey Bravyi and Alexei Kitaev, Universal quantum computation with ideal Clifford gates and noisy ancillas (2005) — arXiv:quant-ph/0403025. The origin of magic-state distillation.