Pauli X, Y, Z — padho-wiki

Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.

In short

The three Pauli matrices X, Y, Z are the three simplest non-trivial single-qubit gates. X = \begin{pmatrix}0 & 1 \\ 1 & 0\end{pmatrix} flips |0\rangle \leftrightarrow |1\rangle — the quantum NOT. Z = \begin{pmatrix}1 & 0 \\ 0 & -1\end{pmatrix} flips the sign of |1\rangle — the phase flip. Y = \begin{pmatrix}0 & -i \\ i & 0\end{pmatrix} does both at once, up to an overall factor of i. Geometrically, each is a 180° rotation of the Bloch sphere about the corresponding axis — X about the x-axis, Y about y, Z about z. They are Hermitian (X^\dagger = X, etc.) and unitary, so they are their own inverses: X^2 = Y^2 = Z^2 = I. They anticommute in pairs — XY = -YX — and the product of any two is i times the third: XY = iZ, YZ = iX, ZX = iY. Together with I, the Paulis form the basis of every 2\times 2 Hermitian matrix, which is why every single-qubit operator can be written as a real combination of them. They are the backbone of quantum error correction, the Clifford group, and the Bloch-vector formalism.

In the last chapter you met the Hadamard — one gate, one job, creates superposition. Now meet the three gates that were there before anyone started talking about quantum computing: Pauli X, Y, Z.

They came from spin physics. In 1927, Wolfgang Pauli was trying to write down what a spin-1/2 particle — an electron — actually is quantum-mechanically. He wrote down three 2\times 2 matrices that are the generators of rotation for such a particle. Those three matrices are X, Y, Z. A century later, the same three matrices are the first gates you meet in every quantum computing course, because a qubit is a spin-1/2 system — and the mathematics of rotating a spin and rotating a qubit are literally the same mathematics.

Three matrices, three axes, one infinite family. Each one does something concrete to a qubit: X swaps |0\rangle and |1\rangle — the quantum version of a classical NOT gate. Z leaves |0\rangle alone but multiplies |1\rangle by -1 — a "phase flip" that is invisible to a computational-basis measurement but fatal to a Hadamard-protected algorithm. Y combines the two. Every other single-qubit gate you will ever see — the rotation gates R_x(\theta), R_y(\theta), R_z(\theta), the phase gates S and T, even the Hadamard itself — is built by exponentiating some combination of X, Y, Z. Learn these three and you have the alphabet.

This chapter builds them in every picture you know. Action on basis states. Action on the Bloch sphere. Matrix form. Commutation relations. The Pauli group. The identities XY = iZ and YZ = iX and ZX = iY that make everything else tick.

Defining the three matrices

Write them down once and keep them on your palm.

X \;=\; \begin{pmatrix}0 & 1 \\ 1 & 0\end{pmatrix}, \qquad Y \;=\; \begin{pmatrix}0 & -i \\ i & 0\end{pmatrix}, \qquad Z \;=\; \begin{pmatrix}1 & 0 \\ 0 & -1\end{pmatrix}.

Three 2\times 2 matrices. Each is Hermitian — take the conjugate transpose and you get the matrix back. Each is unitary — multiply by its conjugate transpose and you get the identity. Each is traceless — the diagonal entries sum to zero. And each, when squared, gives the identity:

X^2 = Y^2 = Z^2 = I.

Why each one squares to I: a unitary that is also Hermitian satisfies U^2 = U\cdot U = U^\dagger U = I. So every Hermitian-and-unitary matrix is its own inverse. The three Paulis are the three simplest non-trivial matrices of this type, which is why they keep appearing.

The three Pauli matrices, each with its action on a qubit and its Bloch-sphere rotation. $X$ swaps basis states, $Z$ leaves the computational basis alone but flips the sign of $|1\rangle$, $Y$ does both with a factor of $i$.

Sometimes in the physics literature these are written as \sigma_x, \sigma_y, \sigma_z, with a subscript telling you which axis they rotate about. Some books abbreviate them as \sigma_1, \sigma_2, \sigma_3. In quantum computing, the clean uppercase X, Y, Z is the standard — so that is what you will see here.

Action on the basis states

Before the Bloch picture, see what each gate does to the four most common states: |0\rangle, |1\rangle, |+\rangle = \tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle), |-\rangle = \tfrac{1}{\sqrt{2}}(|0\rangle - |1\rangle).

X — the quantum NOT

Compute X|0\rangle and X|1\rangle by matrix multiplication.

X|0\rangle = \begin{pmatrix}0 & 1 \\ 1 & 0\end{pmatrix}\begin{pmatrix}1 \\ 0\end{pmatrix} = \begin{pmatrix}0 \\ 1\end{pmatrix} = |1\rangle.

X|1\rangle = \begin{pmatrix}0 & 1 \\ 1 & 0\end{pmatrix}\begin{pmatrix}0 \\ 1\end{pmatrix} = \begin{pmatrix}1 \\ 0\end{pmatrix} = |0\rangle.

Why these work out so cleanly: X is the permutation matrix that swaps the two basis rows. Its columns are (0,1)^T and (1,0)^T — which are |1\rangle and |0\rangle exactly. A matrix's columns are its images of |0\rangle and |1\rangle.

What about X|+\rangle and X|-\rangle? Use linearity:

X|+\rangle = X\cdot\tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle) = \tfrac{1}{\sqrt{2}}(X|0\rangle + X|1\rangle) = \tfrac{1}{\sqrt{2}}(|1\rangle + |0\rangle) = |+\rangle.

|+\rangle is unchanged by X. Similarly X|-\rangle = \tfrac{1}{\sqrt{2}}(|1\rangle - |0\rangle) = -|-\rangle — |-\rangle picks up a minus sign.

Why |+\rangle is fixed: |+\rangle is an eigenstate of X with eigenvalue +1. Symmetric in |0\rangle and |1\rangle, so swapping them does nothing. Similarly |-\rangle is an eigenstate with eigenvalue -1, because swapping the two entries of an antisymmetric combination produces a minus sign.

Z — the phase flip

Now the Z gate.

Z|0\rangle = \begin{pmatrix}1 & 0 \\ 0 & -1\end{pmatrix}\begin{pmatrix}1 \\ 0\end{pmatrix} = \begin{pmatrix}1 \\ 0\end{pmatrix} = |0\rangle.

Z|1\rangle = \begin{pmatrix}1 & 0 \\ 0 & -1\end{pmatrix}\begin{pmatrix}0 \\ 1\end{pmatrix} = \begin{pmatrix}0 \\ -1\end{pmatrix} = -|1\rangle.

|0\rangle is unchanged. |1\rangle picks up a minus sign. Notice that if you measure a single qubit in the computational basis, |1\rangle and -|1\rangle both give outcome 1 with probability 1 — the sign is a global phase on a computational-basis state, and global phases are physically undetectable on their own. So Z, acting alone on |0\rangle or |1\rangle, does nothing you can see.

Why Z is still interesting: the minus sign becomes visible in a superposition. On |+\rangle = \tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle), the sign becomes a relative phase, and relative phases are absolutely observable.

Check it: Z|+\rangle = \tfrac{1}{\sqrt{2}}(Z|0\rangle + Z|1\rangle) = \tfrac{1}{\sqrt{2}}(|0\rangle - |1\rangle) = |-\rangle. Similarly Z|-\rangle = |+\rangle. So Z swaps |+\rangle \leftrightarrow |-\rangle — a visible, physical change of state.

Y — flip and phase together

Finally Y.

Y|0\rangle = \begin{pmatrix}0 & -i \\ i & 0\end{pmatrix}\begin{pmatrix}1 \\ 0\end{pmatrix} = \begin{pmatrix}0 \\ i\end{pmatrix} = i|1\rangle.

Y|1\rangle = \begin{pmatrix}0 & -i \\ i & 0\end{pmatrix}\begin{pmatrix}0 \\ 1\end{pmatrix} = \begin{pmatrix}-i \\ 0\end{pmatrix} = -i|0\rangle.

Y flips |0\rangle to |1\rangle (like X would), but it also multiplies by i. Similarly |1\rangle goes to |0\rangle times -i. The geometric content of the i: on the Bloch sphere, Y rotates 180° about the y-axis, which (unlike X) takes |0\rangle through a half-great-circle whose midpoint is on the +y equator — the |+i\rangle state. The factors of \pm i are exactly what you would get by exponentiating rotation by an imaginary angle, and they keep the algebra consistent.

Why Y is X and Z "combined": you can check directly that Y = iXZ. So Y is the product of X and Z, up to a factor of i. Applying Y is applying a bit flip and a phase flip together in one move.

The full table

Collecting all four actions — the one you should have glued above your desk.

The complete action of the three Pauli gates on the four most common single-qubit basis states. Read the row for the input you care about; the cell gives you the output.

Spot the pattern. X and Z each fix two of the four states and swap the other two (with signs). Y never fixes any of the four — it always sends a state to something different (with a factor of \pm i). Each gate's eigenstates — the states it fixes up to a \pm 1 scalar — are exactly the poles of its rotation axis: |0\rangle, |1\rangle for Z; |+\rangle, |-\rangle for X; |+i\rangle, |-i\rangle for Y.

Picture: Pauli gates as Bloch-sphere rotations

Now the geometric picture. Each of X, Y, Z is a 180° rotation of the Bloch sphere about its own axis.

X — 180° about the x-axis

Under X, the Bloch sphere rotates 180° around the x-axis. The +x and -x points — |+\rangle and |-\rangle — sit on the rotation axis and don't move (up to a sign). The z-axis poles — |0\rangle and |1\rangle — sweep through the xy-plane and land at each other. The y-axis points — |+i\rangle and |-i\rangle — also swap.

The $X$ gate rotates the Bloch sphere $180°$ about the $x$-axis. $|0\rangle$ at the north pole swings through a half-circle and lands at $|1\rangle$ at the south. $|+\rangle$ and $|-\rangle$ sit on the axis and don't move.

Z — 180° about the z-axis

Y — 180° about the y-axis

Y rotates 180° about the y-axis. The \pm y poles — |+i\rangle and |-i\rangle — are on the axis. The z-poles and x-equator points rotate through half-turns: |0\rangle \leftrightarrow |1\rangle and |+\rangle \leftrightarrow |-\rangle. On the Bloch sphere Y looks like X composed with Z (or Z composed with X, up to a sign), which matches the algebraic identity Y = iXZ = -iZX.

Together, the three Paulis are the three orthogonal 180° rotations of the Bloch sphere. Nothing else is like them; they are the skeleton of the rotation group for a qubit.

Self-inverse, Hermitian, unitary

Three algebraic properties of every Pauli matrix.

Hermitian. X^\dagger = X, Y^\dagger = Y, Z^\dagger = Z. Take the conjugate transpose and the matrix is unchanged. Check the middle one: Y^\dagger is Y transposed ((0, -i; i, 0) \to (0, i; -i, 0)) then conjugated (i \to -i and vice-versa), giving back (0, -i; i, 0) = Y. Hermiticity is why Paulis are observables — the eigenvalues of a Hermitian matrix are real, so measuring X, Y, or Z gives a real number (specifically \pm 1, as the eigenvalues are always \pm 1).

Unitary. X^\dagger X = I, Y^\dagger Y = I, Z^\dagger Z = I. Because each is Hermitian, X^\dagger X = X \cdot X = X^2, so unitarity and the self-inverse property X^2 = I are the same statement.

Self-inverse. X^2 = Y^2 = Z^2 = I. Applying any Pauli twice is the identity.

Why self-inverse follows from "Hermitian + unitary": if P is Hermitian then P^\dagger = P. If P is also unitary then P^\dagger P = I. Combining: P \cdot P = I, so P^2 = I. Every matrix that is both Hermitian and unitary is its own inverse. The Paulis are the prototypical examples.

A consequence: the only eigenvalues of a Pauli matrix are \pm 1. (If P|\psi\rangle = \lambda|\psi\rangle then |\psi\rangle = P^2|\psi\rangle = \lambda^2|\psi\rangle, so \lambda^2 = 1, so \lambda = \pm 1.) Measuring a Pauli observable always gives \pm 1 as the outcome. This is the content of the phrase "the Paulis are the \pm 1-valued observables on a qubit."

Commutation and anticommutation

Here is where the algebra gets interesting.

XY does not equal YX

Compute both orderings.

XY \;=\; \begin{pmatrix}0 & 1 \\ 1 & 0\end{pmatrix}\begin{pmatrix}0 & -i \\ i & 0\end{pmatrix} \;=\; \begin{pmatrix}0\cdot 0 + 1\cdot i & 0\cdot(-i) + 1\cdot 0 \\ 1\cdot 0 + 0\cdot i & 1\cdot(-i) + 0\cdot 0\end{pmatrix} \;=\; \begin{pmatrix}i & 0 \\ 0 & -i\end{pmatrix} \;=\; iZ.

Why iZ falls out: the product matrix has i on the top-left and -i on the bottom-right, which factors as i times \text{diag}(1, -1) = Z. Pulling out the i makes the Pauli identity visible.

Now the other ordering.

YX \;=\; \begin{pmatrix}0 & -i \\ i & 0\end{pmatrix}\begin{pmatrix}0 & 1 \\ 1 & 0\end{pmatrix} \;=\; \begin{pmatrix}0\cdot 0 + (-i)\cdot 1 & 0\cdot 1 + (-i)\cdot 0 \\ i\cdot 0 + 0\cdot 1 & i\cdot 1 + 0\cdot 0\end{pmatrix} \;=\; \begin{pmatrix}-i & 0 \\ 0 & i\end{pmatrix} \;=\; -iZ.

So XY = iZ and YX = -iZ. These are not equal — in fact they are negatives of each other. The quantity that measures how much two gates fail to commute is the commutator:

[X, Y] \;=\; XY - YX \;=\; iZ - (-iZ) \;=\; 2iZ.

The three commutation relations

By the same calculation (try the others for yourself if you like, or see the going-deeper section), you can check:

[X, Y] = 2iZ, \qquad [Y, Z] = 2iX, \qquad [Z, X] = 2iY.

These three identities — the Pauli commutation relations — are the single most important algebraic fact about the Pauli matrices. Written compactly with the Levi-Civita symbol \varepsilon_{abc}:

[\sigma_a, \sigma_b] \;=\; 2i\,\varepsilon_{abc}\,\sigma_c

where (a, b, c) ranges over (x, y, z) and \varepsilon_{abc} is +1 for cyclic (xyz), -1 for anticyclic, and 0 if any two indices are equal.

Why these commutation relations are the same as angular momentum: they are the defining identities of the Lie algebra \mathfrak{su}(2), which is the algebra of infinitesimal rotations in 3D. The Pauli matrices generate the rotation group for spin-1/2 particles and equivalently for qubits.

Anticommutation

A second set of relations lives alongside the commutators.

\{X, Y\} \;=\; XY + YX \;=\; iZ + (-iZ) \;=\; 0.

Distinct Paulis anticommute — their product plus their reverse-ordered product is zero. Combined with the commutator relations, the Paulis satisfy

\{X, Y\} = \{Y, Z\} = \{Z, X\} = 0, \qquad \{X, X\} = \{Y, Y\} = \{Z, Z\} = 2I.

You can sum up both commutation and anticommutation in one compact identity:

\sigma_a \sigma_b \;=\; \delta_{ab}\,I + i\,\varepsilon_{abc}\,\sigma_c.

When a = b you get \sigma_a^2 = I. When a \neq b you get i \varepsilon_{abc} \sigma_c — which is iZ for ab = xy, -iZ for ab = yx, and so on. This one line encodes everything there is to know about the Pauli products.

The cyclic commutator structure: walk around the triangle in the direction of the arrows and the commutator gives $2i$ times the next Pauli. Walk the other way and you pick up a minus sign.

Products of Paulis — the three useful identities

Rewriting the compact identity \sigma_a \sigma_b = \delta_{ab}I + i\varepsilon_{abc}\sigma_c as three separate lines, the products you will use every day are

XY = iZ, \qquad YZ = iX, \qquad ZX = iY.

And the anti-cyclic direction flips sign:

YX = -iZ, \qquad ZY = -iX, \qquad XZ = -iY.

Derive YZ = iX to make sure it lands.

YZ \;=\; \begin{pmatrix}0 & -i \\ i & 0\end{pmatrix}\begin{pmatrix}1 & 0 \\ 0 & -1\end{pmatrix} \;=\; \begin{pmatrix}0\cdot 1 + (-i)\cdot 0 & 0\cdot 0 + (-i)\cdot(-1) \\ i\cdot 1 + 0\cdot 0 & i\cdot 0 + 0\cdot(-1)\end{pmatrix} \;=\; \begin{pmatrix}0 & i \\ i & 0\end{pmatrix}.

Factor out the i: \begin{pmatrix}0 & i \\ i & 0\end{pmatrix} = i\begin{pmatrix}0 & 1 \\ 1 & 0\end{pmatrix} = iX. ✓

Why the identities are symmetric under cyclic permutations: the Paulis are the generators of rotations about the three orthogonal axes of a 3D space, and the 3D rotation group treats x, y, z symmetrically under cyclic permutations (the mathematical statement is SO(3)'s Lie algebra). So any identity true for one Pauli must be true for all three under cyclic rotation.

The Pauli group

Take X, Y, Z, together with the identity I, and also include signs and factors of i. Close the set under multiplication. What you get is a finite group called the Pauli group on one qubit.

The single-qubit Pauli group

The single-qubit Pauli group \mathcal{P}_1 is the set

\mathcal{P}_1 \;=\; \{\,\pm I,\, \pm iI,\, \pm X,\, \pm iX,\, \pm Y,\, \pm iY,\, \pm Z,\, \pm iZ\,\}.

It has 16 elements and is generated by X and Z (with Y = iXZ and the signs following from closure).

Why 16? Because there are 4 matrices (I, X, Y, Z) and 4 phases (\pm 1, \pm i), giving 4 \times 4 = 16. Some textbooks quotient out the signs and work with the projective Pauli group, which has just 4 elements (I, X, Y, Z) — it depends on whether you care about overall phases.

The Pauli group on n qubits is the set of tensor products of n single-qubit Paulis, with phases: \mathcal{P}_n has 4^n \times 4 elements. This is the group that quantum error-correcting codes live in — every correctable error is (up to a phase) a tensor product of single-qubit Paulis.

Conjugation by the Hadamard

One beautiful fact that ties this chapter to the previous one: conjugating a Pauli by a Hadamard permutes the three Paulis.

H X H = Z, \qquad H Z H = X, \qquad H Y H = -Y.

Why this permutation happens: H is a 180° rotation about the axis halfway between x and z. Under this rotation, the x-axis swaps with the z-axis, and the y-axis flips sign. The conjugation H P H applies this geometric rotation to the Pauli operator, so X \to Z, Z \to X, Y \to -Y.

This is why H plus the Paulis gives you a much larger set of useful gates — the Clifford group, which you will meet in Part 6.

Worked examples

Example 1: |+⟩ is a +1 eigenstate of X

Apply the Pauli X to the state |+\rangle = \tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle) and show that the result is |+\rangle itself — so that |+\rangle is an eigenvector of X with eigenvalue +1.

Step 1. Write |+\rangle as a column vector.

|+\rangle \;=\; \frac{1}{\sqrt{2}}\begin{pmatrix}1 \\ 1\end{pmatrix}.

Why: |+\rangle = \tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle) — the two amplitudes are both 1/\sqrt{2}, so the column vector has 1/\sqrt{2} in each row.

Step 2. Multiply X by this column.

X|+\rangle \;=\; \begin{pmatrix}0 & 1 \\ 1 & 0\end{pmatrix}\cdot\frac{1}{\sqrt{2}}\begin{pmatrix}1 \\ 1\end{pmatrix} \;=\; \frac{1}{\sqrt{2}}\begin{pmatrix}0\cdot 1 + 1\cdot 1 \\ 1\cdot 1 + 0\cdot 1\end{pmatrix} \;=\; \frac{1}{\sqrt{2}}\begin{pmatrix}1 \\ 1\end{pmatrix}.

Why the same column comes out: X swaps the top and bottom entries of any column vector. If the entries are the same (both 1/\sqrt{2} here), swapping produces the identical vector. This is the mechanical reason |+\rangle is an eigenvector.

Step 3. Recognise the result as |+\rangle.

\frac{1}{\sqrt{2}}\begin{pmatrix}1 \\ 1\end{pmatrix} \;=\; \frac{1}{\sqrt{2}}(|0\rangle + |1\rangle) \;=\; |+\rangle.

Step 4. Confirm |-\rangle is the -1 eigenstate.

X|-\rangle \;=\; \begin{pmatrix}0 & 1 \\ 1 & 0\end{pmatrix}\cdot\frac{1}{\sqrt{2}}\begin{pmatrix}1 \\ -1\end{pmatrix} \;=\; \frac{1}{\sqrt{2}}\begin{pmatrix}-1 \\ 1\end{pmatrix} \;=\; -\frac{1}{\sqrt{2}}\begin{pmatrix}1 \\ -1\end{pmatrix} \;=\; -|-\rangle.

Result. |+\rangle is the +1 eigenstate of X; |-\rangle is the -1 eigenstate. Together they are the eigenbasis of X, also called the X-basis or the plus-minus basis.

$|+\rangle$ and $|-\rangle$ sit on the $x$-axis of the Bloch sphere — the rotation axis of $X$. Points on a rotation's axis are fixed by that rotation (up to sign), which is exactly what being an eigenstate means.

What this shows. Eigenstates of a Pauli gate are geometrically the points on its rotation axis. The eigenvalues \pm 1 correspond to whether you are at the positive or negative end of the axis. This is why the three "axis-bases" — |0\rangle/|1\rangle for Z, |+\rangle/|-\rangle for X, |+i\rangle/|-i\rangle for Y — are the three natural measurement bases for a qubit.

Example 2: Compute Y|0⟩ by matrix multiplication

Compute Y|0\rangle explicitly by multiplying the Pauli Y matrix by the column vector |0\rangle = (1, 0)^T. Verify the answer is i|1\rangle.

Step 1. Write down the matrix and the column.

Y \;=\; \begin{pmatrix}0 & -i \\ i & 0\end{pmatrix}, \qquad |0\rangle \;=\; \begin{pmatrix}1 \\ 0\end{pmatrix}.

Step 2. Perform the matrix-times-column multiplication entry by entry. The top entry of the output is the dot product of the top row of Y with the column |0\rangle:

[Y|0\rangle]_0 \;=\; 0\cdot 1 + (-i)\cdot 0 \;=\; 0.

The bottom entry is the dot product of the bottom row of Y with |0\rangle:

[Y|0\rangle]_1 \;=\; i\cdot 1 + 0\cdot 0 \;=\; i.

Why matrix-times-column works row-by-row: each entry of the output is one row of the matrix dotted with the column. For a 2\times 2 matrix times a column, you do two dot products — one per row. Do it in that order and the arithmetic is clean.

Step 3. Collect into a column vector.

Y|0\rangle \;=\; \begin{pmatrix}0 \\ i\end{pmatrix}.

Step 4. Recognise this as i|1\rangle.

\begin{pmatrix}0 \\ i\end{pmatrix} \;=\; i\begin{pmatrix}0 \\ 1\end{pmatrix} \;=\; i|1\rangle.

Why the factor of i matters: i|1\rangle and |1\rangle are physically the same state (they differ only by a global phase). But in a circuit where this output feeds into another gate or combines with other superpositions, the i becomes a relative phase — and relative phases are observable. So the i is "invisible locally, essential globally."

Step 5. Cross-check with the Bloch picture. Y implements a 180° rotation about the y-axis. Starting from |0\rangle at the north pole, that rotation carries the point through a half-circle in the xz-plane and lands at |1\rangle at the south pole. So on the Bloch sphere the state goes cleanly from |0\rangle to |1\rangle. The factor of i out front is a global phase — it does not move the point on the Bloch sphere at all, since |1\rangle and i|1\rangle are the same physical state. The Bloch picture and the matrix algebra agree: start at the north pole, end at the south pole. Why the Bloch rotation is "clean" while the matrix has a factor of i: the Pauli matrix Y and the proper rotation R_y(\pi) differ by a global phase — specifically Y = i R_y(\pi) — and global phases are invisible on the Bloch sphere but present in the matrix algebra. This is a recurring feature of Pauli gates versus rotation gates: same physics, different overall phase conventions.

Result. Y|0\rangle = i|1\rangle. The action is: start at the north pole of the Bloch sphere, rotate 180° about the y-axis, land at the south pole. The i is an unobservable global phase in isolation but a real relative phase when this output participates in a superposition.

The $Y$ gate carries $|0\rangle$ on a half-circle about the $y$-axis, passing through $|+i\rangle$ at the halfway point and landing at $|1\rangle$ (with a global phase factor of $i$). This is why $Y = iXZ$: the composition of a bit flip and a phase flip.

What this shows. The Pauli Y combines the actions of X (a bit flip) and Z (a phase flip) in a single rotation, with the factor of i ensuring unitarity. Unlike X or Z, applying Y to a basis state produces a state with a genuinely complex amplitude — which is why Y appears in algorithms that need to introduce or manipulate imaginary phases, such as certain quantum Fourier transform circuits and phase estimation routines.

Common confusions

"The factors of i and -i in Y are just cosmetic — you can drop them." No. Drop them and Y is no longer unitary. A matrix is unitary only if U^\dagger U = I, and the imaginary entries of Y are what make this identity work — specifically, Y^\dagger has +i in the top-right and -i in the bottom-left, so Y^\dagger Y produces real entries that sum to I on the diagonal. Replace Y's entries with real numbers and you break unitarity.
"Pauli X is the same as the classical NOT." True in action on basis states — X|0\rangle = |1\rangle and X|1\rangle = |0\rangle, just like a classical NOT flips bits. But X is a genuine quantum gate: it is linear, it acts on superpositions (X(\alpha|0\rangle + \beta|1\rangle) = \alpha|1\rangle + \beta|0\rangle), and it does not randomise anything. A classical NOT is a deterministic flip on a classical bit; the quantum X is a deterministic rotation that happens to permute the computational basis, but its full job is unitary rotation of the Bloch sphere.
"Z doesn't do anything because Z|0\rangle = |0\rangle." True on |0\rangle alone — it is an eigenstate of Z with eigenvalue +1. But Z|1\rangle = -|1\rangle, and more importantly Z is highly non-trivial on superpositions: Z|+\rangle = |-\rangle is a real, measurable change of state. The minus sign on |1\rangle is only invisible if you are about to measure in the computational basis and then never use the result in further quantum processing. In any algorithm that interferes |1\rangle with |0\rangle (and most do, with a Hadamard at the end), the sign matters.
"|+\rangle is a Z eigenstate because it starts from |0\rangle, which is a Z eigenstate." No — this is the single sharpest error to avoid. |+\rangle is an X eigenstate (you just proved it in Example 1), not a Z eigenstate. Z|+\rangle = |-\rangle, a different state. The eigenstates of Z are |0\rangle and |1\rangle; the eigenstates of X are |+\rangle and |-\rangle; the eigenstates of Y are |+i\rangle and |-i\rangle. No state is an eigenstate of more than one Pauli simultaneously — because different Paulis don't share eigenstates (they anticommute, so if P_1|\psi\rangle = \lambda_1|\psi\rangle and P_2|\psi\rangle = \lambda_2|\psi\rangle, then 0 = \{P_1,P_2\}|\psi\rangle = 2\lambda_1\lambda_2|\psi\rangle, forcing \lambda_1 = 0 or \lambda_2 = 0, impossible for non-zero Pauli eigenvalues).
"Pauli X and the CNOT are the same gate." The CNOT (controlled-NOT, a two-qubit gate) applies X to the target qubit when the control is |1\rangle and does nothing when the control is |0\rangle. So X is the one-qubit gate; CNOT is the two-qubit controlled version of it. Every time you see "bit flip on qubit B depending on qubit A," the underlying one-qubit gate is X and the control structure makes it CNOT.
"The commutation relation [X, Y] = iZ has a factor of i that can be absorbed into Z." The correct identity is [X, Y] = 2iZ, not iZ. The factor of 2 matters — it comes from the two contributions XY and -YX each equal to \pm iZ, and subtracting gives 2iZ. Don't drop the 2; it is where the "angular momentum" structure of these operators lives.

Going deeper

If you are just here to know what X, Y, Z are and what they do — you have it. Three 180° rotations, three axis bases, self-inverse, anticommuting in pairs, and with the cyclic product XY = iZ. The rest of this section goes further: the Paulis as a basis for every 2\times 2 Hermitian matrix, the Bloch-vector formula \rho = (I + \vec r\cdot\vec\sigma)/2, the generalisation to qudits (Gell-Mann matrices), and how the same Pauli matrices live in Dirac's equation for the relativistic electron.

The Paulis span all Hermitian matrices on a qubit

A remarkable fact: every 2 \times 2 Hermitian matrix M can be written uniquely as a real linear combination of I, X, Y, Z:

M \;=\; m_0 I + m_x X + m_y Y + m_z Z, \qquad m_0, m_x, m_y, m_z \in \mathbb{R}.

Why four real numbers suffice: a 2\times 2 Hermitian matrix has four independent real entries (two real diagonals and one complex off-diagonal with two real components). The four matrices I, X, Y, Z span exactly this four-dimensional real vector space.

Finding the coefficients is a trace calculation. Because the Paulis are orthogonal under the Hilbert-Schmidt inner product (\text{tr}(P_a P_b) = 2\delta_{ab} for P_a, P_b \in \{X, Y, Z\}, and \text{tr}(I P_a) = 0 for any Pauli, and \text{tr}(I^2) = 2), the coefficients are

m_0 = \tfrac{1}{2}\text{tr}(M), \qquad m_x = \tfrac{1}{2}\text{tr}(XM), \qquad m_y = \tfrac{1}{2}\text{tr}(YM), \qquad m_z = \tfrac{1}{2}\text{tr}(ZM).

This is why Paulis are the natural "basis" for single-qubit observables: any measurement you can define on a qubit is some real combination of I, X, Y, Z. You can read off the components with four trace computations.

The Bloch-vector representation of a density matrix

From the last chapter's preview of density matrices: every qubit density matrix has the form

\rho \;=\; \frac{I + \vec r \cdot \vec\sigma}{2} \;=\; \frac{I + r_x X + r_y Y + r_z Z}{2},

where \vec{r} = (r_x, r_y, r_z) is a real 3-vector with |\vec r| \leq 1 (the Bloch vector). The expansion you just saw is exactly this formula with m_0 = 1/2 (so that \text{tr}(\rho) = 1) and m_a = r_a/2.

Pure states have |\vec r| = 1 (on the Bloch sphere); mixed states have |\vec r| < 1 (inside the Bloch ball). The Pauli expansion is the reason the Bloch vector works: it is the natural coordinate system for the space of single-qubit states.

A consequence: the expectation value of a Pauli observable in the state \rho is the corresponding component of the Bloch vector:

\langle X\rangle_\rho = \text{tr}(X\rho) = r_x, \qquad \langle Y\rangle_\rho = r_y, \qquad \langle Z\rangle_\rho = r_z.

The Bloch vector's components are literally the mean values of the three Paulis. Experimental quantum state tomography — measuring \langle X\rangle, \langle Y\rangle, \langle Z\rangle on many copies of an unknown state — reads out the Bloch vector directly, giving you the full density matrix.

Qudits and the Gell-Mann matrices

For higher-dimensional systems (qudits, with d levels), the Paulis generalise. For d = 3 you get the eight Gell-Mann matrices \lambda_1, \ldots, \lambda_8, which are traceless Hermitian 3\times 3 matrices generalising X, Y, Z to spin-1 (or a qutrit). For general d, you get d^2 - 1 matrices that together with I span all Hermitian operators on the qudit — they are called the generalised Gell-Mann or generalised Pauli basis.

One specific-to-quantum-computing generalisation: the generalised Pauli group on a qudit is generated by the shift operator X: |k\rangle \mapsto |k+1 \mod d\rangle and the clock operator Z: |k\rangle \mapsto \omega^k|k\rangle with \omega = e^{2\pi i/d}. For d = 2, X and Z reduce to the single-qubit Paulis you know. For d = 3, they give the qutrit Paulis used in some error-correction schemes.

Pauli matrices in Dirac's electron

The same three Pauli matrices that appear in quantum computing were introduced by Pauli to describe the spin of an electron. In Schrödinger's (non-relativistic) theory, a spin-1/2 electron's state is a 2-component column vector called a spinor, acted on by X, Y, Z. The spin operators are S_x = \tfrac{\hbar}{2}X, S_y = \tfrac{\hbar}{2}Y, S_z = \tfrac{\hbar}{2}Z — the Paulis multiplied by \hbar/2. Their commutation relations [S_a, S_b] = i\hbar \varepsilon_{abc} S_c are the defining relations of quantum angular momentum.

In Dirac's 1928 relativistic equation for the electron, the Pauli matrices reappear as building blocks of a larger 4\times 4 structure — the Dirac matrices \gamma^\mu. Two of them are built by combining X, Y, Z with the 2\times 2 identity in block form. This is why Dirac's equation naturally incorporates spin: the Pauli matrices are already there, embedded in the relativistic algebra.

Bose's 1924 paper on photon statistics was one early piece of this story; Satyendra Nath Bose did not write down Pauli matrices specifically, but his work on indistinguishable particles — bosons — is the "other half" of the particle-statistics story (fermions follow Pauli's exclusion principle, which is derived in part from the anticommutation of fermion creation operators — same anticommutation structure as the Pauli matrices themselves). Indian physics plays a load-bearing role in the foundations of every piece of the Pauli story, even if Pauli himself was Austrian.

The Clifford group, one step more

You have now met X, Y, Z (the Pauli group), and H (the Hadamard). The next gate in the tour is S, the phase gate — the 90° rotation about the z-axis, with matrix S = \text{diag}(1, i). Together \{H, S, X, Y, Z\} generate the single-qubit Clifford group, a 24-element finite group that maps Paulis to Paulis under conjugation (you saw HXH = Z already; similarly SXS^\dagger = Y, etc.).

The Clifford group plus one non-Clifford gate (conventionally T, the 45° z-rotation) is a universal gate set for quantum computing. So the full set of single-qubit gates you need, at minimum, is \{H, S, T\} — plus their inverses \{H, S^\dagger, T^\dagger\}, plus the Paulis \{X, Y, Z\} which are Cliffords anyway. The Paulis are the foundation; the Clifford gates are the scaffolding; T is the door out to the full universal gate set.

Where this leads next

Rotation gates R_x, R_y, R_z — the continuous-angle versions of the Pauli rotations. R_z(\pi) = -iZ, etc.
Phase gates S and T — the 90° and 45° rotations about the z-axis. Needed to go beyond Cliffords.
The Clifford group — the finite group generated by H, S, CNOT that permutes Paulis under conjugation.
The Pauli group on n qubits — tensor products of single-qubit Paulis with phases, the natural arena for error correction.
The Bloch sphere — the geometric home of single-qubit states, on which the Paulis are 180° rotations.
A preview of density matrices — where the Pauli expansion \rho = (I + \vec r\cdot\vec\sigma)/2 comes from.

References

Wikipedia, Pauli matrices — full definitions, identities, commutators, and history.
Nielsen and Chuang, Quantum Computation and Quantum Information (2010), §2.1.3, §4.2 — Cambridge University Press.
John Preskill, Lecture Notes on Quantum Computation, Ch. 2 (spin-1/2 and the Pauli algebra) — theory.caltech.edu/~preskill/ph229.
Qiskit Textbook, Single Qubit Gates — hands-on X, Y, Z with a live simulator.
Wikipedia, Pauli group — the 16-element group and its n-qubit generalisations.
Wikipedia, Wolfgang Pauli — the 1927 spin paper that introduced these matrices to physics.