In short

QPE with t ancilla qubits produces a t-bit measurement outcome that, with probability at least 4/\pi^2 \approx 0.405, is the nearest t-bit integer to 2^t\varphi. The amplitude distribution has a sinc-like profile centred on the true phase, with tails that fall off as 1/\text{distance}. To guarantee precision \varepsilon (so that |\widetilde\varphi - \varphi| < \varepsilon) with failure probability at most \delta, use

t \;\ge\; \log_2(1/\varepsilon) + \log_2\!\Bigl(2 + \frac{1}{2\delta}\Bigr)

ancillas. The extra qubits beyond \log_2(1/\varepsilon) are "padding" that amplifies the correct outcome. Alternative boosting strategies: run QPE multiple times and take the median (majority vote), or wrap QPE in amplitude amplification for quadratic speedup. The trade-off: higher precision and confidence cost more U-applications and more qubits.

You have a unitary U, an eigenstate |u\rangle with eigenvalue e^{2\pi i\varphi}, and you want to learn \varphi. The previous chapter built the phase estimation circuit — t ancillas, a controlled-U^{2^k} ladder, an inverse QFT, and a measurement — and showed that when \varphi is exactly k/2^t, the measurement returns k deterministically.

That was the clean case. But eigenvalues of real unitaries rarely land on convenient binary fractions. The phase of T on |1\rangle is 1/8 — fine, exactly 3 bits. The phase of R_z(\pi/5) on |0\rangle is -1/20 — not a binary fraction at all, and \pi/5 never terminates in binary. What does QPE do then?

The answer has two parts. First, the measurement returns the nearest t-bit integer with high probability — and "high" is quantifiable. Second, the outputs that are not the nearest integer form a tail that decays as 1/\text{distance}, so even when QPE fails to land on the nearest integer, it usually lands close. The trade-off is controllable: add more ancillas, run more trials, or wrap in amplitude amplification — each boost has a specific cost in gate count and qubit count.

This chapter works out the distribution in detail, derives the canonical 4/\pi^2 success-probability bound, and then shows the three standard ways to drive the failure probability as low as you want. By the end you will know exactly how many ancillas and how many U-applications are required for any target accuracy — a resource-accounting question that sits at the centre of every real QPE deployment.

The setup and the question

Run QPE with t ancillas on an eigenstate with phase \varphi \in [0, 1). Let a be the nearest integer to 2^t \varphi (i.e. the "true answer" rounded to t bits) and let \delta\varphi = \varphi - a/2^t be the rounding error, with |\delta\varphi| \le 1/2^{t+1}.

The measurement outcome is some integer y \in \{0, 1, \ldots, 2^t - 1\} with probability P(y) that depends on the amplitudes of the post-inverse-QFT ancilla state. The question is:

  1. What is P(a) — the probability of the correct rounding?
  2. How does P(y) fall off for y \ne a?
  3. How small can you make "P(\text{bad outcome})" by changing t?

The amplitude distribution — a sinc-like profile

Start from the ancilla state just before the inverse QFT. From the previous chapter, it is

|\Phi_\varphi\rangle \;=\; \frac{1}{\sqrt{2^t}}\sum_{x=0}^{2^t - 1} e^{2\pi i\, x\varphi}\,|x\rangle.

Apply \text{QFT}^\dagger. By the inverse QFT's definition, its matrix entry is (\text{QFT}^\dagger)_{yx} = \tfrac{1}{\sqrt{2^t}}\, e^{-2\pi i\, xy/2^t}. So the amplitude on |y\rangle after the inverse QFT is

\alpha_y \;=\; \langle y | \text{QFT}^\dagger | \Phi_\varphi\rangle \;=\; \frac{1}{2^t}\sum_{x=0}^{2^t - 1} e^{-2\pi i\, xy/2^t}\,e^{2\pi i\, x\varphi} \;=\; \frac{1}{2^t}\sum_{x=0}^{2^t - 1} e^{2\pi i\, x (\varphi - y/2^t)}.

Write \beta = \varphi - y/2^t. The sum becomes a geometric series with ratio r = e^{2\pi i \beta}:

\alpha_y \;=\; \frac{1}{2^t}\sum_{x=0}^{2^t - 1} r^x \;=\; \frac{1}{2^t}\cdot\frac{r^{2^t} - 1}{r - 1} \;=\; \frac{1}{2^t}\cdot\frac{e^{2\pi i\, 2^t \beta} - 1}{e^{2\pi i \beta} - 1}.

Why the geometric-series formula: \sum_{x=0}^{N-1} r^x = (r^N - 1)/(r - 1) whenever r \ne 1. This is the same identity that shows up in DFT orthogonality proofs; here it compresses a sum of 2^t phase-exponentials into a closed form involving just the endpoints.

Take the modulus squared to get the probability. Use the identity |e^{i\theta} - 1| = 2|\sin(\theta/2)|:

P(y) \;=\; |\alpha_y|^2 \;=\; \frac{1}{2^{2t}}\cdot\frac{4\sin^2(\pi\, 2^t \beta)}{4\sin^2(\pi \beta)} \;=\; \frac{\sin^2(\pi\, 2^t \beta)}{2^{2t}\,\sin^2(\pi \beta)}.

Here \beta = \varphi - y/2^t is the "distance" between the true phase and the fractional outcome.

Why this looks like a sinc function: for small arguments, \sin\theta \approx \theta, so the numerator behaves like (\pi\, 2^t \beta)^2 / (2^t)^2 = (\pi \beta)^2 and the denominator like (\pi \beta)^2 — their ratio is roughly 1 when \beta \ll 1/2^t, i.e. near the correct output. Far from the correct output, the numerator oscillates rapidly while the denominator grows, so P(y) decays — mimicking a \text{sinc}^2 function.

Sinc-like amplitude distribution peaked at the nearest integerA graph with a horizontal axis labelled y and a vertical axis labelled probability P of y. A tall central peak sits at y equals a, the nearest integer to two to the t times phi. Secondary bars on either side are much smaller, falling off as inverse-square of the distance from a. Dashed vertical lines mark the positions y equals a minus two, a minus one, a, a plus one, a plus two. A caption reads: P of a is at least four over pi squared.QPE output distribution — concentrated at y = a (nearest integer)a−4a−3a−2a−1aa+1a+2a+3a+410P(a) ≥ 4/π² ≈ 0.405tails ∝ 1/(distance)²outcome y (integer label of ancilla measurement)
The measurement distribution of QPE on a phase $\varphi$ whose rounded value is $a = \text{round}(2^t\varphi)$. The dominant outcome $y = a$ occurs with probability at least $4/\pi^2 \approx 0.405$; outcomes at distance $d$ from $a$ fall off as $\sim 1/d^2$. Most of the remaining probability mass lies on the immediate neighbours $a \pm 1$.

The 4/π² lower bound on the correct outcome

Focus on the "correct" outcome y = a (the nearest integer to 2^t\varphi). For this outcome, \beta = \varphi - a/2^t = \delta\varphi, the rounding error, with |\delta\varphi| \le 1/2^{t+1}. Let \Delta = 2^t \delta\varphi \in [-1/2, 1/2].

Then

P(a) \;=\; \frac{\sin^2(\pi \Delta)}{2^{2t}\,\sin^2(\pi \Delta/2^t)}.

For t \ge 1, the denominator's argument \pi\Delta/2^t is small — at most \pi/2^{t+1}. For small arguments, \sin\theta \le \theta (with equality only at \theta = 0). This gives the bound \sin(\pi\Delta/2^t) \le \pi\Delta/2^t, so

P(a) \;\ge\; \frac{\sin^2(\pi \Delta)}{2^{2t}\cdot(\pi \Delta/2^t)^2} \;=\; \frac{\sin^2(\pi\Delta)}{\pi^2\Delta^2}.

Now use a standard trigonometric inequality: for |\Delta| \le 1/2, |\sin(\pi\Delta)| \ge 2|\Delta|. (This is because the chord from (0, 0) to (\Delta, \sin\pi\Delta) lies below the arc, and the slope of the chord is \sin\pi\Delta / \Delta which is minimised at \Delta = \pm 1/2 giving |\sin(\pi/2)|/|1/2| = 2.) Squaring:

\sin^2(\pi\Delta) \;\ge\; 4\Delta^2,

so

P(a) \;\ge\; \frac{4\Delta^2}{\pi^2 \Delta^2} \;=\; \frac{4}{\pi^2} \;\approx\; 0.405.

Why the bound is saturated at \Delta = \pm 1/2: that is the case where the phase \varphi sits exactly halfway between two adjacent t-bit fractions. The QPE output is split between the two nearest integers a and a \pm 1, each getting probability exactly 4/\pi^2 in the limit of large t, with the remaining 1 - 8/\pi^2 \approx 0.19 of probability spread over more distant outcomes. When \varphi is better-aligned with a t-bit grid point, P(a) is larger — at \Delta = 0, P(a) = 1 exactly.

So no matter what the true phase is, QPE returns the nearest t-bit rounding with probability at least 4/\pi^2 \approx 40.5\%. This is the canonical accuracy guarantee of QPE.

The decay of the tails

For outcomes y far from a, the probability decays as 1/|y - a|^2. To see this, set y = a + m for integer m and compute \beta = \varphi - y/2^t = \delta\varphi - m/2^t. For |m| \ge 1, the denominator \sin^2(\pi\beta) is at least \sin^2(\pi(|m|/2^t - 1/2^{t+1})) \approx (\pi(|m| - 1/2)/2^t)^2, and the numerator \sin^2(\pi 2^t \beta) = \sin^2(\pi(2^t\delta\varphi - m)) \le 1. So

P(a + m) \;\le\; \frac{1}{2^{2t}} \cdot \frac{1}{(\pi(|m| - 1/2)/2^t)^2} \;=\; \frac{1}{\pi^2(|m| - 1/2)^2}.

This tail decay is not exponential — it is inverse-square. The bulk of the probability lives near a, but a heavy-ish tail stretches out. This matters for the next question: how much do you have to pad t to guarantee tiny failure probability?

Resource accounting — how many ancillas for precision ε and confidence 1 − δ

You want an estimate \widetilde\varphi with |\widetilde\varphi - \varphi| < \varepsilon with probability at least 1 - \delta.

The idea: use t ancillas, where t is larger than the minimum needed just to represent \varepsilon in binary. The extra ancillas are "padding" that tighten the distribution.

Let t = n + p where n = \lceil\log_2(1/\varepsilon)\rceil is the minimum needed for precision \varepsilon, and p is the number of padding qubits. The QPE circuit runs with t ancillas; you keep only the top n bits of the outcome as your estimate. The question becomes: how large must p be so that the top-n-bit rounding is within \varepsilon of \varphi with probability \ge 1 - \delta?

The standard analysis (Nielsen & Chuang §5.2.1) goes as follows. The top-n-bit rounding is wrong by more than \varepsilon only if the full t-bit measurement outcome falls more than 2^p/2 units away from the correct t-bit rounding. Using the tail bound,

P(|y - a| \ge 2^{p-1}) \;\le\; \sum_{|m| \ge 2^{p-1}} \frac{1}{\pi^2(|m| - 1/2)^2} \;\le\; \frac{2}{\pi^2}\sum_{m=2^{p-1}}^{\infty}\frac{1}{(m-1/2)^2} \;\le\; \frac{2}{\pi^2}\cdot\frac{1}{2^{p-1} - 1}.

Why the sum-to-integral bound: the sum \sum_{m=M}^{\infty} 1/(m-1/2)^2 is bounded by the integral \int_{M-1}^{\infty} dx / (x-1/2)^2 = 1/(M-3/2). Simplifying and applying the factor-of-2 for the two-sided tail gives the displayed bound.

Setting this failure probability \le \delta and solving for p:

\frac{2}{\pi^2(2^{p-1} - 1)} \;\le\; \delta \;\Longleftrightarrow\; 2^{p-1} \;\ge\; 1 + \frac{2}{\pi^2\delta} \;\Longleftrightarrow\; p \;\ge\; 1 + \log_2\!\Bigl(1 + \frac{2}{\pi^2\delta}\Bigr).

Simplifying with a small constant shift gives the commonly-quoted form

t \;=\; n + p \;\ge\; \log_2(1/\varepsilon) + \log_2\!\Bigl(2 + \frac{1}{2\delta}\Bigr).

Why the second term looks "cheap": you only pay \log_2(1/\delta) extra qubits for each doubling of confidence. Getting from \delta = 1/2 to \delta = 10^{-6} costs about \log_2(10^6/2) \approx 20 extra qubits — completely manageable. The dominant cost scales with the desired precision \varepsilon, not with the confidence.

A concrete example: suppose you want 6-bit precision (\varepsilon = 2^{-6} \approx 0.016) with 99% success (\delta = 0.01). Then

t \;\ge\; 6 + \log_2(2 + 50) \;=\; 6 + \log_2(52) \;\approx\; 6 + 5.7 \;=\; 11.7 \;\Longrightarrow\; t = 12.

So 12 ancillas give you 6 bits of phase precision with 99% confidence. The "padding" of 6 extra qubits is the price of pushing the failure probability from 60% down to 1%.

Gate count versus precisionA plot with the horizontal axis labelled precision bits t from zero to twenty-five, and the vertical axis labelled number of U applications on a logarithmic scale from one to ten to the seventh. A single straight line labelled two to the t minus one rises linearly on the log scale, showing the exponential gate cost of increasing precision. Dashed vertical reference lines at t equals ten and t equals twenty highlight specific values one thousand and one million.Total U-applications in QPE grows as 2ᵗ − 10510152025110²10⁴10⁶10⁸t=10 → ≈1000 U'st=20 → ≈10⁶ U'sprecision bits (t)U-applications (log scale)
Gate count in QPE versus the number of ancilla qubits $t$. Each additional bit of precision doubles the total number of $U$-applications. At $t = 20$ (phase precision $\approx 10^{-6}$), QPE demands about a million $U$-applications — and if $U$ itself is built from, say, $10^4$ gates, the total circuit is $10^{10}$ gates. This is why high-precision QPE remains out of reach of current NISQ hardware.

Boosting — three strategies compared

If adding more ancillas is not enough, or if you want even better guarantees, there are three canonical boosting techniques.

Strategy 1 — more ancillas (the logarithmic padding you just saw)

Idea. Scale t as n + O(\log(1/\delta)).

Cost. O(2^t) total U-applications, t ancilla qubits. Each extra qubit in t doubles U-applications.

When to use. The default. Cheap and transparent. Confidence scales as 2^{-p} per padding qubit; every extra qubit of padding halves the failure probability.

Strategy 2 — majority vote over multiple runs

Idea. Run QPE r times independently, each with the same t ancillas. Each run returns an estimate \widetilde\varphi_i. Compute the median (or a majority vote on each bit). For large r, the median concentrates around the true \varphi with high probability.

Cost. Total r \cdot 2^t U-applications. t ancilla qubits (reusable across runs). r parallel runs (or r sequential with reinitialisation).

How confidence scales. By a Hoeffding-like bound, r runs push the failure probability down to e^{-\Omega(r)} — exponentially fast in the number of runs. A few tens of runs achieve nine-nines confidence.

When to use. When ancilla qubits are expensive but circuit depth is forgiving (NISQ hardware). Majority voting is easier to implement than a longer coherent QPE — each run is shorter, reducing cumulative decoherence error.

Strategy 3 — amplitude amplification

Idea. Run QPE inside a Grover-like amplitude amplification. The amplitude on the "correct outcome" is at least 2/\pi \approx 0.64; amplitude amplification drives it arbitrarily close to 1 using O(1/\text{amp}) rounds.

Cost. O(1) rounds of amplitude amplification, each involving a full QPE forward-and-backward plus a reflection. Total U-applications is O(2^t), like standard QPE, but with a quadratic reduction in the number of repetitions compared to classical (multi-run) amplification.

When to use. In theoretical analyses and in algorithms where QPE is one subroutine inside a larger quantum routine. Amplitude amplification is a heavy hammer — it is worth using only when the cost of a single QPE run is already part of the budget.

The three strategies compose: for instance, run amplitude-amplified QPE with t ancillas and then take the median of r runs. Most practical proposals use some hybrid.

Worked examples

Example 1: the amplitude at y = a for a worst-case phase

Take a phase \varphi that lies exactly midway between two t-bit fractions — say \varphi = (a + 1/2)/2^t for some integer a. Then \Delta = 2^t\delta\varphi = 1/2, and from the amplitude formula

P(a) \;=\; \frac{\sin^2(\pi\cdot 1/2)}{2^{2t}\sin^2(\pi\cdot (1/2)/2^t)} \;=\; \frac{1}{2^{2t}\sin^2(\pi/2^{t+1})}.

Step 1. For large t, use \sin\theta \approx \theta with \theta = \pi/2^{t+1}:

\sin^2(\pi/2^{t+1}) \;\approx\; (\pi/2^{t+1})^2 \;=\; \pi^2/2^{2t+2}.

Step 2. Substitute:

P(a) \;\approx\; \frac{1}{2^{2t}\cdot\pi^2/2^{2t+2}} \;=\; \frac{2^{2t+2}}{2^{2t}\pi^2} \;=\; \frac{4}{\pi^2} \;\approx\; 0.405.

Why the approximation is tight here: for t \ge 3, the small-angle approximation \sin\theta \approx \theta is accurate to better than 2\% at \theta = \pi/2^{t+1}. So the bound P(a) \ge 4/\pi^2 is essentially saturated at this phase — it is the worst-case phase for QPE.

Step 3. By symmetry, P(a + 1) equals P(a) at this worst-case phase (the distribution is symmetric about the midpoint between a and a+1). So P(a) + P(a+1) = 8/\pi^2 \approx 0.81: more than 80\% of probability mass lies on the two nearest integers, a and a+1, with the remaining \approx 19\% spread over farther outcomes.

Result. At the worst-case phase \varphi = (a+1/2)/2^t, QPE returns a with probability \approx 4/\pi^2, returns a+1 with the same probability, and either way produces an estimate within 1/2^t of the true phase with probability \approx 81\%.

Worst-case phase: probability split between two neighboursA bar chart with seven bars. The two central bars at y equals a and a plus one are both at height 0.405. The bars at a minus one and a plus two are at height about 0.045. The outermost bars are tiny. A text label reads worst-case phi halfway between a over two to t and a plus one over two to t.Worst-case phase: P split evenly on two nearest integersa−2a−1aa+1a+2a+3a+40.404/π²4/π²worst-case φ = (a + ½)/2ᵗ
At the worst-case phase midway between two $t$-bit fractions, QPE splits its probability roughly evenly between the two nearest integers $a$ and $a+1$, each at $\approx 4/\pi^2$. The two nearest outcomes together carry $\approx 81\%$ of total probability; the remaining tail spreads out with $1/\text{distance}^2$ decay.

What this shows. The 4/\pi^2 bound is tight: at the worst-case phase, you cannot do better with standard QPE alone. The remaining \approx 60\% is not "lost" — it simply lands on the two nearest outcomes, either of which is within 1/2^t of the true phase. If your application only needs precision 1/2^t (not exact identification of the nearest integer), the effective success probability is \approx 81\% even at the worst case. This milder criterion is what matters for most applications.

Example 2: how many ancillas for 6-bit precision with 99% success

Target precision: \varepsilon = 2^{-6} \approx 0.016 (6 bits after the binary point). Target confidence: 1 - \delta = 99\%, so \delta = 0.01.

Step 1. Compute the minimum precision ancillas:

n \;=\; \lceil\log_2(1/\varepsilon)\rceil \;=\; \lceil\log_2(64)\rceil \;=\; 6.

Why: \varepsilon = 2^{-6} requires representing the phase to the nearest 1/64 — which is 6 bits of precision, hence n = 6 ancillas just to name that precision.

Step 2. Compute the padding:

p \;\ge\; 1 + \log_2\!\Bigl(1 + \frac{2}{\pi^2 \cdot 0.01}\Bigr) \;=\; 1 + \log_2(1 + 20.26) \;=\; 1 + \log_2(21.26) \;\approx\; 1 + 4.41 \;=\; 5.41.

Round up: p = 6. Why padding: extra qubits make the QPE distribution tighter, pushing the probability that the outcome falls outside the precision window below \delta. Each extra padding qubit roughly halves the failure probability.

Step 3. Total ancillas:

t \;=\; n + p \;=\; 6 + 6 \;=\; 12.

Step 4. Sanity-check with the simplified formula

t \;\ge\; \log_2(1/\varepsilon) + \log_2(2 + 1/(2\delta)) \;=\; 6 + \log_2(2 + 50) \;=\; 6 + \log_2(52) \;\approx\; 6 + 5.70 \;=\; 11.70.

Round up: t = 12. Consistent.

Step 5. Compute gate cost: 2^t - 1 = 2^{12} - 1 = 4095 applications of U. (If U itself is built from g elementary gates, total circuit gates \approx 4095 g.)

Result. 12 ancillas, 4095 controlled-U applications total, plus an O(t^2) = 144-gate inverse QFT. For a U that is, say, 1000 elementary gates, the full circuit is about 4.1 \times 10^6 gates. That is at the edge of what a high-fidelity NISQ machine can do without error correction.

Ancilla count for target precision and confidenceA table-like diagram. Left column: target precision. Middle column: target confidence. Right column: minimum ancilla count t. First row is 6 bits, 99 percent, t equals 12. Second row is 10 bits, 99 percent, t equals 16. Third row is 20 bits, 99 percent, t equals 26. Fourth row is 6 bits, 99.9999 percent, t equals 18.Ancilla counts for various precision / confidence targetsprecision εconfidence 1 − δancillas t6 bits (≈0.016)99%1210 bits (≈10⁻³)99%1620 bits (≈10⁻⁶)99%266 bits99.9999%18
Ancilla count grows linearly in the precision bits and logarithmically in the confidence. Each extra bit of precision roughly doubles the $U$-gate count; each extra 9 of confidence adds about $3$–$4$ ancillas. For 20-bit precision with 99% confidence, you need 26 ancillas and about $10^8$ $U$-applications — a cryptographically-relevant scale.

What this shows. Scaling QPE to useful precision is expensive. 6 bits of phase is only enough to distinguish about 64 distinct phase buckets — borderline for detailed chemistry or for reading off the order r in Shor's algorithm for larger integers. 20 bits pushes into 10^6 resolution, which is what Shor's algorithm for a 2048-bit RSA modulus actually needs. The difference between those two regimes — 4095 U-applications versus 10^8 U-applications — is why QPE is the heavy subroutine that makes fault-tolerant hardware a prerequisite for real cryptographic attacks.

Common confusions

Going deeper

If you are here to know how accurate QPE is and how to size a circuit for a target precision — you have it. The key numbers are 4/\pi^2 for raw success, \log_2(1/\delta) padding qubits for confidence scaling, and 2^t - 1 total U-applications. What follows: the full sinc distribution analysis, Hoeffding bounds for majority voting, amplitude amplification as a quadratic boost, iterative QPE tradeoffs, Heisenberg-limited phase estimation, and the applications that push each strategy to its limit.

The full probability distribution via the sinc profile

The closed-form expression

P(y) \;=\; \frac{\sin^2(\pi\, 2^t \beta)}{2^{2t}\sin^2(\pi\beta)}, \qquad \beta = \varphi - y/2^t

can be recognised as the square of a Dirichlet kernel — the analogue of the \text{sinc}^2 function in discrete signal processing. For small \beta, this is well-approximated by (\sin(\pi 2^t\beta)/(\pi 2^t\beta))^2 = \text{sinc}^2(2^t\beta).

Two useful corollaries:

(i) Main-lobe width. The main lobe of \text{sinc}^2(2^t\beta) has half-width 1/2^t in \beta-units, so in y-units the main lobe spans \pm 1 around the correct integer a. Most probability lies on \{a-1, a, a+1\}.

(ii) Aliasing. Because the denominator \sin^2(\pi\beta) is periodic with period 1, the distribution wraps around: outcomes at y = a + 2^t - 1 are "adjacent" to a under the cyclic identification. For typical \varphi well inside the grid this wrapping is negligible, but for \varphi near 0 or 1 it creates the "zero crossing" ambiguity of QPE — something to be aware of in Shor's algorithm implementations.

Majority voting — Hoeffding-style analysis

Suppose QPE succeeds (returns within \varepsilon of \varphi) with probability p \ge 4/\pi^2. Run r independent QPE instances and take the median of the outputs. By a Hoeffding bound, the probability that more than half of the outputs are "failure" outcomes (outside the \varepsilon window) decays as \exp(-2r(p - 1/2)^2). For p = 4/\pi^2 \approx 0.4, p - 1/2 = -0.1, which fails the Hoeffding assumption since p < 1/2.

The fix: make each run already boost p above 1/2 using a bit of padding, then majority-vote on top. For instance, with 1 padding qubit per run, each run succeeds with p \ge 0.8. Then 2r(p - 1/2)^2 = 0.18 r, so r = 30 runs reduce the failure probability to e^{-5.4} \approx 0.005. Combined: 12 ancilla qubits for precision 2^{-6} + one padding = 13 qubits per run, times 30 runs.

This is usually better than pure padding for NISQ: it keeps individual runs short (less decoherence) and trades qubit count for wall-clock time.

Amplitude amplification as a quadratic boost

Grover's amplitude amplification takes a unitary A and a "marking" operation O and produces a state where the O-marked amplitude is amplified. Applied to QPE:

  1. Let A = QPE circuit (forward only, no measurement).
  2. Let O = oracle that marks "correct" outcomes (those within \varepsilon of 2^t\varphi).
  3. The amplitude on marked states after A is \sqrt{p} \ge 2/\pi.
  4. Amplitude amplification takes this to \sim 1 in O(1/\sqrt{p}) rounds, each consisting of A, O, A^\dagger, reflection about |0\rangle, A.

Total cost: O(1/\sqrt p) \cdot \text{QPE-cost} = O(2^t) U-applications — a quadratic improvement over the O(r \cdot 2^t) = O(\log(1/\delta) \cdot 2^t) cost of majority voting. For very stringent confidence (say \delta = 2^{-40}), amplification beats voting.

This is the mechanism behind quantum counting: combine amplitude amplification with QPE on the Grover operator itself to directly output \theta (where \sin^2\theta = number of marked items / total), with full amplification giving quadratic speedup over classical sampling.

Heisenberg-limited phase estimation

The best-possible scaling for estimating \varphi using N total U-applications is Heisenberg-limited: error scales as 1/N, compared to the standard quantum limit of 1/\sqrt N that classical or non-adaptive sampling achieves. QPE saturates the Heisenberg limit (error \sim 1/2^t with 2^t applications), which is why it is the asymptotically-optimal algorithm for eigenvalue estimation.

Giovannetti, Lloyd, and Maccone (2006) [arXiv:quant-ph/0412078] formalised the Heisenberg limit for general quantum metrology; Higgins et al. (2007) implemented it experimentally using Bayesian adaptive protocols. The adaptive variant sometimes called Heisenberg-limited phase estimation achieves 1/N scaling with fewer qubits than vanilla QPE but more classical post-processing.

Iterative QPE — the NISQ-friendly variant

Iterative QPE extracts one bit of \varphi at a time using a single ancilla, with classical feedforward of previously-extracted bits as phase corrections on the remaining circuits. Total qubit count: 1 ancilla + target register. Total U-applications: still 2^t - 1, same as vanilla QPE. Cost structure: depth is still O(2^t) per full estimation, but the circuits are shorter per iteration, which reduces coherent error accumulation at the cost of classical overhead.

Dobšíček et al. (2007) and Svore et al. (2013) are standard references for the NISQ version. On current IBM Quantum hardware, iterative QPE is typically the method of choice for demonstrations beyond trivial precisions.

Applications of high-precision QPE

Indian context — resource analyses at IIT and IIIT

Resource-accounting analyses for Shor's algorithm and chemistry-scale QPE are an active research area in Indian quantum groups. Work at IIT Madras and IIIT Hyderabad on fault-tolerant compilation of QPE circuits, and at TIFR on approximate QFT implementations, contributes to the global effort to push the \varepsilon-\delta trade-off on real hardware. Industrial partners like C-DAC (Centre for Development of Advanced Computing) are building benchmarks for Indian-made quantum processors that include QPE as a standard reference algorithm.

Where this leads next

References

  1. Alexei Kitaev, Quantum measurements and the Abelian stabilizer problem (1995) — arXiv:quant-ph/9511026. Original phase estimation, bit-by-bit construction.
  2. Nielsen and Chuang, Quantum Computation and Quantum Information (2010), §5.2.1 — the canonical accuracy analysis — Cambridge University Press.
  3. Wikipedia, Quantum phase estimation algorithm.
  4. Vittorio Giovannetti, Seth Lloyd, Lorenzo Maccone, Quantum Metrology (2006) — arXiv:quant-ph/0412078. Heisenberg-limited estimation.
  5. Miroslav Dobšíček et al., Arbitrary accuracy iterative quantum phase estimation algorithm using a single ancillary qubit (2007) — arXiv:quant-ph/0610214. NISQ-friendly iterative QPE.
  6. Qiskit Textbook, Quantum Phase Estimation — Accuracy Analysis — hands-on implementation and accuracy discussion.