How Many Iterations — padho-wiki

Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.

In short

The exact optimal iteration count for Grover's algorithm is

k_\text{opt} = \mathrm{round}\!\left(\frac{\pi}{4\theta} - \frac{1}{2}\right), \quad \sin\theta = \frac{1}{\sqrt N}.

For large N, \theta \approx 1/\sqrt N and k_\text{opt} \approx \frac{\pi}{4}\sqrt N. Small cases: N = 4 \Rightarrow k = 1 (probability exactly 1); N = 8 \Rightarrow k = 2 (probability \approx 0.95); N = 16 \Rightarrow k = 3 (probability \approx 0.96); N = 10^6 \Rightarrow k \approx 803; N = 10^{12} \Rightarrow k \approx 785{,}000.

The crucial subtlety is that the success probability P_k = \sin^2((2k+1)\theta) oscillates with k, it does not grow monotonically. Past the optimal k the state rotates beyond |w\rangle and the probability drops — a second iteration too many can send you from 96\% success back to 58\%. "Run Grover longer to be safer" is wrong; you must stop at the right time.

When N or the number M of marked items is unknown, the Boyer-Brassard-Høyer-Tapp (BBHT) adaptive algorithm picks random iteration counts from a range that doubles each round, recovering expected O(\sqrt{N/M}) total queries with no prior knowledge. For applications where monotone behaviour is essential, fixed-point Grover trades a small constant factor for guaranteed non-decreasing success probability.

You have just seen the geometric picture: Grover's algorithm is a rotation by 2\theta per iteration in a 2D plane, and you stop when the state is closest to |w\rangle. The rotation angle 2\theta = 2\arcsin(1/\sqrt N) and the starting angle \theta are fixed by N; the only quantity you control is how many iterations to run.

That sounds like a simple question with a simple answer: "run until the probability is maximised." But "maximised" is where it gets surprising. Classical intuition says: running an iterative algorithm longer can only help — worst case, it wastes time. Grover breaks that intuition. Run too few iterations and the state has not yet rotated to |w\rangle; run too many and it has rotated past |w\rangle. The success probability rises, peaks, falls, rises again, falls again — oscillating forever. You have exactly one measurement to make, and if you time it wrong you can hand back garbage even with a correctly implemented algorithm.

This chapter pins down the exact iteration count, tabulates the success probability as a function of k for small N, handles what to do when N is unknown, and shows how Grover's fragility to the iteration count connects to its fragility to noise. Every answer is one line of trigonometry — but the trigonometry has sharp corners.

The exact formula

From the geometric picture, after k iterations the state is

G^k|s\rangle = \sin((2k+1)\theta)\,|w\rangle + \cos((2k+1)\theta)\,|s'\rangle

where \sin\theta = 1/\sqrt N. The probability of measuring w at this point is

P_k = \sin^2\bigl((2k+1)\theta\bigr).

P_k is maximised when (2k+1)\theta = \pi/2, i.e., when k satisfies

k = \frac{\pi/2 - \theta}{2\theta} = \frac{\pi}{4\theta} - \frac{1}{2}.

Since k must be an integer, pick whichever integer is closest:

\boxed{\;k_\text{opt} = \mathrm{round}\!\left(\frac{\pi}{4\theta} - \frac{1}{2}\right).\;}

Why round and not floor: because \sin^2 is symmetric about its peak — if (2k+1)\theta is slightly past \pi/2, the probability is the same as if it were slightly short. The best integer k is whichever one makes (2k+1)\theta closest to \pi/2, which is exactly what rounding to the nearest integer achieves. Floor-ing would systematically stop too early for half of all N and miss probability for no good reason.

The large-N approximation

For large N, \theta is small, and \sin\theta \approx \theta (the first term of the Taylor series). So \theta \approx 1/\sqrt N, and

k_\text{opt} \approx \frac{\pi}{4 \cdot 1/\sqrt N} - \frac{1}{2} = \frac{\pi}{4}\sqrt N - \frac{1}{2}.

Rounded to an integer, k_\text{opt} \approx \frac{\pi}{4}\sqrt N for any N much larger than 1. This is the famous "\frac{\pi}{4}\sqrt N queries" figure. The \pi/4 \approx 0.785 prefactor is roughly 0.8 — not 1, not \ln 2, just \pi/4, coming from the ratio of "rotation to reach \pi/2" to "angle doubling per iteration."

Numerical error from the approximation

For N = 16, \theta \approx 0.2527 rad. The exact formula gives k_\text{opt} = \mathrm{round}(3.108 - 0.5) = \mathrm{round}(2.608) = 3. The large-N approximation gives \frac{\pi}{4}\sqrt{16} = \pi \approx 3.14, rounded to 3. Agreement.

For N = 8, \theta = \arcsin(1/\sqrt 8) \approx 0.3614 rad. Exact: \mathrm{round}(\pi/(4 \cdot 0.3614) - 0.5) = \mathrm{round}(2.173 - 0.5) = \mathrm{round}(1.673) = 2. Approximate: \frac{\pi}{4}\sqrt 8 \approx 2.22, rounded to 2. Agreement again — the approximation is essentially exact once N \ge 8.

A table of special cases

Plugging in round numbers of qubits:

N	\theta (rad)	k_\text{opt}	P_{k_\text{opt}}
4	\pi/6 \approx 0.524	1	1.0000
8	0.3614	2	0.9453
16	0.2527	3	0.9613
32	0.1789	4	0.9892
64	0.1257	6	0.9962
256	0.0625	12	0.99992
2^{10} = 1{,}024	0.03125	25	>0.9999
2^{20} \approx 10^6	\approx 10^{-3}	803	>0.99999
2^{40} \approx 10^{12}	\approx 10^{-6}	823{,}549	>0.999999

Two patterns are worth noting.

Pattern 1: success probability approaches 1 as N grows. The residual gap between P_{k_\text{opt}} and 1 is the penalty for (2k+1)\theta not being exactly \pi/2. For large N, the rotation angle 2\theta \approx 2/\sqrt N is small, so the k-th rotation can be chosen to land within \theta \approx 1/\sqrt N of \pi/2. The resulting probability is \cos^2(\text{small}) \approx 1 - 1/N. So for N = 10^6, P_{k_\text{opt}} \approx 1 - 10^{-6} — the algorithm is effectively deterministic.

Pattern 2: N = 4 is unique. Only at N = 4 does P_{k_\text{opt}} hit 1 exactly. For every other N, the geometry allows you to get close but not exactly. This is a discrete-rounding artefact: 3\theta = \pi/2 requires \theta = \pi/6 which requires \sin\theta = 1/2 which requires N = 4. No other single-marked N has this property. (Multi-marked versions do — see below.)

Oscillation — the counter-intuitive part

Plot P_k = \sin^2((2k+1)\theta) as a function of k for any fixed N, and you get the signature Grover curve: a rising slope, a peak near k_\text{opt}, a descending slope, a minimum near 2k_\text{opt}, and so on. The curve is periodic with period \pi/(2\theta) - 1/2 \approx \pi/(2\theta) \approx \pi\sqrt N / 2 iterations between consecutive peaks.

Grover's success probability as a function of iteration count for $N = 16$. The curve rises to $\approx 0.96$ at $k = 3$, overshoots, and drops back to near-zero by $k = 7$. This oscillation is why you cannot "run Grover for safety margin" — three extra iterations past the optimum destroy $93\%$ of the probability. The exact answer depends on stopping at the right integer.

For N = 16, seven extra iterations would bring you back near the starting state, and the success probability near zero. The algorithm is periodic, and each full period wastes \pi\sqrt N / 2 queries accomplishing nothing.

The practical rule. If you know N, compute k_\text{opt}, run exactly that many iterations, measure. One shot. If you are even one iteration off, the success probability drops by a noticeable amount. Two or three iterations off, and the drop is drastic.

Two worked examples

Example 1: $N = 8$ — compute $k_\text{opt}$ and verify

Setup. Three qubits. N = 8. One marked state |w\rangle.

Step 1. Compute \theta.

\sin\theta = \frac{1}{\sqrt 8} = \frac{1}{2\sqrt 2} \approx 0.3536.

\theta = \arcsin(0.3536) \approx 0.3614 \text{ rad} \approx 20.70°.

Step 2. Optimal iteration count.

k_\text{opt} = \mathrm{round}\!\left(\frac{\pi}{4 \cdot 0.3614} - \frac{1}{2}\right) = \mathrm{round}(2.173 - 0.5) = \mathrm{round}(1.673) = 2.

Why 2 and not 1: because k = 1 would give angle 3\theta \approx 62.1°, which is 27.9° short of the target 90°, probability \sin^2(62.1°) \approx 0.781. k = 2 gives angle 5\theta \approx 103.5°, which is 13.5° past 90°, probability \sin^2(103.5°) \approx 0.945. k = 2 is closer to the target, even though it overshoots; rounding down would have cost you 0.164 in success probability.

Step 3. Verify the probability.

P_2 = \sin^2(5\theta) = \sin^2(1.807 \text{ rad}) = \sin^2(103.5°) \approx (0.972)^2 \approx 0.945.

Step 4. Comparison to classical. The classical worst case requires 7 queries to f. Grover uses 2 queries (two oracle calls, one per iteration) and returns the marked input with probability 94.5\%. Even counting the one verification query to confirm the answer, that is 3 quantum queries against 7 classical — a 2.3 \times advantage. Small, but already visible at N = 8.

$N = 8$ geometric trajectory. Two iterations rotate the state by $4\theta \approx 82.8°$, landing at $5\theta \approx 103.5°$ — about $13.5°$ past the $|w\rangle$ axis. The success probability $\sin^2(103.5°) \approx 0.945$ is the best an integer number of iterations can do for $N = 8$.

Result. For N = 8, Grover uses 2 oracle queries to return the marked input with probability 94.5\%. Running 1 or 3 iterations would give 78\% and 28\% respectively — much worse. Getting the count right matters even for small N.

Example 2: $N = 16$ — tabulate the oscillation

Setup. Four qubits. N = 16. One marked state. From the geometric-picture chapter, \theta \approx 0.2527 rad, k_\text{opt} = 3.

Step 1. Compute P_k for k from 0 to 13. Use P_k = \sin^2((2k+1)\theta) with \theta = 0.2527. Why go all the way to k = 13: because the full period of the oscillation is \pi/\theta \approx 12.4 iterations, so k = 13 brings the state back near the starting position. Seeing the full period in the table makes the periodicity visible.

k	angle (2k+1)\theta (deg)	P_k
0	14.48°	0.0625
1	43.43°	0.472
2	72.39°	0.908
3	101.34°	\boxed{0.961}
4	130.30°	0.580
5	159.25°	0.126
6	188.21°	0.020
7	217.16°	0.364
8	246.11°	0.824
9	275.07°	0.992
10	304.02°	0.687
11	332.98°	0.207
12	361.93°	0.00113
13	390.89°	0.254

Step 2. Read off the pattern. Two local maxima: k = 3 (probability 0.961) and k = 9 (probability 0.992). Two local minima: k = 6 (probability 0.020) and k = 12 (probability 0.001). The function is roughly periodic in k with period \approx 12.4.

Step 3. Why k = 9 looks like it "wins." It gives 0.992 > 0.961. But it uses three times as many oracle queries: 9 vs 3. The quadratic speedup comes from using the smallest viable k, not the peak with the highest probability — a three-times-more-expensive algorithm that improves success from 96\% to 99\% is a bad trade. In practice you always pick the first peak: k_\text{opt} = 3.

Step 4. Interpret the minima. At k = 6, (2 \cdot 6 + 1)\theta \approx 188°, which is almost 180° (one full half-rotation). The state has rotated to nearly -|s\rangle (the antipode of |s\rangle in the plane), where the amplitude on |w\rangle is -\sin\theta and the probability is \sin^2\theta = 1/N = 0.0625. Actually slightly less because 188° > 180°, giving 0.02. The algorithm has "unwound" back to near the starting probability, and you get essentially no information.

Result. For N = 16, the unique efficient choice is k = 3 iterations, giving 96.1\% success. Any k < 3 is too early; any 3 < k < 9 is past the first peak; any k \ge 9 is wasteful. The correct iteration count is a single integer, and miscounting costs you.

When N or M is unknown — adaptive Grover

The formula k_\text{opt} = \mathrm{round}(\pi/(4\theta) - 1/2) needs \theta, which needs N (or more generally N/M, the ratio of total inputs to marked inputs). In real search problems, one of these might not be known — you may be looking for some marked input in a database without knowing how many marked inputs there are.

The Boyer-Brassard-Høyer-Tapp (BBHT) adaptive scheme handles this. The idea is to try Grover with a random small number of iterations, check the result classically, and grow the "range" of iteration counts geometrically until you succeed.

The algorithm

Initialise m = 1 and pick a growth factor \lambda \in (1, \tfrac{4}{3}] (the original paper uses \lambda = 6/5, which works).
Pick a random integer j \in \{0, 1, \ldots, \lfloor m \rfloor\}.
Apply j Grover iterations to |s\rangle and measure to obtain an outcome x.
Classically evaluate f(x). If f(x) = 1, output x and stop.
Otherwise set m \leftarrow \lambda \cdot m and go to step 2.

Why it works

If M is the (unknown) number of marked items and \theta_M = \arcsin\sqrt{M/N}, the success probability after j iterations is \sin^2((2j+1)\theta_M). Averaging over random j \in \{0, 1, \ldots, m\} gives an expected success probability of at least 1/4 whenever m \ge 1/\sin(2\theta_M) \approx \sqrt{N/M}/2 [5]. So once m has grown past \sqrt{N/M}/2, each round succeeds with probability \ge 1/4 and the expected number of remaining rounds is \le 4.

Total queries: the geometric growth of m means the cumulative iteration count up to round r is O(\lambda^r). When m first reaches \sqrt{N/M}/2, r = O(\log \sqrt{N/M}), and the total queries up to that point are O(\lambda^r) = O(\sqrt{N/M}). After that, each round uses O(\sqrt{N/M}) queries and succeeds with constant probability, so the expected additional queries are also O(\sqrt{N/M}).

Net: the BBHT adaptive scheme uses O(\sqrt{N/M}) expected oracle queries — the same complexity as if M were known in advance. The price of ignorance is a constant factor, not a scaling factor.

Sketch of the decision tree

BBHT adaptive Grover. Without knowing $M$ (the number of marked items), pick a random iteration count $j$ in a range $\{0, 1, \ldots, m\}$, run Grover with $j$ iterations, measure, and verify classically. If you fail, grow the range by a factor of $\lambda \approx 6/5$ and try again. The expected total queries come out to $O(\sqrt{N/M})$.

Practical consequence. If you are running Grover in a setting where the number of marked items is unknown — say, a SAT solver where the number of satisfying assignments is the thing you are trying to discover — BBHT gets you the quadratic speedup with no prior information. The constant factor is slightly worse than \pi/4, but the asymptotic scaling is optimal.

A sibling problem: given the search oracle, can you determine the number M of marked items efficiently? Yes — quantum counting (also Brassard-Høyer-Tapp) uses quantum phase estimation to measure \theta_M = \arcsin\sqrt{M/N} to precision \epsilon using O(1/\epsilon) oracle queries. For M itself, you recover M to additive error \sqrt{MN}\epsilon, which is the quantum analogue of approximate counting. You can run counting first to determine M, then plain Grover with the optimal k, for a total of O(\sqrt{N/M}) queries — the same scaling as BBHT, just organised differently.

The exponential-decay envelope — why noise kills Grover

The pristine picture assumes perfect gates. On a real quantum computer each gate has a small error rate p_\text{gate} (typically 10^{-3} to 10^{-4} on current NISQ hardware). An oracle-plus-diffusion iteration uses O(n) gates, so each iteration has error probability about n \cdot p_\text{gate}. After k iterations, the accumulated error is roughly 1 - (1 - n p_\text{gate})^k \approx k n p_\text{gate} — the probability that the state has depolarised away from the intended trajectory.

For large N, k \approx \frac{\pi}{4}\sqrt N and n = \log_2 N. The total gate count is \frac{\pi}{4}\sqrt N \cdot n = O(\sqrt N \log N). Multiply by p_\text{gate} and the probability the algorithm runs to completion correctly is roughly \exp(-\sqrt N \log N \cdot p_\text{gate}) — an exponentially decaying envelope in problem size.

Concrete: for p_\text{gate} = 10^{-3} and N = 2^{20}, the expected number of errors is \sqrt{2^{20}} \cdot 20 \cdot 10^{-3} \approx 20. The probability of error-free execution is e^{-20} \approx 2 \times 10^{-9}. Grover at this size is completely destroyed by noise without error correction.

This is why the \sqrt N scaling does not translate to immediate cryptographic impact. Breaking AES-128 would require searching 2^{128} keys, \sqrt{2^{128}} = 2^{64} Grover iterations — a 2x effective key-length reduction, meaning AES-128 becomes as hard to break quantumly as AES-64 is classically. On paper. In practice, running 2^{64} Grover iterations with near-unit probability of correctness requires gate error rates around 10^{-20}, far below anything buildable without fault-tolerant quantum computing. The asymptotic speedup is real; the practical threat is generations away.

Fixed-point Grover — monotone success probability

For applications that need guaranteed progress even when k is not exactly tuned, the fixed-point Grover variant (Yoder-Low-Chuang, 2014) uses a slightly different iteration that is monotone: success probability never decreases with k. The trade-off: each fixed-point iteration costs two rotations instead of one, so the total iteration count is a small constant factor larger than standard Grover. The benefit is that overshoot is no longer destructive — the algorithm is safe to run longer than necessary, making it robust to noise and to unknown M.

Fixed-point Grover is the right choice for amplitude-amplification inside larger algorithms: when Grover is a subroutine and you cannot afford to have its probability oscillate, the monotone guarantee is essential.

Common confusions

"Just run Grover longer to be safer." False. Running longer is actively harmful — the probability oscillates, and past k_\text{opt} you are rotating away from |w\rangle. You have exactly one measurement; you must stop at the right integer or accept drastically reduced success probability.
"More iterations give a monotonically increasing probability." False for standard Grover (true only for fixed-point Grover). The probability \sin^2((2k+1)\theta) oscillates with period about \pi\sqrt N / 2 iterations, and has many local maxima and minima.
"The \sqrt N bound is approximate; you might do better in practice." False. The exact formula is k_\text{opt} = \mathrm{round}(\pi/(4\theta) - 1/2), and the \pi/4 prefactor is tight. No choice of k — integer, real, adaptive — does better than this for standard Grover. The BBBV lower bound (see unstructured search) says no other quantum algorithm does either.
"Round k_\text{opt} up to be safe." Not necessarily. The better rule is "round to nearest integer." For some N, rounding down is closer to \pi/2; for others, rounding up is closer. The success probability \sin^2(\cdot) is symmetric about its peak, so whichever integer puts (2k+1)\theta nearest to \pi/2 wins.
"For very large N, the -1/2 in the formula doesn't matter." Mostly true — it is a rounding-only correction. For N = 10^6, \pi/(4\theta) \approx 785 and \pi/(4\theta) - 1/2 \approx 784.5, both round to 785 (or maybe 784). For very large N the "-1/2" is invisible in practice. It matters only for small N where being off by one iteration is a sizeable fraction of the total.
"Adaptive Grover is a fancier algorithm than basic Grover." Not really — it is basic Grover called in a loop with a doubling-trick wrapper. The "algorithm" sits almost entirely in how to choose j and m; the core quantum circuit is unchanged. The cleverness is classical scheduling.
"With quantum noise, Grover still works — just at lower probability." Misleading. Noise doesn't just reduce the peak probability; it smears out the oscillation curve and eventually saturates it near 1/N (uniform). Past a certain number of iterations, noisy Grover is indistinguishable from random guessing. The "gracefully degrades" picture applies only to small circuits; for large N, noise is a hard cliff.

Going deeper

The formula k_\text{opt} = \mathrm{round}(\pi/(4\theta) - 1/2) and the oscillation picture are the essentials. What follows extends the story: the multi-marked formula, rigorous BBHT analysis, exact Grover, quantum counting, noise calculations, and how iteration-count questions scale up to fault-tolerant cryptographic attacks.

The multi-marked formula

For M marked items out of N total, \sin\theta_M = \sqrt{M/N} and

k_\text{opt} = \mathrm{round}\!\left(\frac{\pi}{4\theta_M} - \frac{1}{2}\right) \approx \frac{\pi}{4}\sqrt{\frac{N}{M}} - \frac{1}{2}.

Examples:

N	M	\theta_M (deg)	k_\text{opt}	P
16	1	14.48°	3	0.961
16	4	30°	1	1.000
16	8	45°	0	0.500
64	1	7.18°	6	0.996
64	4	14.48°	3	0.961
64	16	30°	1	1.000
64	32	45°	0	0.500

Note the pattern M = N/4 (rightmost column): \theta_M = 30° exactly, k_\text{opt} = 1, probability exactly 1. And M = N/2: \theta_M = 45°, k_\text{opt} = 0 (zero iterations!), meaning the initial state is already optimal and one measurement returns a marked item with probability 1/2. Running even one iteration would rotate past |w_\text{avg}\rangle and make things worse.

Rigorous BBHT analysis

Claim: the BBHT scheme with \lambda = 6/5 uses O(\sqrt{N/M}) expected oracle queries.

Lemma. For any m \ge 0 and \theta_M > 0, averaging over uniformly random j \in \{0, 1, \ldots, m\}:

\mathbb{E}_j[\sin^2((2j+1)\theta_M)] = \frac{1}{2} - \frac{\sin(4(m+1)\theta_M)}{4(m+1)\sin(2\theta_M)}.

Proof sketch. Expand \sin^2 = (1 - \cos)/2 and sum the resulting geometric series of cosines. The \cos sum telescopes via product-to-sum identities.

Consequence. When m \ge 1/\sin(2\theta_M), the right-hand term is at most 1/4, so the expected success probability is at least 1/4. Any single round with m at least this large succeeds with probability \ge 1/4; the expected number of additional rounds is \le 4.

Query count. Each round with parameter m runs Grover with up to m iterations (actually a random number j \le m, so on average m/2). The cumulative iterations up to round r where m_r = \lambda^r are \sum_{r' \le r} m_{r'}/2 = O(\lambda^r) = O(m_r). The stopping m is O(\sqrt{N/M}), so total queries are O(\sqrt{N/M}). QED.

The hidden constant depends on \lambda and on the random-j uniform distribution; the original paper uses \lambda = 6/5 to get a tight constant of about 8. The optimal-k-known version has constant \pi/4 \approx 0.785.

Quantum counting

Given a Grover oracle O, the unitary G = D \cdot O has eigenvalues e^{\pm 2i\theta_M} (as a rotation by 2\theta_M in the 2D plane). Applying quantum phase estimation to G extracts the phase 2\theta_M to t-bit precision using 2^t applications of G. Setting t = \log_2(1/\epsilon) gives precision \epsilon in O(1/\epsilon) queries. From \theta_M you recover M = N \sin^2\theta_M.

For counting to additive precision \epsilon M (say, "is M between 1000 and 1010?"), you need \epsilon \approx 1/\sqrt{MN}, giving O(\sqrt{MN}) queries. For exact counting (M to integer precision when M is known to be polynomial in N), this is O(\sqrt N \log N) queries — still polynomial.

Quantum counting is a subroutine in several advanced algorithms, including quantum mean estimation (Brassard-Høyer-Mosca-Tapp) and quantum Monte Carlo (Montanaro 2015). In each case, Grover's amplitude amplification underwrites a quadratic improvement over classical Monte Carlo.

Exact Grover for arbitrary N

Standard Grover only achieves success probability 1 for N on the list 4, 16, 36, 64, \ldots — specifically N = 4M^2 for integer M. For arbitrary N, the exact-Grover construction (Long 2001, among others) modifies the diffusion operator slightly:

D_\phi = I - (1 - e^{i\phi})|s\rangle\langle s|,

with a tuneable phase \phi (not just \pi, which is the standard Grover case). Choosing \phi correctly, given N, makes the rotation angle a divisor of \pi for some small integer number of iterations. The result: deterministic success probability 1 for any N, at the cost of a slightly different diffusion operator that depends on N.

Whether this is worth implementing depends on the application. For most search problems, the > 99.99\% success probability of standard Grover at N \ge 256 is sufficient, and the small probability of a wrong answer is cheaply handled by one classical verification query. For very small N where success probability below 100\% matters, exact Grover is available.

Connection to cryptanalysis scales

AES-128 has 2^{128} possible keys. A Grover attack on AES-128 performs \sqrt{2^{128}} = 2^{64} Grover iterations, giving a 2\times reduction in effective key length. NIST's post-quantum guidance treats AES-128 as providing 64 bits of quantum security, which is below the recommended 128-bit threshold. Hence: AES-256 is the post-quantum-safe symmetric choice, because \sqrt{2^{256}} = 2^{128} iterations meets the threshold.

Actual construction of the oracle for "AES key check" requires the entire AES encryption circuit inside a quantum computer — each AES round becomes O(n^2) quantum gates, and 10 AES rounds plus key-schedule logic produces a circuit of \sim 10{,}000 gates that needs to be iterated 2^{64} times. Total gate count: \sim 10^{16}. With error rates of 10^{-3}, the probability of a single error-free run is 10^{-10000} — the run needs error correction.

Fault-tolerant estimates (Grassl-Langenberg-Roetteler-Steinwandt 2016) put the qubit requirement for attacking AES-128 at \sim 3000 logical qubits, which via the surface code translates to \sim 10^7 to 10^8 physical qubits. Current machines have \sim 1000 physical qubits. The practical threat is far off, but asymptotically real — which is why post-quantum cryptography migration is an active policy topic, including in India through the National Quantum Mission's cryptography workstream.

Noise scaling — quantitative

With depolarising error rate p per two-qubit gate and G gates per Grover iteration and k iterations total, the probability of a clean run is

P_\text{clean} \approx (1 - p)^{Gk} \approx \exp(-pGk).

For p = 10^{-3}, G = O(n) = O(\log N), k = \frac{\pi}{4}\sqrt N:

P_\text{clean} \approx \exp\!\left(-10^{-3} \cdot \log N \cdot \frac{\pi}{4}\sqrt N\right).

This drops below 0.5 around \sqrt N \log N \cdot 10^{-3} \approx 1, i.e., N \approx 10^5. So on p = 10^{-3} hardware, Grover is practically usable up to N \approx 10^4 to 10^5. Beyond that, error correction is mandatory.

This is the quantitative reason NISQ-era Grover demonstrations are confined to small N (n \le 5 or 6 qubits in published experiments) — not because the algorithm doesn't work, but because error accumulation over \sqrt N gate-rounds destroys the signal before N becomes interesting.

Where this leads next

Grover with multiple targets — the M \ge 2 case in full detail, with the adapted formula k = \mathrm{round}(\pi/(4\theta_M) - 1/2) and the adaptive scheme.
Grover's algorithm circuit — the full circuit whose iteration count you are now computing.
Grover's geometric picture — the 2D-plane derivation of the \sqrt N scaling.
Amplitude amplification — the generalisation that inherits the same iteration-count machinery for any quantum subroutine.

References

Lov K. Grover, A fast quantum mechanical algorithm for database search (1996) — arXiv:quant-ph/9605043. Original paper.
Boyer, Brassard, Høyer, Tapp, Tight bounds on quantum searching (1998) — arXiv:quant-ph/9605034. The BBHT adaptive algorithm and rigorous iteration-count analysis.
Wikipedia, Grover's algorithm — number of iterations.
Nielsen and Chuang, Quantum Computation and Quantum Information, §6.1.4 — Cambridge University Press.
John Preskill, Lecture Notes on Quantum Computation, Chapter 6 — theory.caltech.edu/~preskill/ph229. Clear derivation with the oscillation curve.
Yoder, Low, Chuang, Fixed-point quantum search with an optimal number of queries (2014) — arXiv:1409.3305. The monotone-probability variant.

The exact formula

The large-N approximation

Numerical error from the approximation

A table of special cases

Oscillation — the counter-intuitive part

Two worked examples

When N or M is unknown — adaptive Grover

The algorithm

Why it works

Sketch of the decision tree

The related problem: quantum counting

The exponential-decay envelope — why noise kills Grover

Fixed-point Grover — monotone success probability

Common confusions

Going deeper

The multi-marked formula

Rigorous BBHT analysis

Quantum counting

Exact Grover for arbitrary N

Connection to cryptanalysis scales

Noise scaling — quantitative

Where this leads next

References