Amplitude Estimation

Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.

In short

Amplitude estimation takes the amplitude amplification machinery and turns it into a measurement device for the success probability p itself. Suppose A prepares a state |\Psi\rangle = A|0\rangle whose probability of landing in a "good" subspace is p. Amplification rotated |\Psi\rangle toward the good subspace; estimation measures the rotation angle and recovers p from it.

The idea: the amplification operator Q = -A S_0 A^\dagger S_\Pi is a rotation by 2\theta in the 2D good-bad plane, and its eigenvalues on that plane are e^{\pm 2i\theta} with \sin\theta = \sqrt p. Apply quantum phase estimation to Q — with |\Psi\rangle as the input state (a superposition of Q's two eigenvectors) and t ancilla qubits — and the measurement outcome gives you \theta to t bits of precision. Compute p = \sin^2\theta from the estimate and you are done.

The scaling: O(2^t) = O(1/\varepsilon) queries to A give you p with additive error \varepsilon. Classically, estimating a probability p to additive error \varepsilon by Monte Carlo sampling needs \Theta(1/\varepsilon^2) samples — quadratically more. This quadratic improvement in Monte Carlo is one of the most practically important quantum speedups, with applications in finance (option pricing), chemistry (expectation values), and Bayesian statistics.

Amplitude amplification showed you how to reach the good outcome. Amplitude estimation asks a different question: how good is the probability itself? Is p one in ten, or one in a hundred, or one in a million?

A lot of problems reduce to estimating a probability. What fraction of option-pricing paths end up exercising? What fraction of molecular configurations fall within a given energy band? What fraction of Bayesian posterior samples satisfy a hypothesis? In every one of these, you want the number p, not just a single "good" sample.

Classically, the way to estimate p is Monte Carlo: run the procedure m times, count the fraction of successes, invoke the law of large numbers. To get additive accuracy \varepsilon, you need m = \Theta(1/\varepsilon^2) samples — the famous 1/\sqrt m scaling of sample-mean error. To estimate a probability to the first decimal place (\varepsilon = 0.01) needs 10\,000 samples; to the third decimal place, 10^6 samples. Monte Carlo's bottleneck is exactly this quadratic penalty on accuracy.

Amplitude estimation replaces 1/\varepsilon^2 with 1/\varepsilon. The procedure needs O(1/\varepsilon) applications of the state-preparation unitary A, which is a quadratic speedup over the classical \Theta(1/\varepsilon^2). For \varepsilon = 10^{-3} that is 1\,000 quantum queries versus 10^6 classical samples.

The construction is short: take the amplification operator Q from the previous chapter, run quantum phase estimation on it, read the phase as \theta, and compute p = \sin^2\theta. Every piece is already built. The chapter is about stitching them together.

From amplification to estimation — a change of perspective

Recall Q = -A\,S_0\,A^\dagger\,S_\Pi. You proved in the amplitude amplification chapter that, restricted to the 2D plane spanned by |\text{good}\rangle and |\text{bad}\rangle, Q acts as the rotation matrix

Q|_{\text{plane}} \;=\; \begin{pmatrix}\cos 2\theta & -\sin 2\theta \\ \sin 2\theta & \cos 2\theta\end{pmatrix},

where \sin\theta = \sqrt p.

A 2D rotation matrix has eigenvalues e^{\pm 2i\theta} and eigenvectors |\psi_\pm\rangle = \frac{1}{\sqrt 2}(|\text{bad}\rangle \mp i|\text{good}\rangle) — a standard piece of linear algebra. So Q has a known spectrum: two eigenvalues on the unit circle, at angles \pm 2\theta. The angle of those eigenvalues encodes the success probability you want to estimate, through p = \sin^2\theta.

If only you had a way to extract the phase of a unitary's eigenvalue... quantum phase estimation is exactly that tool. That is the whole punchline. Everything else is implementation detail.

The two eigenvalues of the amplification operator $Q$ in the good-bad plane, as points on the unit circle. They sit at angles $\pm 2\theta$ from the positive real axis, where $\sin\theta = \sqrt p$. Amplitude estimation applies quantum phase estimation to $Q$ to read the angle $2\theta$, from which $p$ is recovered via $p = \sin^2\theta$.

Decomposing |\Psi\rangle in Q's eigenbasis

QPE extracts the phase of an eigenvalue when the input is an eigenstate. Your state |\Psi\rangle is not an eigenstate of Q — it is the state you prepared with A|0\rangle. But it is a superposition of Q's two eigenstates in the plane, and that turns out to be exactly what QPE needs.

The eigenstates of a 2D rotation matrix are

|\psi_+\rangle \;=\; \tfrac{1}{\sqrt 2}\bigl(|\text{bad}\rangle + i\,|\text{good}\rangle\bigr), \qquad |\psi_-\rangle \;=\; \tfrac{1}{\sqrt 2}\bigl(|\text{bad}\rangle - i\,|\text{good}\rangle\bigr),

with Q|\psi_\pm\rangle = e^{\mp 2i\theta}|\psi_\pm\rangle (the sign convention depends on how you orient the rotation; either way, one eigenvalue is e^{+2i\theta} and the other is e^{-2i\theta}).

Rearrange: |\text{bad}\rangle = \tfrac{1}{\sqrt 2}(|\psi_+\rangle + |\psi_-\rangle) and |\text{good}\rangle = \tfrac{1}{i\sqrt 2}(|\psi_+\rangle - |\psi_-\rangle). Then

|\Psi\rangle \;=\; \sin\theta\,|\text{good}\rangle + \cos\theta\,|\text{bad}\rangle \;=\; \frac{-i}{\sqrt 2}\Bigl(e^{i\theta}|\psi_+\rangle - e^{-i\theta}|\psi_-\rangle\Bigr)

after a small amount of algebra using \sin\theta = (e^{i\theta} - e^{-i\theta})/(2i) and \cos\theta = (e^{i\theta} + e^{-i\theta})/2.

The upshot, ignoring the overall phase: |\Psi\rangle is a superposition of the two eigenstates |\psi_+\rangle and |\psi_-\rangle, with equal weight 1/\sqrt 2 on each.

Why equal weight matters: when you run QPE with |\Psi\rangle as the input, QPE will happily output either of the two eigenvalue phases, with probability 1/2 for each. That is, half the time the measurement reads out (an approximation of) +2\theta, and half the time it reads -2\theta — equivalently, 1 - 2\theta in the [0, 1) convention QPE uses. Both give you \theta up to a sign, and since p = \sin^2\theta only depends on \theta modulo sign, this ambiguity is harmless. You just need to remember that the two outcomes are both valid and both lead to the same p.

The amplitude-estimation circuit

Assemble the pieces.

Inputs. The state preparation A (with inverse A^\dagger), the good-subspace reflection S_\Pi, and a choice of t ancilla qubits for phase estimation. Larger t gives higher precision.

Ingredients for one round.

A t-qubit ancilla register, initialised to |0\rangle^{\otimes t}.
The target register, initialised to |0\rangle then hit with A to become |\Psi\rangle.
Controlled applications of Q^{2^k} for k = 0, 1, \ldots, t - 1, with ancilla qubit k as the control.
An inverse QFT on the ancillas.
A measurement of the ancillas, yielding an integer y \in \{0, 1, \ldots, 2^t - 1\}.

Recovering p. From y, compute

\tilde\theta \;=\; \frac{\pi y}{2^t}, \qquad \tilde p \;=\; \sin^2\tilde\theta.

That is the estimate. With probability at least 8/\pi^2 \approx 0.81, the outcome y is within 1 of the true 2\theta \cdot 2^t / (2\pi) = \theta \cdot 2^t/\pi, giving |\tilde\theta - \theta| \le \pi/2^t and (with a bit more work) |\tilde p - p| \le 2\pi\sqrt{p(1-p)}/2^t + \pi^2/4^t. For small \varepsilon, this is essentially |\tilde p - p| = O(\sqrt p / 2^t), matching the famous scaling.

The amplitude estimation circuit, with $t = 4$ ancilla qubits. $A$ prepares $|\Psi\rangle$ on the target register. A ladder of controlled-$Q^{2^k}$ gates writes the phase $2\theta$ into the ancilla register, using the phase-kickback mechanism of QPE. The inverse QFT rotates the ancilla register into the computational basis, and a measurement returns an integer $y$ from which you recover $\tilde\theta = \pi y / 2^t$ and $\tilde p = \sin^2\tilde\theta$.

Why the error scales like 1/2^t, not 1/2^{t/2}

The sample-mean error of classical Monte Carlo is 1/\sqrt m: to halve the error you need 4 \times as many samples. The error of amplitude estimation is 1/2^t in the exponent: doubling t squares the accuracy, and adding one ancilla halves the error. That is the quadratic speedup.

How the scaling comes out of QPE. QPE on Q with t ancillas measures the phase \varphi = 2\theta/(2\pi) = \theta/\pi (using the convention that QPE reads a phase \varphi \in [0, 1) such that the eigenvalue is e^{2\pi i\varphi}) to accuracy \pm 1/2^t. So \theta is estimated to accuracy \pm \pi/2^t.

Then p = \sin^2\theta, and the error in p from an error \Delta\theta in \theta is \Delta p \approx |2\sin\theta\cos\theta|\cdot\Delta\theta = \sin(2\theta) \cdot \Delta\theta. For small \theta (i.e. small p), \sin(2\theta) \approx 2\sqrt p, giving \Delta p \approx 2\sqrt p \cdot \pi/2^t = O(\sqrt p / 2^t).

Cost in terms of \varepsilon. Set \Delta p = \varepsilon and solve: 2^t \approx 2\pi\sqrt p/\varepsilon. Total applications of A needed: the Q-ladder calls Q a total of 1 + 2 + 4 + \ldots + 2^{t-1} = 2^t - 1 \approx 2^t times, and each Q uses A and A^\dagger once. So the total is O(2^t) = O(\sqrt p / \varepsilon) applications of A.

For the classical comparison: Monte Carlo estimating p to accuracy \varepsilon needs \Theta(p(1-p)/\varepsilon^2) = O(p/\varepsilon^2) samples for small p. Ratio:

\frac{\text{classical}}{\text{quantum}} \;\approx\; \frac{p/\varepsilon^2}{\sqrt p/\varepsilon} \;=\; \frac{\sqrt p}{\varepsilon}.

For p = 10^{-3} and \varepsilon = 10^{-4}, this ratio is \sqrt{10^{-3}}/10^{-4} \approx 316. Amplitude estimation is \sim 300\times cheaper in this regime, in A-applications. Whether the cheaper cost translates to cheaper wall-clock time depends on whether A can be implemented efficiently on a real quantum computer — which is where engineering meets the algorithm, and the subject of the applications section below.

Why the ratio improves as p shrinks: classical Monte Carlo wastes most of its samples on the bad subspace, using 1/p samples just to land on one good outcome before even starting to measure the probability. Quantum amplitude estimation extracts the angle \theta of the full state without ever having to land in the good subspace first. The smaller p gets, the more classical Monte Carlo suffers and the bigger the quantum advantage.

Worked example 1: estimating p = 1/4 with t = 4 ancillas

Take a small, concrete case where you can track every number.

Example 1: $A$ prepares a state with $p = 1/4$; estimate with $t = 4$

Setup. Suppose A prepares

|\Psi\rangle \;=\; \tfrac{1}{2}|11\rangle \;+\; \tfrac{\sqrt 3}{2}|\text{other}\rangle,

so p = |1/2|^2 = 1/4 and you want to estimate it. Use t = 4 ancilla qubits.

Step 1: compute the true \theta.

\sin\theta \;=\; \sqrt{1/4} \;=\; 1/2, \qquad \theta \;=\; \pi/6 \;\approx\; 0.5236\text{ rad} \;=\; 30°.

Step 2: compute what QPE should read out. QPE on Q reads the phase \varphi = 2\theta/(2\pi) = \theta/\pi = 1/6 \approx 0.1667. With t = 4 ancillas, the measured integer y concentrates near 2^t \cdot \varphi = 16 \cdot (1/6) = 16/6 \approx 2.67.

So y will be 2 or 3 most of the time (because the true value 2.67 sits between them). The probability of reading y = 3 is higher than y = 2 because 2.67 is closer to 3.

Step 3: compute \tilde p from each plausible outcome.

y	\tilde\theta = \pi y/2^t	\tilde p = \sin^2\tilde\theta
2	\pi \cdot 2/16 \approx 0.3927	\sin^2(0.3927) \approx 0.146
3	\pi \cdot 3/16 \approx 0.5890	\sin^2(0.5890) \approx 0.308
13	\pi \cdot 13/16 \approx 2.552	\sin^2(2.552) \approx 0.308
14	\pi \cdot 14/16 \approx 2.749	\sin^2(2.749) \approx 0.146

Why y = 13 and y = 14 also appear: the eigenstate |\psi_-\rangle of Q has eigenvalue e^{-2i\theta}, which QPE reads as the phase 1 - \varphi = 1 - 1/6 = 5/6 \approx 0.8333, so y = 16 \cdot 5/6 \approx 13.33. Half the time the measurement lands near y = 3 (from the +2\theta eigenstate); half the time near y = 13 (from the -2\theta eigenstate). Both give the same \tilde p because \sin^2(\pi - x) = \sin^2(x).

Step 4: compare with the truth. The true p = 0.25. The two most likely \tilde p values are 0.146 and 0.308. The error is \sim 0.06 in the worse case — roughly 1/2^t = 1/16 in relative terms, consistent with the theoretical bound. With just t = 4 ancillas and \sim 16 queries to A, you have pinned p to within \pm 0.1 or so, using a single measurement.

Check the scaling. Classical Monte Carlo estimating p = 0.25 to accuracy \pm 0.1 would need roughly p(1-p)/\varepsilon^2 = 0.1875/0.01 \approx 19 samples. In this tiny regime the advantage is marginal (19 vs 16 — a wash). The quadratic advantage only emerges for much tighter accuracies; the next example spells that out.

Result. A t = 4 amplitude-estimation circuit on p = 1/4 produces \tilde p \in \{0.146, 0.308\} with high probability, recovering p to within about \pm 0.06 using \sim 16 queries to the state preparation A. The true p sits right between the two peaks, as the geometry predicts.

Worked example 2: Monte Carlo versus amplitude estimation at \varepsilon = 10^{-3}

Now the comparison that matters for applications.

Example 2: estimating an integral to $0.1\%$ accuracy

Setup. Suppose you have a probability distribution \pi over some space (say, future prices of a stock), and you want to estimate the probability p = \Pr_\pi[\text{price ends above a strike}], which is the classical problem of option pricing. Suppose the true p \approx 0.3 and you want the estimate to be within \varepsilon = 10^{-3} (i.e., \pm 0.001, or 0.1\% of the range).

Classical Monte Carlo cost. Number of samples needed:

m_\text{classical} \;=\; \frac{p(1-p)}{\varepsilon^2} \;=\; \frac{0.3 \times 0.7}{(10^{-3})^2} \;=\; \frac{0.21}{10^{-6}} \;=\; 2.1 \times 10^5 \;\approx\; 210\,000.

So you need to draw about two hundred thousand sample paths, evaluate each, and average.

Quantum amplitude estimation cost. The quantum version needs 2^t \approx 2\pi\sqrt p/\varepsilon iterations of Q:

2^t \;\approx\; \frac{2\pi\sqrt{0.3}}{10^{-3}} \;\approx\; \frac{3.44}{10^{-3}} \;\approx\; 3\,440.

So t \approx \log_2 3\,440 \approx 12 ancilla qubits suffice, and the number of A-applications is roughly 2 \cdot 3\,440 \approx 6\,900 (each Q uses A and A^\dagger once).

Comparison.

Metric	Classical Monte Carlo	Quantum amplitude estimation
Applications of A	\sim 210\,000	\sim 6\,900
Ratio	1	\sim 30\times fewer

Amplitude estimation needs roughly 30 times fewer applications of A to reach the same accuracy. For \varepsilon = 10^{-4} the ratio grows to \sim 300\times; for \varepsilon = 10^{-5}, \sim 3\,000\times. The quadratic improvement compounds as accuracy tightens.

Why the ratio is \sqrt p / \varepsilon: classical scales as p/\varepsilon^2, quantum as \sqrt p / \varepsilon. Dividing, p/\varepsilon^2 \div \sqrt p / \varepsilon = \sqrt p / \varepsilon. So for fixed p, halving \varepsilon doubles the quantum-to-classical advantage. This is the "quadratic" nature of the speedup: quantum costs scale with the square root of the classical cost in the \varepsilon \to 0 regime.

What this means in practice. A quantum computer that can implement A — the state preparation encoding the price-path distribution — and run a t = 12 ancilla amplitude-estimation circuit, can estimate the probability to 0.1\% using \sim 7\,000 quantum queries. A classical computer doing the same task needs \sim 210\,000 Monte Carlo samples. If each quantum query takes the same time as a classical sample (a big "if" — real quantum hardware has much higher per-operation latency today), amplitude estimation wins by a factor of 30. For \varepsilon = 10^{-5}, the factor grows to \sim 3\,000, which is large enough to eat significant constant-factor overhead and still win.

Caveat. This comparison ignores the cost of building A. If A is a complicated quantum circuit that loads a classical distribution into a quantum state, that cost is non-trivial and is sometimes the bottleneck. The remainder of this chapter's applications section discusses where amplitude estimation's theoretical speedup actually translates to practice.

Log-log plot of queries vs target accuracy $\varepsilon$, for $p = 0.3$. The grey line is classical Monte Carlo ($\log_{10} m = 2\log_{10}(1/\varepsilon) + \text{const}$), the red line is amplitude estimation ($\log_{10} m = \log_{10}(1/\varepsilon) + \text{const}$). At $\varepsilon = 10^{-3}$ (x = -3) the classical line sits about 1.5 decades above the quantum one, matching the factor-of-30 cost ratio computed above. The gap widens at tighter accuracies.

Result. For 0.1\% accuracy estimates of a probability p \sim 0.3, amplitude estimation needs \sim 7\,000 quantum queries versus \sim 210\,000 classical Monte Carlo samples — a 30\times reduction in queries. The ratio grows with tighter accuracy targets, as \sqrt p / \varepsilon.

Where amplitude estimation shows up

Four settings are particularly important.

1. Option pricing in finance. An option is a contract whose payoff depends on the future price of some asset — a stock, a currency, a commodity. Pricing the option under a stochastic model (say, geometric Brownian motion) reduces to estimating the expectation \mathbb{E}[\text{payoff}] over the model distribution, and when the payoff is an indicator function (e.g., "price ends above strike"), this is a pure probability estimation. Stamatopoulos et al. (arXiv:1905.02666) work out explicit quantum circuits for European and Asian option pricing using amplitude estimation. For realistic options and accuracies, the classical cost is \sim 10^6 to 10^8 samples; amplitude estimation brings this down to \sim 10^3 to 10^4 queries. The catch: constructing the quantum state that encodes the stochastic model is itself expensive, sometimes needing quantum-generative techniques or amplitude encoding of classical data.

2. Quantum chemistry — expectation values. Computing \langle\psi|H|\psi\rangle for a molecular Hamiltonian H is the core of the variational quantum eigensolver (VQE) used for finding molecular ground-state energies. VQE classically estimates the expectation via repeated sampling; amplitude estimation replaces this with a quadratic-speedup subroutine, reducing the number of circuit repetitions required for a given accuracy. Knill-Ortiz-Somma (2007) and subsequent work have developed this into a building block for quantum chemistry simulators.

3. Bayesian inference. Estimating a posterior probability p(\text{hypothesis} | \text{data}) reduces (in many formulations) to estimating an amplitude, which amplitude estimation can speed up quadratically. This connects to quantum algorithms for Bayesian networks and probabilistic inference, where the classical sampling cost dominates and amplitude estimation offers a principled improvement.

4. Mean estimation of a random variable. Given a distribution and a function f on its support, estimate \mathbb{E}[f]. Montanaro's 2015 algorithm wraps amplitude estimation around a careful state-preparation scheme to achieve the quadratic speedup for any bounded random variable, not just probabilities. This is the cleanest statement of quantum's Monte Carlo advantage: whenever you have a random quantity whose mean you want, quantum can learn it to accuracy \varepsilon with O(1/\varepsilon) quantum queries instead of O(1/\varepsilon^2) classical samples.

Where amplitude estimation does not help:

Problems that require structured classical data. If the mean you want is an average over a database the quantum computer does not already have access to, you first have to load the data — and the loading cost can erase the quadratic advantage. This is the infamous QRAM bottleneck.
Small-N Monte Carlo. If you only need 100 samples anyway, the constant-factor overhead of amplitude estimation (including the QFT at the end) can exceed the classical cost.
Non-bounded random variables. Amplitude estimation handles probabilities and bounded expectations. For variables with heavy tails, the quadratic advantage is either lost or needs careful truncation.

Common confusions

"Amplitude estimation is just faster Monte Carlo." Technically yes, but the constants are large and the engineering is hard. You need a fault-tolerant quantum computer (to run deep QFT circuits without decoherence), an efficient implementation of A, and a way to load classical data into the state. In the near-term-hardware regime, the quadratic speedup is largely theoretical; it becomes real on full-scale quantum computers.
"It gives the exact p." No. The output is a quantised estimate \tilde p = \sin^2(\pi y/2^t) for some integer y. The resolution is \sim 1/2^t; the accuracy of \tilde p relative to p is O(\sqrt p/2^t). For t in the single digits you get a coarse estimate; for t = 20 you get six-decimal accuracy. This is a finite-precision estimator, not an oracle that returns the exact probability.
"Amplitude estimation doesn't need fault-tolerant quantum." For practical accuracy targets, it likely does. The QPE ladder on Q requires coherent application of Q^{2^{t-1}}, a very deep circuit. On NISQ hardware, decoherence corrupts the phase measurement long before t is large enough to beat classical. Active research on iterative amplitude estimation (see below) trades depth for an adaptive approach, achieving the speedup with shorter individual circuits but more of them.
"Amplitude estimation always wins over Monte Carlo." It wins asymptotically, for sufficiently tight \varepsilon and assuming equal per-query cost. In practice, the constants (QFT overhead, QRAM data loading, fault-tolerant compilation) can make quantum slower on problems where classical Monte Carlo is already cheap. The clearest wins are in regimes where classical needs 10^8+ samples to achieve the required accuracy.
"The two outcomes y and 2^t - y are a bug." No — they are a feature of the eigenstructure. The input |\Psi\rangle is an equal superposition of Q's two eigenstates, so the measurement yields both phases with equal probability. Both give the same \tilde p via \sin^2\tilde\theta; the two-peak measurement distribution is the correct behaviour.
"You need to know p to choose t." For worst-case analysis yes — the bound |\tilde p - p| = O(\sqrt p/2^t) uses p on the right-hand side. In practice you pick t based on the target accuracy \varepsilon, using a rough upper bound p \le 1 (giving \varepsilon \ge 1/2^t). For finer control, pilot runs or iterative variants (see Going deeper) estimate p progressively and set t adaptively.

Going deeper

If you came here to understand what amplitude estimation does — run QPE on the amplification iteration Q to read the angle \theta, compute p = \sin^2\theta, enjoy the quadratic Monte Carlo speedup — you have it. The rest of this section states the BHMT theorem precisely, sketches the iterative amplitude estimation variant that avoids deep QFT circuits, discusses applications in Indian finance and its concrete circuit costs, and explains why amplitude estimation sits at the theoretical heart of quantum Monte Carlo.

The BHMT theorem, formally

Brassard, Høyer, Mosca, and Tapp (arXiv:quant-ph/0005055) proved the main quantitative result.

Theorem (BHMT 2002, informal). Let A be a quantum algorithm producing |\Psi\rangle = A|0\rangle with good-subspace probability p. For any integer t \ge 1, there is a quantum circuit using 2^t applications of A and A^\dagger, plus one inverse QFT on t qubits, that outputs an estimate \tilde p satisfying

|\tilde p - p| \;\le\; 2\pi\frac{\sqrt{p(1-p)}}{2^t} + \frac{\pi^2}{4^t}

with probability at least 8/\pi^2 \approx 0.81.

The first term is the dominant one for large t and small p, giving the O(\sqrt p/2^t) scaling. The second term is a correction from the discreteness of the measurement.

To achieve a target additive accuracy \varepsilon with high probability, one picks t = \lceil \log_2(2\pi\sqrt p/\varepsilon) \rceil + O(1), giving O(\sqrt p/\varepsilon) total queries. Repeating the entire estimation \log(1/\delta) times and taking the median boosts the probability of success from 8/\pi^2 to 1 - \delta at logarithmic extra cost — the standard technique for reliability amplification.

Iterative amplitude estimation

The catch with full-blown amplitude estimation is that 2^t sequential applications of Q require a coherently-held state across a very deep circuit. On any realistic hardware, decoherence corrupts the phase estimation long before t is large enough to beat classical Monte Carlo.

Iterative amplitude estimation (Grinko, Gacon, Zoufal, Woerner, 2021, arXiv:1912.05559) is a variant that reaches the same quadratic speedup using shorter individual circuits. The idea: instead of running phase estimation with one big t, run a sequence of amplitude amplification rounds at carefully chosen depths k_1 < k_2 < k_3 < \ldots, each producing a binary measurement outcome; combine the outcomes via a classical post-processing rule that progressively narrows an interval around \theta.

The overall query count is still O(1/\varepsilon) (up to log factors), but the deepest single circuit is only O(1/\varepsilon^{2/3}) deep or so, not O(1/\varepsilon). For realistic noisy hardware, iterative amplitude estimation is the more practical variant, and it is what most current experimental demonstrations use.

Comparison with classical variance reduction

Classical Monte Carlo has its own tricks: importance sampling, control variates, stratified sampling, and quasi-Monte Carlo. These techniques reduce the variance of the estimator, sometimes dramatically. A well-tuned importance sampler can reduce the sample complexity by orders of magnitude below the naive 1/\varepsilon^2.

Does quantum amplitude estimation still win after classical optimisations? The answer depends on the problem. For problems where the classical variance reduction itself reduces the effective p (e.g., importance sampling makes the "good" events common), the gap between quantum and classical narrows. For problems where the rare-event nature is intrinsic (e.g., tail risk in finance, certain condensed-matter expectation values), the quantum advantage survives variance reduction, because quantum estimates the actual distribution's p, not a reweighted version.

A fair comparison always requires classical-best-vs-quantum, not classical-naive-vs-quantum. This is an active area: Montanaro and Pallister (arXiv:1607.01083) give a careful analysis showing the quadratic speedup survives for a broad class of problems, while noting regimes where classical variance reduction closes the gap.

Amplitude estimation in quantum chemistry

For molecular Hamiltonian H = \sum_j c_j P_j (a linear combination of Pauli strings P_j), computing \langle\psi|H|\psi\rangle means measuring each \langle\psi|P_j|\psi\rangle and linearly combining. Each Pauli expectation is a quantity in [-1, 1], which with a simple encoding becomes an amplitude to be estimated.

The classical VQE approach: measure \langle P_j\rangle with O(1/\varepsilon_j^2) shots per term, total O(N/\varepsilon^2) shots across N terms. The amplitude-estimation approach: O(1/\varepsilon_j) per term, total O(N/\varepsilon) queries. For precision-demanding chemistry (where \varepsilon \sim 10^{-4} "chemical accuracy" is standard) this is a very large quadratic saving in principle, though again gated by the depth of A and the overhead of fault-tolerance.

Indian context: BSE, NSE, and the fintech quantum programme

Monte Carlo simulation is a daily workhorse of Indian financial infrastructure. BSE (formerly the Bombay Stock Exchange) and NSE (the National Stock Exchange) process tens of millions of transactions per day, and their derivative markets — equity futures, index options, currency options — rely on risk models that estimate tail probabilities, value-at-risk, and expected shortfall through extensive Monte Carlo. A typical daily risk computation at a large market-maker can cost 10^8 or 10^9 sampled paths.

Quantum amplitude estimation would — if a fault-tolerant machine existed — reduce this to 10^4 to 10^5 queries. Indian academic-industry research collaborations at IIT Delhi, IIT Bombay, ISI Kolkata, and fintech-focused centres are exploring this transition. RBI (Reserve Bank of India) has working groups on quantum-resilient and quantum-enabled finance, with focus on both the offensive (post-quantum cryptography for settlement systems) and the speedup (quantum Monte Carlo for risk) sides. The National Quantum Mission earmarks financial modelling as one of its application domains.

The timeline matters. Today's Indian quantum hardware — at TIFR, IIT Madras, IISc — is at the tens-of-qubits, noisy-hardware regime. Practical amplitude estimation at the accuracy finance needs probably wants at least t = 15 and deep Q ladders, which in turn wants thousands of fault-tolerant logical qubits, which in turn wants millions of physical qubits with current error rates. Commercial quantum-advantage for option pricing on Indian exchanges is likely a 2035-plus proposition, not a near-term one. But the theoretical foundations are settled, the circuits are understood, and the prep work — algorithm design, quantum-classical interfaces, workforce training — is happening now.

The maximum-likelihood estimator and its optimality

A more refined analysis of amplitude estimation uses the maximum-likelihood estimator given the measurement outcomes. For a single QPE run with outcome y, the likelihood L(\theta | y) is a known function peaked near the true \theta; maximising it gives the MLE \hat\theta, which is asymptotically efficient in the sense that its variance achieves the Cramér-Rao lower bound.

Suzuki-Uno-Raymond et al. (arXiv:1904.10246) develop maximum-likelihood amplitude estimation further, using the MLE across multiple QPE runs (at different t's) to extract more information per query. This achieves near-optimal scaling with tighter constants than the textbook QPE-based estimator, and it is the algorithm of choice for near-term demonstrations where every query counts.

Why amplitude estimation is asymptotically optimal

Just like amplitude amplification saturates the BBBV lower bound for the amplification problem, amplitude estimation saturates the Heisenberg limit for phase estimation: no algorithm with black-box access to A and S_\Pi can estimate p to accuracy \varepsilon with fewer than \Omega(1/\varepsilon) queries, for any \varepsilon much smaller than \sqrt p. This is proved via information-theoretic arguments going back to Nayak and Wu (1999) and refined through subsequent work.

The Heisenberg bound is the fundamental quantum speed limit for parameter estimation: to learn a continuous parameter with error \varepsilon requires \Omega(1/\varepsilon) queries, no matter how clever your quantum algorithm. Classical statistics gives 1/\varepsilon^2. The factor-of-two-in-the-exponent gap is exactly the quantum advantage, and it is tight — amplitude estimation is as good as it can be, up to constants.

Where this leads next

Amplitude amplification — the sister algorithm. Amplification boosts p; estimation measures it.
Quantum phase estimation — the workhorse subroutine amplitude estimation rides on top of.
Quantum Monte Carlo — the broader framework of quantum speedups for sampling and integration.
Option pricing on quantum computers — the finance application in concrete circuit detail.
Mean estimation with quantum queries — Montanaro's generalisation to arbitrary bounded random variables.

References

Brassard, Høyer, Mosca, Tapp, Quantum Amplitude Amplification and Estimation (2000/2002) — arXiv:quant-ph/0005055. The original framework paper, defining both amplification and estimation.
Ashley Montanaro, Quantum speedup of Monte Carlo methods (2015) — arXiv:1504.06987. The general Monte Carlo quadratic speedup theorem.
Wikipedia, Amplitude estimation — a compact summary.
John Preskill, Lecture Notes on Quantum Computation, Chapter 6 — theory.caltech.edu/~preskill/ph229. The clearest textbook treatment.
Stamatopoulos et al., Option Pricing using Quantum Computers (2019) — arXiv:1905.02666. Explicit circuits for European and Asian options.
Grinko, Gacon, Zoufal, Woerner, Iterative Quantum Amplitude Estimation (2019) — arXiv:1912.05559. The shorter-circuit variant practical on noisy hardware.