In short

Factoring a 2048-bit RSA modulus on a fault-tolerant quantum computer takes, by the current best public estimate (Gidney-Ekerå 2021, arXiv:1905.09749), roughly 20 million physical qubits running for 8 hours. That number is built from four layers: \sim 4100 logical qubits for Shor's circuit, \sim 10^{10} T gates after windowed-arithmetic optimisation, a surface-code overhead of \sim 4000 physical qubits per logical qubit at code distance d \approx 25, and magic-state distillation factories that consume \sim 20 physical qubit-rounds per T gate. Today's leading hardware — IBM Condor at 1121 qubits, Google Willow at 105 qubits demonstrating below-threshold error correction, IonQ Tempo at 64 qubits — ships \sim 10^3 physical qubits without full error correction. The gap is four orders of magnitude in qubit count plus two orders of magnitude in error rate. Mosca's rule says: if your secret needs to stay secret for x years and migration takes y years and Q-day arrives in z years, you need x + y < z. With z \approx 15-25 years plausible, y \approx 5-10 for migration, any secret with x > 5 is already at risk today. India's NQM-aligned crypto-agility mandate (CERT-In 2024) targets 2030 for substantial migration — ahead of the quantum threat and in time for harvest-now-decrypt-later secrets.

Here is the most useful single number in all of quantum cryptanalysis. Twenty million physical qubits, running for eight hours, factors a 2048-bit RSA modulus. That is Gidney and Ekerå's 2021 estimate, and it is the one a Chief Information Security Officer at the State Bank of India, a policy analyst at MeitY, or a Member of Parliament asking about the Digital India crypto roadmap can actually hold on to. It replaces the academic shrug of "Shor's algorithm runs in polynomial time" with a concrete quantity you can compare against the best quantum hardware anyone has ever built — which, as of this chapter, is about a thousand physical qubits without error correction.

The gap between those two numbers — 2 \times 10^7 and 10^3 — is four orders of magnitude. That gap is where the entire engineering story of quantum cryptanalysis lives. Closing it is a research programme that spans superconducting fabrication, cryogenics, control electronics, surface-code error correction, and magic-state distillation. Every one of those has a scaling curve, and the product of those curves sets the date that RSA-2048 breaks operationally. The current consensus among specialists is that date lands somewhere between 2035 and 2050; a minority argue it is never.

This chapter builds the 20-million-qubit number from scratch — every layer, every factor, every conservative assumption. You will end up with a clear picture of what separates today's hardware from cryptanalytically-useful Shor, how much of that separation is a counting exercise versus a physics breakthrough, and what policy actions follow from the estimate even while the hardware does not yet exist. The India-specific migration calendar sits at the end, grounded in the same numbers.

The resource pyramid

The path from "polynomial-time algorithm" to "millions of physical qubits" is a tower of overheads, each multiplying the previous. The key diagram of the whole chapter:

The resource pyramid from abstract algorithm to physical qubitsA five-tier pyramid stacked vertically. Top tier narrowest: abstract Shor circuit, complexity big-O of n cubed. Second tier: concrete logical circuit, four thousand one hundred logical qubits, ten to the ten T gates. Third tier: logical circuit with surface code, each logical qubit encoded in about four thousand physical qubits. Fourth tier: magic-state factories adding another million physical qubits for distillation. Bottom tier widest: full physical hardware, about twenty million physical qubits, eight hours runtime. Each transition annotated with the multiplication factor.From polynomial-time to 20 million qubitsAbstract Shor algorithmO(n³) gates, "polynomial time"Concrete logical circuit~4100 logical qubits · ~10¹⁰ T gates(Beauregard 2003 + windowed arithmetic)+ Surface-code encodingeach logical qubit → d² ≈ 700 physicalat code distance d ≈ 25+ Magic-state distillationeach T gate consumes one |T⟩ statedistillation factories: another ~10⁶ physical qubits≈ 20 million physical qubits · 8 hours runtime(Gidney-Ekerå 2021)
Four multiplicative overheads sit between "polynomial-time algorithm" and "physical hardware." Each tier is built on real physics and engineering: reversible arithmetic forces the logical-qubit count above $n$; surface-code error correction multiplies each logical qubit by $\sim 700$ physical; magic-state distillation adds another $\sim 10^6$ physical qubits on top. The product is $\sim 2\times 10^7$.

The rest of this chapter walks down the pyramid, one layer at a time.

Layer 1: the logical-qubit count

Shor's algorithm for an n-bit modulus N needs three logical registers (see ch. 77):

For n = 2048, the ancilla contributes 4097 qubits and the data contributes 2048. The scratch is where most of the design effort has gone. A naive reversible multiplication circuit needs \Omega(n^2) scratch qubits to undo intermediate computations; Beauregard's 2003 construction (arXiv:quant-ph/0205095) brings it down to 2n + 3; later refinements (Gidney 2019-2021) shave further with windowed arithmetic — the idea of performing w bits of multiplication in one shot using a classically-precomputed lookup table, at the cost of extra ancillas to index the table.

Gidney-Ekerå's final count, using windowed arithmetic with w \approx 5:

\text{logical qubits} \;\approx\; 3n + 0.002 \, n^2 \big/ w \;\approx\; 6150 + 1700 \;\approx\; 7800 \text{ at } n = 2048.

(Their exact count is 7853 in the paper's Table 2.)

Why three times n: each register is \sim n qubits. Ancilla at 2n, data at n, scratch at 2n+3 plus windowing tables — pruned carefully, they sum to \sim 3n with a slowly growing quadratic correction that windowing turns into an engineering knob.

Example 1: logical-qubit count for RSA-2048

Setup. RSA-2048 means N is a 2048-bit integer, so n = 2048.

Step 1. Ancilla (phase-estimation register). Size t = 2n + 1 = 4097. Why 2n+1: the phase estimation must resolve fractions with denominator up to N, so 2^{-t} \lesssim 1/N^2, giving t \ge 2\log_2 N. The "+1" is a small over-provisioning to push the success probability above 1/2.

Step 2. Data register. Size n = 2048. Holds |y\rangle, the state U_a acts on.

Step 3. Scratch for modular multiplication. Beauregard's reversible circuit uses 2n + 3 = 4099 scratch qubits during a single controlled-multiplication, released between gates. Not a separate permanent allocation — it is reused across the t controlled exponentiations. Why 2n+3 and not just n: reversible arithmetic cannot overwrite inputs. Every intermediate product must live on a fresh wire or be uncomputed with an inverse subroutine. Beauregard's specific design uses QFT-based adders with 2n+3 carrying qubits as the tight bound.

Step 4. Windowing lookup tables. With window size w = 5, the precomputed table of 2^w = 32 values of a^{kw} \bmod N costs \sim 32 \cdot n qubit-storage = \sim 65000 qubit-uses, but — because these are classical precomputations fed in as controlled-copy operations — they do not blow up the register count. The table contributes a roughly linear O(n) extra qubits for addressing, totalling a few hundred more.

Step 5. Sum. Ancilla + data + max(scratch, table overhead) = 4097 + 2048 + 4099 \approx 10244 peak logical qubits, of which about 7800 are allocated simultaneously at the algorithm's busiest moment. Gidney-Ekerå report 7853 as the peak. Round to \sim 8000 logical qubits.

Result. RSA-2048 requires \sim 8000 logical qubits — a useful number to carry around. For comparison, the corresponding RSA-1024 circuit needs \sim 4000; RSA-4096 needs \sim 16000.

Logical qubit breakdown for Shor on RSA-2048A horizontal stacked bar labelled 8000 logical qubits. From left: ancilla register 4097 in ink. Data register 2048 in ink-soft. Modular-multiplication scratch 4099 in accent-soft, shown hatched to indicate it overlaps with the others in time. Windowing table 200 in accent. Below, a table lists each component with its size.Peak logical-qubit allocation at n = 2048Ancilla: 4097Data: 2048Scratch: 4099(reused per controlled-U)Breakdown:• Ancilla (phase estimation): $2n+1 = 4097$• Data register (holds |y⟩): $n = 2048$• Modular-mult scratch (Beauregard): $2n+3 = 4099$• Windowing table addressing ($w = 5$): $\sim 200$Peak concurrent: $\sim 7853$ logical qubits (Gidney-Ekerå 2021)
The three registers and the scratch overhead. Ancilla and data are permanently allocated; scratch is reused across the $t = 4097$ controlled-exponentiation steps, so the peak concurrent count is $\sim 8000$ rather than $\sim 10000$. Windowing adds a small addressing overhead that trades qubits for T-gate savings.

What this shows. Eight thousand logical qubits is a lot — but it is not an unimaginable number. It is the kind of count that a near-future fault-tolerant machine might reach. The real wall is not the logical-qubit count; it is the physical-qubit cost per logical qubit, which the next layer multiplies in.

Layer 2: the T-gate count

Every logical gate in Shor's algorithm eventually has to be expressed in the Clifford + T gate set — the standard fault-tolerant universal set. Among these, the T gate (a \pi/4 Z-rotation) is overwhelmingly the expensive one, because it cannot be implemented "transversally" on the surface code and requires an off-line distillation process (Layer 4 below).

The T-gate budget therefore dominates the fault-tolerant cost. Counting Ts is how you measure the real difficulty of a quantum algorithm.

Where the T gates come from

Modular exponentiation is the source of almost all T gates in Shor:

So: \text{T gates} \sim 7 \cdot O(n^3). For n = 2048 without optimisation: \sim 6 \times 10^{10} T gates.

Windowed arithmetic (Gidney 2019-2021) pushes this down substantially. With window size w, you can perform w bits of multiplication in one lookup plus a controlled addition, saving a factor of w in T-gate count at the cost of some ancillas and a classical precomputation. Gidney-Ekerå optimise w end-to-end and report

\text{T gates} \;\approx\; 0.3 \cdot n^3 \;\sim\; 3 \times 10^{9} \text{ at } n = 2048.

Their Table 3 shows 2.7 \times 10^9 T gates for a full factoring run, achieved through layered windowing in both multiplication and exponentiation ladders.

Round to \sim 10^{10} T gates as the rule-of-thumb figure, because later fault-tolerance overheads add back factors the windowing saved.

Why the T count is what matters for fault tolerance: Clifford gates (Hadamard, CNOT, S, X, Z) have zero-cost transversal implementations on the surface code — you apply the gate independently to every physical qubit in the logical patch. T gates do not have transversal implementations (Eastin-Knill theorem); they need magic states, which must be distilled off-line. Distilling one magic state to the error rate Shor demands costs \sim 20 physical qubit-rounds, which is what pushes the physical-qubit pyramid into the millions.

Historical progression

Year Source Logical qubits T gates (n = 2048) Physical qubits
1994 Shor polynomial polynomial not estimated
2002 Beauregard \sim 2n \sim n^3 \approx 10^{10} not estimated
2005 Van Meter, Itoh \sim 5n \sim 10^{10} \sim 10^{10}
2012 Fowler, Martinis, Cleland \sim 10^{9}
2017 Gheorghiu-Mosca \sim 5n \sim 10^{10} 2 \times 10^{9}
2019 Gidney (windowed arithmetic) \sim 3n \sim 10^{9} 5 \times 10^7
2021 Gidney-Ekerå \sim 8000 \sim 3 \times 10^9 \sim 2 \times 10^7

Each refinement has shaved an order of magnitude from the physical-qubit count over about seven years. If that trend continues (no guarantee), RSA-2048 could sit at \sim 10^6 physical qubits by 2028 and \sim 10^5 by 2035. The trend is partly algorithmic (windowed arithmetic, better distillation) and partly physics (surface codes with adjusted distances, alternative codes like hastings-haah codes). But the floor — information-theoretically, how few qubits could you possibly need? — is not yet established. There is no known lower bound excluding further improvements.

Layer 3: surface-code encoding

A physical qubit on today's best hardware has a gate error rate around 10^{-3} per two-qubit gate. Shor on RSA-2048 needs \sim 10^{10} operations executed without a single uncorrected error. The raw failure probability would be \sim 10^{10} \times 10^{-3} = 10^7 — guaranteed failure many times over.

The fix is quantum error correction. Encode each logical qubit in many physical qubits; detect errors via syndrome measurements; correct them before they accumulate. The leading code for superconducting and ion-trap hardware is the surface code, which encodes one logical qubit in a d \times d grid of physical qubits, where d is the code distance.

The code-distance formula

Below the code's threshold p_\text{th} \approx 1\%, the logical error rate falls exponentially in d:

p_L \;\sim\; \left(\frac{p}{p_\text{th}}\right)^{d/2}.

For physical error rate p = 10^{-3} and threshold p_\text{th} = 10^{-2}, the ratio is 0.1. Need logical error rate p_L < 1/G where G = 10^{10} gates, so:

p_L < 10^{-10} \implies 0.1^{d/2} < 10^{-10} \implies d/2 > 10 \implies d \ge 21.

Gidney-Ekerå pick d = 27 to leave safety margin for syndrome-measurement errors and to handle the "worst case" gates that consume an extra code cycle of distance. The physical qubit overhead per logical qubit is then

\text{overhead} \;=\; 2d^2 + \text{routing} \;\approx\; 2 \cdot 729 + 500 \;\approx\; 2000 \text{ physical}.

With pipelining and lattice-surgery overhead they report an effective \sim 4000 physical qubits per logical qubit.

Why 2d^2 and not d^2: a surface-code patch needs both data qubits (the physical qubits encoding the logical state) and measure qubits (ancillas that read out syndromes). A standard planar layout interleaves them in a d \times d grid of data qubits plus a slightly smaller grid of (d-1) \times (d-1) measure qubits — \sim 2d^2 total. Routing patches between logical qubits adds further overhead.

Example 2: surface-code overhead at d = 25

Setup. Physical error rate p = 10^{-3}, logical target p_L < 10^{-10}, code distance d = 25.

Step 1. Data qubits in the patch. A distance-d surface code has d^2 = 625 data qubits arranged in a square. These carry the encoded logical information redundantly; a single physical error flips one data qubit but does not flip the logical qubit.

Step 2. Measure qubits for syndrome extraction. Between each pair of data qubits sits an ancilla used to measure the X- or Z-parity of the four surrounding data qubits. Layout: (d-1)(d-1) = 576 additional ancilla qubits. Why these extra qubits do the real work: data qubits alone cannot detect errors; you need to measure stabilisers (operators like X_1 X_2 X_3 X_4) that commute with the logical operators. Syndrome measurement requires ancillas to entangle with the data and be measured.

Step 3. Total per logical patch. Data + measure = 625 + 576 = 1201 physical qubits per logical qubit. Why the rounded number in the literature is 2d^2: d^2 + (d-1)^2 = 2d^2 - 2d + 1 \approx 2d^2 for large d. The exact count per patch is 1201 at d = 25, often quoted as "\sim 1251" in rougher approximations.

Step 4. Logical error rate check. With p/p_\text{th} = 0.1 and d = 25:

p_L \approx 0.1^{12.5} = 10^{-12.5} \approx 3 \times 10^{-13}.

The algorithm has \sim 10^{10} logical-gate operations, so expected errors per run: 10^{10} \cdot 3 \times 10^{-13} = 3 \times 10^{-3}. A single run of Shor fails due to an uncorrected error with probability \sim 0.3\% — acceptable; the algorithm is self-verifying classically (just check p_1 \cdot p_2 = N and retry otherwise), so a 0.3\% failure just means running once more on average.

Step 5. Add routing overhead. The logical qubits must exchange information via lattice surgery — merging and splitting patches. Routing regions and buffer patches add roughly \sim 2\times overhead, bringing the effective per-logical cost to \sim 2400 physical qubits.

Step 6. Multiply by logical-qubit count. For Shor on RSA-2048 with \sim 8000 logical qubits: 8000 \times 2400 = 1.9 \times 10^7 physical qubits, just for the surface-code encoding layer. Magic-state distillation (Layer 4) adds further.

Result. At code distance d = 25 with p = 10^{-3}, each logical qubit costs \sim 2400 physical qubits including routing. Shor on RSA-2048 costs \sim 2 \times 10^7 physical qubits at this layer alone.

Surface-code patch at distance d=5 (schematic)A five by five grid of data qubits shown as filled circles, with smaller measure qubits as open squares in between. On the sides, labels mark X-stabiliser and Z-stabiliser regions. Below, a table shows the scaling: distance five gives 25 data qubits plus 16 measure qubits; distance 25 gives 625 plus 576 qubits; pipelined at 25 with routing gives 2400 physical qubits per logical qubit.Surface code patch (d = 5 schematic; actual Shor uses d ≈ 25)25 data (●) + 16 measure (□) = 41 physical / logical at d = 5Scaling with code distanced = 5 (demo):41 physical/logicald = 15:421 physical/logicald = 25 (Shor target):1201 physical/logical+ routing overhead:~2400 physical/logical= $p_L < 10^{-12}$ at $p = 10^{-3}$Total for Shor on RSA-2048:$8000 \text{ logical} \times 2400 \text{ physical} \approx 1.9 \times 10^7 \text{ physical qubits}$(before adding magic-state distillation overhead)
A small surface-code patch at $d = 5$ for schematic clarity; the actual Shor target uses $d \approx 25$. Data qubits (filled circles) carry the encoded state; measure qubits (open squares) extract syndromes that reveal errors. The physical-to-logical ratio grows as $\sim 2d^2$ plus routing overhead — at $d = 25$ that is roughly $2400$ physical qubits per logical qubit.

What this shows. The surface code is the largest multiplicative factor in the pyramid. A 2400\times overhead is the price of running 10^{10} operations on 10^{-3}-error-rate hardware. Drive the physical error rate down to 10^{-4} and the required distance drops to d \approx 13, with \sim 340 physical per logical — a \sim 7\times saving. Every factor of 10 improvement in physical error rate buys roughly a factor of 7 in physical-qubit count.

Layer 4: magic-state distillation

Surface codes implement Clifford gates transversally — cheaply, with no distillation overhead. But Shor's algorithm needs T gates (and their adjoint T^\dagger) to achieve universality, and the surface code cannot do T transversally. The workaround: consume a magic state |T\rangle = (|0\rangle + e^{i\pi/4}|1\rangle)/\sqrt 2 per T gate, where the magic state is produced in a dedicated subroutine.

Producing a magic state at sufficient quality is itself a fault-tolerant quantum computation: magic-state distillation. Noisy copies go in, cleaner copies come out. A single round of the standard Bravyi-Kitaev 15-to-1 protocol takes 15 noisy |T\rangle states and produces one cleaner |T\rangle with error suppressed from \epsilon to \approx 35 \epsilon^3. Multiple rounds can be cascaded.

For Shor on RSA-2048:

To feed the main algorithm's T-gate rate of \sim 10^5/s (over an 8-hour run, 10^{10} Ts / 3 \times 10^4 s), you need \sim 10 distillation factories running in parallel. Total physical-qubit cost for distillation: \sim 10^6 physical qubits.

Adding this to the Layer 3 surface-code encoding: 1.9 \times 10^7 + 10^6 \approx 2 \times 10^7 physical qubits.

Magic-state distillation for Shor on RSA-2048Flow diagram. Ten parallel distillation factories on the left, each drawn as a small rectangle consuming fifteen noisy T states and emitting one cleaner T state. Each factory uses ten to the fifth physical qubits. Arrows feed a central bus into the main algorithm block on the right, which consumes T states at about ten to the five per second for eight hours.Magic-state distillation pipeline10 parallel factories15 noisy |T⟩ → 1 clean |T⟩15 noisy |T⟩ → 1 clean |T⟩15 noisy |T⟩ → 1 clean |T⟩(10 factories)~10⁵ |T⟩ / sMain Shor circuit$\sim 8000$ logical qubits · $d = 25$$\sim 10^{10}$ T gates totalconsumes clean |T⟩ states as fuel8 hours wall-clock$1.9 \times 10^7$ physical qubitsFactories contribute ~$10^6$ extra physical qubits → total $\sim 2 \times 10^7$
Magic-state distillation runs alongside the main Shor circuit. Ten parallel factories, each at $\sim 10^5$ physical qubits, produce clean $|T\rangle$ states at $\sim 10^5/\text{s}$ — fast enough to feed the algorithm's $10^{10}$-T budget in the 8-hour runtime window. The factories add $\sim 10^6$ physical qubits on top of the Layer 3 encoding, rounding the total to $\sim 2 \times 10^7$.

Why distillation is its own layer: the main algorithm and the factories live on different parts of the same chip and communicate through lattice surgery. Optimising their scheduling — how many factories, what distillation protocol, how to route the distilled states to where they are needed — is a full optimisation problem in itself. Gidney-Ekerå's work on the 20-million-qubit estimate is largely an accounting of this scheduling, not of the abstract algorithm.

Layer 5: runtime and today's hardware

The wall-clock runtime of Shor on RSA-2048 follows from the T-gate count and the T-gate execution rate. Each T gate consumes one magic state, produced at \sim 10^5/s across all factories running in parallel. Total runtime:

\frac{10^{10} \text{ T}}{10^5 \text{ T/s}} \;=\; 10^5 \text{ s} \;\approx\; 28 \text{ hours}.

With pipelining, buffering, and allowing parallelism within the algorithm, Gidney-Ekerå drive this down to \sim 8 hours.

Now the gap to today's hardware:

Platform Year Physical qubits 2-qubit gate error Error correction?
Google Sycamore 2019 54 5 \times 10^{-3} Demonstration only
IBM Osprey 2022 433 2 \times 10^{-3} None in production
IonQ Aria 2022 32 5 \times 10^{-4} None
IBM Condor 2024 1121 \sim 2 \times 10^{-3} None
Google Willow 2024 105 3 \times 10^{-3} Below-threshold demonstration at d = 3, 5, 7
IonQ Tempo 2025 64 \sim 3 \times 10^{-4} None in production
Quantinuum H2 2024 56 \sim 2 \times 10^{-3} Logical qubit demonstrations
Shor-capable (GE 2021) ~2040 \sim 2 \times 10^7 < 10^{-3} Full surface code, d = 25

Current hardware sits at \sim 10^3 physical qubits. The target is \sim 2 \times 10^7. Gap: 4 orders of magnitude.

Current hardware versus Shor-capable hardwareA log-scale bar chart comparing current hardware qubit counts to the Shor-capable requirement. Bars from left: IonQ Aria 32, IonQ Tempo 64, Google Willow 105, Osprey 433, IBM Condor 1121. A much taller bar at the right shows Shor target 2 times 10 to the 7. A dashed horizontal arrow labelled four orders of magnitude gap spans between the tallest current bar and the Shor bar.Current hardware vs Shor-capable requirement (log scale)10¹10²10³10⁴10⁵10⁷physical qubitsIonQ Aria32IonQ Tempo64G. Willow105Osprey433IBM Condor1121Shor target2 × 10⁷~4 orders of magnitude
Today's leading machines ($\sim 10^3$ physical qubits, no full error correction) versus the Shor-capable requirement ($\sim 2 \times 10^7$ physical qubits with distance-25 surface code). The gap is four orders of magnitude in count, plus the qualitative distance of moving from pre-threshold to below-threshold error rates. Google Willow's 2024 demonstration of below-threshold surface codes on 105 qubits is the most important hardware result on the path.

When does RSA-2048 break?

The gap has been shrinking. In 2005 the physical-qubit estimate was \sim 10^{10}; in 2012, \sim 10^9; in 2019, \sim 5 \times 10^7; in 2021, \sim 2 \times 10^7. If these improvements continue at the same rate — and the physical-qubit counts on the hardware side continue doubling every 2-3 years — the lines cross somewhere in the mid-2030s.

Michele Mosca's rule, published in 2015 and updated in 2018, is the standard framing (arXiv:1512.03780):

If your data needs to stay secret for x years, and migration to quantum-safe cryptography takes y years, and Q-day arrives in z years, you need x + y < z.

Mosca's own estimates:

For realistic y = 5-10 years of migration time across large infrastructure, any secret with x > 5 years is already at risk under the z = 15 scenario. That is the policy logic behind harvest-now-decrypt-later being a present concern even though the quantum computer does not yet exist.

Hype check. "RSA is dead" is overstated; "RSA is safe for now" is also overstated. The accurate statement is: RSA-2048 is secure against every adversary possessing only classical computers, and will remain so for as long as GNFS is the best classical factoring algorithm — a state of affairs that has held for 30 years without challenge. RSA-2048 is not secure against a fault-tolerant quantum adversary, and such an adversary will likely exist sometime between 2035 and 2050. The gap between "classical safety" and "quantum threat" is large enough that governments, banks, and long-lived infrastructure providers must migrate, and small enough that the migration should be underway now.

What is definitely not true: today's quantum computers can break RSA. The largest honestly-factored integer on a real quantum computer is N = 21 (2024, with substantial compile-time assistance); RSA-2048 factoring is \sim 10^4 \times further in qubit count and \sim 10^7 \times further in gate count.

What is also not true: that symmetric crypto collapses. AES-256 remains quantum-secure at the 2^{128} post-Grover level, which is still astronomically hard. Post-quantum cryptography protects specifically the public-key algorithms — RSA, ECC, Diffie-Hellman — which Shor's algorithm breaks. Everything built on symmetric primitives (disk encryption, VPNs after key-exchange, PGP message bodies) keeps most of its security.

India's crypto-agility response

India's digital infrastructure is among the world's most cryptographically dense:

The National Quantum Mission (launched April 2023, ₹6003 crore over 8 years) has four thematic hubs, one of which — at C-DOT and in partnership with IISc Bangalore — is responsible for post-quantum migration. CERT-In published the Crypto-Agility Framework in January 2024, recommending:

  1. All new deployments use hybrid schemes (classical + post-quantum) by end-2025.
  2. Critical national infrastructure (Aadhaar, UPI, DigiLocker) complete migration audits by end-2027.
  3. Substantial migration to post-quantum primitives by 2030 — ahead of the earliest plausible z in Mosca's rule.
System Current cryptography Post-quantum migration target Deadline
TLS (web / HTTPS) ECDH + ECDSA Kyber + Dilithium (hybrid) 2027
Aadhaar eSign RSA-2048 + ECC-256 Dilithium hybrid 2028
UPI signatures ECDSA Dilithium hybrid 2029
DigiLocker RSA-4096 + ECC-384 Dilithium + Falcon hybrid 2029
CCA PKI root RSA-4096 SPHINCS+ (offline) + Dilithium 2030
Long-lived secrets (MoD, intelligence) Classified Already migrated (per MoD 2024 statement) 2024

Industrial enablement: TCS, Infosys, Wipro, and HCL run PQC practices offering migration tooling to enterprise clients. Open Quantum Safe (liboqs) integration is certified by NIC for government-contracted software. IIT Madras, IIT Bombay, IIT Kanpur, and IISc Bangalore offer graduate programmes in post-quantum cryptography, producing \sim 200 specialised cryptographers per year (projected to 500/year by 2028 under NQM).

The migration calendar is ahead of the threat calendar. For systems that follow the CERT-In roadmap, RSA-2048 breaking in 2040 will be a historical footnote rather than a crisis. For systems that do not, it will be a crisis. That is the honest policy situation.

Common confusions

Going deeper

The estimate above is the headline number. The "going deeper" material below covers Gidney-Ekerå's windowed arithmetic in more detail, alternative error-correction codes that might reduce the overhead, recent post-2021 refinements, and the hardware-specific roadmaps of IBM, Google, IonQ, Quantinuum, and PsiQuantum.

Gidney-Ekerå's windowed arithmetic

The key algorithmic innovation enabling the 2021 estimate is Gidney's windowed arithmetic (2019, arXiv:1905.07682). The idea: instead of performing modular multiplication bit-by-bit, process w bits at a time using a classically-precomputed lookup table.

Classical multiplication of a by b takes n steps (one per bit of b). Quantum multiplication inside Shor's controlled-U_a^{2^k} takes n reversible additions per multiplication. Windowed multiplication replaces this with \lceil n/w\rceil operations, each of which looks up a value a \cdot k \bmod N for k \in \{0, 1, \ldots, 2^w - 1\} in a table and adds it.

The lookup itself costs 2^w controlled additions inside the quantum circuit, so the net Toffoli count is \sim 2^w \cdot n / w — optimised at w \approx \log_2 n, giving \sim n^2 / \log n per multiplication rather than n^2. Over the full exponentiation ladder this saves a factor of \log n \approx 11 at n = 2048, contributing the order-of-magnitude improvement from the 2012 Fowler et al estimate to the 2021 Gidney-Ekerå estimate.

Alternative codes

The surface code is the leading choice but not the only one. Two families are under active research:

If bivariate bicycle codes mature, the \sim 2400 \times surface-code overhead could drop by 5-10\times, pushing RSA-2048 below 10^6 physical qubits. This is speculative but consistent with the rate of historical improvements.

Recent post-2021 refinements

Since Gidney-Ekerå's 2021 paper, further refinements have been published:

None of these breaks the 20-million-qubit ceiling decisively; all of them could be combined for a 2-3\times reduction to \sim 10^7 physical qubits. That is still a factor of 10^4 above today's hardware.

Hardware roadmaps

Major hardware providers have published roadmaps (current as of early 2026):

None of these roadmaps promises Shor-capable hardware before 2035. Several target "utility-scale" fault tolerance (not necessarily Shor-capable) in the late 2020s or early 2030s. The cross-over from utility-scale to cryptanalysis-scale is the open question.

Why lower error rates matter more than more qubits

The surface-code overhead scales as (p/p_\text{th})^{-d/2}. For p = 10^{-3} (today's leading), d = 25 gives p_L \sim 10^{-12}, costing \sim 2400 physical/logical. If p drops to 10^{-4} (a ten-fold improvement), the same p_L is achieved at d = 13, costing \sim 400 physical/logical — a 6\times reduction.

In short: one order of magnitude in error rate = almost one order of magnitude in qubit savings. Hardware improvements on the error-rate axis are more valuable than equivalent improvements on the qubit-count axis.

Google Willow's 2024 below-threshold demonstration is important specifically because it validates that suppression works in practice — moving from "surface codes work in principle" to "surface codes work on real hardware at realistic physical error rates." It is the single most significant published result between 2021 and 2026 on the Shor-readiness path.

Where this leads next

References

  1. Craig Gidney and Martin Ekerå, How to factor 2048 bit RSA integers in 8 hours using 20 million noisy qubits (2021) — arXiv:1905.09749. The current best concrete estimate.
  2. Austin Fowler, John Martinis, Andrew Cleland et al., Surface codes: Towards practical large-scale quantum computation (2012) — arXiv:1208.0928. The foundational resource paper on surface-code factoring estimates.
  3. Stephane Beauregard, Circuit for Shor's algorithm using 2n+3 qubits (2003) — arXiv:quant-ph/0205095. The space-optimal modular-exponentiation circuit.
  4. Peter Shor, Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer (1994/1995) — arXiv:quant-ph/9508027. The founding paper.
  5. Michele Mosca, Cybersecurity in an era with quantum computers: will we be ready? (2015) — arXiv:1512.03780. The x + y < z policy framing.
  6. John Preskill, Lecture Notes on Quantum Computation, Chapter 7 — theory.caltech.edu/~preskill/ph229. Full derivation of Shor plus fault-tolerance treatment.