Factoring 21 — Why It's Harder

Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.

In short

Factor N = 21 with Shor's algorithm, and something changes compared to N = 15. Take a = 2: the order is r = 6 (powers 2, 4, 8, 16, 11, 1 modulo 21), and 6 is not a power of 2. The phase estimates s/r = s/6 cannot be represented exactly in a t-bit binary ancilla register, so the measurement distribution spreads across many bins instead of concentrating on six sharp peaks. With t = 10 ancillas, typical outcomes cluster near y \approx 171, 341, 512, 683, 853 — the closest integers to s \cdot 2^{10}/6 for s = 1, 2, 3, 4, 5. You measure y = 171, compute \widetilde\varphi = 171/1024 \approx 0.1670, feed it to continued fractions, get the convergent 1/6, identify r = 6, compute x = 2^3 = 8 \bmod 21, check x \neq \pm 1, and take \gcd(7, 21) = 7 and \gcd(9, 21) = 3. Factors.

Compared to N = 15: more qubits, more tolerance for bin-smearing, more care in continued-fractions interpretation. And experimentally: no fully honest, non-precompiled demonstration of Shor on N = 21 exists on real quantum hardware as of 2026. Every published result on composites larger than 15 has used some degree of classical circuit simplification that assumes knowledge of the answer. This chapter explains why — and what "honest" means as a standard.

You learned the algorithm on N = 15 in ch. 80, where the arithmetic was particularly clean: the order r = 4 is a power of 2, the phase estimates s/r land exactly on 8-bit binary fractions, and the measurement distribution is four sharp peaks. Every worked example of Shor on real hardware uses N = 15 for exactly this reason — it is the composite where everything behaves nicely.

Now you meet N = 21, the next odd composite that is not a prime power. And almost everything that made N = 15 clean now bends. Take a = 2: the order is r = 6, a number with no power-of-two factor in sight. Take a = 5: still r = 6. Take a = 10: r = 6 again. Most coprime bases for N = 21 give r = 6, and the phase estimates s/6 cannot be expressed exactly in any binary ancilla register. The measurement distribution smears across many bins instead of concentrating on six. Continued fractions still works — it has to, because it was designed for approximate inputs — but more of the algorithmic care lives in the classical post-processing.

This chapter does two things. First, a worked example: factor N = 21 with a = 2 and r = 6, tracing the quantum output and the continued-fractions post-processing honestly. Second, the uncomfortable experimental fact: no published demonstration of Shor's on N = 21 has been done without some form of "cheating" — precompiling the circuit using classical knowledge of the answer. You will see what "cheating" means, why it is difficult to avoid on small NISQ hardware, and why "honest" demonstrations are the benchmark the field has been struggling to cross.

Why r = 6 is structurally harder

For N = 15 and a = 7, the quantum circuit's measurement distribution was four sharp peaks at y = 0, 64, 128, 192 — exact multiples of 2^t / r = 256/4 = 64. With t = 8 ancillas, y = s \cdot 64 was an integer for every s, so the amplitude concentrated perfectly.

For N = 21 and a = 2, the orbit of 2 modulo 21 is

2^1 = 2,\; 2^2 = 4,\; 2^3 = 8,\; 2^4 = 16,\; 2^5 = 32 = 11,\; 2^6 = 22 = 1 \pmod{21},

so r = 6. Use t = 10 ancillas (any t \geq 2n works for n = \lceil \log_2 21\rceil = 5). The ideal phases are s/r = s/6 for s \in \{0, 1, 2, 3, 4, 5\}. Converted to t = 10 binary bits: s \cdot 2^{10}/6 = s \cdot 170.\overline{6}, which is not an integer.

The measurement distribution therefore concentrates near, but not exactly at, the integer values y closest to s \cdot 170.\overline{6}:

| s | s/6 | s \cdot 1024 / 6 | Peak bin y^* | y^*/1024 | Error |y^*/1024 - s/6| | |---|---|---|---|---|---| | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 1/6 \approx 0.1\overline{6} | 170.67 | 171 | 0.1670 | 3.3 \times 10^{-4} | | 2 | 2/6 \approx 0.3\overline{3} | 341.33 | 341 | 0.3330 | 3.3 \times 10^{-4} | | 3 | 3/6 = 0.5 | 512 | 512 | 0.5 | 0 (exact) | | 4 | 4/6 \approx 0.6\overline{6} | 682.67 | 683 | 0.6670 | 3.3 \times 10^{-4} | | 5 | 5/6 \approx 0.8\overline{3} | 853.33 | 853 | 0.8330 | 3.3 \times 10^{-4} |

Why the error matters: the peak at the nearest integer y^* carries most of the probability, but amplitude leaks to neighbouring bins y^* \pm 1, y^* \pm 2, \ldots — the measurement distribution has a finite width, centred on s/r but with a tail. The standard analysis (Nielsen and Chuang §5.2.1) shows the total probability of measuring within the nearest integer of s/r is at least 4/\pi^2 \approx 0.405 for any t, with the remaining \sim 0.6 spread over nearby bins. For t \geq 2n + 1 = 11, the phase estimate with continued fractions still recovers r correctly with probability \gtrsim 0.4 per run.

The outcome s = 3 happens to be exact because 3/6 = 1/2, which reduces to a clean binary fraction — but s = 3 is also the unlucky case where \gcd(s, r) = 3 \neq 1, so continued fractions on \widetilde\varphi = 1/2 returns the reduced denominator 2, a divisor of r, not r itself. The three "good" outcomes for a single-run recovery of r = 6 are s = 1 and s = 5 (denominators coprime to r). Each lands on a slightly-off-integer peak — the best quantum outcome is now an approximation to s/r, not an exact representation.

The measurement distributions for $N = 15$ and $N = 21$ side by side. $N = 15$ has four perfect delta-peaks because $r = 4$ divides $2^t = 256$. $N = 21$ has six broadened peaks because $r = 6$ does not divide $2^t = 1024$ — the peaks are centred at the nearest integers to $s \cdot 2^t/r$, and amplitude leaks into neighbouring bins. Continued fractions handles the approximate input, but more ancilla qubits are needed to guarantee reliable recovery.

The setup — a = 2, r = 6

Walk through Shor on N = 21 with a = 2.

Classical preamble. \gcd(2, 21) = 1 — compute Euclid: 21 = 10 \cdot 2 + 1, 2 = 2 \cdot 1 + 0, so \gcd = 1. Proceed. For reference (not used by the algorithm), the powers of 2 modulo 21:

2^1 = 2,\; 2^2 = 4,\; 2^3 = 8,\; 2^4 = 16,\; 2^5 = 32 = 11,\; 2^6 = 22 = 1 \pmod{21}.

So the classical order is r = 6. This is the number the quantum subroutine is about to recover — by sampling a phase \approx s/6 and running continued fractions.

Quantum registers. n = 5 data qubits (to hold residues 0, 1, \ldots, 20), and t = 10 ancilla qubits. The target register starts in |1\rangle = |00001\rangle.

Why t = 10: the rule of thumb t \geq 2n + 1 gives t = 11, but for a pedagogical demonstration t = 10 is already enough for Shor to succeed with probability close to 0.4 per run. A fault-tolerant production run would use t = 11 or t = 12 to push the per-run success probability above 0.5. The exact threshold depends on the interplay between binary-truncation error and the \gcd(s, r) = 1 requirement.

Circuit. Standard Shor: Hadamard the ancillas, apply the ladder of controlled-U_2^{2^k} gates for k = 0, 1, \ldots, 9, apply inverse QFT, measure. Each controlled-U_2^{2^k} implements multiplication by 2^{2^k} \bmod 21 on the 5-qubit data register. Precomputed constants:

2^1 = 2,\; 2^2 = 4,\; 2^4 = 16,\; 2^8 = 16^2 = 256 = 4 \pmod{21},

2^{16} = 4^2 = 16,\; 2^{32} = 16^2 = 4,\; 2^{64} = 4^2 = 16,\; \ldots \pmod{21}.

Why the pattern repeats: from 2^6 = 1 \pmod {21}, higher powers cycle with period 6. In particular 2^{2^k} \bmod 21 is periodic in k modulo the multiplicative order of 2 in the group of units of \mathbb Z/6\mathbb Z. The details don't matter for the algorithm — each 2^{2^k}\bmod 21 is just a classical constant fed into the k-th controlled modular-multiplication subcircuit.

The orbit of $2$ modulo $21$ has length $r = 6$, and the six residues $\{1, 2, 4, 8, 16, 11\}$ cycle under multiplication by $2$. Because $6$ is not a power of $2$, the phase estimates $s/6$ are not exactly representable in any binary ancilla register — this is the structural difference from the $N = 15, r = 4$ case.

One successful run — y = 171

Run the circuit. Say the measurement returns y = 171, the expected peak for s = 1. Compute:

\widetilde\varphi = \frac{171}{1024} = 0.1669921875.

Continued fractions time. Apply Euclid to 171 and 1024:

1024 = 5 \cdot 171 + 169,\qquad 171 = 1 \cdot 169 + 2,

169 = 84 \cdot 2 + 1,\qquad 2 = 2 \cdot 1 + 0.

Why this is the continued-fraction computation: the quotients in each step — 5, 1, 84, 2 — are exactly the integer partial quotients of the continued-fraction expansion of 171/1024. Euclid's algorithm and the continued-fraction expansion are the same process, read in opposite directions.

So 171/1024 = [0;\, 5,\, 1,\, 84,\, 2]. The convergents (successive best rational approximations) are computed from the partial quotients [5, 1, 84, 2] using the standard recurrence:

depth	quotient a_k	numerator h_k	denominator k_k	convergent
0	0	0	1	0/1
1	5	1	5	1/5
2	1	1	6	1/6
3	84	85	509	85/509
4	2	171	1024	171/1024

Why stop at depth 2: you want the convergent s/r with r \leq N = 21. Depth 0 gives 0/1 (denominator 1); depth 1 gives 1/5 (denominator 5); depth 2 gives 1/6 (denominator 6 — still \leq 21); depth 3 gives 85/509 (denominator 509 > 21, stop). The last convergent with denominator \leq N is the one to test.

Candidate order: r = 6. Verify: 2^6 = 64 = 3 \cdot 21 + 1 \equiv 1 \pmod{21}. Confirmed.

Extracting the factors

With r = 6:

r even? 6 is even. Yes.
a^{r/2} \not\equiv -1 \pmod N? Compute x = 2^{r/2} = 2^3 = 8 \pmod{21}. Is 8 \equiv -1 \pmod{21}? No: -1 \equiv 20, and 8 \neq 20. Condition holds.

Take the two gcds:

\gcd(x - 1, N) = \gcd(7, 21) = 7,

\gcd(x + 1, N) = \gcd(9, 21) = 3.

Verify: 7 \times 3 = 21. Both prime. Done.

Example 1: factor $N = 21$ with $a = 2$

Step 1. Pick and check a. Choose a = 2. \gcd(2, 21) = 1 (from 21 = 10 \cdot 2 + 1, 2 = 2 \cdot 1 + 0). Proceed. Why not take a random a for this example: any coprime a that yields r = 6 — most of them do — behaves identically. a = 2 is the cleanest case where the order arithmetic is a power of 2 sequence you already know by heart.

Step 2. Quantum order finding. Prepare 10 ancilla qubits in |0\rangle^{\otimes 10} and 5 data qubits in |1\rangle = |00001\rangle. Apply H^{\otimes 10}, the controlled-U_2^{2^k} ladder for k = 0, 1, \ldots, 9, and \text{QFT}^{-1}. Measure. Suppose the outcome is y = 171, so \widetilde\varphi = 171/1024 \approx 0.1670.

Step 3. Continued fractions. Expand \widetilde\varphi via Euclid: 1024 = 5 \cdot 171 + 169, 171 = 1 \cdot 169 + 2, 169 = 84 \cdot 2 + 1, 2 = 2 \cdot 1 + 0. Partial quotients [0; 5, 1, 84, 2]. Convergents with denominator \leq N = 21: 0/1, 1/5, 1/6. Take the last: candidate r = 6. Why stop before the convergent with denominator 509: the theorem says a convergent s'/r' of \widetilde\varphi with r' \leq N is guaranteed to equal s/r whenever \widetilde\varphi is within 1/(2N^2) of s/r. For larger r' there is no such guarantee. The correct r is the denominator of the last convergent with r' \leq N. Verify: 2^6 = 64 \equiv 1 \pmod{21}. Confirmed.

Step 4. Check conditions. r = 6 is even. x = 2^{r/2} = 2^3 = 8 \bmod 21. Is 8 \equiv \pm 1 \pmod{21}? 8 \neq 1 and 8 \neq 20. Both conditions hold.

Step 5. Extract factors.

p_1 = \gcd(x - 1, 21) = \gcd(7, 21) = 7,

p_2 = \gcd(x + 1, 21) = \gcd(9, 21) = 3.

Verify: 3 \times 7 = 21. Both prime.

Result. 21 = 3 \times 7.

The complete trace for $N = 21$ with $a = 2$. The pipeline has the same five-step structure as $N = 15$ — the only difference is that step 2 now returns an approximate phase estimate $\widetilde\varphi \approx 1/6$ instead of an exact one, and step 3 (continued fractions) carries the burden of correcting that approximation back to the exact denominator $r = 6$.

What this shows. The algorithm works on N = 21 identically to N = 15 in structure — five steps, one quantum subroutine, four classical wrappers. But the measurement outcome is now an approximation to s/r, not an exact binary representation. The continued-fractions step is where the approximation is corrected. Everything about Shor's robustness lives in that correction.

Side-by-side — N = 15 vs N = 21

Example 2: compare the two measurement distributions

The cleanest way to see why N = 21 is harder is to put the phase-estimate distributions side by side.

N = 15, a = 7, t = 8, r = 4. The possible phases s/4 for s \in \{0, 1, 2, 3\} are 0, 0.25, 0.5, 0.75 — all exactly representable as 8-bit binary fractions 0/256, 64/256, 128/256, 192/256. Measurement probability concentrates perfectly on four bins.

N = 21, a = 2, t = 10, r = 6. The possible phases s/6 for s \in \{0, 1, 2, 3, 4, 5\} are 0, 0.1\overline{6}, 0.3\overline{3}, 0.5, 0.6\overline{6}, 0.8\overline{3}. Only 0 and 0.5 are exact in 10-bit binary. The others are approximately 171/1024, 341/1024, 683/1024, 853/1024, each surrounded by a sinc-like tail of lower-probability neighbouring bins.

Key numbers.

Property	N = 15	N = 21
Order r (typical a)	4	6
r a power of 2?	Yes	No
Ancilla qubits t	8	10 (or 11)
Data qubits n	4	5
Peak locations	exact integers 64s	nearest integers to 170.\overline{6}\,s
Peak probability per good outcome	1/r = 1/4	\approx 0.4/(r-1) \approx 0.08
Per-run success probability	3/4	\approx 0.4
Demonstrated honestly on hardware?	Yes (ion traps, 2016)	No (as of 2026)

Why the per-run success numbers differ so much: for N = 15, three of the four outcomes (ignoring s = 0) immediately give r = 4 from continued fractions. For N = 21, only the outcomes s \in \{1, 5\} — those coprime to r = 6 — give r directly; the others give divisors of r or wrong answers that require additional runs to reconcile. The probability of drawing a coprime s is \phi(r)/r = 2/6 = 1/3 in the best case, and binary-truncation further spreads the amplitude.

Side-by-side comparison of Shor on $N = 15$ and $N = 21$. Every structural property — order, qubit count, peak locations, success probability, experimental status — is harder for $N = 21$, and each difference compounds in a real hardware implementation.

What this shows. The jump from N = 15 to N = 21 is not just "a bigger number" — it is a structural change in how cleanly the quantum output maps to the answer. N = 15 works because r = 4 divides 2^t; N = 21 works despite r = 6 not dividing 2^t, and the "despite" is where most of the algorithmic effort lives. Every Shor target N larger than 15 faces a version of this challenge.

The honest-vs-cheated distinction

The most uncomfortable fact about experimental Shor's algorithm, as of 2026, is this: no genuine, non-precompiled demonstration of Shor has ever been run on real hardware for any N > 15. Every published "Shor factors N = 21" or "N = 35" or larger result has used some form of circuit simplification that depends on classical knowledge of the answer.

The technical term for this is precompilation. The process is:

Classically determine r for the specific a, N pair.
Classically simplify the circuit using that knowledge — for instance, replacing the controlled-U_a^{2^k} gates for k beyond \log_2 r with identity (because U_a^r = I), or collapsing the entire modular-exponentiation ladder into a smaller hard-coded permutation.
Run the simplified circuit and verify that the measurement distribution matches the expected peaks.

This is not fraud — papers disclose their precompilation — but it is a qualitatively weaker demonstration than running the full generic Shor circuit and watching it output r without being told the answer.

What counts as honest. A demonstration is honest in this sense if:

The modular-exponentiation subcircuit is implemented as a general-purpose quantum circuit (one that could factor any N of the appropriate size, fed N as input rather than hard-coded).
The phase-estimation output is processed by generic continued fractions, not by classical logic that assumes the specific r.
The experimental claim is that the circuit would work on hard unknown-r inputs, not just on the specific N, a pair being demonstrated.

By these criteria, only the ion-trap experiment by Monz et al. 2016 (arXiv:1507.08852) comes close for N = 15. Even Monz et al. use some compilation tricks specific to ion-trap architectures. For N = 21, 35, 51, no published result is fully honest — the Smolin, Smith, and Vargo 2013 commentary (arXiv:1301.7007) titled "Pretending to factor large numbers on a quantum computer" laid out the critique in detail, pointing out that some reported factorisations can be achieved with circuits that contain no quantum entanglement at all.

The honest/cheated distinction for experimental Shor demonstrations. An honest demonstration uses generic subcircuits that would work on unknown-$r$ input; a cheated (precompiled) demonstration hard-codes information about the specific $N, a$. By this standard, as of 2026, no fully honest Shor has been run on any $N$ larger than $15$.

Common confusions

"N = 21 is just a slightly harder number than N = 15." Structurally, it is a different regime. N = 15 has r = 4, a power of 2; N = 21 has r = 6, which is not. Every step downstream of the phase-estimation measurement is harder: the peak widens, continued fractions must handle approximate input, and the success probability per run drops. "Slightly harder" misses the structural point.
"Shor handles any N — the theoretical algorithm is complete." The theoretical algorithm is complete for any N and runs in O(n^3) on a perfect quantum computer. The experimental realisations are not complete — every published demonstration on N > 15 has precompiled the circuit using classical knowledge of the answer. Theory and experiment are in very different states.
"Precompilation is normal software engineering — any compiler simplifies circuits." Up to a point. A compiler simplifying a classical circuit is legitimate; a quantum-algorithm implementation that hard-codes the answer during classical compilation is a qualitatively different thing. The distinction matters specifically for demonstrations of quantum advantage or supremacy — if you compile in the answer, you have not demonstrated that the quantum computer can find the answer. The field's "honest" standard tracks this line carefully.
"Shor on N = 21 requires only slightly more qubits than on N = 15." The abstract qubit count rises from 13 to \sim 16. But the fidelity requirements rise substantially — amplitude-leakage into neighbouring bins compounds with every noisy gate, so the effective number of qubits needed for a successful run at realistic gate fidelities is much higher. The experimental threshold for an honest N = 21 demonstration has been estimated at \sim 30 to \sim 50 physical qubits with error rates below 10^{-4} per gate — noticeably beyond what small ion-trap or superconducting-qubit demonstrations have achieved.
"The cheating critique means experimental Shor is fake." No. Experimental Shor demonstrations have validated components of the algorithm — coherent multi-qubit control, controlled modular arithmetic, phase estimation, inverse QFT — on real hardware. These are real engineering accomplishments. What they have not done is demonstrate the full algorithm operating on unknown inputs at scale. That is the gap between demonstrations-as-proof-of-components and demonstrations-as-cryptographic-threats.

Going deeper

The worked examples above complete the algorithmic picture. The rest of this section develops the formal analysis of non-power-of-two orders, the Smolin-Smith-Vargo critique of pretending-to-factor, and a careful comparison of resource requirements from N = 15 up to N = 2048-bit RSA.

Formal analysis of non-power-of-two orders

When r does not divide 2^t, the phase-estimation output probability distribution is given by (Nielsen and Chuang §5.2.1, Eq. 5.26):

\Pr(y \mid s) \;=\; \frac{1}{2^{2t}}\,\left|\,\frac{\sin\!\left(\pi(2^t \cdot s/r - y)\right)}{\sin\!\left(\pi (2^t s/r - y)/2^t\right)}\,\right|^2.

This is a sinc-squared-like distribution centred at y = 2^t s / r (not necessarily an integer) with width \sim 1. The total probability of measuring y within 1 of 2^t s /r — the "nearest-integer" outcome — is at least 4/\pi^2 \approx 0.405, a bound due to Cleve-Ekert-Macchiavello-Mosca (arXiv:quant-ph/9708016). This is the foundational bound that makes Shor work at all for non-power-of-two r.

The number of ancilla qubits needed is t \geq 2n + 1, which guarantees that with probability \geq 1 - 1/(2(N-1)) the measurement output is within distance 1/(2N^2) of s/r, allowing continued fractions to identify s/r uniquely. For N = 21, 2n + 1 = 11; our worked example with t = 10 succeeded because the specific outcome s = 1, y = 171 happened to fall well within the precision window, but a more conservative implementation would use t = 11 or t = 12.

The Smolin-Smith-Vargo critique

In 2013, John Smolin, Graeme Smith, and Alexander Vargo published a short paper (arXiv:1301.7007) titled "Pretending to factor large numbers on a quantum computer." The argument: several high-profile "demonstrations" of Shor on N = 15, 21, 143, 56153 between 2001 and 2012 had used circuits that, after classical precompilation, contained no quantum entanglement — the "quantum" circuits were effectively classical permutations of basis states, and could be simulated in polynomial time on any laptop.

The critique did not claim the experiments were dishonest — papers disclosed their compilation steps — but argued that calling them "quantum factoring demonstrations" was misleading, because no quantum advantage was actually demonstrated. A 2012 photonic experiment claiming to factor 143 was shown by Smolin-Smith-Vargo to be a simplified circuit that implemented a 3-qubit phase-kickback operation with classical post-processing, and did not use Shor's algorithm in any essential way.

The critique forced a recalibration in the experimental community. Subsequent demonstrations — notably Monz et al. 2016 on N = 15 in ion traps — made explicit efforts to avoid the Smolin-Smith-Vargo critique by implementing generic modular-exponentiation subcircuits and verifying entanglement during the run. For N = 21 and above, no subsequent experiment has met the same standard.

Resource requirements — N = 15 to N = 2048-bit RSA

The theoretical scaling is O(n^3) in the bit-size n of N. Plugging in:

N	n = \log_2 N	Logical qubits	Toffolis	Physical qubits (surface code)
15	4	\sim 13	\sim 60	\sim 10^4 if error-corrected
21	5	\sim 16	\sim 130	\sim 2 \times 10^4 if error-corrected
35	6	\sim 19	\sim 220	\sim 3 \times 10^4 if error-corrected
143 = 11 \times 13	8	\sim 26	\sim 510	\sim 5 \times 10^4 if error-corrected
10^{10}	34	\sim 110	\sim 39000	\sim 10^6 if error-corrected
2048-bit RSA	2048	\sim 6000	\sim 10^9	\sim 2 \times 10^7

For small n, the physical-qubit counts assume a fault-tolerant implementation; in practice, small demonstrations have used NISQ-era hardware without error correction, relying on low gate depth and raw fidelity to get meaningful results on a single-shot basis.

The operational threshold for cryptographic relevance is 2048-bit RSA — about 10^4\times more physical qubits than the largest current machines (as of 2026). The gap between "demonstrations on N = 15" and "operational Shor on N = 2048" spans roughly 10 orders of magnitude in total resource cost. Closing even one order of magnitude has taken roughly a decade of engineering progress.

The scaling gap in one picture

To see the problem from the other side: the N = 15 circuit has \sim 60 Toffoli gates and can be run (at various fidelities) on today's NISQ hardware. The N = 21 circuit has \sim 130 Toffoli gates and is borderline — an honest demonstration without precompilation is at the edge of what the best current devices can do. For N = 143, the gate count is \sim 510, well beyond NISQ reach. For 2048-bit RSA, the gate count is 10^9, requiring full fault tolerance.

Every factor of two in bit-size n multiplies the resource cost by roughly 8 (from the n^3 scaling). Moving from N = 15 (n = 4) to N = 2048-bit RSA (n = 2048) is a factor of 512 in n and therefore \sim 10^8 in gate count — consistent with the 60 vs 10^9 comparison. Shor's polynomial scaling is real, but the constants and exponents compound into an enormous absolute number.

What an honest N = 21 demonstration would prove

An honest, fault-tolerant demonstration of Shor on N = 21 — on a machine that did not know the answer in advance, implementing generic modular exponentiation, with measurement outputs recovered via generic continued fractions — would prove:

Quantum hardware can implement the modular-exponentiation subcircuit at a scale (\sim 20 logical qubits, \sim 100 Toffolis) beyond what classical simulation trivially handles at single-shot fidelity.
Phase estimation with non-exact phases (s/r where r is not a power of 2) behaves as theory predicts.
The continued-fractions post-processing correctly recovers r from approximate input.

This has not yet been done. As of 2026, it is an open benchmark for the experimental quantum-computing community — a step on the roadmap between N = 15 (solved) and cryptographically relevant scales (still distant).

Where this leads next

Factoring 15 step by step — the clean case that this chapter compares against.
The Shor circuit end to end — the full algorithm at arbitrary N, with the 2048-bit RSA resource estimate.
Continued fractions — the classical post-processing that corrects approximate phase estimates back to the exact denominator.
Phase estimation accuracy — the sinc-squared distribution and the 4/\pi^2 concentration bound.
Quantum phase estimation — the underlying subroutine used by Shor.

References

Smolin, Smith, Vargo — Pretending to factor large numbers on a quantum computer (2013) — arXiv:1301.7007. The canonical critique of precompiled Shor demonstrations.
Monz et al. — Realization of a scalable Shor algorithm (2016) — arXiv:1507.08852. Ion-trap demonstration of N = 15 with minimal precompilation.
Lucero et al. — Computing prime factors with a Josephson phase qubit quantum processor (2012) — arXiv:1202.5707. Early superconducting-qubit N = 15 demonstration with explicit precompilation.
Vandersypen et al. — Experimental realization of Shor's quantum factoring algorithm using nuclear magnetic resonance (2001) — arXiv:quant-ph/0112176. The original Compustar NMR demonstration.
Nielsen and Chuang — Quantum Computation and Quantum Information §5.3 — Cambridge University Press.
Wikipedia — Shor's algorithm — experimental demonstrations section.