QFT vs Classical FFT

Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.

In short

The quantum Fourier transform on n qubits uses O(n^2) gates; the classical fast Fourier transform on the same N = 2^n amplitudes uses O(n \cdot 2^n) arithmetic operations. The gap is exponential in n, which is why articles love to say "quantum Fourier transforms are exponentially faster." That sentence is misleading without a footnote. The FFT takes a classical list in and produces a classical list out — and the list is the point of running it. The QFT takes a quantum state in and produces a quantum state out — and you can only sample one basis element from that state per run. For signal processing, image compression, radar, audio, MP3 encoding, the QFT is not a drop-in replacement; it is the wrong tool. The QFT only wins inside algorithms — Shor, phase estimation, amplitude estimation, hidden subgroup — where the answer is already encoded as a feature (a peak, a period, a sparse pattern) of the transformed state, and one measurement sample is enough to read that feature out.

If you have read the last few chapters, you have seen two numbers that seem to promise something huge:

Classical FFT on a length-N signal: O(N \log N) operations.
Quantum QFT on n = \log_2 N qubits: O(n^2) gates.

Written in terms of n, that is O(n \cdot 2^n) versus O(n^2). For n = 30 (a billion-amplitude signal), it is 30 billion versus roughly 500. On paper, a six-hundred-million-fold speedup for a task the world spends billions of CPU-hours per year on.

If this were the full story, JPEGs would be compressed on quantum computers. MP3s would be encoded on quantum computers. Radar signals would be processed on quantum computers. None of that has happened, and none of it is planned. The honest comparison has a twist that every pop-science article leaves out: the QFT and the FFT do not have the same input, do not produce the same output, and are not drop-in replacements for each other.

This chapter is the careful, side-by-side comparison. By the end, you should be able to tell, for any Fourier-transform problem you see, whether the QFT offers a real advantage or whether the FFT is what you actually want.

Rehashing the operation counts

Start with the fact everyone agrees on.

Classical FFT. Cooley and Tukey's 1965 algorithm transforms N complex numbers in O(N \log N) arithmetic operations. Writing N = 2^n, this is O(n \cdot 2^n). The input is an array of N complex numbers sitting in classical memory; the output is another array of N complex numbers sitting in classical memory. Each operation is a standard floating-point multiply-add, the kind your CPU does billions of times per second.

Quantum QFT. The n-qubit QFT circuit is built from n Hadamards, about n(n-1)/2 controlled phase gates, and \lfloor n/2 \rfloor SWAPs. Total: O(n^2) quantum gates. The input is a quantum state on n qubits, held as amplitudes in superposition; the output is another quantum state on the same n qubits. Each gate is a physical operation (a microwave pulse, an optical pulse) with finite error.

The quantum QFT grows polynomially in $n$; the classical FFT grows exponentially. At $n = 30$ qubits, the classical computation is 60 million times larger than the quantum one. The gap widens with every additional qubit.

So the operation counts really are exponentially separated. The catch is everything else.

What they actually compute — same map, different medium

Both the FFT and the QFT compute the same linear map. Given N input coefficients a_0, a_1, \ldots, a_{N-1}, both produce the N output coefficients

\tilde{a}_k \;=\; \frac{1}{\sqrt N}\sum_{j=0}^{N-1} e^{2\pi i\, jk/N}\,a_j, \qquad k = 0, 1, \ldots, N-1.

Where they differ is in how the coefficients are stored.

	Classical FFT	Quantum QFT
Representation of N numbers	N separate memory cells	n = \log N qubits, held as amplitudes
Storage scaling	linear in N	logarithmic in N
Access to any one coefficient	O(1) — read memory cell	requires measurement; destructive
Reading all N coefficients	O(N) reads	\Omega(N) repeated runs or tomography
Writing N coefficients as input	O(N) writes	O(N) worst-case state prep
Precision	machine float (64-bit)	controlled by circuit depth

The classical FFT's input and output are arrays you can touch. Every cell is a number sitting in RAM; you can read it, write it, print it, save it, compare it to another cell. The QFT's input and output are quantum states — wavefunctions — that exist in the joint amplitude space of n qubits. You cannot touch a quantum state directly; you can only apply unitaries to it or measure it.

The readout difference — the whole story in one paragraph

This is the single paragraph that explains why the QFT is not a drop-in replacement for the FFT.

After a classical FFT, you have N numbers sitting in memory. You can read all N of them. You can copy them, print them, reuse them in the next stage of your signal-processing pipeline. After a quantum QFT, you have a state |\psi\rangle = \sum_k \tilde{a}_k |k\rangle. You can measure it and get one basis state |k\rangle with probability |\tilde{a}_k|^2. One measurement, one number, out of N possibilities. If you want more, you re-run the entire circuit and measure again. To estimate all N amplitudes to reasonable precision you need \Omega(N) runs — which kills the exponential speedup because each run is O(n^2) gates.

The classical FFT returns an array you can read. The quantum QFT returns a superposition; measurement collapses it to a single basis state sampled from the distribution $|\tilde{a}_k|^2$.

There is no workaround. The no-cloning theorem forbids making a copy of the output state to measure repeatedly; and "quantum tomography" — the procedure of fully characterising the output state — itself requires \Omega(N) runs. So the speedup is real for the computation but vanishes for any task that requires you to see all the outputs.

What this means for Fourier-transform applications

Think through every domain where the FFT is a workhorse.

Audio processing — an MP3 encoder runs an FFT on every 1024-sample window of your song and writes out the frequency-domain coefficients to a file. The file is classical, the output array is the point. Quantum has nothing useful to offer.
Image compression — JPEG computes a discrete cosine transform (closely related to the FFT) on 8 \times 8 blocks of pixels, quantises the coefficients, and writes them to a file. Same story: the output array is the point.
Radar — a radar pulse returns a time-domain echo; FFT converts it to the frequency domain; peaks in the frequency domain are the ranges of reflecting objects. The operator wants to see the full frequency spectrum.
Wireless communications — OFDM (orthogonal frequency-division multiplexing, used in Wi-Fi, 4G, 5G) applies an FFT to every subcarrier. The receiver needs the full spectrum to decode the data.
Seismology, astronomy, medical imaging — all run FFTs whose output spectra are the scientific product.

In none of these cases does a quantum computer help. Even if the FFT step itself could magically be made free, the rest of the pipeline needs the classical array — which means loading N samples, which costs O(N), wiping any speedup.

Where the QFT does help: algorithms where the answer is a sparse feature of the transformed state — a peak, a period, a bit pattern — and where the quantum computer can extract that feature by measurement.

Shor's algorithm — factoring reduces to period-finding, and the period becomes a sharp peak in the QFT-transformed state. One measurement samples a random multiple of 1/r; continued fractions recover r. The QFT is exactly the right tool.
Phase estimation — the quantity of interest is an eigenphase, which lands (to the desired precision) on a single basis state after the inverse QFT. Again, one measurement suffices.
Hidden subgroup problem — the answer is the dual subgroup H^\perp, which is supported on a small set of basis states after the QFT. Sampling converges on H^\perp quickly.
Amplitude estimation — the angle 2\theta lands on a single ancilla basis state. Same pattern.

The unifying rule: the QFT wins when the input state is already prepared by a prior quantum step (no classical loading cost) AND the output is concentrated in a small number of basis states (one measurement sample is enough). In every case, the QFT is a subroutine inside a bigger algorithm — not a standalone Fourier-transformer.

Can a quantum computer compute a classical DFT fast?

Take this question literally. You have N complex numbers a_0, \ldots, a_{N-1} sitting in classical memory — say, 1024 audio samples — and you want the classical DFT coefficients \tilde{a}_0, \ldots, \tilde{a}_{N-1}. Can a quantum computer do this exponentially faster than an FFT?

The answer is no, and the reasoning is worth going through carefully.

Step 1: loading the input. You need to encode \mathbf{a} = (a_0, \ldots, a_{N-1}) as a quantum state

|\mathbf{a}\rangle \;=\; \frac{1}{\|\mathbf{a}\|}\sum_{j=0}^{N-1} a_j|j\rangle.

This is called state preparation. For a generic classical input vector, state preparation takes \Omega(N) gates — you need to write down N complex numbers, one for each amplitude. This is provably tight; no circuit can prepare an arbitrary N-amplitude state in fewer gates.

The QRAM assumption (a hypothetical quantum random-access memory that loads a classical vector in O(\log N) time) would relax this, but QRAM has not been built and there is active debate about whether a scalable QRAM is physically realisable.

Step 2: the QFT. This part is cheap — O(n^2) = O(\log^2 N) gates. No problem.

Step 3: reading the output. The QFT has produced |\tilde{\mathbf{a}}\rangle = \sum_k \tilde{a}_k |k\rangle. You want to read all N coefficients \tilde{a}_k. Each measurement yields one sample from the distribution |\tilde{a}_k|^2. To estimate all amplitudes to accuracy \varepsilon, you need \Omega(N^2/\varepsilon^2) samples in the worst case (one for each \tilde{a}_k, with concentration). This is dominant.

Total cost. O(N) + O(\log^2 N) + \Omega(N^2/\varepsilon^2) = \Omega(N^2/\varepsilon^2), which is worse than the classical FFT's O(N \log N).

So the honest answer to "can a quantum computer compute a classical DFT faster?" is: not if you include input loading and output readout, which you must if the input and output are classical.

Hype check. You will see headlines like "quantum computers give exponential speedups for Fourier transforms." That is true only under a very narrow reading: the internal gate count of the QFT subroutine is exponentially smaller than the classical FFT's operation count, for the same data expressed as a quantum amplitude vector. It is not true that a quantum computer will Fourier-transform your MP3 faster, or your radar return faster, or your JPEG faster. The QFT is an algorithmic primitive, not a Fourier-transform processor.

Two worked examples

Example 1: image compression via FFT

A 256 \times 256 greyscale image is 65{,}536 pixels. JPEG compression applies the discrete cosine transform to 8 \times 8 blocks — the DCT is closely related to the FFT and shares its asymptotic cost. For simplicity, pretend we apply a single 256 \times 256 2D FFT (an FFT of length 65{,}536 viewed as a 2D signal) to the whole image.

Step 1. Classical FFT cost. For N = 65{,}536 = 2^{16}, the 2D FFT cost is O(N \log N) = 65{,}536 \times 16 \approx 10^6 operations. On a modern CPU (10 GFLOPs), this is about 100 microseconds — essentially free. Why 2D FFT is O(N \log N) with N pixels: the 2D transform factorises as 1D FFTs along rows, then columns. Each row is \sqrt{N} pixels and costs \sqrt{N}\log\sqrt{N}; there are \sqrt{N} rows. Total per axis is \sqrt{N} \cdot \sqrt{N}\log\sqrt{N} = (N/2) \log N, and with two axes, N \log N.

Step 2. Could a quantum computer do better? Loading a 65{,}536-pixel image into a quantum state requires \Omega(65{,}536) state-preparation gates — on the same order as the classical FFT cost. The QFT itself is O(16^2) = 256 gates — fast, yes, but we just spent a million gates loading the data. Why loading is \Omega(N): arbitrary-state preparation has information-theoretic lower bound of \Omega(N) gates. An N-amplitude state requires N complex numbers to specify; you cannot write them down in fewer than N gates in the worst case.

Step 3. Extracting the output. JPEG needs every DCT coefficient (to quantise it and write it to a file). Reading N amplitudes from the post-QFT state requires \Omega(N) samples — and each sample returns only one coefficient, non-deterministically. Over \Omega(N^2) runs, you could estimate all coefficients — but that is vastly worse than the classical N \log N.

Result. For image compression, the quantum computer is not faster — it is dramatically slower, and requires hardware that does not exist. The classical FFT remains the right tool, forever.

For classical-in, classical-out Fourier problems, the QFT is the tiny middle block. The state preparation and measurement stages dominate, making the quantum total worse than classical FFT.

What this shows. Even if every gate were free, state preparation and readout kill the speedup for any task that needs a classical output array. The QFT's O(n^2) cost is not the bottleneck — and therefore reducing it doesn't help.

Example 2: Shor's period-finding — where QFT wins

Shor's algorithm factors a 1024-bit RSA modulus N. The inner loop is period-finding: find the period r of f(x) = a^x \bmod N.

Step 1. The quantum circuit. Prepare a uniform superposition on n = 2048 ancilla qubits (the work register is \log_2 N = 1024 qubits, and you double the ancilla for good precision). Apply modular exponentiation: |x\rangle|0\rangle \to |x\rangle|a^x \bmod N\rangle. The cost of this step is dominated by the modular arithmetic, about O(n^3) = 10^{10} gates. Apply the inverse QFT to the ancilla. Measure. Why n = 2048 ancilla qubits: the QFT must distinguish multiples of 1/r from other fractions. Number-theoretic estimates (the continued-fraction bound) require the ancilla register to be about 2 \log_2 N qubits to guarantee success with high probability.

Step 2. The QFT itself. O(n^2) = 2048^2 \approx 4 \times 10^6 gates. A small fraction of the overall circuit, but still a meaningful chunk.

Step 3. The classical alternative. To find r classically via FFT, you would need to compute f(x) for x = 0, 1, 2, \ldots, 2^{2048} — that is 2^{2048} function evaluations, each involving modular exponentiation of 1024-bit numbers. The number 2^{2048} is larger than the number of atoms in the observable universe. The FFT step on this length-2^{2048} signal would be 2048 \cdot 2^{2048} operations — absurd. Why classical period-finding is exponential in the bit-length: the only known classical method is to loop x through enough values to detect a period, and the period r can be as large as N itself. For a 2048-bit N, r can be up to 2^{2048}.

Step 4. Where the quantum speedup lives. After the inverse QFT, the ancilla state is concentrated on multiples of 2^n/r. One measurement samples a random multiple of 1/r, and the classical continued-fraction algorithm recovers r. No need to read every coefficient — the answer is a single basis state.

Result. Shor's total cost is O(n^3) \approx 10^{10} gates for 1024-bit factoring. Classical cost is 2^{O(n^{1/3} \log^{2/3} n)} via the number field sieve, or about 2^{80} \approx 10^{24} operations for 1024-bit. The quantum advantage is real, exponential, and it depends entirely on the QFT doing its job as a subroutine — not as a standalone Fourier transform.

Shor's algorithm is the paradigm case where QFT's speedup is real. The input is prepared by a cheap superposition (not by loading classical data), and the output is a single basis state (not the full amplitude list).

What this shows. Shor's algorithm is the Fourier-transform problem reshaped to match the QFT's strengths: (a) input that is not a classical array but a cheap quantum superposition; (b) output that is a single peak rather than the full spectrum. The QFT's exponential cost advantage is fully captured because neither input nor output acts as a classical-to-quantum bottleneck.

When each wins — a direct summary

Put the two together.

	FFT wins	QFT wins
Input	classical data (audio, image, radar)	quantum state prepared by a prior quantum subroutine
Output needed	full array of coefficients	one sparse feature (peak, period, bit pattern)
Access pattern	read every coefficient	one measurement sample
Representative applications	signal processing, compression, comms, imaging	Shor's, phase estimation, hidden subgroup, amplitude estimation
Bottleneck	the FFT itself	state prep + readout (but these are cheap in the right context)
Hardware	GPUs, ASICs, FFT chips — all commodity	fault-tolerant quantum computer (doesn't yet exist at scale)

The short version: the QFT is not the quantum FFT. It is a Fourier-transform-shaped piece of quantum algorithmic machinery. The FFT is what you use for Fourier analysis; the QFT is what you use for quantum algorithms that exploit Fourier structure.

Common confusions

"QFT is a faster FFT." The QFT has fewer internal gates than the FFT has operations, yes, but it runs on a different input medium, produces a different output medium, and cannot be substituted for the FFT in any classical-input-classical-output pipeline. The comparison is not apples-to-apples.
"Quantum computers will speed up signal processing." They will not speed up ordinary classical signal processing. There is active research on quantum algorithms for problems with intrinsic quantum input (sensing with quantum detectors, simulating quantum systems), but for audio, image, radio, and video, classical hardware is and will remain the right tool.
"QFT can process audio." No — an audio file is a classical array. Loading it into a quantum state costs \Omega(N) gates (killing any QFT advantage), and reading the output spectrum costs another \Omega(N) measurements. Even in the best case, you'd get parity with classical FFT, and on real noisy hardware you'd be far slower.
"Building a QRAM solves the input problem." A QRAM is a hypothetical device that loads a classical vector in O(\log N) time. If QRAM existed, many quantum speedups would look more impressive. But QRAM has not been built, and the engineering path to it is not clear. Most serious quantum algorithms today are designed to not require QRAM.
"The QFT output can be read via quantum state tomography." Tomography fully characterises a quantum state, but it requires \Omega(N) measurements (and more for high precision). So tomography is not a loophole — it costs the same as reading an FFT output.
"Quantum computers are faster at everything eventually." False and an important thing not to believe. Quantum speedups are narrow: known for factoring, discrete log, simulating quantum systems, some linear algebra under conditions, unstructured search (quadratic only). For the vast majority of computational tasks — spreadsheets, databases, video rendering, neural network inference, ordinary arithmetic — classical computers are better and will stay better.

Going deeper

If you have the picture — QFT and FFT compute the same map but on different media, and the readout problem makes the QFT useless for classical-in-classical-out tasks — you have the main point of this chapter. The rest digs into rigorous complexity comparison, input-output models, quantum signal processing, and the economics of classical FFT hardware.

Rigorous complexity comparison

The two cost numbers need careful bracketing.

Classical FFT runs on a Turing machine or a standard RAM model. The cost O(N \log N) counts basic arithmetic operations — complex multiplies and adds — each of which is O(1) on real hardware when N fits in a register. For very large N (say, N > 2^{30}), memory-bandwidth bottlenecks begin to dominate, and the effective cost can be closer to O(N \log^2 N) on a naive implementation; cache-aware variants (FFTW, Intel MKL) keep close to O(N \log N) in practice.

Quantum QFT runs in the circuit model. O(n^2) gates counts fundamental two-qubit operations. On real hardware, these compile to a larger number of native gates (CZ plus single-qubit rotations), and the circuit depth (longest sequential chain) matters more than the total gate count for coherence-bound machines. Naive QFT has depth O(n^2); parallel constructions give O(n \log n) depth at the cost of more ancillas.

A cleaner question: if you are trying to do the same task on the same data representation, which algorithm wins? Answer: depends on the data representation.

If your amplitudes live as a quantum state on n qubits, QFT is the unique choice — there is no classical competitor.
If your amplitudes live as a classical array of N numbers, FFT is the unique choice — the QFT can only be applied after state preparation, which dominates.

Input-output model assumptions

Quantum algorithms are often compared against classical ones under subtly different input/output models. The key models:

Classical black box input, classical output. The inputs and outputs are classical data. Quantum computation happens only in the middle. QFT has no advantage here because of the loading/readout bottleneck.
Quantum state input, quantum state output. The inputs and outputs are quantum states. This is the model where QFT's O(n^2) complexity is honestly compared against classical's O(n \cdot 2^n) and wins. But this model assumes you have a quantum subroutine that produces the input and another that consumes the output — most classical tasks do not fit.
QRAM input, classical expectation-value output. A hypothetical QRAM provides cheap classical-to-quantum loading; the output is an expectation value rather than a full amplitude list. This is the model many quantum ML papers implicitly use, and it is the one most vulnerable to dequantisation results (Tang and collaborators).
Quantum query model. Input is an oracle that answers queries about a classical function. Output is a classical bit. Many structural quantum speedups (Grover, Simon, Bernstein–Vazirani) are proved in this model. The QFT is used as a subroutine in several of these.

The honest comparison is always stated in a specific model, with specific assumptions about input preparation and output extraction. "Quantum is faster" without the model is not a claim; it is a sentence.

Quantum signal processing — a post-QFT primitive

Since about 2017, the quantum algorithms community has developed a generalisation of the QFT called quantum signal processing (QSP), introduced by Low, Chuang, and others. QSP takes a unitary U (or a block encoding of a Hermitian matrix) and applies a polynomial transformation to its eigenvalues. The QFT is recovered as one special case. QSP gives tight gate-count bounds for Hamiltonian simulation, amplitude amplification, linear-system solving, and eigenvalue estimation — often improving on QFT-based protocols by constant or logarithmic factors.

QSP is not a replacement for QFT in pedagogical exposition — the QFT is still the right first example of a quantum Fourier transform — but it is the modern framework within which QFT-based speedups are studied rigorously.

Quantum chemistry's specific use of QFT

Ground-state energy estimation runs QPE on the Hamiltonian H, which (as seen in the previous chapter) internally applies an inverse QFT. Chemistry-relevant precision is about 1 kcal/mol, which translates to roughly 2^{-10} in atomic units of energy — so n = 10–15 ancilla qubits for the QFT, a manageable subcircuit inside a much larger simulation. The QFT is not the bottleneck here; the Hamiltonian simulation circuits e^{-iH \delta t} are. But the QFT at the end is what converts the accumulated phase into a measurable bit string.

Economics of classical FFT — why quantum cannot compete

Modern FFT hardware is extraordinarily fast. A single GPU (NVIDIA H100) computes FFTs at about 10^{13} floating-point operations per second. FPGA-based FFT processors used in 5G base stations do millions of 1024-point FFTs per second with sub-microsecond latency. Digital signal processors (DSPs) in phones run FFTs on every audio frame in real time, at microwatts of power.

A quantum computer that could match this throughput would need to (a) load millions of classical vectors per second into quantum states, (b) run a shallow QFT, (c) measure the output, all within the coherence window, and (d) do so with error rates low enough to be useful. As of 2025, the coherence times of the best superconducting qubits are around 100 microseconds; a single QFT on n = 20 qubits compiles to roughly 10^4 physical operations; the error rate per operation is about 10^{-3}. End-to-end fidelity after 10^4 operations is essentially zero.

The economics will not change soon. Classical FFT hardware is on a Moore's-law-like curve; quantum hardware is improving but is several orders of magnitude away from competing on classical-input-classical-output tasks, and may never compete on those tasks because the readout bottleneck is fundamental, not an engineering defect.

Indian signal-processing industry and the QFT

India's signal-processing industry runs on the classical FFT. BSNL, BharatGroup AakashTel, and BharatLine base stations — by the millions across the country — compute FFTs in their OFDM modems thousands of times per second per user. ISRO satellite ground stations Fourier-transform spacecraft telemetry and imaging data in real time. The Indian pharmaceutical industry uses FFT-based NMR spectroscopy for structure determination. None of this is going quantum any time soon, and none of it should.

Where India's National Quantum Mission expects quantum to matter is in different problems: quantum chemistry (drug discovery, catalysis), quantum-safe cryptography (migrating Aadhaar and UPI to post-quantum signatures before large-scale Shor's attacks become feasible), and quantum sensing. The National Quantum Mission's stated priorities deliberately do not include "quantum Fourier transforms for signal processing" — the research community has correctly identified that as a dead end.

Where this leads next

QFT defined — the mathematical definition of the quantum Fourier transform.
Why QFT is efficient — the O(n^2) circuit derivation in detail.
Shor's algorithm — the canonical application where QFT's speedup is real and exponential.
Quantum signal processing — the modern generalisation of QFT-based quantum algorithms.

References

Wikipedia, Quantum Fourier transform — comparison with classical FFT.
Wikipedia, Fast Fourier transform — applications across signal processing, communications, and imaging.
Andrew M. Childs, Robin Kothari, Rolando D. Somma, Quantum algorithm for systems of linear equations with exponentially improved dependence on precision (2017) — arXiv:1511.02306. A modern QSP-flavoured analysis of QFT-based algorithms.
John Preskill, Lecture Notes on Quantum Computation, Chapter 6 — QFT and its scope — theory.caltech.edu/~preskill/ph229.
Stephen Jordan, Quantum Algorithm Zoo — curated catalogue of quantum algorithms, with QFT-based ones marked and scope honestly described.
Ewin Tang, A Quantum-Inspired Classical Algorithm for Recommendation Systems (2019) — arXiv:1807.04271. The first major dequantisation result, showing classical algorithms can match several QFT-based quantum ML speedups.