Lattice-Based Cryptography

Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.

In short

A lattice is the set of all integer combinations of a few basis vectors in high-dimensional space. The lattice problems that cryptography rests on — SVP (find the shortest non-zero vector), CVP (find the lattice point closest to a given target), and LWE (Learning With Errors: recover a secret from many noisy linear equations modulo a prime q) — are believed hard even for quantum computers. No known quantum algorithm beats a 2^{\Theta(n)} runtime on them; the period-finding trick Shor uses to break RSA simply does not apply. Three of NIST's four post-quantum standards are lattice-based: ML-KEM (Kyber) for key exchange, ML-DSA (Dilithium) for signatures, and FN-DSA (FALCON) as a shorter-signature alternative. Their keys (~1 kB) and signatures (~2–3 kB) are larger than RSA's 256 bytes but small enough that Chrome, Cloudflare, Signal, and Pearle iMessage already deploy them in production TLS. India's National Quantum Mission and CERT-In's crypto-agility framework target a Kyber+Dilithium migration of UPI, Aadhaar, and CCA certificate chains by 2030.

Take a sheet of graph paper. Pick two arrows drawn on it — any two arrows, as long as they do not point in the same direction. Add them, subtract them, add double one to triple the other; every combination lands on a specific dot on the paper. The set of all such dots — all integer combinations of your two arrows — is a lattice in two dimensions. It looks like a regularly spaced grid of points, slanted and stretched by whatever arrows you chose.

Now ask a question that looks innocent: given the two arrows, which lattice point is closest to the origin — other than the origin itself? In 2D you can almost see the answer. Your eye picks out the nearest dot in a fraction of a second. But now do the same in 512 dimensions. The arrows are 512-long vectors. The lattice has infinitely many points, most of them far from the origin, some of them disconcertingly close. You cannot see it, you cannot draw it, and neither can any algorithm anyone has ever written in sub-exponential time. That one question — what is the shortest non-zero vector in a high-dimensional lattice — is the mathematical rock on top of which three of the four NIST post-quantum standards are built.

This chapter is about why that rock is solid. You will see what a lattice actually is, what the shortest-vector and closest-vector problems are, what Learning With Errors adds on top, and how Kyber and Dilithium wrap the hardness into key-exchange and signature algorithms your phone's TLS stack will run for the next twenty years. By the end you will understand what your bank means when it says it is moving to Kyber, why the keys are suddenly 1 kB instead of 32 bytes, and why — despite those bigger numbers — the lattice family is considered the least painful path out of the quantum threat.

A lattice is a grid in high-dimensional space

Start with the simplest concrete example. Take the vectors \mathbf{b}_1 = (3, 1) and \mathbf{b}_2 = (1, 2) in the plane. The lattice generated by them is

L = \{ n_1 (3, 1) + n_2 (1, 2) : n_1, n_2 \in \mathbb{Z} \}.

Every point in L is an integer combination of the two basis vectors. Plot a few: (0, 0), (3, 1), (1, 2), (4, 3), (-3, -1), (2, -1)... the pattern is a tilted parallelogram grid that fills the plane.

A 2D lattice generated by $\mathbf{b}_1 = (3, 1)$ and $\mathbf{b}_2 = (1, 2)$. Every dot is an integer combination of the two arrows. The shortest non-zero vector of this lattice is $\mathbf{s} = \mathbf{b}_1 - \mathbf{b}_2 = (2, -1)$ (equivalently $(2, 1)$ after reflection) — shorter than either basis vector. Finding that shortest vector is easy in 2D; the same problem in 512 dimensions is believed intractable.

Why the shortest vector is not a basis vector: the basis is any set of generators — you can describe the same lattice with many different bases, some of which have "short" vectors and some of which are deliberately distorted to have long ones. The shortest vector is an intrinsic property of the lattice, independent of which basis you use to name it. The gap between the basis you are given and the shortest vector you want to find is exactly the hardness hiding in the problem.

The rank of a lattice is the number of basis vectors. In cryptography the rank is large — 256, 512, 1024. The vectors live in \mathbb{Z}^n or \mathbb{Z}_q^n (integers modulo a prime q), and there is no way to draw the picture. But the definition is identical: a lattice is the set of integer combinations of a basis.

Lattice (formal)

Given linearly independent vectors \mathbf{b}_1, \ldots, \mathbf{b}_n \in \mathbb{R}^n, the lattice generated by them is

L = \left\{ \sum_{i=1}^n n_i \mathbf{b}_i : n_i \in \mathbb{Z} \right\}.

The vectors \mathbf{b}_1, \ldots, \mathbf{b}_n form a basis of L. A lattice has infinitely many different bases; they all describe the same point set.

The three hard problems

With a lattice in hand, three questions give rise to the entire family of hard problems.

SVP — the shortest vector problem

Given a basis of a lattice L, find a non-zero vector \mathbf{v} \in L of minimum length.

In 2D you solve this by eye. In 3D with a good basis you can still do it by hand. In dimension n with an arbitrary basis, the best known classical algorithm runs in time 2^{\Theta(n)} (lattice sieving, refined through BKZ-style block reduction). The best quantum algorithm (Grover-flavoured sieving) runs in 2^{\Theta(0.29 n)} or so — still exponential, just with a smaller constant in the exponent.

SVP is NP-hard to solve exactly, and it remains hard even to approximate within any polynomial factor. This worst-case hardness is the bedrock under everything that follows.

CVP — the closest vector problem

Given a basis of a lattice L and a target vector \mathbf{t} \in \mathbb{R}^n, find \mathbf{v} \in L minimising \|\mathbf{v} - \mathbf{t}\|.

This is SVP's slightly meaner cousin. You are not asked for the shortest lattice vector; you are handed a random-looking point and asked to snap it to the nearest lattice point. SVP reduces to CVP (and vice versa, up to polynomial loss), so their hardness is essentially the same. CVP is the problem decryption has to solve in many lattice cryptosystems — and it is what an attacker has to solve to decrypt without the private key.

LWE — Learning With Errors

This is the one your TLS handshake actually uses. Oded Regev introduced it in 2005 and it has been the workhorse of lattice cryptography ever since.

Fix a prime q (say q = 3329, as in Kyber) and a dimension n (say n = 256). The secret is a vector \mathbf{s} \in \mathbb{Z}_q^n with small entries. The LWE sample generator draws a uniformly random \mathbf{a} \in \mathbb{Z}_q^n, a small error e from some Gaussian-like distribution, and outputs the pair

(\mathbf{a}, \; b = \langle \mathbf{a}, \mathbf{s} \rangle + e \pmod q).

You see many such pairs, (\mathbf{a}_1, b_1), (\mathbf{a}_2, b_2), \ldots, (\mathbf{a}_m, b_m). Your task: recover \mathbf{s}.

Without the errors, this is a linear system: A \mathbf{s} = \mathbf{b} \pmod q, easily solved by Gaussian elimination once m \ge n. With the errors — even tiny errors, one-bit errors — Gaussian elimination catastrophically amplifies the noise and gives nonsense. The problem is hard.

The LWE equation. Without the error term $\mathbf{e}$, recovering $\mathbf{s}$ from $(A, \mathbf{b})$ is a linear system solved in polynomial time. Adding a small error destroys linear-algebraic recovery completely: the errors accumulate under Gaussian elimination, and no algorithm — classical or quantum — is known that solves the noisy version in less than exponential time.

Why small errors make the problem hard: Gaussian elimination on A \mathbf{s} = \mathbf{b} takes linear combinations of rows. Each combination adds a linear combination of errors too. To isolate a single coordinate of \mathbf{s} you eventually need a combination whose error is small relative to q — but the number of such lucky combinations shrinks exponentially as n grows. The best classical attack (BKZ lattice sieving) essentially searches the lattice \Lambda(A) for a short vector that un-cancels the error; that search is 2^{\Theta(n)}.

Regev's (2005) headline result: solving average-case LWE is at least as hard as solving worst-case lattice problems (GapSVP and SIVP) to within approximation factors. Worst-case-to-average-case reductions are rare and precious in cryptography. They mean that "I picked LWE instances at random" does not accidentally give an easier problem than "I carefully chose the hardest possible lattice." If any lattice is hard, random ones are hard. That is the guarantee lattice cryptography offers that RSA never could.

Why lattice problems resist quantum attacks

Shor's algorithm breaks factoring and discrete log because those problems reduce to period-finding in a periodic function — and the quantum Fourier transform detects periods with exponential speedup. Lattice problems have no such period. There is no function f that is periodic on L in a way that QFT can exploit in polynomial time.

The best known quantum attacks on SVP and LWE are refinements of classical lattice sieving, with the inner nearest-neighbour search speeded up by Grover's algorithm. The exponent drops from (heuristic) 2^{0.292 n} classical to about 2^{0.265 n} quantum — a meaningful but not asymptotically crippling speedup. To compensate, a scheme using quantum-security parameter \lambda raises n modestly; Kyber's parameters already budget for this. No polynomial-time quantum algorithm for SVP, CVP, or LWE is known. The conjecture that none exists is the load-bearing assumption under every lattice-based standard.

The structural reason lattice problems resist Shor. RSA hides behind a periodic function whose period the quantum Fourier transform extracts in polynomial time. LWE hides behind a uniformly random matrix; there is no period to find, the QFT produces no distinguishable signal, and the best quantum attack is still exponential.

Ring-LWE and Module-LWE — adding structure for speed

Plain LWE with n = 256 needs a 256 \times 256 matrix A, which is \approx 65{,}000 entries. Keys would be huge. The fix is to replace the unstructured matrix with one built from a polynomial ring.

Let R_q = \mathbb{Z}_q[x] / (x^n + 1) — the ring of polynomials of degree < n with coefficients mod q, where x^n is identified with -1. Ring-LWE replaces each entry of the LWE vectors with a polynomial in R_q. Multiplication in R_q is fast: the Number Theoretic Transform (a discrete cousin of the FFT) computes it in O(n \log n) time. Keys shrink from O(n^2) to O(n).

Module-LWE (used by Kyber and Dilithium) sits between plain LWE and Ring-LWE: vectors are short tuples of ring elements, not single ring elements. It hits a sweet spot — small enough keys, flexible enough parameters for multiple security levels, no single polynomial ring carrying all the weight.

Why the algebraic structure does not help the attacker: the best known attacks on Ring-LWE and Module-LWE run in the same 2^{\Theta(n)} time as plain LWE. The extra ring structure could in principle give an attacker leverage, but despite fifteen years of targeted cryptanalysis nobody has found any. NIST's confidence in lattice cryptography comes partly from this: the structured variants have survived every attack aimed specifically at exploiting their algebra.

Kyber — the key-exchange standard

CRYSTALS-Kyber, standardised as ML-KEM in NIST FIPS 203 (August 2024), is the post-quantum replacement for RSA/ECDH key exchange. It is a Key Encapsulation Mechanism (KEM): Alice publishes a public key, Bob encapsulates a fresh random shared secret against it, and Alice recovers the secret with her private key.

The simplified construction:

Setup. Pick the Module-LWE parameters: rank k \in \{2, 3, 4\}, dimension n = 256, prime q = 3329, small-error distribution \chi (centred binomial).

Key generation. Alice samples a public matrix A \in R_q^{k \times k} and small-error secret/error vectors \mathbf{s}, \mathbf{e} \in R_q^k.

Private key: \mathbf{s}.
Public key: \mathbf{b} = A \mathbf{s} + \mathbf{e} \in R_q^k.

Encapsulation (Bob). Sample small-error \mathbf{r}, \mathbf{e}_1, \mathbf{e}_2 and a random message bit-string m \in \{0, 1\}^{256}.

\mathbf{u} = A^T \mathbf{r} + \mathbf{e}_1.
v = \mathbf{b}^T \mathbf{r} + e_2 + \text{Encode}(m) where \text{Encode}(m) maps each bit to \lfloor q/2 \rfloor or 0.
Ciphertext = (\mathbf{u}, v), shared secret = H(m) (hash of m).

Decapsulation (Alice). Compute v - \mathbf{s}^T \mathbf{u} and round coefficients to the nearest of \{0, \lfloor q/2 \rfloor\} to recover m. Then H(m) is the shared secret.

Why decapsulation works: expand v - \mathbf{s}^T \mathbf{u} = (A \mathbf{s} + \mathbf{e})^T \mathbf{r} + e_2 + \text{Encode}(m) - \mathbf{s}^T (A^T \mathbf{r} + \mathbf{e}_1) = \mathbf{e}^T \mathbf{r} - \mathbf{s}^T \mathbf{e}_1 + e_2 + \text{Encode}(m). The first three terms are all sums of products of small quantities; they stay far from \pm q/4. The last term carries the message bit at magnitude q/2. Rounding to the nearest multiple of q/2 strips the noise and reveals m. Without \mathbf{s}, an attacker sees (\mathbf{u}, v) and must solve a Module-LWE instance to recover m — believed quantum-hard.

Kyber ships three parameter sets:

Parameter set	Rank k	Security (classical / quantum)	Public key	Ciphertext
ML-KEM-512	2	128 / 118 bits	800 B	768 B
ML-KEM-768	3	192 / 184 bits	1184 B	1088 B
ML-KEM-1024	4	256 / 256 bits	1568 B	1568 B

ML-KEM-768 is the default in production deployments (Chrome, Cloudflare, Signal): NIST category 3, roughly AES-192 equivalent, keys small enough to carry in a TLS handshake.

Dilithium — the signature standard

CRYSTALS-Dilithium, standardised as ML-DSA in NIST FIPS 204, replaces RSA/ECDSA for digital signatures. The construction is in the same Module-LWE family; the mechanics of signing use a Fiat-Shamir with aborts transformation that turns a lattice-based identification protocol into a non-interactive signature.

The simplified idea: the public key is \mathbf{t} = A \mathbf{s}_1 + \mathbf{s}_2 for secret small vectors \mathbf{s}_1, \mathbf{s}_2. To sign a message M, the signer:

Samples a random nonce vector \mathbf{y} with small coefficients.
Computes \mathbf{w} = A \mathbf{y} and hashes: c = H(M, \text{HighBits}(\mathbf{w})) — a small-weight polynomial.
Computes \mathbf{z} = \mathbf{y} + c \cdot \mathbf{s}_1.
Rejection sampling: if any coefficient of \mathbf{z} is too large (would leak information about \mathbf{s}_1), throw it all away and restart from step 1. Otherwise output signature (\mathbf{z}, c, \ldots).

Verification recovers A \mathbf{z} - c \mathbf{t} and checks that its high bits equal those hashed into c.

Why rejection sampling: without it, the distribution of \mathbf{z} leaks information about the secret \mathbf{s}_1 (each signature is a noisy measurement of c \cdot \mathbf{s}_1). By throwing away signatures whose \mathbf{z} falls outside a safe region, the remaining signatures have a distribution that is provably statistically independent of \mathbf{s}_1 — so an adversary cannot stitch many signatures into a reconstruction attack. The cost: about 3–7 rejections per successful signature, still just microseconds on a modern CPU.

Dilithium parameter sets:

Parameter set	Security	Public key	Signature
ML-DSA-44	128 bits	1312 B	2420 B
ML-DSA-65	192 bits	1952 B	3293 B
ML-DSA-87	256 bits	2592 B	4595 B

Worked example — Kyber-vs-RSA sizes and speeds

Example 1: wire cost of upgrading a handshake from RSA-2048 to ML-KEM-768

Setup. A merchant POS terminal in Jaipur completes a TLS 1.3 handshake with a payments-switch server in Mumbai for every UPI transaction. Today the handshake uses X25519 for key exchange; tomorrow NPCI mandates hybrid X25519+ML-KEM-768. Compare the wire cost and compute cost per handshake.

Step 1 — wire sizes. Count bytes added to the handshake.

Primitive	Public key	Ciphertext / shared secret
X25519	32 B	32 B
RSA-2048 key transport	256 B	256 B
ML-KEM-768	1184 B	1088 B

Why Kyber is ~37× X25519: Kyber's public key encodes k = 3 polynomials of degree 256 with coefficients in \mathbb{Z}_{3329} (\approx 12 bits each), plus a seed. That is 3 \times 256 \times 12 / 8 \approx 1152 bytes, plus 32 bytes of seed, giving 1184. The lattice security comes from having enough coefficients that the best lattice-sieving attack still costs 2^{192} operations — there is no compression of the underlying hard instance without losing security.

Step 2 — hybrid handshake overhead. Hybrid X25519+ML-KEM-768 carries both key shares and both ciphertexts. Extra bytes over pure X25519:

\Delta = 1184 + 1088 = 2272 \text{ bytes per handshake.}

A typical TCP initial window is 14 kB (IW10 at 1460 MSS). The hybrid handshake still fits comfortably in one round trip.

Step 3 — compute cost. On a Cortex-A55 (the ARM core in typical Indian POS hardware):

Operation	X25519	ML-KEM-768
Keygen	≈ 70 μs	≈ 40 μs
Encap / shared secret	≈ 70 μs	≈ 50 μs
Decap	—	≈ 40 μs

Kyber is faster than X25519 on this hardware, despite the bigger keys. The reason is that X25519's dominant operation is 255-bit modular multiplication (expensive on small ARM cores without a dedicated multiplier), while Kyber's dominant operation is the Number Theoretic Transform on 256-coefficient polynomials mod 3329 — which reduces to 13-bit multiplications that Cortex-A55 handles comfortably.

Step 4 — end-to-end transaction impact. A UPI authorisation round-trip has a typical end-to-end latency of 300 ms, dominated by the switching network. The extra ~90 μs of Kyber compute and ~2 kB of handshake bytes add well under 1 ms to the total. The user does not notice.

Result. ML-KEM-768 is a drop-in replacement for the key-exchange portion of a TLS handshake: small enough to fit in the first TCP round trip, fast enough to not add perceptible latency, and strong enough to survive harvest-now-decrypt-later. The migration is happening at Chrome, Cloudflare, Signal, and Pearle iMessage; NPCI's hybrid-handshake pilot followed in 2027.

Public keys, ciphertexts, and signatures for the main classical and post-quantum primitives. Kyber is 4–5× RSA-2048; Dilithium signatures are 13× RSA's. Every number shown still fits inside a single 1500-byte TCP segment (PQC transaction handshakes typically span two or three segments rather than one) — the wire cost is measurable but not dealbreaking.

What this shows. The PQC wire cost is real but not catastrophic. For handshake-sized operations (TLS, SSH, IKE), the extra 1–3 kB fits in the first TCP round-trip window. For high-volume small-message applications (DNSSEC, TOTP, streaming telemetry), the aggregate cost is more visible — and an active area of optimisation.

Worked example — walking through an LWE decryption by hand

Example 2: a 4-dimensional LWE encryption and decryption

Setup. Take n = 4, q = 17, and Regev-style bit encryption. The secret is \mathbf{s} = (2, 1, -1, 0)^T \in \mathbb{Z}_{17}^4. The public key is built from m = 5 LWE samples, each one a small error added to a random \langle \mathbf{a}_i, \mathbf{s} \rangle. Encrypt and decrypt one message bit.

Step 1 — generate the public key. Pick random \mathbf{a}_i \in \mathbb{Z}_{17}^4 and small e_i \in \{-1, 0, 1\}.

\mathbf{a}_1 = (3, 5, 1, 7), e_1 = 1, b_1 = 3{\cdot}2 + 5{\cdot}1 + 1{\cdot}(-1) + 7{\cdot}0 + 1 = 11 \bmod 17.
\mathbf{a}_2 = (4, 2, 8, 3), e_2 = -1, b_2 = 4{\cdot}2 + 2{\cdot}1 + 8{\cdot}(-1) + 3{\cdot}0 - 1 = 1 \bmod 17.
\mathbf{a}_3 = (1, 6, 2, 4), e_3 = 0, b_3 = 2 + 6 - 2 + 0 = 6 \bmod 17.
\mathbf{a}_4 = (5, 3, 4, 2), e_4 = 1, b_4 = 10 + 3 - 4 + 0 + 1 = 10 \bmod 17.
\mathbf{a}_5 = (2, 7, 3, 5), e_5 = -1, b_5 = 4 + 7 - 3 + 0 - 1 = 7 \bmod 17.

Public key: the five pairs (\mathbf{a}_i, b_i). The secret \mathbf{s} stays hidden.

Step 2 — encrypt a message bit \mu = 1. Pick a random subset S \subseteq \{1, \ldots, 5\} — say S = \{1, 3, 5\}. Compute

\mathbf{u} = \sum_{i \in S} \mathbf{a}_i = (3+1+2, 5+6+7, 1+2+3, 7+4+5) = (6, 18, 6, 16) \equiv (6, 1, 6, 16) \bmod 17,

v = \sum_{i \in S} b_i + \mu \cdot \lfloor q/2 \rfloor = (11 + 6 + 7) + 1 \cdot 8 = 32 \equiv 15 \bmod 17.

Send (\mathbf{u}, v) = ((6, 1, 6, 16), 15). Why this encryption works: the ciphertext looks uniformly random to anyone without \mathbf{s} (the sum of uniform \mathbf{a}_i is uniform; the sum of b_i plus the encoded bit is masked by the random subset). With \mathbf{s}, the receiver can cancel the A\mathbf{s} part and recover the bit — up to a small error that is far from the message magnitude q/2 = 8.

Step 3 — decrypt. Compute v - \langle \mathbf{u}, \mathbf{s} \rangle \bmod 17.

\langle \mathbf{u}, \mathbf{s} \rangle = 6 \cdot 2 + 1 \cdot 1 + 6 \cdot (-1) + 16 \cdot 0 = 12 + 1 - 6 + 0 = 7.

v - \langle \mathbf{u}, \mathbf{s} \rangle = 15 - 7 = 8 \bmod 17.

Step 4 — round to nearest. The candidate values for the encoded bit are 0 and \lfloor q/2 \rfloor = 8. The computed value is 8 — closer to 8 than to 0 — so \mu = 1. Correct.

Step 5 — check that the error stayed small. Track the error term through the computation. The "signal" was 1 \cdot 8 = 8. The residual error is

\sum_{i \in S} e_i = 1 + 0 + (-1) = 0.

Total recovered value: 8 + 0 = 8, exactly the encoded bit magnitude. In a real Kyber instance the error is a polynomial-weighted sum of many small errors, bounded below q/4 with overwhelming probability. If an attacker tried to recover \mathbf{s} from the public key (\mathbf{a}_i, b_i) without solving LWE, they would face a \mathbb{Z}_{17}^4 search — fine at n=4, astronomically hopeless at n = 256.

Result. The same arithmetic your phone will run, at scale n = 256 rather than n = 4, is what protects your TLS handshake in the post-quantum era.

The LWE encryption round-trip at $n = 4$, $q = 17$. The public key is five noisy linear equations in the secret $\mathbf{s}$. Encryption sums a random subset of them and adds the message bit in the high half of the modulus. Decryption uses $\mathbf{s}$ to cancel the $A\mathbf{s}$ part; the residual error is small, so rounding reveals the bit. An attacker without $\mathbf{s}$ must solve LWE.

Indian industrial and academic lattice work

Lattice cryptography is where most of India's PQC engineering effort is concentrated. The SETS lab at IIT Kharagpur (Prof. Debdeep Mukhopadhyay) published one of the fastest public Cortex-M4 implementations of Dilithium and has contributed side-channel-resistant hardware designs to the CRYSTALS reference codebase. IIT Madras hosts the Centre for Cryptography, Cybersecurity and Distributed Trust (C3iHub) working on lattice cryptanalysis and formal verification of lattice-crypto code. IISc Bangalore works on lattice-based homomorphic encryption — same underlying hardness, applied to computing on encrypted data, with national-priority applications in private biometric matching for Aadhaar. ISI Kolkata contributes number-theoretic foundations and lattice-reduction attacks.

On the deployment side, CERT-In's draft 2026 PQC transition framework recommends hybrid X25519+ML-KEM-768 for all new TLS deployments from 2027 and Dilithium-based signatures in the CCA root-ceremony pipeline from 2028. The RBI cyber-security framework's "crypto-agility" clause (2024) implicitly requires all scheduled commercial banks to make lattice deployment configuration-switchable by end of 2027. State Bank of India and SetuBank Bank have completed internal crypto-agility audits; production Kyber deployment to customer-facing endpoints is targeted for early 2028.

Common confusions

"A lattice is just a regular grid." In 2D with a rectangular basis, yes. With arbitrary basis vectors in dimension 512, the point pattern has no simple geometric intuition — short vectors can exist in surprising directions, the "closest lattice point to a random target" can be far, and the intuition from \mathbb{Z}^2 fails badly. Lattice cryptanalysis is the art of recovering that geometric intuition for specific cryptographic lattices; it is nowhere close to a closed problem.
"LWE is just linear algebra with noise — should be easy." Without the noise, yes: Gaussian elimination solves A\mathbf{s} = \mathbf{b} in polynomial time. Adding even one-bit errors transforms the problem: every row operation in elimination amplifies the error, and after n eliminations the noise swamps the signal. Recovering \mathbf{s} under noise is provably as hard as approximating worst-case lattice problems (Regev 2005).
"Quantum-resistant is proven, like Shor broke RSA." No. Lattice hardness is a conjecture supported by decades of failed cryptanalysis. Shor's breaking of RSA is a theorem with a proof; Kyber's quantum-resistance is a belief that no analogous quantum algorithm exists. That belief is well-tested but not a theorem, which is exactly why NIST diversifies its standard portfolio with a hash-based backup (SPHINCS+) in case of future lattice surprise.
"Bigger keys are the only downside." There is a second subtlety: decryption failure. Lattice encryption rounds a noisy value to recover the message; with non-negligible probability (tuned to roughly 2^{-139} for ML-KEM-768) the noise pushes past q/4 and the decryption is wrong. Cryptographic protocols using Kyber must absorb occasional correctness failures (typically by re-running with fresh randomness or using an Fujisaki–Okamoto transform to turn chosen-ciphertext attacks into chosen-plaintext attacks).
"Module-LWE is just a small version of LWE." Module-LWE is LWE with extra algebraic structure — polynomial-ring coefficients rather than scalar coefficients. The extra structure permits efficient arithmetic (Number Theoretic Transform) and smaller keys, at the cost of an additional security conjecture (that the structured variant is as hard as plain LWE). The conjecture has held against fifteen years of dedicated attack.

Going deeper

If you understand that a lattice is the set of integer combinations of basis vectors, that SVP and CVP and LWE are the hard problems cryptography rests on, that Kyber encodes message bits in the high half of q and decapsulates by rounding noisy inner products, and that Dilithium uses Fiat-Shamir with rejection sampling to sign messages — you have chapter 159. The material below is for readers who want Regev's worst-case-to-average-case reduction, Ring-LWE's algebraic structure, primal versus dual attacks, and the full FIPS 203 / 204 specifications.

Ajtai's worst-case-to-average-case reduction

The progenitor of modern lattice cryptography is Ajtai's 1996 result: breaking the Short Integer Solution (SIS) problem on average is at least as hard as approximating the Shortest Vector Problem in the worst case. This was the first cryptosystem whose security was provably reducible to a worst-case lattice problem, rather than to an average-case assumption hoped to be hard. Regev's (2005) LWE result extended the reduction to encryption (not just one-way functions) using a quantum reduction — LWE is at least as hard as worst-case SVP/SIVP under a polynomial-time quantum reduction, later classically matched by Brakerski et al. for certain parameter regimes.

Ring-LWE

Ring-LWE replaces \mathbb{Z}_q^n with the polynomial ring R_q = \mathbb{Z}_q[x]/(x^n + 1) where n is a power of 2. A sample (a, b) \in R_q \times R_q is formed as b = a \cdot s + e where the multiplication is polynomial multiplication mod (x^n + 1), and s and e have small coefficients.

The hardness of Ring-LWE is conjectured to be nearly equivalent to LWE for randomly chosen parameters, with a worst-case-to-average-case reduction to Ideal-SVP in the ring of integers of a cyclotomic number field (Lyubashevsky–Peikert–Regev 2010). The conjecture is stronger than plain LWE's — if an attack exploited the ring structure, Ring-LWE would fall first — but no such attack is known.

Arithmetic in R_q is fast. The Number Theoretic Transform (NTT) — a discrete analogue of the FFT over \mathbb{Z}_q when q \equiv 1 \pmod{2n} — computes polynomial multiplication in O(n \log n) operations of \mathbb{Z}_q-multiplication. Kyber uses NTT throughout its fast path; a single Kyber-768 key generation executes about six NTTs on 256-coefficient polynomials, each \sim 10 μs on a modern x86 core.

Module-LWE

Module-LWE, used by Kyber and Dilithium, sits between plain LWE (rank-n scalars) and Ring-LWE (rank-1 polynomials). Its samples are formed from matrices over a structured ring: \mathbf{b} = A \mathbf{s} + \mathbf{e} where A \in R_q^{k \times k}, \mathbf{s}, \mathbf{e} \in R_q^k with small coefficients. Changing k tunes the security–size trade-off without changing the ring, giving a single implementation that serves multiple security levels.

Primal vs dual Regev encryption

The Kyber decryption described above is the dual Regev form: the ciphertext is (\mathbf{u}, v) = (A^T \mathbf{r} + \mathbf{e}_1, \mathbf{b}^T \mathbf{r} + e_2 + \text{Encode}(m)), and decryption is v - \mathbf{s}^T \mathbf{u}.

There is also a primal Regev form, which reverses the roles: the ciphertext encodes the bit into b^T directly and decryption uses the secret to extract it. Dual Regev is preferred for KEM constructions because it has smaller ciphertexts at equivalent security; Kyber's final spec uses a hybrid primal-dual-variant optimised for parameter efficiency.

Lattice sieving and BKZ block reduction

The best classical attack on high-dimensional lattices is sieving. Given a lattice L \subset \mathbb{R}^n, generate a large pool of lattice vectors (say 2^{0.2 n} of them), then repeatedly subtract pairs of close vectors to produce shorter ones. Eventually the pool contains vectors shorter than any initially generated. The asymptotic cost is around 2^{0.292 n} classically (Becker–Ducas–Gama–Laarhoven) and about 2^{0.265 n} quantumly (with Grover's amplitude amplification on the nearest-neighbour search).

BKZ (Block Korkine–Zolotarev) reduction is the more practical algorithm: it works on sub-blocks of the basis, running sieving on each block and propagating the result through the full basis. BKZ's running time grows with block size; increasing the block size trades time for shorter output vectors. Kyber's parameters assume an attacker uses BKZ with the largest block size that fits in a 2^{192} operation budget — roughly block size 400 for ML-KEM-768.

Side channels and constant-time implementation

Kyber and Dilithium implementations must be constant-time — no secret-dependent branches, no secret-dependent memory accesses — otherwise timing side channels leak the secret. The original CRYSTALS reference code is constant-time; many cryptanalytic papers (including published attacks on early Falcon implementations) demonstrate what happens when it is not. The Indian engineering contribution to Dilithium hardware includes formally verified constant-time assembly for ARM Cortex-M4, Cortex-A53/A55, and RISC-V.

Decryption-failure attacks

Because Kyber has a small but non-zero decryption-failure probability, an attacker who can observe many encryptions of chosen messages and flag which ones decrypt correctly can mount a decryption-failure attack: the pattern of failures leaks information about the secret. The Fujisaki–Okamoto transform (used in ML-KEM) rebuilds decryption from chosen-plaintext secure encryption to chosen-ciphertext secure KEM by making the encryption deterministic under the message hash and checking on decryption that the recovered message re-encrypts to the received ciphertext. Chosen-ciphertext attackers cannot probe decryption outcomes without producing valid ciphertexts — which requires knowing the message anyway.

Formal specification pointers

FIPS 203 (ML-KEM): defines Kyber parameter sets, key-generation, encapsulation, decapsulation, and the Fujisaki–Okamoto CCA transform. Keccak-based XOFs (SHAKE-128, SHAKE-256) are used for hashing and randomness expansion.
FIPS 204 (ML-DSA): defines Dilithium parameter sets, key-generation, signing (with rejection sampling), and verification. Uses SHAKE-256 for the Fiat–Shamir challenge hash.
FIPS 206 draft (FN-DSA): defines FALCON, a shorter-signature Dilithium alternative based on NTRU and Gaussian sampling. FALCON is harder to implement correctly because its signing requires floating-point Gaussian sampling (exotic on embedded hardware); the payoff is 666-byte signatures vs Dilithium's 3293.

Where this leads next

Post-Quantum Cryptography: Why Now — the migration context: harvest-now-decrypt-later, NIST's four standards, India's National Quantum Mission timeline.
Factoring and RSA — the structural weakness lattice crypto replaces.
Hash-Based Signatures — the conservative fallback with the weakest assumptions.
Code-Based and Isogeny-Based Cryptography — the other NIST-considered families, and why SIKE collapsed in 2022.
Quantum Crypto Threat Model — which primitives break, which bend, and what you actually need to migrate.

References

Oded Regev, On Lattices, Learning with Errors, Random Linear Codes, and Cryptography (JACM 2009) — arXiv:quant-ph/0405057.
NIST, FIPS 203 — Module-Lattice-Based Key-Encapsulation Mechanism Standard (ML-KEM) — csrc.nist.gov/pubs/fips/203.
NIST, FIPS 204 — Module-Lattice-Based Digital Signature Standard (ML-DSA) — csrc.nist.gov/pubs/fips/204.
Chris Peikert, A Decade of Lattice Cryptography (2016) — eprint.iacr.org/2015/939.
Wikipedia, Lattice-based cryptography.
John Preskill, Lecture Notes on Quantum Computation, Chapter 8 (Cryptography) — theory.caltech.edu/~preskill/ph229.