Trotter-Suzuki Decomposition

Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.

In short

Given a Hamiltonian H = \sum_{j=1}^J h_j that is a sum of k-local pieces, you want to implement the time-evolution unitary e^{-iHt} on a quantum computer. The obstacle: e^{A+B} \neq e^A e^B when A and B do not commute, so you cannot just multiply the exponentials of the individual terms. Trotter's 1959 formula provides the fix: for large n,

e^{-iHt} \approx \left(\prod_{j=1}^J e^{-i h_j t/n}\right)^n.

Each factor e^{-i h_j t/n} is a small, bounded-size unitary — easy to implement as a short quantum circuit because h_j is k-local. The price is an error that scales as O(t^2/n) per Trotter step of the sum — so to achieve total precision \epsilon, you need n = O(t^2 / \epsilon) Trotter steps and a total gate count of O(J \cdot t^2 / \epsilon).

Seth Lloyd's 1996 Science paper [2] turned this observation into the first efficient quantum-simulation algorithm: polynomial in the number of qubits, polynomial in the evolution time, polynomial in 1/\epsilon. Everything in Part 15 — higher-order Trotter, linear combinations of unitaries, qubitisation — is a refinement of, or a replacement for, this single idea.

You have a Hamiltonian in hand. Last chapter gave you the zoo — Ising, Heisenberg, Hubbard, molecular — and convinced you that every one of them is a sum of k-local terms, H = \sum_j h_j. Your quantum computer wants to run e^{-iHt}, the unitary that evolves a state for time t under H. You know what each e^{-i h_j t} looks like individually — a small bounded-size unitary on the few qubits h_j touches, which you can compile into elementary gates. What you need is a way to glue those pieces into the full e^{-iHt}.

Here is the wrinkle. If A and B were ordinary numbers, you could write e^{A+B} = e^A \cdot e^B without a second thought. But Hamiltonian terms are matrices, and matrices do not commute in general. The Pauli operators X and Z famously satisfy XZ = -ZX — their product depends on the order. For non-commuting operators, e^{A+B} is not the same as e^A e^B. If you just multiplied the exponentials of the individual terms, you would be computing the wrong unitary.

Trotter's insight — the foundation of quantum Hamiltonian simulation — is that if you chop the time into very small slices, the error from pretending the pieces commute shrinks faster than the number of slices grows. In the limit of infinitely many slices, the approximation becomes exact. For any finite number of slices, the error is controllable. That is what you are going to derive and count in this chapter.

The core obstacle — non-commutativity

Start with the sharpest possible picture. Two 2 \times 2 matrices:

A = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} = X, \qquad B = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix} = Z.

Compute both orderings.

AB = XZ = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}, \qquad BA = ZX = \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix}.

Why they are different: X flips the basis then Z adds a sign to |1\rangle; but Z first adds a sign then X flips — the sign ends up on the other basis element. The commutator [X, Z] = XZ - ZX = -2iY is not zero. Two Pauli operators on the same qubit generically do not commute.

Expand e^{X+Z} as a power series to second order:

e^{X+Z} = I + (X+Z) + \frac{(X+Z)^2}{2} + \ldots = I + (X+Z) + \frac{X^2 + XZ + ZX + Z^2}{2} + \ldots

Now expand e^X e^Z, also to second order:

e^X e^Z = \left(I + X + \frac{X^2}{2}\right)\left(I + Z + \frac{Z^2}{2}\right) + \ldots

= I + X + Z + XZ + \frac{X^2 + Z^2}{2} + \ldots

Compare the second-order pieces. In e^{X+Z} you have \frac{1}{2}(XZ + ZX) — the symmetric combination. In e^X e^Z you have XZ alone. The difference is

e^X e^Z - e^{X+Z} = \frac{XZ - ZX}{2} + \ldots = \frac{[X,Z]}{2} + \ldots

Why this matters: whenever [A, B] \neq 0, the product e^A e^B differs from e^{A+B} at second order in the size of A and B. The leading error is proportional to the commutator. If you make A and B smaller — say, replace A by A/n and B by B/n — the commutator gets smaller by 1/n^2. That is the lever Trotter will pull.

Non-commutativity made visible. For a single qubit, evolving by $A$ then $B$, by $B$ then $A$, and by $A+B$ directly land at three different points on the Bloch sphere. The gap between endpoint 1 / 2 and the true endpoint is controlled by the commutator $[A, B]$. Trotter's trick is to cut the time into many small pieces so each gap becomes tiny.

So if A and B do not commute, e^{A+B} is not e^A e^B. What is it?

Trotter's formula — the idea

Trotter's 1959 theorem [1] says this. For any bounded operators A and B (bounded means their operator norms \|A\| and \|B\| are finite — always true for k-local terms with bounded coefficients), the limit

e^{A+B} = \lim_{n \to \infty} \left(e^{A/n} \, e^{B/n}\right)^n

holds exactly. The product of exponentials, if you slice each one into n pieces and interleave them, converges to the exponential of the sum.

Heuristic picture. Cut the time interval [0, t] into n small slices, each of length \Delta t = t/n. Over one slice, evolve first by A alone for time \Delta t, then by B alone for time \Delta t. Do this n times in a row. In the limit n \to \infty, \Delta t \to 0, the non-commutativity error in each slice shrinks like (\Delta t)^2 = (t/n)^2, and since there are n slices, the total error shrinks like t^2 / n \to 0. You end up with the correct unitary e^{A+B}.

That is the idea in one breath. Now make it quantitative.

The first-order Trotter formula

Substitute A \to -iAt, B \to -iBt into Trotter's identity (the -i factors and the time t are absorbed into the operators). The formula becomes

e^{-i(A+B)t} = \lim_{n \to \infty} \left(e^{-iAt/n} \, e^{-iBt/n}\right)^n.

For finite n, define the first-order Trotter approximation

U_1(t, n) \;=\; \left(e^{-iAt/n} \, e^{-iBt/n}\right)^n.

The question is: how close is U_1(t, n) to the true unitary e^{-i(A+B)t}?

The error bound, derived

You can read off the leading-order error from the Baker-Campbell-Hausdorff expansion. For small \Delta t = t/n:

e^{-iA\Delta t} e^{-iB\Delta t} = \exp\!\left(-i(A+B)\Delta t - \frac{\Delta t^2}{2}[A, B] + O(\Delta t^3)\right).

Why: the BCH formula says e^X e^Y = e^{X + Y + \tfrac{1}{2}[X, Y] + \ldots}. Plug in X = -iA\Delta t, Y = -iB\Delta t. The commutator [X, Y] = -\Delta t^2 [A, B]. The leading correction to e^{-i(A+B)\Delta t} is -\tfrac{1}{2}\Delta t^2 [A, B].

Step from one slice to n slices. Each slice introduces an error of size \frac{1}{2}\Delta t^2 \|[A, B]\| in the operator norm. The errors accumulate — roughly, the total error after n slices is n times the per-slice error:

\|U_1(t, n) - e^{-i(A+B)t}\| \;\leq\; \frac{n \cdot \Delta t^2 \cdot \|[A, B]\|}{2} \;=\; \frac{t^2 \, \|[A, B]\|}{2 n}.

This is the first-order Trotter error bound. It scales as O(t^2 / n) — quadratic in the total time t, inverse in the number of slices n.

Generalising to a sum of J terms

For a local Hamiltonian H = \sum_{j=1}^J h_j, apply the same idea piece by piece. One Trotter step through all J terms is

S_1(\Delta t) \;=\; e^{-i h_1 \Delta t} \, e^{-i h_2 \Delta t} \cdots e^{-i h_J \Delta t}.

The full simulation is n Trotter steps:

U_1^H(t, n) \;=\; \big(S_1(t/n)\big)^n \;=\; \left(\prod_{j=1}^{J} e^{-i h_j t/n}\right)^n.

The error per step picks up contributions from every pair of non-commuting terms. A standard bound [2][4]:

\|U_1^H(t, n) - e^{-iHt}\| \;\leq\; \frac{t^2}{2n} \sum_{j < k} \|[h_j, h_k]\|.

If each \|h_j\| \leq \Lambda (a uniform bound on each term's operator norm) and there are J terms, the sum of commutator norms is at most J^2 \Lambda^2, giving

\|U_1^H(t, n) - e^{-iHt}\| \;=\; O\!\left(\frac{J^2 \Lambda^2 t^2}{n}\right).

For k-local Hamiltonians with \|h_j\| = O(1) and J polynomial in the number of qubits N, this is the bound. For many physical Hamiltonians the commutator count is much smaller — only pairs of terms that share a qubit can fail to commute, which for nearest-neighbour 2-local Hamiltonians gives O(N) non-zero commutators, not O(N^2).

First-order Trotter formula

Let H = \sum_{j=1}^{J} h_j be a sum of bounded Hermitian operators on N qubits. The first-order Trotter approximation of e^{-iHt} with n Trotter steps is

U_1^H(t, n) = \left(\prod_{j=1}^{J} e^{-i h_j t/n}\right)^n.

Error bound. \|U_1^H(t, n) - e^{-iHt}\| \leq \dfrac{t^2}{2n} \sum_{j < k} \|[h_j, h_k]\|.

Gate count. Each factor e^{-i h_j t/n} implements in O(1) gates (since h_j is k-local with bounded k). Total gates: O(J \cdot n). To reach error \epsilon, set n = O(J^2 \Lambda^2 t^2 / \epsilon), giving total gates O(J^3 \Lambda^2 t^2 / \epsilon).

For practical k-local Hamiltonians (J = O(N), \Lambda = O(1)): total gates O(N^3 t^2 / \epsilon). Higher-order formulas (next chapter) improve the dependence on t and \epsilon.

Reading the bound. Three knobs control the cost:

Longer evolution (t bigger) costs quadratically more steps for the same precision.
Tighter precision (\epsilon smaller) costs inverse-linearly more steps.
More non-commuting term pairs (J bigger, or more commutator weight) cost linearly more per step.

A picture of one Trotter step

A first-order Trotter simulation of $e^{-iHt}$ unrolled as a quantum circuit. The total evolution is cut into $n$ slices; each slice runs the $J$ terms of the Hamiltonian in sequence, each for a short time $\Delta t = t/n$. Total gate count is $J \cdot n$, all bounded-size pieces, provided the Hamiltonian is $k$-local.

Why locality is the load-bearing assumption

Take a moment to see what makes this work. Each factor e^{-i h_j t/n} is the exponential of a k-local operator. An operator that touches k qubits is a 2^k \times 2^k matrix — size 4 for k=2, size 16 for k=4, a fixed number. Its exponential is another 2^k \times 2^k unitary, and by the Solovay-Kitaev theorem any such unitary can be built out of O(4^k) gates from a universal gate set, to error \epsilon, in \text{polylog}(1/\epsilon) steps.

For bounded k, this is O(1) gates per factor. The full Trotter circuit has J \cdot n factors, each O(1) gates, so the total gate count is O(J n). Setting n = O(J^2 t^2 / \epsilon) to hit error \epsilon gives total gates O(J^3 t^2 / \epsilon) — polynomial in every parameter. This is Lloyd's efficient quantum simulation theorem [2].

Contrast with a generic (non-local) Hamiltonian. An arbitrary 2^N \times 2^N Hermitian has 4^N independent real parameters. There is no hope of decomposing e^{-iHt} into anything polynomial in N; the unitary simply has too much information in it. Locality is what makes the problem tractable, and Trotter is the algorithm that exploits it.

Why this is the Feynman-Lloyd theorem: Feynman (chapter 129) argued that nature, being quantum-mechanical, can only be efficiently simulated by another quantum system. Lloyd's 1996 paper gave the first proof — an explicit quantum algorithm (first-order Trotter) that simulates any k-local Hamiltonian in polynomial time, polynomial space, and with bounded error. Before Lloyd, "efficient quantum simulation" was a hope. After Lloyd, it was a theorem.

Examples — work through the counts

Example 1: simulate $e^{-i(X+Z)t}$ on one qubit

Setup. The simplest non-commuting Hamiltonian: two Paulis on one qubit. H = X + Z. You want to apply e^{-iHt} to a qubit in state |0\rangle. Take t = 1 (in some unit of inverse energy) and aim for error \epsilon = 10^{-2} using first-order Trotter.

Step 1. Split and exponentiate each term. X is 1-local, Z is 1-local, both single-qubit Paulis. The exponentials are standard single-qubit rotations:

e^{-iX\Delta t} = \cos(\Delta t)\, I - i \sin(\Delta t)\, X, \qquad e^{-iZ\Delta t} = \begin{pmatrix} e^{-i\Delta t} & 0 \\ 0 & e^{i\Delta t} \end{pmatrix}.

Why: for any Pauli P with P^2 = I, e^{-iP\theta} = \cos\theta \, I - i\sin\theta \, P. Paulis are involutions; their exponentials are rotations by angle \theta on the Bloch sphere about the corresponding axis.

Step 2. Choose n. The commutator is [X, Z] = -2iY, so \|[X, Z]\| = 2. The error bound is \|U_1 - e^{-i(X+Z)}\| \leq t^2 \|[X,Z]\| / (2n) = 1 / n. For error \epsilon = 10^{-2}, set n = 100 Trotter steps.

Step 3. Build the circuit. One Trotter step is R_X(2/n) \cdot R_Z(2/n) where R_P(\theta) = e^{-iP\theta/2} is the standard rotation convention. Repeat n times. Total gate count: 2n = 200 single-qubit rotations. Modern compilers collapse consecutive same-axis rotations, but for the plain algorithm: 200 gates.

Step 4. Verify numerically. The exact e^{-i(X+Z)} is a rotation on the Bloch sphere about the axis (1, 0, 1)/\sqrt{2} by angle 2\sqrt{2}. The Trotter approximation U_1(1, 100) is close — you can compute \|U_1 - e^{-i(X+Z)}\| explicitly and check it is roughly 10^{-2}, matching the bound.

Result. First-order Trotter with n = 100 steps reaches 1% precision on e^{-i(X+Z)} at t = 1, using 200 single-qubit rotations.

Example 1 circuit. 100 Trotter steps, each applying $R_Z$ then $R_X$ with small angle, approximates $e^{-i(X+Z)}$ to 1% precision.

What this shows. Even for one qubit, Trotter is doing real work: [X, Z] \neq 0, so you cannot just apply e^{-iX} once and e^{-iZ} once. You have to interleave many small rotations to approximate the true evolution.

Example 2: gate count for the 3×3 Ising model

Setup. From the last chapter's example: the transverse-field Ising model on a 3×3 periodic lattice.

H = -J \sum_{\langle i,j\rangle} Z_i Z_j - h \sum_{i=1}^{9} X_i, \qquad J = h = 1.

J and h here are energy scales (units of inverse time), not the number of terms. The number of terms: 18 bond ZZ terms + 9 field X terms = 27 terms total, all k-local with k = 2. You want to simulate e^{-iHt} for t = 10 (in units of inverse energy) with total error \epsilon = 10^{-2}.

Step 1. Choose the Trotter order and step count. First-order Trotter error bound: \|U_1 - e^{-iHt}\| \leq t^2 C / n where C is a problem-dependent constant that depends on commutators. For a 2D Ising model with all coefficients O(1), numerical experiments and analytical bounds give C on the order of tens (rough estimate).

Using C \approx 30 (absorbing the sum of commutator norms):

\epsilon = \frac{C t^2}{n} \;\Longrightarrow\; n = \frac{C t^2}{\epsilon} = \frac{30 \cdot 100}{0.01} = 3 \times 10^5.

Step 2. Count gates per Trotter step. 27 terms per step, each a single small rotation (an R_{ZZ} or R_X). An R_{ZZ} compiles to 2 CNOTs + 1 single-qubit rotation = 3 elementary gates. An R_X is 1 gate. So gates per Trotter step: 18 \cdot 3 + 9 \cdot 1 = 63 elementary gates.

Step 3. Total gate count. 63 \times 3 \times 10^5 \approx 2 \times 10^7 gates.

Why this matters: 20 million gates is well beyond NISQ reach (current hardware's gate-error ceiling is around 10^{-3}, and 10^7 gates all need to work to get one measurement). To actually run this simulation, you need either a fault-tolerant machine or a shorter time / larger error tolerance. Using higher-order Trotter (next chapter) or LCU would reduce this to perhaps 10^4 gates.

Step 4. Back-of-envelope with higher-order. Fourth-order Trotter (chapter 132) reduces n to O(t^{5/4}/\epsilon^{1/4}). For t = 10, \epsilon = 10^{-2}: n \sim 10^{5/4} / (10^{-2})^{1/4} \sim 18 \cdot 3.2 \approx 60. A dramatic reduction — at the cost of more gates per Trotter step (factor ~5).

Result. First-order Trotter on the 3×3 Ising model for t = 10, \epsilon = 10^{-2}: roughly 2 \times 10^7 gates. Fourth-order Trotter: roughly 3 \times 10^3 gates. A 10,000× reduction, which is why the next chapter exists.

Gate count for simulating the 3×3 Ising model at $t = 10$ to precision $10^{-2}$, using three Trotter orders. First-order needs ~$2 \times 10^7$ gates; higher orders (next chapter) reduce this by four orders of magnitude.

What this shows. First-order Trotter is simple and always works, but its gate count explodes for long simulations. Knowing the number of terms, the time, and the precision lets you estimate in advance whether a simulation is feasible on available hardware.

Common confusions

"Trotter is exact." No — Trotter is exact only in the n \to \infty limit. For any finite n, there is a nonzero error proportional to t^2 / n. The statement "Trotter becomes exact for small step sizes" is right as a limit; "Trotter is exact" is wrong.
"If the Hamiltonian terms commute, there is no Trotter error." Correct. If every [h_j, h_k] = 0, then e^{-iHt} = \prod_j e^{-i h_j t} exactly, and Trotter gives the answer in n = 1 step. This is the reason diagonalisable Hamiltonians (which are sums of commuting projectors) are trivial to simulate.
"More Trotter steps always beats the noise trade-off on hardware." No, and this is a real engineering dilemma. Each Trotter step costs gates, and each gate has a small error probability p_{\text{gate}}. For G total gates, the circuit error is roughly G \cdot p_{\text{gate}}, which grows with n. So there is an optimal n that balances Trotter error (falling) against gate noise (rising). Past the optimum, more steps make things worse.
"Trotter and qubitisation are alternative approaches to the same problem." Close, but they solve the problem differently. Trotter uses the Hamiltonian directly through its time-evolution operator e^{-iHt}. Qubitisation (Low-Chuang 2017) uses a block encoding of H as a sub-block of a larger unitary, and achieves better asymptotic scaling — O(t + \log(1/\epsilon)) instead of Trotter's O(t^2/\epsilon). For very long simulations or very tight precision, qubitisation is the winner; for short simulations and reasonable precision, Trotter is simpler and sometimes cheaper.
"First-order Trotter is good enough for any problem." The asymptotic scaling n \propto t^2 / \epsilon says otherwise. For chemical-accuracy calculations (\epsilon \sim 10^{-3} Hartree) on molecules over tens of atoms, the gate count at first order exceeds 10^{10} — beyond any foreseeable machine. Higher-order Trotter (next chapter) and post-Trotter methods (LCU, qubitisation) were developed precisely because first-order is not good enough for real chemistry.
"Trotter only works for Pauli-sum Hamiltonians." Not quite. Trotter works for any sum H = \sum_j h_j where each h_j can be efficiently exponentiated. Pauli-sum form is the most common because e^{-i P \theta} for a Pauli string P is standard. But h_j can be any k-local operator — a projector, a hopping term, a four-fermion Coulomb term — as long as you have a gate implementation for e^{-i h_j \Delta t}.

Going deeper

You now have first-order Trotter: the formula, the error bound, the gate count, the picture. That is enough to write a working quantum simulator for any k-local Hamiltonian. What follows goes into the structure below the bound — Baker-Campbell-Hausdorff corrections, tighter commutator-scaling bounds, the modern post-Trotter landscape (LCU, qDRIFT, qubitisation), and applications to quantum chemistry where real gate-count engineering happens.

Baker-Campbell-Hausdorff corrections

The formula e^A e^B = e^{A + B + \tfrac{1}{2}[A, B] + \tfrac{1}{12}([A, [A, B]] - [B, [A, B]]) + \ldots} is the Baker-Campbell-Hausdorff expansion. Every term in the BCH series is a nested commutator of A and B. When you Trotterise with small step \Delta t = t/n, each slice introduces a correction

\Delta t \cdot (A + B) + \tfrac{\Delta t^2}{2}[A, B] + O(\Delta t^3)

in the exponent. Summing over n slices, the linear part gives the desired t(A + B); the quadratic commutator term gives a total error \propto n \cdot \Delta t^2 = t^2/n, which is the first-order Trotter error. Pushing BCH to third order gives the leading O(t^3/n^2) correction after careful reshuffling — which is the error of second-order (symmetric) Trotter. Higher-order Suzuki formulas kill these corrections systematically.

Commutator-scaling bounds — tighter than worst-case

The bound \|U_1 - e^{-iHt}\| \leq \tfrac{t^2}{2n} \sum_{j < k} \|[h_j, h_k]\| is tighter than the worst-case bound O(J^2 \Lambda^2 t^2 / n) because many commutators vanish. For a nearest-neighbour 2-local Hamiltonian on an N-qubit lattice, only terms sharing a qubit fail to commute — there are O(N) such non-zero commutators, not O(N^2). Childs et al. 2019 [6] proved commutator-based Trotter bounds with explicit constants, showing that physical Hamiltonians often need far fewer steps than worst-case analysis predicts. This "commutator scaling" is why real-world quantum-chemistry simulations use Trotter in practice.

Post-Trotter methods

Three modern algorithms improve over Trotter.

Linear Combinations of Unitaries (LCU) [5] (Berry, Childs, Kothari 2015). Expand e^{-iHt} = \sum_k c_k U_k as a linear combination of unitaries — for instance, via the Taylor series e^{-iHt} = \sum_k (-iHt)^k / k!. Apply each U_k via a multiplexed quantum oracle and use amplitude amplification to select the weighted sum. Scaling: O(\tau \log(\tau/\epsilon) / \log\log(\tau/\epsilon)) where \tau = t \|H\|_1.
qDRIFT (Campbell 2019). Randomly sample terms h_j from H with probability proportional to \|h_j\| and apply them. Expected gate count scales as O(t^2 \|H\|_1^2 / \epsilon) — independent of the number of terms J, a huge win for Hamiltonians with many small terms.
Qubitisation (Low, Chuang 2017). Block-encode H into a larger unitary and implement e^{-iHt} via Quantum Signal Processing. Scaling: O(t \|H\| + \log(1/\epsilon)) — optimal in both t and \epsilon. See chapter 134.

The hierarchy: Trotter is simplest and most versatile; LCU is better scaling but needs more ancilla qubits; qubitisation is optimal but requires the most complex block-encoding machinery.

Application to quantum chemistry

For real molecules, first-order Trotter applied to the molecular Hamiltonian (Part 15 later chapters) gives gate counts of order 10^{10}–10^{14} for chemical accuracy — well out of reach of NISQ. Much of the last decade of quantum-chemistry-on-QC research is about engineering the constant factors. Key tricks:

Term ordering. Reordering the terms so commuting groups evolve together can shave orders of magnitude off the gate count.
Fermionic SWAP networks. Exploit the structure of Jordan-Wigner encodings to compile each e^{-i h_j \Delta t} efficiently.
Trotter error from the actual commutator structure. Rather than worst-case bounds, numerical studies estimate the true Trotter error for specific molecules and show that fewer steps often suffice.
Adaptive step sizes. Use small Trotter steps where the Hamiltonian changes rapidly and larger steps elsewhere.

A 2020 paper by Campbell and co-authors estimated that simulating FeMoCo — the nitrogen-fixation enzyme active site — needs around 10^{10} gates with heavily optimised Trotter. With qubitisation, that drops to 10^{8}. Both are beyond NISQ; fault tolerance is required. The open question is whether the hardware gets there before the classical competition (tensor networks, DMRG extended to larger systems) solves the same problems.

Indian research connections

TIFR's theoretical condensed-matter group has published on Trotter-based simulation of Hubbard models; IISc Bangalore's quantum chemistry group (Mukhopadhyay, Mandal, and others) has VQE + Trotter hybrid-algorithm papers. India's National Quantum Mission explicitly lists Hamiltonian simulation as a priority use case for the future fault-tolerant quantum computing effort.

Where this leads next

Higher-Order Trotter — chapter 132, Suzuki's symmetric and fourth-order formulas that dramatically improve the error-vs-gate-count trade-off.
Linear Combinations of Unitaries — chapter 133, the first major post-Trotter method.
Qubitisation and Block Encoding — chapter 134, the optimal-scaling modern method.
Local Hamiltonians — chapter 130, the structural assumption that makes Trotter efficient.
Feynman Revisited — chapter 129, the original motivation for quantum Hamiltonian simulation.

References

Wikipedia, Lie product formula (Trotter product formula) — the original Trotter 1959 identity and its extensions.
Seth Lloyd, Universal Quantum Simulators, Science 273, 1073 (1996) — DOI:10.1126/science.273.5278.1073.
Nielsen and Chuang, Quantum Computation and Quantum Information, §4.7 — Cambridge University Press.
John Preskill, Lecture Notes on Quantum Computation, Chapter 7 — theory.caltech.edu/~preskill/ph229.
Berry, Childs, Kothari, Hamiltonian simulation with nearly optimal dependence on all parameters (2015) — arXiv:1501.01715.
Childs, Su, Tran, Wiebe, Zhu, A Theory of Trotter Error (2019) — arXiv:1912.08854.