Why Isn't Checking 20 Examples Enough to Prove a Universal Statement?

You are asked to show that "for every positive integer n, the expression n^2 - n is even." You quickly try n = 1, 2, 3, \ldots, 20 and all twenty come out even. Your teacher reads your work and writes: "This is not a proof." You are confused. Twenty cases! Every single one worked! What more could possibly be wanted?

The answer is uncomfortable but clean: a universal statement quantifies over an infinite set. Twenty examples cover twenty of infinitely many cases. The remaining infinity is untested, and mathematics does not extrapolate from finite samples. This article explains why, when "checking more examples" is useful, and what replaces it.

The core reason: "every" is infinite

The statement "for every positive integer n, n^2 - n is even" makes a claim about \mathbb{N} = \{1, 2, 3, \ldots\}. This set is countably infinite. Checking 20 values — even checking a billion — leaves infinitely many values not checked. A single hidden failure among those untested values would make the statement false. Your 20 successes cannot rule that out.

Why finite checking cannot suffice: the logical form of the claim is \forall n \in \mathbb{N},\, P(n). To verify \forall, you need every single n to satisfy P. To falsify \forall, you need one n where P(n) fails. Your 20 successes rule out a falsifier within the 20 — they say nothing about the other \aleph_0 - 20 cases. Mathematics does not accept "probably works" as a proof.

History's most famous trap: the n^2 - n + 41 primes

Euler noticed that the polynomial p(n) = n^2 - n + 41 produces primes for a remarkable run of values:

p(1) = 41 (prime)
p(2) = 43 (prime)
p(3) = 47 (prime)
\ldots
p(40) = 1601 (prime)

Forty-one straight primes. A student who checks the first forty positive integers and concludes "n^2 - n + 41 is always prime" would have forty successful cases — twice what your teacher asked for. Then:

p(41) = 41^2 - 41 + 41 = 41^2 = 1681 = 41 \times 41.

Not prime. The polynomial fails at n = 41 — a fact invisible from any finite run below that point. Forty examples were useless as evidence for a universal claim. The formula is not "usually prime, rarely composite" — it is provably composite at n = 41 and at every multiple of 41. The formula behaves like a proof for forty numbers and then exposes the lie.

This is the cleanest illustration of why example-checking never substitutes for a proof. If you had stopped at 40 and written a "proof by example," you would have confidently published a false theorem.

Larger traps lurk at astronomical n

There are universal claims that hold for every integer up to 10^{100} and then fail. Examples include a number of conjectures about the distribution of primes where the smallest counterexample is so large that no computer will ever test it directly, yet the counterexample is known to exist from deeper arguments.

No human, and no computer, can check 10^{100} cases. Even if you could, a universal statement about \mathbb{N} requires you to go further still. This is the structural reason the proof-by-examples approach is abandoned the moment you leave primary-school arithmetic: the universe of inputs is too large for case-by-case checking, and the first counterexample can hide arbitrarily far away.

What a proof does that examples cannot

A proof replaces checking one input at a time with one uniform argument that handles all inputs at once. For the original claim, here is the full direct proof:

"Let n be any positive integer. Write n^2 - n = n(n - 1). The numbers n and n - 1 are consecutive integers, so exactly one of them is even. Therefore their product n(n - 1) is even. Hence n^2 - n is even for every positive integer n. \square"

Look at what the proof does: it takes an arbitrary n (a placeholder, not a specific number) and uses a general property (consecutive integers alternate parity) that applies uniformly to every input. Because the argument never depends on a specific value of n, it simultaneously covers n = 1, n = 2, n = 41, n = 10^{100}, and every other positive integer. One paragraph, infinitely many cases.

This is why proofs exist. Examples cover a finite sample; proofs cover the whole quantifier.

When are examples useful, then?

Examples are not worthless. They are useful for three specific jobs — none of which is "proving the claim":

Building intuition. Trying n = 1, 2, 3 helps you see why the claim is true. The examples give you something to point at while you search for the general argument.
Finding counterexamples. If the claim turns out to be false, a single example is all you need to refute it. Checking n = 1, 2, 3, \ldots is exactly how you might discover that a conjecture fails at some small n. For falsification, examples are decisive; for verification, they are not.
Pattern-spotting toward a proof. The first few examples often reveal the algebraic form of the eventual proof. When you see that n^2 - n = n(n-1) gives 0, 2, 6, 12, 20, 30, \ldots, you notice every result is even and the product structure hints at the "one of two consecutive integers is even" argument. Examples feed the scratch work that writes the proof.

What examples never do is replace the proof. They inform it and test it; they cannot certify it.

Where the temptation comes from

In science, twenty successful experiments is excellent evidence. A drug works on twenty patients, we record the effect and publish. Science accepts inductive evidence because the universe is finite and we have no other option. The twenty successes matter because twenty is a large fraction of a reasonable population.

Mathematics is different. The domain is usually infinite — all integers, all real numbers, all continuous functions, all sets. No finite sample is a "reasonable fraction" of infinity. So mathematics demands deductive arguments that apply to every case simultaneously, rather than inductive extrapolation from observed cases. A student arriving from science class to maths class often imports the inductive habit and is surprised when it is rejected.

The reflex to install

When the claim contains "for all," "every," "each," "whenever," or an implicit universal quantifier (as in "n^2 - n is always even"), finite example-checking is not a proof. Your job is:

Compute a few examples on scratch to convince yourself the claim is probably true.
Look at the examples for a pattern that suggests a general argument.
Write a proof that works on a symbolic n, not on a list of specific values.

The examples fuel the proof. The proof is what earns the claim. Twenty examples is a great starting point and a terrible ending point — the work is not done until the proof handles the infinite remainder.

When one example is enough: existence claims

There is one kind of claim where a single example finishes the job: an existence claim. "There exists an even integer." Answer: 4. Done. For \exists, one witness suffices. For \forall, no finite number of witnesses suffices.

Always read the quantifier before deciding how much example-checking matters.