Inclusion-Exclusion Principle

In short

The inclusion-exclusion principle computes the size of a union of sets. For two sets: |A \cup B| = |A| + |B| - |A \cap B|. For three sets, add back the triple intersection: |A \cup B \cup C| = |A| + |B| + |C| - |A \cap B| - |A \cap C| - |B \cap C| + |A \cap B \cap C|. The general formula alternates signs through all possible intersections, ensuring every element is counted exactly once.

In a class of 40 students, 25 play cricket and 18 play football. The sports teacher asks: how many students play at least one of the two sports? If you add 25 + 18 = 43, you get a number larger than the class size. Something has been counted twice — the students who play both cricket and football. If 10 students play both, the correct count is 25 + 18 - 10 = 33.

This "add, then subtract the overlap" logic is the inclusion-exclusion principle for two sets. The principle scales to any number of sets, and it is one of the most powerful tools in counting arguments — especially when direct counting leads to overcounting.

Two sets: the simplest case

Take two finite sets A and B. Elements of A \cup B fall into three mutually exclusive regions: those in A only, those in B only, and those in both.

When you compute |A| + |B|, every element in A \cap B is counted twice — once as a member of A and once as a member of B. Subtracting |A \cap B| removes the extra copy:

|A \cup B| = |A| + |B| - |A \cap B|

Two overlapping sets. Adding $|A|$ and $|B|$ double-counts the intersection $A \cap B$. Subtracting $|A \cap B|$ once fixes the count: $|A \cup B| = |A| + |B| - |A \cap B|$.

Proof for two sets

Partition A \cup B into three disjoint parts:

A \setminus B (elements in A but not B)
B \setminus A (elements in B but not A)
A \cap B (elements in both)

Then |A \cup B| = |A \setminus B| + |B \setminus A| + |A \cap B|.

Since |A| = |A \setminus B| + |A \cap B|, it follows that |A \setminus B| = |A| - |A \cap B|. Similarly, |B \setminus A| = |B| - |A \cap B|. Substituting:

|A \cup B| = (|A| - |A \cap B|) + (|B| - |A \cap B|) + |A \cap B| = |A| + |B| - |A \cap B|

The proof is complete. Every element of A \cup B is counted exactly once on the right-hand side.

Three sets

For three sets A, B, C, a similar overcounting argument applies, but with one more correction layer.

Start with |A| + |B| + |C|. An element in exactly two of the three sets is counted twice, and an element in all three sets is counted three times. Subtract all pairwise intersections: |A \cap B| + |A \cap C| + |B \cap C|. Now an element in exactly two sets is counted 2 - 1 = 1 time (correct), but an element in all three sets is counted 3 - 3 = 0 times (gone). Add back the triple intersection |A \cap B \cap C|:

|A \cup B \cup C| = |A| + |B| + |C| - |A \cap B| - |A \cap C| - |B \cap C| + |A \cap B \cap C|

Three sets create seven regions. Adding $|A| + |B| + |C|$ overcounts the pairwise intersections and triple-counts the centre. Subtracting the three pairwise intersections eliminates the triple-counted centre entirely, so it must be added back once.

Tracking the count for an element in all three sets

Consider an element x \in A \cap B \cap C. Here is how it gets counted at each stage:

Operation	Contribution to x's count	Running total
$+	A	$
$+	B	$
$+	C	$
$-	A \cap B	$
$-	A \cap C	$
$-	B \cap C	$
$+	A \cap B \cap C	$

The final count is 1 — exactly right. The same tracking works for elements in exactly one set (count: 1 - 0 + 0 = 1) and elements in exactly two sets (count: 2 - 1 + 0 = 1). Every element of the union is counted exactly once.

Step:

An element that belongs to all three sets starts at count $0$. Each $+|A|$, $+|B|$, $+|C|$ raises it by $1$. Each pairwise subtraction lowers it. The triple intersection addition brings it back to exactly $1$.

The general formula

For n finite sets A_1, A_2, \ldots, A_n, the inclusion-exclusion principle states:

Inclusion-Exclusion Principle

\left|\bigcup_{i=1}^{n} A_i\right| = \sum_{i} |A_i| - \sum_{i < j} |A_i \cap A_j| + \sum_{i < j < k} |A_i \cap A_j \cap A_k| - \cdots + (-1)^{n+1} |A_1 \cap A_2 \cap \cdots \cap A_n|

More compactly:

\left|\bigcup_{i=1}^{n} A_i\right| = \sum_{\emptyset \neq S \subseteq \{1,\ldots,n\}} (-1)^{|S|+1} \left|\bigcap_{i \in S} A_i\right|

The pattern: add all single sets, subtract all pairwise intersections, add all triple intersections, subtract all quadruple intersections, and so on. The signs alternate, starting with +.

Why it works. Take any element x that belongs to exactly m of the n sets (with m \geq 1). Its contribution to the right-hand side is:

\binom{m}{1} - \binom{m}{2} + \binom{m}{3} - \cdots + (-1)^{m+1}\binom{m}{m}

By the binomial theorem, \sum_{r=0}^{m}(-1)^r\binom{m}{r} = (1-1)^m = 0. Removing the r = 0 term (\binom{m}{0} = 1) gives:

\sum_{r=1}^{m}(-1)^r\binom{m}{r} = -1

Multiplying both sides by -1:

\sum_{r=1}^{m}(-1)^{r+1}\binom{m}{r} = 1

Every element belonging to at least one set contributes exactly 1 to the total. Elements in none of the sets contribute 0. The formula counts every element of the union exactly once.

Applications in counting

Counting elements NOT in any set (the complement form)

If the sets A_1, \ldots, A_n are subsets of a universal set U, then the number of elements in U that belong to none of the A_i is:

|U| - \left|\bigcup_{i=1}^n A_i\right|

This complement form is often the more natural way to solve a problem: count the "bad" cases using inclusion-exclusion, then subtract from the total.

Counting integers not divisible by given primes

How many integers from 1 to 100 are divisible by neither 2 nor 3 nor 5?

Let A_2 = multiples of 2 up to 100, A_3 = multiples of 3, A_5 = multiples of 5.

|A_2| = \lfloor 100/2 \rfloor = 50
|A_3| = \lfloor 100/3 \rfloor = 33
|A_5| = \lfloor 100/5 \rfloor = 20
|A_2 \cap A_3| = \lfloor 100/6 \rfloor = 16
|A_2 \cap A_5| = \lfloor 100/10 \rfloor = 10
|A_3 \cap A_5| = \lfloor 100/15 \rfloor = 6
|A_2 \cap A_3 \cap A_5| = \lfloor 100/30 \rfloor = 3

By inclusion-exclusion: |A_2 \cup A_3 \cup A_5| = 50 + 33 + 20 - 16 - 10 - 6 + 3 = 74.

Integers divisible by none of 2, 3, 5: 100 - 74 = 26.

This technique connects to the Euler totient function in number theory: \phi(n) counts integers from 1 to n that are coprime to n, and the formula for \phi(n) is derived using inclusion-exclusion on the prime factors of n.

Multiples of $2$, $3$, and $5$ among the integers $1$ to $100$. Each circle shows the total count of multiples; overlaps show the shared multiples. Inclusion-exclusion gives $74$ integers divisible by at least one of $2, 3, 5$, leaving $26$ coprime to $30$.

Two worked examples

Example 1: In a survey of $200$ households, $120$ have a TV, $80$ have a computer, and $50$ have both. How many households have at least one of the two? How many have neither?

Step 1. Define the sets.

Let T = set of households with a TV, C = set of households with a computer.

|T| = 120, |C| = 80, |T \cap C| = 50.

Why: the problem gives us the sizes of two sets and their intersection — exactly the data needed for the two-set inclusion-exclusion formula.

Step 2. Apply the two-set formula.

|T \cup C| = |T| + |C| - |T \cap C| = 120 + 80 - 50 = 150

Why: the 50 households with both devices are in T and in C, so adding |T| + |C| counts them twice. Subtracting 50 corrects this.

Step 3. Compute the complement.

\text{Neither} = 200 - |T \cup C| = 200 - 150 = 50

Why: the universal set has 200 households. Those not in T \cup C have neither a TV nor a computer.

Step 4. Sanity check by accounting for all four regions.

TV only: 120 - 50 = 70
Computer only: 80 - 50 = 30
Both: 50
Neither: 50
Total: 70 + 30 + 50 + 50 = 200. This matches the total number of households.

Result. 150 households have at least one device; 50 have neither.

The four disjoint regions account for every household: $70$ (TV only) + $50$ (both) + $30$ (computer only) + $50$ (neither) = $200$. The inclusion-exclusion formula gives the union $T \cup C = 150$ in one step.

The Venn diagram splits the 200 households into four non-overlapping groups. The numbers in all four regions add up to 200, confirming the calculation.

Example 2: How many integers from $1$ to $1{,}000$ are divisible by $3$, $5$, or $7$?

Step 1. Define the sets.

Let A = multiples of 3 in \{1, \ldots, 1000\}, B = multiples of 5, C = multiples of 7.

Why: "divisible by 3, 5, or 7" means belonging to A \cup B \cup C. This calls for the three-set inclusion-exclusion formula.

Step 2. Compute the individual set sizes.

|A| = \lfloor 1000/3 \rfloor = 333, \quad |B| = \lfloor 1000/5 \rfloor = 200, \quad |C| = \lfloor 1000/7 \rfloor = 142.

Why: the number of multiples of d from 1 to N is \lfloor N/d \rfloor (the greatest integer function). For d = 3: 1000/3 = 333.33\ldots, so there are 333 multiples.

Step 3. Compute the pairwise intersections.

Since 3, 5, 7 are pairwise coprime, A \cap B = multiples of \text{lcm}(3,5) = 15, and so on.

Why: a number divisible by both 3 and 5 is divisible by \text{lcm}(3,5) = 15. Since 3 and 5 are coprime, the lcm is their product.

Step 4. Compute the triple intersection and apply the formula.

|A \cap B \cap C| = \lfloor 1000/105 \rfloor = 9

|A \cup B \cup C| = 333 + 200 + 142 - 66 - 47 - 28 + 9 = 543

Why: the three-set inclusion-exclusion formula with alternating signs: add singles (675), subtract pairs (141), add the triple (9), giving 675 - 141 + 9 = 543.

Result. 543 integers from 1 to 1{,}000 are divisible by 3, 5, or 7.

The three-set inclusion-exclusion applied to divisibility. Adding singles ($675$), subtracting pairs ($141$), adding the triple ($9$) gives $543$ integers divisible by at least one of $3$, $5$, $7$. The remaining $457$ are coprime to $105 = 3 \times 5 \times 7$.

The final line of the diagram hints at a deeper connection: the fraction of integers coprime to 105 in any long interval approaches \phi(105)/105 = 48/105, and 457/1000 \approx 0.457 is close to 48/105 \approx 0.457 — the Euler totient function in action.

Common confusions

"Inclusion-exclusion only works for two or three sets." The principle works for any finite number of sets. For n sets, the formula has 2^n - 1 terms (one for each non-empty subset). In practice, most exam problems involve two or three sets.
"You always add the pairwise intersections." The pairwise intersections are always subtracted (they carry a - sign). The signs alternate: + for singles, - for pairs, + for triples, - for quadruples, and so on. The sign of a term involving |S| sets is (-1)^{|S|+1}.
"If two sets are disjoint, inclusion-exclusion is not needed." When A \cap B = \emptyset, the formula gives |A \cup B| = |A| + |B| - 0 = |A| + |B|. It still works — the principle does not require the sets to overlap.
"I can use inclusion-exclusion only with set sizes." The principle applies to any additive quantity over sets — probabilities, for instance. For events A and B: P(A \cup B) = P(A) + P(B) - P(A \cap B). The structure is identical.
"The formula for three sets has seven terms." It has seven terms if you count each |A_i|, |A_i \cap A_j|, and |A_1 \cap A_2 \cap A_3| separately: 3 + 3 + 1 = 7. More generally, for n sets, the number of terms is \binom{n}{1} + \binom{n}{2} + \cdots + \binom{n}{n} = 2^n - 1.

Going deeper

If you can apply the two-set and three-set formulas, set up divisibility problems, and use the complement form, you are ready for the main applications: derangements, surjection counting, and the Euler totient function. The rest of this section covers the general proof and an important variant.

Proof of the general formula by induction

Base case. For n = 1: |A_1| = |A_1|. True.

Inductive step. Assume the formula holds for n - 1 sets. Write \bigcup_{i=1}^n A_i = \left(\bigcup_{i=1}^{n-1} A_i\right) \cup A_n. Apply the two-set formula:

\left|\bigcup_{i=1}^n A_i\right| = \left|\bigcup_{i=1}^{n-1} A_i\right| + |A_n| - \left|\left(\bigcup_{i=1}^{n-1} A_i\right) \cap A_n\right|

The intersection distributes: \left(\bigcup_{i=1}^{n-1} A_i\right) \cap A_n = \bigcup_{i=1}^{n-1}(A_i \cap A_n). Apply the inductive hypothesis to both the first union (over n-1 sets) and the intersection union (also over n-1 sets, namely A_1 \cap A_n, \ldots, A_{n-1} \cap A_n). Collecting terms and tracking signs yields the formula for n sets.

The "at least / exactly" variants

Inclusion-exclusion also answers "how many elements belong to exactly m of the n sets?" Let S_k = \sum_{|T|=k}\left|\bigcap_{i \in T} A_i\right| (the sum of all k-fold intersection sizes). Then:

At least m sets: \sum_{k=m}^{n}(-1)^{k-m}\binom{k}{m}S_k
Exactly m sets: \sum_{k=m}^{n}(-1)^{k-m}\binom{k}{m}S_k

For m = 0 (the complement — elements in none of the sets), this gives: |U| - S_1 + S_2 - S_3 + \cdots + (-1)^n S_n, which is the familiar complement form.

These "exactly m" formulas appear in problems like "how many integers from 1 to 100 are divisible by exactly one of 2, 3, 5" — a natural extension of the divisibility examples above.

Where this leads next

Derangements — permutations with no fixed points, counted elegantly by applying inclusion-exclusion.
Set Operations — union, intersection, complement — the language behind every formula in this article.
Combinations — Basics — the \binom{n}{r} counts that appear in the general proof and the "exactly m" variants.
Fundamental Principle of Counting — the addition principle for disjoint cases, which inclusion-exclusion extends to overlapping cases.
Number Theory — Basics — the Euler totient function and Mobius inversion, both powered by inclusion-exclusion.