Why Does Cramer's Rule Work — What's the Determinant Doing Geometrically?

In short

The determinant of a 2 \times 2 matrix \begin{pmatrix} a & b \\ c & d \end{pmatrix} is the signed area of the parallelogram whose two sides are the column vectors (a, c) and (b, d). That area equals ad - bc, with sign telling you whether the second column sits counter-clockwise (positive) or clockwise (negative) from the first. Cramer's rule then says: solving \{ax + by = e,\; cx + dy = f\} is the same as asking, "by what scale factors x and y must I stretch the two column vectors so their sum lands at (e, f)?" The answer falls straight out as area ratios — x = D_x / D is the factor for the first column, y = D_y / D for the second — because replacing column 1 with (e, f) produces a parallelogram whose area is exactly x times the original.

You have already met Cramer's rule as a recipe — three little determinant grids, divide, done. That sibling article shows the mechanics. This one answers the harder question: why does the recipe work at all? Why on earth does swapping a column with the constants (e, f) and dividing produce the answer?

The answer is geometric, and once you see it, Cramer's rule stops being a formula to memorise and becomes something obvious — like saying "if I double the side of a square, the area quadruples." That kind of obvious. CBSE Class 12 (Matrices and Determinants) and JEE Mains both lean on Cramer's rule, and JEE Advanced loves to test the geometric meaning of D = 0. Knowing why is the difference between answering one question and answering five.

1. The determinant is the area of a parallelogram

Pick two vectors in the plane:

\vec{u} = (a, c), \qquad \vec{v} = (b, d).

These are the two columns of the coefficient matrix \begin{pmatrix} a & b \\ c & d \end{pmatrix}. Plant both vectors at the origin. They span a parallelogram — the four corners are (0, 0), \vec{u}, \vec{v}, and \vec{u} + \vec{v}.

The area of that parallelogram is

\text{Area} = |ad - bc|.

The signed version, ad - bc (no absolute value), is what you call the determinant. The sign just records orientation: positive if you sweep counter-clockwise from \vec{u} to \vec{v} (like turning a steering wheel left), negative if clockwise (turning right). When \vec{u} and \vec{v} point along the same line, the parallelogram is squashed flat — area zero — and the determinant is zero.

Left: the parallelogram spanned by the two columns of the coefficient matrix. Its area is the determinant $D$. Right: replace the first column with the constants vector $(e, f)$ — the new parallelogram has area $D_x$. Cramer's rule says the ratio $D_x / D$ is exactly $x$, the value you need to scale the original first column by to land on $(e, f)$. Areas in the diagram are schematic — the *ratio* is what matters.

You can derive the area-equals-ad - bc formula by chopping the bounding rectangle and subtracting the four right triangles outside the parallelogram, or by using the cross-product formula |\vec{u} \times \vec{v}| from physics — both give the same answer. For now, just trust it: \text{area} = ad - bc, with sign.

2. What is the system \{ax + by = e,\; cx + dy = f\} really asking?

The standard reading of the system is "find (x, y) that satisfies both equations." That is the row picture — each equation is a line, and you want the intersection.

But there is a second reading — the column picture — and this is where Cramer's rule lives. Rewrite the system as a single vector equation:

x \begin{pmatrix} a \\ c \end{pmatrix} + y \begin{pmatrix} b \\ d \end{pmatrix} = \begin{pmatrix} e \\ f \end{pmatrix}.

Read it out loud: "x times the first column vector \vec{u} = (a, c) plus y times the second column vector \vec{v} = (b, d) equals the constants vector (e, f)."

So the question is: which combination of stretches (x, y) adds these two arrows tip-to-tail to land at (e, f)? It is the same question a sailor asks when they want to reach a port using two known wind directions — how much of each wind do they ride?

If \vec{u} and \vec{v} are not parallel, there is exactly one such combination — the parallelogram law has a unique answer. If they are parallel (so D = 0), every linear combination still lands somewhere on that single line; you can either hit (e, f) in infinitely many ways (if (e, f) lies on the line) or never at all (if it doesn't). That is the same trichotomy you saw in Systems of Linear Equations, now from the column-picture angle.

3. Cramer's rule as area ratios

Here is the punchline. The original parallelogram has area D — sides \vec{u} and \vec{v}. Now imagine you replace the first side \vec{u} with (e, f) — but (e, f) = x\vec{u} + y\vec{v}, by the column-picture equation. The new parallelogram has sides (x\vec{u} + y\vec{v}) and \vec{v}.

What is its area? Two facts about parallelogram areas — both visible from a sketch:

Adding a multiple of one side to the other does not change the area. (Slide one corner along the opposite side; the base and height are unchanged.)
Scaling one side by k scales the area by k.

So the parallelogram with sides (x\vec{u} + y\vec{v}) and \vec{v} has the same area as the parallelogram with sides x\vec{u} and \vec{v} — the y\vec{v} part adds a multiple of the second side to the first, which by (1) leaves the area alone. And by (2), scaling \vec{u} by x scales the area by x. Therefore

D_x = x \cdot D.

Divide both sides by D (which is nonzero, by assumption) and you get

\boxed{x = \frac{D_x}{D}.}

Why this is the whole rule: D_x is literally the area of the parallelogram you get by replacing column 1 with the target vector (e, f). Because the target vector decomposes as x\vec{u} + y\vec{v} and the y\vec{v} part contributes nothing to the area, you are left with a parallelogram whose first side is x\vec{u} — and its area is exactly x times the original area D.

The same argument with the second column replaced gives D_y = y \cdot D, hence y = D_y / D. That is Cramer's rule, fully derived from one geometric fact: the area of a parallelogram scales linearly with the length of either side.

Three worked examples

Example 1 — see $D$ as area for $\{2x + y = 5,\; x + 3y = 8\}$

The two columns of the coefficient matrix are \vec{u} = (2, 1) and \vec{v} = (1, 3). Plant both at the origin and span the parallelogram.

Compute the determinant:

D = ad - bc = (2)(3) - (1)(1) = 6 - 1 = 5.

So the parallelogram with corners (0, 0), (2, 1), (3, 4), (1, 3) has area 5. You can check this by chopping the 3 \times 4 bounding box and subtracting the right triangles outside the parallelogram:

\text{box} - 4 \text{ triangles} = 12 - 2 \cdot \tfrac12 (2)(1) - 2 \cdot \tfrac12 (1)(3) = 12 - 2 - 3 = 7

Wait — that gives 7, not 5, because the bounding box is not the right reference; the parallelogram does not fit snugly inside a 3 \times 4 rectangle. The cleanest computation is the determinant itself: |ad - bc| = 5. The point of the worked example is that the formula and the geometry agree on what "area" means, with the determinant doing the bookkeeping for you.

Why the determinant is faster than counting: the box-and-triangles trick only works when the parallelogram is conveniently aligned. The formula |ad - bc| works for any parallelogram, regardless of orientation. That is what makes determinants the universal "area meter" of linear algebra.

Example 2 — Cramer for the same system

Continuing with 2x + y = 5 and x + 3y = 8 — so a = 2, b = 1, c = 1, d = 3, e = 5, f = 8.

You have D = 5 from above. Now compute the two swapped determinants:

D_x = ed - bf = (5)(3) - (1)(8) = 15 - 8 = 7.

D_y = af - ec = (2)(8) - (5)(1) = 16 - 5 = 11.

x = \frac{D_x}{D} = \frac{7}{5}, \qquad y = \frac{D_y}{D} = \frac{11}{5}.

Check. 2(\tfrac75) + \tfrac{11}{5} = \tfrac{14 + 11}{5} = \tfrac{25}{5} = 5 ✓. And \tfrac75 + 3(\tfrac{11}{5}) = \tfrac{7 + 33}{5} = \tfrac{40}{5} = 8 ✓.

The geometric reading. x = 7/5 says: scale the first column (2, 1) by 7/5 to get (14/5, 7/5). y = 11/5 says: scale the second column (1, 3) by 11/5 to get (11/5, 33/5). Add them: (14/5 + 11/5,\; 7/5 + 33/5) = (25/5, 40/5) = (5, 8) — exactly the constants vector. The two scaled columns added tip-to-tail land precisely at (e, f), just as the column picture demanded.

Example 3 — when $D = 0$, the geometry collapses

Take

x + 2y = 4, \qquad 2x + 4y = 9.

The columns are \vec{u} = (1, 2) and \vec{v} = (2, 4) = 2\vec{u}. The two columns point along the same line, so the parallelogram they span has zero area:

D = (1)(4) - (2)(2) = 0.

Geometrically, every linear combination x\vec{u} + y\vec{v} lands somewhere on the line through the origin with direction (1, 2). The constants vector (4, 9) — does it lie on that line? On the line, y = 2x, so the point would have to be (4, 8), not (4, 9). The target sits off the line, so no combination of the two columns can reach it. No solution.

If instead the constants had been (4, 8) — on the line — then every combination (x, y) with x + 2y = 4 would land on it. Infinitely many solutions.

Cramer's rule gives 0/0 in both cases, which is why it cannot distinguish them by itself — you have to look at the columns. The column picture explains why the rule breaks: when the two arrows are parallel, they cover only a one-dimensional subset of the plane, not the whole plane, so most targets are unreachable and the few that are reachable have infinitely many ways to be reached.

Why this matches the row picture: parallel column vectors (1, 2) and (2, 4) correspond to proportional row coefficients — the two equations describe parallel (or coincident) lines. The column picture's "arrows lie on a single line" and the row picture's "lines are parallel" are the same condition D = 0 seen from opposite angles. JEE Advanced loves to ask which case you are in given a parameter — knowing both pictures is what makes those questions easy.

Why Cramer's rule is more than a memorised recipe

Most students meet Cramer's rule as a sequence of moves: write the matrix, compute three determinants, divide. That works for exam questions but leaves the rule looking like a magic trick. The geometric reading replaces magic with a single sentence: the determinant measures area, and the area scales linearly with each side, so the area-after-replacement gives you the scale factor directly. The same one-sentence intuition extends to 3 \times 3 systems (volume of a parallelepiped, scaled by the corresponding x, y, z) and beyond.

This is also why D = 0 is not just "the formula breaks down" — it is the geometric statement that the column vectors fail to span the plane. The two arrows lie along a line, and most points in the plane are unreachable. The rule fails because the geometry fails, not the other way around.

References

Cramer's rule — Wikipedia
Gilbert Strang, Introduction to Linear Algebra, Chapter 5 — MIT OCW notes
3Blue1Brown — Cramer's rule, geometric interpretation (Essence of Linear Algebra, Ch. 12)
3Blue1Brown — The determinant (Essence of Linear Algebra, Ch. 6)
NCERT Class 12 Mathematics, Chapter 4 — Determinants