Systems of Linear Equations (Matrices)

In short

A system of n linear equations in n unknowns can be written as AX = B, where A is the coefficient matrix, X is the column of unknowns, and B is the column of constants. If \det A \neq 0, the system has a unique solution given by X = A^{-1}B or by Cramer's rule. If \det A = 0, the system is either inconsistent (no solution) or has infinitely many solutions, depending on B.

A shopkeeper in Jaipur sells two types of namkeen — let's call them type P and type Q. On Monday, 3 packets of P and 2 packets of Q sell for a total of Rs 190. On Tuesday, 5 packets of P and 4 packets of Q sell for Rs 350. What is the price of each type?

Writing x for the price of P and y for the price of Q, you get two equations:

3x + 2y = 190

5x + 4y = 350

You already know how to solve this by substitution or elimination from earlier algebra. But here is a different way to look at it. Write the system as a single matrix equation:

\begin{bmatrix} 3 & 2 \\ 5 & 4 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 190 \\ 350 \end{bmatrix}

This is AX = B, where A = \begin{bmatrix} 3 & 2 \\ 5 & 4 \end{bmatrix}, X = \begin{bmatrix} x \\ y \end{bmatrix}, and B = \begin{bmatrix} 190 \\ 350 \end{bmatrix}.

Now \det A = 3 \cdot 4 - 2 \cdot 5 = 12 - 10 = 2 \neq 0. The matrix is invertible. So you can multiply both sides on the left by A^{-1}:

X = A^{-1} B = \frac{1}{2}\begin{bmatrix} 4 & -2 \\ -5 & 3 \end{bmatrix} \begin{bmatrix} 190 \\ 350 \end{bmatrix} = \frac{1}{2}\begin{bmatrix} 760 - 700 \\ -950 + 1050 \end{bmatrix} = \frac{1}{2}\begin{bmatrix} 60 \\ 100 \end{bmatrix} = \begin{bmatrix} 30 \\ 50 \end{bmatrix}

So x = 30 and y = 50. Each packet of type P costs Rs 30, each packet of type Q costs Rs 50.

The matrix form did the same work as substitution — but it concentrated all the structure into a single equation AX = B and a single condition \det A \neq 0. For two equations, the advantage is modest. For three or more equations, it is decisive.

The matrix form: AX = B

Any system of n linear equations in n unknowns can be written in matrix form.

Matrix form of a linear system

The system

a_{11}x_1 + a_{12}x_2 + \cdots + a_{1n}x_n = b_1

a_{21}x_1 + a_{22}x_2 + \cdots + a_{2n}x_n = b_2

\vdots

a_{n1}x_1 + a_{n2}x_2 + \cdots + a_{nn}x_n = b_n

can be written as AX = B, where

A = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{bmatrix}, \quad X = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}, \quad B = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{bmatrix}

A is the coefficient matrix, X is the unknown vector, and B is the constant vector.

If B is the zero vector (every b_i = 0), the system is called homogeneous. If at least one b_i \neq 0, it is called non-homogeneous. This distinction matters because the two types behave differently when \det A = 0.

Method 1: The inverse method

When \det A \neq 0, the matrix A is invertible, and the unique solution is

X = A^{-1} B

This is the most direct method. You compute A^{-1} (using the adjoint formula or row reduction, as described in the Inverse of Matrix article), then multiply it by B. The result is the column of unknowns.

The method is clean and conceptual, but it has a practical drawback: computing A^{-1} requires all the cofactors, and then you multiply the entire inverse by B. If you only need the solution and not the inverse itself, Cramer's rule is faster.

Method 2: Cramer's rule

Cramer's rule gives each unknown directly as a ratio of two determinants, without computing the full inverse.

Derivation for 2 equations

Start with the 2 \times 2 system:

a_1 x + b_1 y = c_1

a_2 x + b_2 y = c_2

Multiply the first equation by b_2 and the second by b_1:

a_1 b_2 x + b_1 b_2 y = c_1 b_2

a_2 b_1 x + b_1 b_2 y = c_2 b_1

Subtract the second from the first:

(a_1 b_2 - a_2 b_1) x = c_1 b_2 - c_2 b_1

The coefficient of x on the left is \det A = \begin{vmatrix} a_1 & b_1 \\ a_2 & b_2 \end{vmatrix}.

The right side is \begin{vmatrix} c_1 & b_1 \\ c_2 & b_2 \end{vmatrix} — the determinant you get by replacing the first column of A (the coefficients of x) with the constants c_1, c_2.

So:

x = \frac{\begin{vmatrix} c_1 & b_1 \\ c_2 & b_2 \end{vmatrix}}{\begin{vmatrix} a_1 & b_1 \\ a_2 & b_2 \end{vmatrix}}

Similarly, multiplying the first equation by a_2 and the second by a_1, subtracting, and recognising determinants gives:

y = \frac{\begin{vmatrix} a_1 & c_1 \\ a_2 & c_2 \end{vmatrix}}{\begin{vmatrix} a_1 & b_1 \\ a_2 & b_2 \end{vmatrix}}

The numerator for y is the determinant with the second column replaced by the constants. The pattern is clear: to find the k-th unknown, replace the k-th column of A with B and take the determinant. Divide by \det A.

The general rule

Cramer's rule

For the system AX = B with \det A \neq 0, the unique solution is

x_k = \frac{\det A_k}{\det A}, \qquad k = 1, 2, \ldots, n

where A_k is the matrix obtained by replacing the k-th column of A with the column B.

Each unknown has its own determinant in the numerator, and they all share the same determinant \det A in the denominator. The rule is elegant and direct — no inverse matrix to compute, no row reduction to perform. The cost is that you compute n+1 determinants (one for \det A and one for each unknown).

Why it works in general

Here is a slick proof using the properties of determinants. Start from AX = B, which says that B is a linear combination of the columns of A:

x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 + \cdots + x_n \mathbf{a}_n = \mathbf{b}

where \mathbf{a}_k is the k-th column of A. Now form the matrix A_k by replacing \mathbf{a}_k with \mathbf{b}:

A_k = [\mathbf{a}_1 \;\cdots\; \mathbf{a}_{k-1} \;\; \mathbf{b} \;\; \mathbf{a}_{k+1} \;\cdots\; \mathbf{a}_n]

Substitute \mathbf{b} = x_1 \mathbf{a}_1 + \cdots + x_n \mathbf{a}_n into the k-th column:

\det A_k = \det[\mathbf{a}_1 \;\cdots\; x_1 \mathbf{a}_1 + \cdots + x_n \mathbf{a}_n \;\cdots\; \mathbf{a}_n]

The determinant is multilinear in its columns, so this splits into n terms. But every term with j \neq k has two identical columns (\mathbf{a}_j appears both in position j and in the linear combination in position k), so those terms are zero. Only the term with j = k survives:

\det A_k = x_k \det[\mathbf{a}_1 \;\cdots\; \mathbf{a}_k \;\cdots\; \mathbf{a}_n] = x_k \det A

Divide by \det A: x_k = \frac{\det A_k}{\det A}.

Worked examples

Example 1: A 2 x 2 system using Cramer's rule

Solve the system:

2x + 3y = 8

4x - y = 2

Step 1. Compute \det A.

\det A = \begin{vmatrix} 2 & 3 \\ 4 & -1 \end{vmatrix} = 2(-1) - 3(4) = -2 - 12 = -14

Why: \det A \neq 0, so a unique solution exists. Cramer's rule applies.

Step 2. Compute \det A_1 (replace column 1 with B).

\det A_1 = \begin{vmatrix} 8 & 3 \\ 2 & -1 \end{vmatrix} = 8(-1) - 3(2) = -8 - 6 = -14

Why: the first column of A holds the coefficients of x. Replacing it with the constants [8, 2]^T gives the numerator for x.

Step 3. Compute \det A_2 (replace column 2 with B).

\det A_2 = \begin{vmatrix} 2 & 8 \\ 4 & 2 \end{vmatrix} = 2(2) - 8(4) = 4 - 32 = -28

Why: the second column holds the coefficients of y. Replacing it with [8, 2]^T gives the numerator for y.

Step 4. Apply Cramer's rule.

x = \frac{\det A_1}{\det A} = \frac{-14}{-14} = 1, \qquad y = \frac{\det A_2}{\det A} = \frac{-28}{-14} = 2

Why: each unknown is the ratio of its modified determinant to the original determinant. The negative signs cancel cleanly here.

Result: x = 1, y = 2.

The two equations represent two straight lines. The line $2x + 3y = 8$ (black) and $4x - y = 2$ (red) intersect at exactly one point: $(1, 2)$. A unique intersection is guaranteed because $\det A \neq 0$ — the two lines are not parallel.

The graph makes the algebra visible. Two non-parallel lines in a plane always meet at exactly one point. The determinant being nonzero is the algebraic way of saying "the lines are not parallel."

Example 2: A 3 x 3 system using Cramer's rule

Solve the system:

x + y + z = 6

x + 2y + 3z = 14

2x + y + z = 7

Step 1. Write the coefficient matrix and compute \det A.

A = \begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 3 \\ 2 & 1 & 1 \end{bmatrix}

Expand along row 1: \det A = 1(2 - 3) - 1(1 - 6) + 1(1 - 4) = -1 + 5 - 3 = 1

Why: \det A = 1 \neq 0, so the system has a unique solution. The clean value 1 will make the arithmetic especially nice.

Step 2. Compute \det A_1 (replace column 1 with [6, 14, 7]^T).

\det A_1 = \begin{vmatrix} 6 & 1 & 1 \\ 14 & 2 & 3 \\ 7 & 1 & 1 \end{vmatrix} = 6(2-3) - 1(14-21) + 1(14-14) = -6 + 7 + 0 = 1

Why: each 3 \times 3 determinant is expanded the same way — along the first row, using cofactors.

Step 3. Compute \det A_2 (replace column 2 with [6, 14, 7]^T).

\det A_2 = \begin{vmatrix} 1 & 6 & 1 \\ 1 & 14 & 3 \\ 2 & 7 & 1 \end{vmatrix} = 1(14-21) - 6(1-6) + 1(7-28) = -7 + 30 - 21 = 2

Step 4. Compute \det A_3 (replace column 3 with [6, 14, 7]^T).

\det A_3 = \begin{vmatrix} 1 & 1 & 6 \\ 1 & 2 & 14 \\ 2 & 1 & 7 \end{vmatrix} = 1(14-14) - 1(7-28) + 6(1-4) = 0 + 21 - 18 = 3

Step 5. Apply Cramer's rule.

x = \frac{1}{1} = 1, \quad y = \frac{2}{1} = 2, \quad z = \frac{3}{1} = 3

Why: since \det A = 1, each unknown equals its own modified determinant. The answer (1, 2, 3) can be verified: 1+2+3=6, 1+4+9=14, 2+2+3=7. All three equations check out.

Result: x = 1, y = 2, z = 3.

Three planes in three-dimensional space, each representing one equation. When $\det A \neq 0$, the three planes intersect at exactly one point — the unique solution $(1, 2, 3)$. If the determinant were zero, the planes would either be parallel, coincide, or intersect along a line instead of a point.

In three dimensions, each equation is a plane. Two planes typically intersect in a line. Three planes typically intersect in a point — but only if they are "independent" in the sense that no one of them is redundant. The determinant measures exactly this independence.

Non-homogeneous systems: what happens when \det A = 0

When \det A \neq 0, the story is clean: there is exactly one solution. When \det A = 0, the story splits into two cases, and you have to look at the constant vector B to tell them apart.

Case 1: No solution (inconsistent). The system asks for something impossible. Geometrically, two of the planes are parallel (in the 3 \times 3 case), or the three planes form a triangular prism with no common point. There is no (x, y, z) that satisfies all equations simultaneously.

Case 2: Infinitely many solutions (dependent). The equations are not all independent — one of them is a consequence of the others. Geometrically, the planes intersect along a line (or even a whole plane). There are infinitely many solutions, parameterised by one or more free variables.

The condition that separates these two cases involves the augmented matrix [A \mid B]. The rule is:

If \text{rank}(A) = \text{rank}([A \mid B]) < n, the system has infinitely many solutions.
If \text{rank}(A) < \text{rank}([A \mid B]), the system has no solution.

Here "rank" means the number of linearly independent rows. In practice, you find it by row-reducing the augmented matrix and seeing whether any row of the form [0 \; 0 \; \cdots \; 0 \mid c] with c \neq 0 appears (inconsistent) or not (infinitely many solutions).

A concrete example of inconsistency. Consider

x + y = 3

2x + 2y = 8

The second equation is 2 times the first, except the right side is 8 instead of 6. The first equation says x + y = 3; the second says x + y = 4. Both cannot be true. The determinant of A = \begin{bmatrix} 1 & 1 \\ 2 & 2 \end{bmatrix} is 0, and the augmented matrix row-reduces to \begin{bmatrix} 1 & 1 & | & 3 \\ 0 & 0 & | & 2 \end{bmatrix}. The second row says 0 = 2 — a contradiction. No solution.

A concrete example of infinitely many solutions. Consider

x + y = 3

2x + 2y = 6

Now the second equation is exactly 2 times the first. Both say x + y = 3. There is really only one equation in two unknowns, so y = 3 - x for any x. Infinitely many solutions: (0, 3), (1, 2), (5, -2), and so on.

Homogeneous systems: AX = O

A homogeneous system is one where every constant is zero: AX = O (the zero vector). These systems have a special property: X = O is always a solution. The zero vector satisfies every homogeneous equation, because a_{i1} \cdot 0 + a_{i2} \cdot 0 + \cdots = 0 for every i. This solution — all unknowns equal to zero — is called the trivial solution.

The question for homogeneous systems is not whether a solution exists (it always does), but whether a non-trivial solution exists — one where at least one unknown is nonzero.

The answer depends on the determinant:

If \det A \neq 0: the only solution is the trivial one, X = O. The inverse method gives X = A^{-1} O = O.
If \det A = 0: there are infinitely many solutions, including non-trivial ones. The system has a free variable, and the solutions form a line (or plane, or higher-dimensional subspace) through the origin.

Notice that the "no solution" case never happens for homogeneous systems. The reason is simple: X = O is always a solution, so the system is always consistent. The only question is whether X = O is the only solution or one of infinitely many.

Consistency conditions: the complete picture

Here is the full summary for a system of n equations in n unknowns.

Non-homogeneous system (AX = B, B \neq O):

Condition	Number of solutions
\det A \neq 0	Exactly one (unique)
\det A = 0 and \text{rank}(A) = \text{rank}([A \mid B])	Infinitely many
\det A = 0 and \text{rank}(A) < \text{rank}([A \mid B])	None

Homogeneous system (AX = O):

Condition	Solutions
\det A \neq 0	Only the trivial solution X = O
\det A = 0	Infinitely many (including non-trivial)

The determinant is the gatekeeper. When it is nonzero, everything is determined — one solution for non-homogeneous, only the trivial solution for homogeneous. When it is zero, the floodgates open.

Common confusions

"If \det A = 0, the system has no solution." Not necessarily. It might have infinitely many. The determinant being zero rules out a unique solution but does not by itself distinguish between no solution and infinitely many. You need to examine the augmented matrix.
"Cramer's rule works when \det A = 0." It does not. Cramer's rule requires \det A \neq 0 because it divides by \det A. When \det A = 0, you must use other methods (row reduction, for instance).
"A homogeneous system always has only the trivial solution." Only when \det A \neq 0. When \det A = 0, homogeneous systems have infinitely many solutions. A 3 \times 3 homogeneous system with \det A = 0 has solutions forming a line or plane through the origin.
"The inverse method and Cramer's rule give different answers." They never do. Both methods extract the same unique solution from the same condition \det A \neq 0. If your answers disagree, there is an arithmetic error in one of them.
"For a 3 \times 3 system, I need to compute four determinants using Cramer's rule." Correct: one for \det A and one for each of the three unknowns. This is four 3 \times 3 determinants, which is more work than a single row reduction — but in an exam, the determinants are often designed to come out cleanly.

Going deeper

If you can solve 2 \times 2 and 3 \times 3 systems using Cramer's rule and the inverse method, and you understand the consistency conditions, you have what you need. The material below covers Gaussian elimination in detail and connects the matrix approach to the geometry of linear transformations.

Gaussian elimination: the row reduction method

The most general method for solving linear systems — one that works whether \det A is zero or not, and whether the system is square or not — is Gaussian elimination. You form the augmented matrix [A \mid B] and apply row operations to reduce A to row echelon form (upper triangular with leading 1s). Then you read off the solution by back-substitution.

For the system in Example 2:

\left[\begin{array}{ccc|c} 1 & 1 & 1 & 6 \\ 1 & 2 & 3 & 14 \\ 2 & 1 & 1 & 7 \end{array}\right]

R_2 \to R_2 - R_1:

\left[\begin{array}{ccc|c} 1 & 1 & 1 & 6 \\ 0 & 1 & 2 & 8 \\ 2 & 1 & 1 & 7 \end{array}\right]

R_3 \to R_3 - 2R_1:

\left[\begin{array}{ccc|c} 1 & 1 & 1 & 6 \\ 0 & 1 & 2 & 8 \\ 0 & -1 & -1 & -5 \end{array}\right]

R_3 \to R_3 + R_2:

\left[\begin{array}{ccc|c} 1 & 1 & 1 & 6 \\ 0 & 1 & 2 & 8 \\ 0 & 0 & 1 & 3 \end{array}\right]

Now back-substitute: from row 3, z = 3. From row 2, y + 2(3) = 8, so y = 2. From row 1, x + 2 + 3 = 6, so x = 1. The same answer: (1, 2, 3).

The advantage of Gaussian elimination is that it handles all cases uniformly — unique solution, no solution, or infinitely many solutions — without checking the determinant first. A row of the form [0 \; 0 \; 0 \mid c] with c \neq 0 signals inconsistency. A row of all zeros (including the augmented entry) signals a free variable and infinitely many solutions.

The geometry of singular systems

When \det A = 0 for a 3 \times 3 system, several geometric configurations are possible:

Three planes intersecting in a line. Each pair of planes meets in a line, and all three lines are the same. The system has infinitely many solutions, parameterised by one free variable.
Three planes forming a triangular prism. Each pair meets in a line, but the three lines are parallel and distinct. No common point. No solution.
Two planes parallel, third crosses both. No common point. No solution.
All three planes coinciding. The system has infinitely many solutions (two free variables).

The determinant tells you that something unusual is happening. The rank of the augmented matrix tells you exactly which case you are in.

Connection to linear independence

The columns of A represent n vectors in \mathbb{R}^n. The equation AX = B asks: can B be written as a linear combination of these column vectors?

If the columns are linearly independent (\det A \neq 0), they span all of \mathbb{R}^n, and every B can be reached — in exactly one way. Unique solution.
If the columns are linearly dependent (\det A = 0), they span a subspace of dimension less than n. Some B vectors can be reached (infinitely many ways), and others cannot be reached at all.

This is the bridge between the algebraic theory of determinants and the geometric theory of vector spaces. The determinant is counting whether you have enough independent directions to reach any point in the space.

Where this leads next

Inverse of Matrix — the method for computing A^{-1}, which this article uses directly.
Properties of Determinants — the rules that make Cramer's rule and the consistency conditions work.
Systems of Linear Equations — the pre-matrix approach using substitution and elimination, which this article generalises.
Special Matrices — orthogonal and other structured matrices that make solving systems especially efficient.