In short
A vector space is any collection of objects you can add together and scale — arrows, polynomials, functions, signals. A linear transformation is a rule that maps one vector space to another while respecting addition and scaling. Eigenvalues and eigenvectors are the special directions a transformation stretches without rotating. Together, these three ideas form the backbone of linear algebra and appear everywhere from quantum mechanics to machine learning.
You know how to multiply a matrix by a vector. Take the matrix
and the vector \mathbf{v} = \begin{pmatrix} 1 \\ 1 \end{pmatrix}. The product is
You fed in the vector (1, 1) and got out the vector (3, 3). The matrix moved the vector — it changed its direction and its length. Do this with every vector in the plane, and the matrix rearranges the entire plane: stretching some regions, compressing others, rotating some directions, flipping some.
That is the shift this article is about. Until now, a matrix has been a rectangle of numbers you manipulate with row operations. From here on, think of a matrix as a machine that moves space. The numbers inside the matrix are instructions for the machine. The interesting questions are: what does this machine do to space? Which directions does it stretch? By how much? Are there directions it leaves alone?
Those questions are what linear algebra is really about. The language you need — vector spaces, linear transformations, eigenvalues — is the subject of this article.
Vector spaces: the idea behind the arrows
You have been working with vectors in \mathbb{R}^2 (pairs of numbers) and \mathbb{R}^3 (triples). But the word "vector" in mathematics is broader than that.
Consider polynomials of degree at most 2. Any such polynomial looks like a + bx + cx^2. You can add two of them:
You can multiply one by a number:
These two operations — addition and scalar multiplication — obey exactly the same rules as addition and scaling of arrows in the plane: addition is commutative and associative, there is a zero element (the zero polynomial), scalar multiplication distributes over addition, and so on. Every rule you have used for vectors in \mathbb{R}^2 works here too.
That is the insight: polynomials behave like vectors. Not because they are arrows in space, but because they satisfy the same algebraic rules. A vector space is any collection of objects that follows these rules. The objects are called vectors, even if they are polynomials, matrices, functions, or signals.
Vector space
A vector space over the real numbers \mathbb{R} is a set V equipped with two operations — addition (\mathbf{u} + \mathbf{v}) and scalar multiplication (c\mathbf{v}) — satisfying the following axioms for all \mathbf{u}, \mathbf{v}, \mathbf{w} \in V and all scalars c, d \in \mathbb{R}:
- \mathbf{u} + \mathbf{v} \in V (closure under addition)
- c\mathbf{v} \in V (closure under scalar multiplication)
- \mathbf{u} + \mathbf{v} = \mathbf{v} + \mathbf{u} (commutativity)
- (\mathbf{u} + \mathbf{v}) + \mathbf{w} = \mathbf{u} + (\mathbf{v} + \mathbf{w}) (associativity)
- There exists a zero vector \mathbf{0} such that \mathbf{v} + \mathbf{0} = \mathbf{v}
- For every \mathbf{v}, there exists -\mathbf{v} such that \mathbf{v} + (-\mathbf{v}) = \mathbf{0}
- c(d\mathbf{v}) = (cd)\mathbf{v}
- 1 \cdot \mathbf{v} = \mathbf{v}
- c(\mathbf{u} + \mathbf{v}) = c\mathbf{u} + c\mathbf{v} (distributivity over vector addition)
- (c + d)\mathbf{v} = c\mathbf{v} + d\mathbf{v} (distributivity over scalar addition)
Ten axioms looks like a lot, but most of them are things you already assume without thinking when you do arithmetic with vectors. The list is the formal guarantee that all the algebra you have been doing in \mathbb{R}^2 and \mathbb{R}^3 carries over intact to the new setting.
The payoff of abstraction is enormous. Any theorem you prove using only these ten axioms automatically applies to every vector space — arrows in 3D, polynomials, infinite-dimensional function spaces, the state spaces of quantum mechanics. You prove it once and use it everywhere.
Basis and dimension
Inside any vector space, a basis is a minimal set of vectors from which every other vector can be built by addition and scaling. In \mathbb{R}^2, the standard basis is \{(1, 0), (0, 1)\}: every vector (a, b) is a(1, 0) + b(0, 1). In the space of polynomials of degree at most 2, a natural basis is \{1, x, x^2\}: every such polynomial a + bx + cx^2 is a \cdot 1 + b \cdot x + c \cdot x^2.
The number of vectors in a basis is called the dimension of the space. \mathbb{R}^2 has dimension 2. The polynomial space \{a + bx + cx^2\} also has dimension 3 (three basis elements: 1, x, x^2). These are different spaces, but they have the same structure — a fact that linear algebra makes precise.
The deep point is that once you choose a basis, every vector in the space can be described by its coordinates — a list of numbers. A vector in a 3-dimensional space needs three coordinates, regardless of whether the "vectors" are arrows, polynomials, or something else entirely. This is why matrices — which are just arrays of numbers — can represent operations on polynomials, functions, and signals, not only on geometric arrows.
Linear transformations: the machines
A linear transformation is a function T: V \to W between two vector spaces that respects the two operations:
These two rules say: the transformation commutes with addition and scaling. If you add two vectors and then transform, you get the same result as transforming each one and then adding. If you scale a vector and then transform, you get the same result as transforming and then scaling.
Why does this matter? Because every matrix multiplication \mathbf{v} \mapsto A\mathbf{v} is a linear transformation — and conversely, every linear transformation between finite-dimensional vector spaces can be represented by a matrix (once you choose bases). So linear transformations are what matrices do; the matrix is just the numerical encoding.
Here are three concrete examples to build intuition.
Rotation. The matrix
rotates every vector in \mathbb{R}^2 by \theta radians anticlockwise. For \theta = 90° = \pi/2:
The vector (1, 0) becomes (0, 1) — rotated 90° anticlockwise, exactly as expected.
Reflection. The matrix
reflects every vector across the x-axis: (a, b) \mapsto (a, -b). The x-component stays, the y-component flips sign.
Differentiation. Consider the vector space of polynomials of degree at most 3. The derivative operator D that sends p(x) to p'(x) is a linear transformation:
Linearity holds because the derivative of a sum is the sum of the derivatives, and the derivative of a constant times a polynomial is the constant times the derivative. Using the basis \{1, x, x^2, x^3\}, the matrix of D is:
because D(1) = 0, D(x) = 1, D(x^2) = 2x, D(x^3) = 3x^2. The columns of the matrix are the coordinate vectors of D applied to each basis element. Differentiation — which you have been doing as calculus — is just matrix multiplication in disguise.
Notice what all three transformations have in common: the origin stays fixed, straight lines remain straight, and parallel lines remain parallel. Those are the geometric consequences of linearity. Any transformation that bends a straight line or moves the origin cannot be linear.
Eigenvalues and eigenvectors: the special directions
Here is the question that makes linear algebra powerful. Given a linear transformation (or equivalently, a matrix A), are there any vectors that the transformation simply stretches (or compresses, or flips) without changing direction?
Such a vector \mathbf{v} would satisfy
for some scalar \lambda. The transformation sends \mathbf{v} to a scalar multiple of itself — the same line, just scaled by \lambda. The scalar \lambda is called an eigenvalue and \mathbf{v} is called the corresponding eigenvector.
Take the matrix from the start of this article:
Try the vector \mathbf{v}_1 = \begin{pmatrix} 1 \\ 0 \end{pmatrix}:
The output is 2 times the input. So \mathbf{v}_1 = (1, 0) is an eigenvector with eigenvalue \lambda_1 = 2. The matrix stretches this direction by a factor of 2.
Now try \mathbf{v}_2 = \begin{pmatrix} 1 \\ 1 \end{pmatrix}:
The output is 3 times the input. So \mathbf{v}_2 = (1, 1) is an eigenvector with eigenvalue \lambda_2 = 3. The matrix stretches this direction by a factor of 3.
Every other vector in the plane gets both stretched and rotated by A. But these two special directions — the eigenvectors — get stretched without any rotation. They are the skeleton of the transformation.
Eigenvalue and eigenvector
Let A be an n \times n matrix. A nonzero vector \mathbf{v} is an eigenvector of A if there exists a scalar \lambda such that
The scalar \lambda is called the eigenvalue corresponding to \mathbf{v}. The zero vector is never counted as an eigenvector, even though A\mathbf{0} = \lambda\mathbf{0} for any \lambda.
Finding eigenvalues: the characteristic equation
You do not have to guess eigenvectors. There is a systematic method.
Rewrite A\mathbf{v} = \lambda\mathbf{v} as (A - \lambda I)\mathbf{v} = \mathbf{0}, where I is the identity matrix. This is a homogeneous system. It has a nonzero solution \mathbf{v} if and only if the matrix A - \lambda I is singular — that is, its determinant is zero:
This is called the characteristic equation. For a 2 \times 2 matrix, it is a quadratic in \lambda; for a 3 \times 3 matrix, a cubic; and so on. The roots of this equation are the eigenvalues.
For A = \begin{pmatrix} 2 & 1 \\ 0 & 3 \end{pmatrix}:
So \lambda = 2 or \lambda = 3 — exactly the eigenvalues you found by inspection.
Worked examples
Example 1: Eigenvalues and eigenvectors of a 2×2 matrix
Find the eigenvalues and eigenvectors of
Step 1. Write the characteristic equation.
Why: expanding the determinant gives a quadratic in \lambda. The roots of this quadratic are the eigenvalues.
Step 2. Solve the quadratic.
So \lambda_1 = 2 and \lambda_2 = 5.
Why: factor by looking for two numbers that multiply to 10 and add to -7. That gives -2 and -5.
Step 3. Find the eigenvector for \lambda_1 = 2. Solve (A - 2I)\mathbf{v} = \mathbf{0}:
Both rows say the same thing: v_1 + v_2 = 0, so v_2 = -v_1. Taking v_1 = 1: \mathbf{v}_1 = \begin{pmatrix} 1 \\ -1 \end{pmatrix}.
Why: a singular matrix always has at least one free variable. That free variable gives a family of eigenvectors along one line — you pick any nonzero representative.
Step 4. Find the eigenvector for \lambda_2 = 5. Solve (A - 5I)\mathbf{v} = \mathbf{0}:
Both rows say v_1 = 2v_2. Taking v_2 = 1: \mathbf{v}_2 = \begin{pmatrix} 2 \\ 1 \end{pmatrix}.
Why: same logic. The matrix A - 5I is singular, so the system has a nontrivial solution lying along a line.
Result: Eigenvalues \lambda_1 = 2, \lambda_2 = 5. Eigenvectors \mathbf{v}_1 = (1, -1), \mathbf{v}_2 = (2, 1).
You can verify: A(1, -1)^T = (4 \cdot 1 + 2 \cdot (-1),\; 1 \cdot 1 + 3 \cdot (-1))^T = (2, -2)^T = 2(1, -1)^T. And A(2, 1)^T = (4 \cdot 2 + 2 \cdot 1,\; 1 \cdot 2 + 3 \cdot 1)^T = (10, 5)^T = 5(2, 1)^T. Both eigenvectors land back on their own line, scaled by their eigenvalue.
Example 2: A linear transformation on polynomials
Consider the vector space V of polynomials of degree at most 2, with basis \{1, x, x^2\}. Define the linear transformation T: V \to V by
where p'(x) is the derivative. Find the matrix of T and its eigenvalues.
Step 1. Apply T to each basis element.
Why: a linear transformation is completely determined by what it does to a basis. Once you know T(1), T(x), T(x^2), you can find T of any polynomial by linearity.
Step 2. Write the matrix. Express each output in coordinates relative to the basis \{1, x, x^2\}:
Why: each column of the matrix is the coordinate vector of T applied to the corresponding basis element. Since the outputs are all multiples of basis elements, the matrix is diagonal.
Step 3. The matrix is already diagonal, so the eigenvalues are the diagonal entries: \lambda_1 = 1, \lambda_2 = 2, \lambda_3 = 3. The eigenvectors (in coordinates) are the standard basis vectors, which correspond to the polynomials 1, x, and x^2.
Why: a diagonal matrix stretches each basis direction independently. The diagonal entries are the stretching factors — the eigenvalues.
Step 4. Verify: T(x^2) = 3x^2 = 3 \cdot x^2. The polynomial x^2 is mapped to 3 times itself. It is an eigenvector of the transformation T with eigenvalue 3.
Result: The eigenvalues of T are 1, 2, 3, with eigenvectors 1, x, x^2 respectively.
This example shows why eigenvectors matter beyond geometry. The transformation T is not about arrows or rotations — it acts on polynomials. But it still has eigenvalues and eigenvectors, and they tell you the same thing: which inputs pass through the machine and come out scaled, without mixing with other components.
Diagonalization: choosing the right coordinates
The polynomial example above was special: the matrix of T came out diagonal. That happened because the basis \{1, x, x^2\} consists entirely of eigenvectors. In a diagonal matrix, each basis direction is acted on independently — no mixing, no rotation, just pure scaling.
Most matrices are not diagonal when you write them in the standard basis. But if you can find a basis of eigenvectors, then in that basis the matrix becomes diagonal. This process is called diagonalization.
Here is the recipe. Suppose A is an n \times n matrix with n linearly independent eigenvectors \mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_n and corresponding eigenvalues \lambda_1, \lambda_2, \ldots, \lambda_n. Form the matrix P whose columns are the eigenvectors:
Then
The matrix D is diagonal, with the eigenvalues on the diagonal. The matrix P is the change of basis from the standard basis to the eigenvector basis. In the eigenvector basis, A acts as pure scaling along each axis.
For the matrix A = \begin{pmatrix} 4 & 2 \\ 1 & 3 \end{pmatrix} from Example 1, with eigenvectors (1, -1) and (2, 1):
You can verify this by computing P^{-1} = \frac{1}{3}\begin{pmatrix} 1 & -2 \\ 1 & 1 \end{pmatrix} and multiplying out P^{-1}AP.
Why diagonalization matters
A diagonal matrix is the easiest matrix to work with. For instance, computing A^{100} for a general 2 \times 2 matrix is a nightmare of matrix multiplication. But if A = PDP^{-1}, then
Powers of a diagonal matrix are just powers of the diagonal entries. Diagonalization turns a hard computation into an easy one by changing to the right coordinate system.
This is the deep lesson: the right basis simplifies everything. Linear algebra is, in large part, the art of choosing coordinates that make a problem easy. Eigenvectors are the coordinates that make linear transformations easy.
When diagonalization fails
Not every matrix can be diagonalized. The matrix
has characteristic equation (2 - \lambda)^2 = 0, so \lambda = 2 is the only eigenvalue (with algebraic multiplicity 2). Solving (B - 2I)\mathbf{v} = \mathbf{0} gives \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}\mathbf{v} = \mathbf{0}, which yields only v_2 = 0 — there is only one independent eigenvector, (1, 0). You cannot form a basis of eigenvectors. This matrix is not diagonalizable.
Such matrices are handled by a more general decomposition called the Jordan normal form, which uses a near-diagonal structure with 1's above the diagonal to account for the missing eigenvectors. It is the subject of a more advanced course, but the key idea is the same: find the simplest form of the matrix by choosing the best basis.
Common confusions
-
"An eigenvalue of zero means there is no eigenvector." An eigenvalue of \lambda = 0 is perfectly valid — it means the transformation collapses the eigenvector to the zero vector. The matrix \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix} has eigenvalue 0 with eigenvector (0, 1): it crushes the y-axis to a point. The determinant of a matrix is the product of its eigenvalues, so \lambda = 0 means the matrix is singular.
-
"Eigenvectors are unique." An eigenvector is not unique — any nonzero scalar multiple of an eigenvector is also an eigenvector with the same eigenvalue. What is unique is the eigenspace: the line (or plane, or subspace) of all eigenvectors for a given eigenvalue. When you write "the eigenvector is (1, -1)," you really mean "the eigenspace is spanned by (1, -1)."
-
"Every matrix has real eigenvalues." The rotation matrix \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix} (rotation by 90°) has characteristic equation \lambda^2 + 1 = 0, giving \lambda = \pm i. No real eigenvalues — because a 90° rotation does not leave any real direction fixed. Complex eigenvalues arise naturally from rotations and oscillations.
-
"A vector space must contain arrows." Polynomials, matrices, continuous functions on [0, 1], sequences of real numbers — all of these are vector spaces. The abstract definition captures the structure (addition and scaling), not the shape of the objects. Calling them "vectors" is about what you can do with them, not what they look like.
-
"Linear transformation means the graph is a straight line." The function f(x) = 2x + 3 has a straight-line graph but is not a linear transformation (it fails f(0) = 3 \neq 0). Linear transformations must send the zero vector to the zero vector. The function f(x) = 2x is a linear transformation; f(x) = 2x + 3 is an affine transformation.
Going deeper
If you came here to understand what vector spaces, linear transformations, and eigenvalues are, you have the picture — you can stop here. What follows is for readers who want to see how these ideas connect to the rest of mathematics.
The spectral theorem
For symmetric matrices — matrices where A = A^T — something remarkable happens. The eigenvalues are always real, the eigenvectors corresponding to different eigenvalues are always orthogonal (perpendicular), and the matrix is always diagonalizable. This is the spectral theorem, one of the most important results in linear algebra.
For example, A = \begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix} is symmetric. Its eigenvalues are 4 and 2, with eigenvectors (1, 1) and (1, -1) — which are indeed perpendicular (1 \cdot 1 + 1 \cdot (-1) = 0). Symmetric matrices decompose space into mutually perpendicular stretching directions. This is why they are so central to physics (moments of inertia, stress tensors) and statistics (covariance matrices, principal component analysis).
Eigenvalues and differential equations
The system of differential equations
has solutions of the form e^{\lambda t}\mathbf{v}, where \lambda is an eigenvalue and \mathbf{v} is the corresponding eigenvector. If \lambda > 0, the solution grows exponentially along the eigenvector direction. If \lambda < 0, it decays. If \lambda is complex, you get oscillations.
The eigenvalues of A therefore control the qualitative behaviour of every dynamical system described by the equation \dot{\mathbf{x}} = A\mathbf{x}. This is why eigenvalues appear in stability analysis, control theory, quantum mechanics (where the eigenvalues of the Hamiltonian are the energy levels), and vibration analysis (where they give the natural frequencies).
The dimension theorem (rank-nullity)
For a linear transformation T: V \to W, the rank-nullity theorem says:
The kernel (null space) \ker T is the set of vectors that T sends to zero. The image \text{im}\, T is the set of vectors that T can produce. The theorem says these two dimensions always add up to the dimension of the domain. It is the fundamental bookkeeping rule of linear algebra: whatever T "wastes" on the kernel, it makes up for in the image.
For the differentiation operator D on polynomials of degree at most 3: the kernel is the constant polynomials (dimension 1, since D(c) = 0), and the image is the polynomials of degree at most 2 (dimension 3). Indeed, 1 + 3 = 4 = \dim V.
Where this leads next
You now have the vocabulary for abstract linear algebra. The road ahead branches in several directions.
- Matrix Transformations — a geometric study of what specific 2 \times 2 and 3 \times 3 matrices do to space: rotations, reflections, projections, shears.
- Diagonalization — the full theory, including when it works, when it fails, and what to do when it fails (Jordan form).
- Inner Product Spaces — adding the notion of angle and length to abstract vector spaces, leading to orthogonality, projections, and the spectral theorem.
- Systems of Linear Equations — the connection between matrices and solving simultaneous equations, via row reduction and the rank-nullity theorem.
- Singular Value Decomposition — a generalisation of diagonalization that works for every matrix, not just square ones with enough eigenvectors.