In short
The Mean Value Theorem (Lagrange's form) says: if f is continuous on [a, b] and differentiable on (a, b), then there exists a point c in (a, b) where the instantaneous rate of change equals the average rate of change — that is, f'(c) = \frac{f(b) - f(a)}{b - a}. Cauchy's MVT generalises this to ratios of two functions.
A car drives from Delhi to Agra — a distance of 200 km — in exactly 2 hours. Its average speed over the trip is 200 \div 2 = 100 km/h.
Here is a question: at some point during the trip, was the car's speedometer reading exactly 100 km/h?
Think about it. The car started at some speed, maybe 0 (at a red light), maybe 60. It ended at some speed. Its average was 100. Must it have passed through 100 at some instant?
Yes. And the reason is not complicated. If the car's speed were always less than 100, it could not have covered 200 km in 2 hours — the distance would be less than 100 \times 2 = 200. If the speed were always greater than 100, the distance would be more than 200. Since the speed is a continuous function (the car does not teleport), and the average works out to exactly 100, the speedometer must have read exactly 100 at least once.
That is the Mean Value Theorem. At some point during the journey, the instantaneous speed equals the average speed. The theorem says this is true not just for cars, but for any smooth function: the instantaneous rate of change must, at some point, equal the average rate of change.
From Rolle to Lagrange: tilting the picture
You have already met Rolle's theorem: if a smooth curve starts and ends at the same height, there is a horizontal tangent somewhere in between. The Mean Value Theorem is the generalisation to curves that start and end at different heights.
The idea is a beautiful geometric trick. Suppose f(a) \neq f(b). Draw the straight line from (a, f(a)) to (b, f(b)) — this is the secant line. Its slope is the average rate of change:
Now here is the trick: define a new function by subtracting this secant line from f. Call it
What is g? It is the vertical distance from the curve to the secant line. At x = a: g(a) = f(a) - f(a) = 0. At x = b: g(b) = f(b) - \left[f(a) + \frac{f(b) - f(a)}{b - a}(b - a)\right] = f(b) - f(b) = 0. So g(a) = g(b) = 0.
The function g satisfies all the conditions of Rolle's theorem: it is continuous on [a, b] (because f is, and the secant is a line), differentiable on (a, b), and g(a) = g(b). So there exists a c in (a, b) where g'(c) = 0.
Compute g'(x):
Setting g'(c) = 0:
That is it. The Mean Value Theorem falls out of Rolle's theorem in five lines, just by tilting the picture.
The formal statement
Lagrange's Mean Value Theorem
Let f be a function such that:
- f is continuous on [a, b],
- f is differentiable on (a, b).
Then there exists at least one point c \in (a, b) such that
Reading the theorem geometrically. The right-hand side is the slope of the secant line joining (a, f(a)) and (b, f(b)). The left-hand side is the slope of the tangent line at x = c. The theorem says: somewhere between a and b, the tangent is parallel to the secant.
Rolle's theorem is a special case. When f(a) = f(b), the secant is horizontal (slope zero), and the MVT reduces to f'(c) = 0 — exactly Rolle's theorem.
The proof, laid out step by step
Define the auxiliary function
Verify the conditions of Rolle's theorem for g:
-
Continuity on [a, b]: f is continuous by hypothesis, and the second term is a linear function (continuous everywhere). So g is continuous.
-
Differentiability on (a, b): f is differentiable by hypothesis, and a linear function is differentiable everywhere. So g is differentiable.
-
Equal endpoint values:
- g(a) = f(a) - f(a) - \frac{f(b)-f(a)}{b-a}(a - a) = 0
- g(b) = f(b) - f(a) - \frac{f(b)-f(a)}{b-a}(b - a) = f(b) - f(a) - [f(b) - f(a)] = 0
So g(a) = g(b) = 0.
By Rolle's theorem, there exists c \in (a, b) with g'(c) = 0.
Compute g'(c):
Setting g'(c) = 0:
which is the conclusion of the MVT. \blacksquare
Geometric interpretation
The picture is the theorem. Draw the curve y = f(x) between x = a and x = b. Draw the secant line connecting the two endpoints. Now imagine sliding a line parallel to that secant, moving it vertically. At some position, this sliding line will be tangent to the curve — it will touch the curve at exactly one point (locally). That touching point is the c guaranteed by the MVT.
Applications of the MVT
The MVT is not just a theoretical statement — it is the tool behind nearly every result in the qualitative theory of derivatives. Here are the most important applications.
Application 1: Constant functions have zero derivatives (and the converse)
If f'(x) = 0 for every x in an interval (a, b), then f is constant on (a, b).
Proof using MVT. Pick any two points x_1 < x_2 in (a, b). By the MVT, there is a c between them with f'(c) = \frac{f(x_2) - f(x_1)}{x_2 - x_1}. But f'(c) = 0, so f(x_2) - f(x_1) = 0, so f(x_1) = f(x_2). Since this holds for any two points, f is constant.
This is the foundation of antiderivatives: two functions with the same derivative differ by a constant.
Application 2: The monotonicity test
If f'(x) > 0 for all x in (a, b), then f is strictly increasing on (a, b).
Proof using MVT. Take x_1 < x_2 in (a, b). By MVT, f(x_2) - f(x_1) = f'(c)(x_2 - x_1) for some c between them. Since f'(c) > 0 and x_2 - x_1 > 0, the product is positive, so f(x_2) > f(x_1). The function is increasing.
You will use this result constantly in the Monotonicity article.
Application 3: Bounding function values
The MVT gives precise bounds on how much a function can change. Since f(b) - f(a) = f'(c)(b-a) for some c, and if you know that m \le f'(x) \le M on (a, b), then
This is a powerful way to estimate function values without computing them exactly.
Example 1: Estimate $\sqrt{26}$ using the MVT
Step 1. Take f(x) = \sqrt{x}, with a = 25 and b = 26. You know f(25) = 5. You want f(26) = \sqrt{26}.
Why: \sqrt{25} is known exactly. The MVT will relate \sqrt{26} - \sqrt{25} to the derivative.
Step 2. By the MVT, f(26) - f(25) = f'(c)(26 - 25) = f'(c) for some c \in (25, 26).
Why: the interval has length 1, so the factor (b - a) is just 1.
Step 3. Compute f'(x) = \frac{1}{2\sqrt{x}}. For c \in (25, 26):
Why: \sqrt{x} is increasing, so \frac{1}{2\sqrt{x}} is decreasing. The largest value of f' on (25, 26) occurs at the left endpoint.
Step 4. So \sqrt{26} = 5 + f'(c), and f'(c) < \frac{1}{10} = 0.1. Also f'(c) > \frac{1}{2\sqrt{26}} > \frac{1}{2 \times 5.1} \approx 0.098. So
Why: the MVT has turned the problem into a derivative estimate. The actual value is \sqrt{26} \approx 5.0990.
Result: \sqrt{26} lies between 5.098 and 5.1, with the MVT giving the rigorous bounds.
Example 2: Prove that $|\sin a - \sin b| \le |a - b|$ for all real $a, b$
Step 1. Take f(x) = \sin x. This function is continuous everywhere and differentiable everywhere.
Why: \sin x satisfies the MVT hypotheses on any interval.
Step 2. By the MVT applied to f on [a, b] (assuming a < b):
Why: f'(x) = \cos x.
Step 3. Take absolute values:
Why: |\cos(c)| \le 1 for all c. This is the only property of cosine needed.
Step 4. The case a = b is trivial (both sides are zero). If a > b, swap the labels.
Result: |\sin a - \sin b| \le |a - b| for all real a, b. This inequality is called a Lipschitz condition with constant 1.
Common confusions
-
"The MVT tells you the value of c." No. The theorem says c exists. It does not tell you what c is. For specific functions you can solve f'(c) = \frac{f(b)-f(a)}{b-a} to find c, but for the general statement, existence is all you get.
-
"There is exactly one such c." Not necessarily. There can be multiple points where the tangent is parallel to the secant. The theorem guarantees at least one.
-
"The MVT works for functions that are continuous everywhere." Continuity alone is not enough — you also need differentiability on the open interval. The function |x| on [-1, 1] is continuous but not differentiable at x = 0. The secant has slope 0, and indeed f' is either +1 or -1 (never 0), so the conclusion fails.
-
"The MVT is the same as Rolle's theorem." Rolle's theorem is a special case (when f(a) = f(b)). The MVT is strictly more general. In fact, the MVT is proved using Rolle's theorem, by subtracting the secant line.
Cauchy's Mean Value Theorem
There is a more general version that handles two functions simultaneously. It is called Cauchy's Mean Value Theorem (or the generalised MVT).
Cauchy's Mean Value Theorem
Let f and g be functions such that:
- Both are continuous on [a, b],
- Both are differentiable on (a, b),
- g'(x) \neq 0 for all x \in (a, b).
Then there exists c \in (a, b) such that
Why condition 3? The condition g'(x) \neq 0 ensures that g(b) \neq g(a) (by Rolle's theorem — if g(a) = g(b), then g' would be zero somewhere). So the right-hand side is well-defined, and the left-hand side does not have a zero denominator.
The proof. Define the auxiliary function
Check: h(a) = 0 and h(b) = f(b) - f(a) - \frac{f(b)-f(a)}{g(b)-g(a)}[g(b) - g(a)] = 0. The function h is continuous on [a, b] and differentiable on (a, b). By Rolle's theorem, h'(c) = 0 for some c \in (a, b).
Setting h'(c) = 0:
Since g'(c) \neq 0, divide both sides by g'(c):
Lagrange's MVT is a special case. Set g(x) = x. Then g'(x) = 1, g(b) - g(a) = b - a, and Cauchy's MVT becomes f'(c) = \frac{f(b) - f(a)}{b - a} — Lagrange's MVT.
Why Cauchy's MVT matters. It is the tool behind L'Hôpital's rule. When you have a limit of the form \frac{f(x)}{g(x)} as x \to a, with both numerator and denominator going to zero, Cauchy's MVT is the theorem that justifies replacing this with \frac{f'(x)}{g'(x)}. Without Cauchy, L'Hôpital's rule would be a heuristic, not a theorem.
Going deeper
If you came here to understand what the MVT says and how to use it, you have it — you can stop here. What follows is for readers who want to see the theorem's deeper consequences and its place in analysis.
The MVT as a bridge between local and global
The most important role of the MVT in analysis is this: it converts local information (the value of f' at individual points) into global information (the behaviour of f over an entire interval).
Without the MVT, knowing that f'(x) > 0 at every point would not logically imply that f is increasing. The derivative is defined at a single point using a limit; increasing-ness is a statement about pairs of points across an interval. The MVT is the bridge:
This equation relates function values (a global property) to the derivative (a local property). Every theorem about monotonicity, about extrema, about concavity — all of them ultimately rest on this bridge.
The MVT for integrals
There is a companion result for integrals. If f is continuous on [a, b], then
for some c \in (a, b). This says: the area under the curve equals the area of a rectangle whose height is f(c) — a sort of "average height." This is the MVT for integrals, and it is the tool behind many averaging arguments in analysis and probability.
Where this leads next
The MVT is the engine behind the next set of results in differential calculus.
- Monotonicity — the derivative test for increasing/decreasing functions, proved using the MVT.
- Monotonicity — Applications — using monotonicity to prove inequalities and analyse composite functions.
- Maxima and Minima — First Derivative Test — finding and classifying critical points using the sign of f'.
- Rolle's Theorem — the special case that powers the entire MVT family.
- L'Hôpital's Rule — evaluating \frac{0}{0} and \frac{\infty}{\infty} limits, justified by Cauchy's MVT.