In short

The Mean Value Theorem (Lagrange's form) says: if f is continuous on [a, b] and differentiable on (a, b), then there exists a point c in (a, b) where the instantaneous rate of change equals the average rate of change — that is, f'(c) = \frac{f(b) - f(a)}{b - a}. Cauchy's MVT generalises this to ratios of two functions.

A car drives from Delhi to Agra — a distance of 200 km — in exactly 2 hours. Its average speed over the trip is 200 \div 2 = 100 km/h.

Here is a question: at some point during the trip, was the car's speedometer reading exactly 100 km/h?

Think about it. The car started at some speed, maybe 0 (at a red light), maybe 60. It ended at some speed. Its average was 100. Must it have passed through 100 at some instant?

Yes. And the reason is not complicated. If the car's speed were always less than 100, it could not have covered 200 km in 2 hours — the distance would be less than 100 \times 2 = 200. If the speed were always greater than 100, the distance would be more than 200. Since the speed is a continuous function (the car does not teleport), and the average works out to exactly 100, the speedometer must have read exactly 100 at least once.

That is the Mean Value Theorem. At some point during the journey, the instantaneous speed equals the average speed. The theorem says this is true not just for cars, but for any smooth function: the instantaneous rate of change must, at some point, equal the average rate of change.

From Rolle to Lagrange: tilting the picture

You have already met Rolle's theorem: if a smooth curve starts and ends at the same height, there is a horizontal tangent somewhere in between. The Mean Value Theorem is the generalisation to curves that start and end at different heights.

The idea is a beautiful geometric trick. Suppose f(a) \neq f(b). Draw the straight line from (a, f(a)) to (b, f(b)) — this is the secant line. Its slope is the average rate of change:

\text{slope of secant} = \frac{f(b) - f(a)}{b - a}

Now here is the trick: define a new function by subtracting this secant line from f. Call it

g(x) = f(x) - \left[f(a) + \frac{f(b) - f(a)}{b - a}(x - a)\right]

What is g? It is the vertical distance from the curve to the secant line. At x = a: g(a) = f(a) - f(a) = 0. At x = b: g(b) = f(b) - \left[f(a) + \frac{f(b) - f(a)}{b - a}(b - a)\right] = f(b) - f(b) = 0. So g(a) = g(b) = 0.

The function g satisfies all the conditions of Rolle's theorem: it is continuous on [a, b] (because f is, and the secant is a line), differentiable on (a, b), and g(a) = g(b). So there exists a c in (a, b) where g'(c) = 0.

Compute g'(x):

g'(x) = f'(x) - \frac{f(b) - f(a)}{b - a}

Setting g'(c) = 0:

f'(c) = \frac{f(b) - f(a)}{b - a}

That is it. The Mean Value Theorem falls out of Rolle's theorem in five lines, just by tilting the picture.

The Mean Value Theorem in one picture. The dashed line is the secant from $a$ to $b$. At the point $c$, the tangent (red) is parallel to the secant — their slopes are equal. The theorem guarantees that such a $c$ exists.

The formal statement

Lagrange's Mean Value Theorem

Let f be a function such that:

  1. f is continuous on [a, b],
  2. f is differentiable on (a, b).

Then there exists at least one point c \in (a, b) such that

f'(c) = \frac{f(b) - f(a)}{b - a}

Reading the theorem geometrically. The right-hand side is the slope of the secant line joining (a, f(a)) and (b, f(b)). The left-hand side is the slope of the tangent line at x = c. The theorem says: somewhere between a and b, the tangent is parallel to the secant.

Rolle's theorem is a special case. When f(a) = f(b), the secant is horizontal (slope zero), and the MVT reduces to f'(c) = 0 — exactly Rolle's theorem.

The proof, laid out step by step

Define the auxiliary function

g(x) = f(x) - f(a) - \frac{f(b) - f(a)}{b - a}(x - a)

Verify the conditions of Rolle's theorem for g:

  1. Continuity on [a, b]: f is continuous by hypothesis, and the second term is a linear function (continuous everywhere). So g is continuous.

  2. Differentiability on (a, b): f is differentiable by hypothesis, and a linear function is differentiable everywhere. So g is differentiable.

  3. Equal endpoint values:

    • g(a) = f(a) - f(a) - \frac{f(b)-f(a)}{b-a}(a - a) = 0
    • g(b) = f(b) - f(a) - \frac{f(b)-f(a)}{b-a}(b - a) = f(b) - f(a) - [f(b) - f(a)] = 0

    So g(a) = g(b) = 0.

By Rolle's theorem, there exists c \in (a, b) with g'(c) = 0.

Compute g'(c):

g'(x) = f'(x) - \frac{f(b) - f(a)}{b - a}

Setting g'(c) = 0:

f'(c) = \frac{f(b) - f(a)}{b - a}

which is the conclusion of the MVT. \blacksquare

Geometric interpretation

The picture is the theorem. Draw the curve y = f(x) between x = a and x = b. Draw the secant line connecting the two endpoints. Now imagine sliding a line parallel to that secant, moving it vertically. At some position, this sliding line will be tangent to the curve — it will touch the curve at exactly one point (locally). That touching point is the c guaranteed by the MVT.

For $f(x) = \frac{1}{2}x^2 - 2x + 3$ on $[1, 5]$: the secant (dashed) has slope $\frac{5.5 - 1.5}{5 - 1} = 1$. The tangent at $c = 3$ also has slope $f'(3) = 3 - 2 = 1$. The tangent and secant are parallel.

Applications of the MVT

The MVT is not just a theoretical statement — it is the tool behind nearly every result in the qualitative theory of derivatives. Here are the most important applications.

Application 1: Constant functions have zero derivatives (and the converse)

If f'(x) = 0 for every x in an interval (a, b), then f is constant on (a, b).

Proof using MVT. Pick any two points x_1 < x_2 in (a, b). By the MVT, there is a c between them with f'(c) = \frac{f(x_2) - f(x_1)}{x_2 - x_1}. But f'(c) = 0, so f(x_2) - f(x_1) = 0, so f(x_1) = f(x_2). Since this holds for any two points, f is constant.

This is the foundation of antiderivatives: two functions with the same derivative differ by a constant.

Application 2: The monotonicity test

If f'(x) > 0 for all x in (a, b), then f is strictly increasing on (a, b).

Proof using MVT. Take x_1 < x_2 in (a, b). By MVT, f(x_2) - f(x_1) = f'(c)(x_2 - x_1) for some c between them. Since f'(c) > 0 and x_2 - x_1 > 0, the product is positive, so f(x_2) > f(x_1). The function is increasing.

You will use this result constantly in the Monotonicity article.

Application 3: Bounding function values

The MVT gives precise bounds on how much a function can change. Since f(b) - f(a) = f'(c)(b-a) for some c, and if you know that m \le f'(x) \le M on (a, b), then

m(b - a) \le f(b) - f(a) \le M(b - a)

This is a powerful way to estimate function values without computing them exactly.

Example 1: Estimate $\sqrt{26}$ using the MVT

Step 1. Take f(x) = \sqrt{x}, with a = 25 and b = 26. You know f(25) = 5. You want f(26) = \sqrt{26}.

Why: \sqrt{25} is known exactly. The MVT will relate \sqrt{26} - \sqrt{25} to the derivative.

Step 2. By the MVT, f(26) - f(25) = f'(c)(26 - 25) = f'(c) for some c \in (25, 26).

Why: the interval has length 1, so the factor (b - a) is just 1.

Step 3. Compute f'(x) = \frac{1}{2\sqrt{x}}. For c \in (25, 26):

\frac{1}{2\sqrt{26}} < f'(c) < \frac{1}{2\sqrt{25}} = \frac{1}{10}

Why: \sqrt{x} is increasing, so \frac{1}{2\sqrt{x}} is decreasing. The largest value of f' on (25, 26) occurs at the left endpoint.

Step 4. So \sqrt{26} = 5 + f'(c), and f'(c) < \frac{1}{10} = 0.1. Also f'(c) > \frac{1}{2\sqrt{26}} > \frac{1}{2 \times 5.1} \approx 0.098. So

5.098 < \sqrt{26} < 5.1

Why: the MVT has turned the problem into a derivative estimate. The actual value is \sqrt{26} \approx 5.0990.

Result: \sqrt{26} lies between 5.098 and 5.1, with the MVT giving the rigorous bounds.

The curve $y = \sqrt{x}$ near $x = 25$. The secant from $(25, 5)$ to $(26, \sqrt{26})$ has slope $\sqrt{26} - 5$. The MVT says this slope equals $f'(c) = \frac{1}{2\sqrt{c}}$ for some $c$ between 25 and 26. Since $f'$ is bounded between $\frac{1}{2\sqrt{26}}$ and $\frac{1}{10}$, the estimate is pinned down.

Example 2: Prove that $|\sin a - \sin b| \le |a - b|$ for all real $a, b$

Step 1. Take f(x) = \sin x. This function is continuous everywhere and differentiable everywhere.

Why: \sin x satisfies the MVT hypotheses on any interval.

Step 2. By the MVT applied to f on [a, b] (assuming a < b):

\sin b - \sin a = \cos(c) \cdot (b - a) \quad \text{for some } c \in (a, b)

Why: f'(x) = \cos x.

Step 3. Take absolute values:

|\sin b - \sin a| = |\cos(c)| \cdot |b - a| \le 1 \cdot |b - a| = |b - a|

Why: |\cos(c)| \le 1 for all c. This is the only property of cosine needed.

Step 4. The case a = b is trivial (both sides are zero). If a > b, swap the labels.

Result: |\sin a - \sin b| \le |a - b| for all real a, b. This inequality is called a Lipschitz condition with constant 1.

The sine curve cannot change faster than its argument. The slope of the red secant joining any two points on the curve is bounded by $|\cos c| \le 1$, which means the sine curve always lies between the two dashed lines of slope $\pm 1$ through any given point. The MVT makes this precise.

Common confusions

Cauchy's Mean Value Theorem

There is a more general version that handles two functions simultaneously. It is called Cauchy's Mean Value Theorem (or the generalised MVT).

Cauchy's Mean Value Theorem

Let f and g be functions such that:

  1. Both are continuous on [a, b],
  2. Both are differentiable on (a, b),
  3. g'(x) \neq 0 for all x \in (a, b).

Then there exists c \in (a, b) such that

\frac{f'(c)}{g'(c)} = \frac{f(b) - f(a)}{g(b) - g(a)}

Why condition 3? The condition g'(x) \neq 0 ensures that g(b) \neq g(a) (by Rolle's theorem — if g(a) = g(b), then g' would be zero somewhere). So the right-hand side is well-defined, and the left-hand side does not have a zero denominator.

The proof. Define the auxiliary function

h(x) = f(x) - f(a) - \frac{f(b) - f(a)}{g(b) - g(a)}\,[g(x) - g(a)]

Check: h(a) = 0 and h(b) = f(b) - f(a) - \frac{f(b)-f(a)}{g(b)-g(a)}[g(b) - g(a)] = 0. The function h is continuous on [a, b] and differentiable on (a, b). By Rolle's theorem, h'(c) = 0 for some c \in (a, b).

h'(x) = f'(x) - \frac{f(b) - f(a)}{g(b) - g(a)} \cdot g'(x)

Setting h'(c) = 0:

f'(c) = \frac{f(b) - f(a)}{g(b) - g(a)} \cdot g'(c)

Since g'(c) \neq 0, divide both sides by g'(c):

\frac{f'(c)}{g'(c)} = \frac{f(b) - f(a)}{g(b) - g(a)} \qquad \blacksquare

Lagrange's MVT is a special case. Set g(x) = x. Then g'(x) = 1, g(b) - g(a) = b - a, and Cauchy's MVT becomes f'(c) = \frac{f(b) - f(a)}{b - a} — Lagrange's MVT.

Why Cauchy's MVT matters. It is the tool behind L'Hôpital's rule. When you have a limit of the form \frac{f(x)}{g(x)} as x \to a, with both numerator and denominator going to zero, Cauchy's MVT is the theorem that justifies replacing this with \frac{f'(x)}{g'(x)}. Without Cauchy, L'Hôpital's rule would be a heuristic, not a theorem.

Going deeper

If you came here to understand what the MVT says and how to use it, you have it — you can stop here. What follows is for readers who want to see the theorem's deeper consequences and its place in analysis.

The MVT as a bridge between local and global

The most important role of the MVT in analysis is this: it converts local information (the value of f' at individual points) into global information (the behaviour of f over an entire interval).

Without the MVT, knowing that f'(x) > 0 at every point would not logically imply that f is increasing. The derivative is defined at a single point using a limit; increasing-ness is a statement about pairs of points across an interval. The MVT is the bridge:

f(x_2) - f(x_1) = f'(c)(x_2 - x_1)

This equation relates function values (a global property) to the derivative (a local property). Every theorem about monotonicity, about extrema, about concavity — all of them ultimately rest on this bridge.

The MVT for integrals

There is a companion result for integrals. If f is continuous on [a, b], then

\int_a^b f(x)\,dx = f(c)(b - a)

for some c \in (a, b). This says: the area under the curve equals the area of a rectangle whose height is f(c) — a sort of "average height." This is the MVT for integrals, and it is the tool behind many averaging arguments in analysis and probability.

Where this leads next

The MVT is the engine behind the next set of results in differential calculus.