In short
The second derivative f''(x) is the derivative of f'(x) — it measures how the slope is changing. The nth derivative f^{(n)}(x) is obtained by differentiating n times. For products of two functions, the Leibniz theorem gives a binomial-like formula: (fg)^{(n)} = \sum_{k=0}^{n} \binom{n}{k} f^{(k)} g^{(n-k)}.
Take f(x) = x^3. Its derivative is f'(x) = 3x^2. At x = 1 the slope is 3; at x = 2 the slope is 12; at x = 3 the slope is 27. The slope is getting larger — the curve is getting steeper.
But how fast is the slope getting larger? At x = 1 the slope is 3, at x = 2 the slope is 12, a jump of 9 over one unit. At x = 2 the slope is 12, at x = 3 the slope is 27, a jump of 15. The slope itself is changing at a changing rate.
You can make this precise. The slope is f'(x) = 3x^2, and this is itself a function. Differentiate it:
At x = 1, the slope is changing at a rate of 6. At x = 2, the slope is changing at a rate of 12. At x = 3, the rate is 18. The function f''(x) = 6x tells you all of this at once. It is called the second derivative of f — the derivative of the derivative.
And you can keep going. Differentiate f''(x) = 6x to get f'''(x) = 6. Differentiate that to get f^{(4)}(x) = 0. After that, every further derivative is 0 — there is nothing left to differentiate.
So for f(x) = x^3, the chain of derivatives is:
Each derivative peels off one layer of the polynomial. After three differentiations (the degree of the polynomial), you reach a constant. One more differentiation kills it.
This raises a natural question: can you always keep differentiating? For polynomials, the chain terminates at zero. But for functions like e^x or \sin x, it never does — they can be differentiated infinitely many times, each time producing another interesting function. The general theory of nth derivatives handles all of this.
Notation
Higher derivatives have several notations, all meaning the same thing.
| Derivative | Lagrange | Leibniz | Short Leibniz |
|---|---|---|---|
| First | f'(x) | \dfrac{d y}{dx} | \dfrac{d}{dx}f |
| Second | f''(x) | \dfrac{d^2 y}{dx^2} | \dfrac{d^2}{dx^2}f |
| Third | f'''(x) | \dfrac{d^3 y}{dx^3} | \dfrac{d^3}{dx^3}f |
| nth | f^{(n)}(x) | \dfrac{d^n y}{dx^n} | \dfrac{d^n}{dx^n}f |
The Leibniz notation d^2y/dx^2 looks like a fraction squared. It is not — the superscripts are part of the symbol, recording that you have applied the operator d/dx twice. Read it as "dee-two-y dee-x-squared."
For the first three derivatives, primes work fine: f', f'', f'''. Beyond that, primes become unreadable (f''''' for the fifth derivative is a forest of tick marks), so the parenthetical notation f^{(n)} takes over. The parentheses are essential: f^{(5)} is the fifth derivative, while f^5 means f raised to the fifth power.
The second derivative and what it measures
The first derivative f'(x) measures the rate of change of f — the slope of the graph at each point.
The second derivative f''(x) measures the rate of change of the slope. Geometrically, it captures concavity: how the curve bends.
- If f''(x) > 0, the slope is increasing. The curve bends upward — like the inside of a bowl. This is called concave up.
- If f''(x) < 0, the slope is decreasing. The curve bends downward — like the top of a hill. This is called concave down.
- If f''(x) = 0 and the concavity changes sign, the point is an inflection point — where the curve switches from bending one way to bending the other.
There is a physical way to think about this. If s(t) is the position of a moving object at time t, then s'(t) is its velocity and s''(t) is its acceleration — the rate at which the velocity is changing. A positive acceleration means the object is speeding up (if moving forward) or decelerating less. The second derivative is how physics measures the push.
Here is a concrete illustration. Take f(x) = x^3 - 3x + 2. Its first derivative is f'(x) = 3x^2 - 3 and its second derivative is f''(x) = 6x.
-
At x = -1: f''(-1) = -6 < 0. The curve is concave down. And indeed, x = -1 is a local maximum — the curve peaks there. The second derivative test for maxima says: if f'(a) = 0 and f''(a) < 0, then a is a local maximum. Here f'(-1) = 3(1) - 3 = 0 and f''(-1) = -6 < 0, confirming the maximum.
-
At x = 1: f''(1) = 6 > 0. The curve is concave up. And x = 1 is a local minimum — the curve dips there. The second derivative test says: if f'(a) = 0 and f''(a) > 0, then a is a local minimum. Here f'(1) = 0 and f''(1) = 6 > 0, confirming the minimum.
-
At x = 0: f''(0) = 0. This is the inflection point — the curve switches from concave down (for x < 0) to concave up (for x > 0).
The nth derivative
For some functions, you can write down the nth derivative directly, without differentiating n times one by one.
Polynomials. If f(x) = x^m (where m is a positive integer), then:
The pattern is clear. After k differentiations:
for k \leq m. Once k > m, the derivative is 0 — you have differentiated away all the terms. In particular, f^{(m)}(x) = m! (a constant) and f^{(m+1)}(x) = 0.
So the fifth derivative of x^5 is 5! = 120, and the sixth derivative of x^5 is 0.
The coefficient \dfrac{m!}{(m-k)!} is called the falling factorial, sometimes written (m)_k or m^{\underline{k}}. It counts the number of ways to arrange k objects chosen from m — a connection to combinatorics that is not accidental, as the Leibniz theorem below will make clear.
Exponentials. If f(x) = e^{ax}, then f'(x) = ae^{ax}, f''(x) = a^2 e^{ax}, and in general:
Each differentiation pulls out one factor of a. The function e^x (where a = 1) is its own derivative at every order — f^{(n)}(x) = e^x for all n. This is the defining property that makes e^x unique.
Sine and cosine. If f(x) = \sin x:
The fourth derivative brings you back to \sin x — a cycle of period 4. So:
Each differentiation shifts the argument by \pi/2, which is equivalent to cycling through \sin, \cos, -\sin, -\cos.
Similarly, (\cos x)^{(n)} = \cos\!\left(x + \frac{n\pi}{2}\right).
Logarithms. If f(x) = \ln x:
The pattern: f^{(n)}(x) = (-1)^{n-1}(n-1)!\,x^{-n} for n \geq 1.
Here is a summary table:
| Function f(x) | nth derivative f^{(n)}(x) | Notes |
|---|---|---|
| x^m | \dfrac{m!}{(m-n)!}\,x^{m-n} | Zero for n > m |
| e^{ax} | a^n e^{ax} | Never zero |
| \sin(ax + b) | a^n\sin\!\left(ax + b + \frac{n\pi}{2}\right) | Cycles with period 4 |
| \cos(ax + b) | a^n\cos\!\left(ax + b + \frac{n\pi}{2}\right) | Cycles with period 4 |
| \ln x | (-1)^{n-1}(n-1)!\,x^{-n} | Alternating signs |
| (ax+b)^m | a^n\cdot\dfrac{m!}{(m-n)!}(ax+b)^{m-n} | Zero for n > m if m \in \mathbb{N} |
These closed-form nth derivatives are the building blocks that make the Leibniz theorem practical — without them, the theorem would give a sum whose individual terms you could not compute.
The Leibniz theorem
You know the product rule: (fg)' = f'g + fg'. What about the second derivative of a product? Differentiate the product rule:
And the third:
The coefficients are 1, 1, then 1, 2, 1, then 1, 3, 3, 1. These are the binomial coefficients — exactly the same numbers that appear in Pascal's triangle and in the expansion of (a+b)^n.
To see the third derivative in detail, differentiate (fg)'' = f''g + 2f'g' + fg'' using the product rule on each term:
Coefficients: 1, 3, 3, 1. The pattern continues at every order, and the result is the Leibniz theorem (also called the Leibniz rule for differentiation).
Leibniz theorem
If f and g are n times differentiable, then
where f^{(0)} = f and g^{(0)} = g (the "zeroth derivative" is the function itself), and \binom{n}{k} = \dfrac{n!}{k!(n-k)!}.
Written out, this is:
Compare this with the binomial theorem:
The structures are identical. The binomial theorem expands powers of a sum; the Leibniz theorem expands derivatives of a product. In both, the binomial coefficients \binom{n}{k} do the counting.
Proof by mathematical induction
The proof uses mathematical induction on n.
Base case (n = 1). The Leibniz theorem claims (fg)^{(1)} = \binom{1}{0}f^{(1)}g^{(0)} + \binom{1}{1}f^{(0)}g^{(1)} = f'g + fg'. This is the product rule, which is already known.
Inductive step. Assume the theorem holds for some n = m:
Differentiate both sides with respect to x to get (fg)^{(m+1)}. On the right side, each term \binom{m}{k}f^{(k)}g^{(m-k)} is a product, so the product rule gives:
Sum over all k from 0 to m:
In the first sum, shift the index: let j = k + 1, so k = j - 1 and the sum runs from j = 1 to j = m + 1:
In the second sum, replace k by j (just renaming):
Now combine. The j = 0 term appears only in the second sum, giving \binom{m}{0}f^{(0)}g^{(m+1)} = fg^{(m+1)}. The j = m + 1 term appears only in the first sum, giving \binom{m}{m}f^{(m+1)}g^{(0)} = f^{(m+1)}g. For 1 \leq j \leq m, both sums contribute, and the coefficients add:
This is Pascal's identity — the defining recurrence of the binomial coefficients. So the combined sum is:
which is exactly the Leibniz theorem for n = m + 1.
By induction, the theorem holds for all positive integers n. \square
Worked examples
Example 1: $n$th derivative of $x^2 e^x$
Find the nth derivative of f(x) = x^2 e^x.
Step 1. Identify the two factors for the Leibniz theorem.
Let u = x^2 and v = e^x.
Why: choose u to be the polynomial, because its higher derivatives eventually become zero, collapsing the sum.
Step 2. Compute the derivatives of u and v.
Why: x^2 is a degree-2 polynomial, so its third and higher derivatives vanish. And e^x is its own derivative at every order.
Step 3. Apply the Leibniz theorem. Since u^{(k)} = 0 for k \geq 3, only three terms survive:
Why: each v^{(k)} = e^x factors out, leaving a polynomial in x and n.
Step 4. Verify for small n. For n = 1: (x^2 e^x)' = e^x(x^2 + 2x). Check: by the product rule, (x^2 e^x)' = 2xe^x + x^2 e^x = e^x(x^2 + 2x). Matches. For n = 2: the formula gives e^x(x^2 + 4x + 2). Check: differentiate e^x(x^2 + 2x) to get e^x(x^2 + 2x) + e^x(2x + 2) = e^x(x^2 + 4x + 2). Matches.
Why: always verify a general formula against known cases — it catches sign errors and miscounted binomial coefficients.
Result: (x^2 e^x)^{(n)} = e^x\!\left[x^2 + 2nx + n(n-1)\right].
The Leibniz theorem turned an n-step computation into a three-term sum. The trick is that one of the two factors (x^2) has only finitely many non-zero derivatives, so the infinite-looking sum collapses.
Example 2: $n$th derivative of $x^3 \sin x$
Find the nth derivative of f(x) = x^3 \sin x.
Step 1. Let u = x^3 and v = \sin x.
Why: again, u is the polynomial (which will terminate), and v has a clean nth derivative formula.
Step 2. Compute derivatives.
Why: the nth derivative of \sin x is \sin(x + n\pi/2), as established earlier.
Step 3. Apply the Leibniz theorem. Four terms survive (since u^{(k)} = 0 for k \geq 4):
Step 4. Simplify the binomial coefficients.
Why: \binom{n}{1} \cdot 3 = 3n, \binom{n}{2} \cdot 6 = 3n(n-1), \binom{n}{3} \cdot 6 = n(n-1)(n-2).
Step 5. Verify for n = 1. The formula gives:
Check by direct differentiation: (x^3\sin x)' = 3x^2\sin x + x^3\cos x. Matches.
Result: (x^3\sin x)^{(n)} = x^3\sin\!\left(x + \frac{n\pi}{2}\right) + 3nx^2\sin\!\left(x + \frac{(n-1)\pi}{2}\right) + 3n(n-1)x\sin\!\left(x + \frac{(n-2)\pi}{2}\right) + n(n-1)(n-2)\sin\!\left(x + \frac{(n-3)\pi}{2}\right).
The formula has four terms because x^3 has derivatives up to order 3. A degree-m polynomial multiplied by any function whose nth derivative has a closed form will always produce a Leibniz sum with exactly m + 1 terms.
Common confusions
-
"The second derivative is just the first derivative squared." No. f''(x) means differentiate f'(x), not square it. For f(x) = x^3, f'(x) = 3x^2 and f''(x) = 6x. But (f'(x))^2 = 9x^4, which is entirely different.
-
"d^2y/dx^2 = (dy/dx)^2." Same confusion in Leibniz notation. The superscript 2 in d^2y/dx^2 records that d/dx has been applied twice, not that anything is being squared.
-
"The nth derivative of a product is f^{(n)} \cdot g^{(n)}." This is wrong. The product rule at first order already shows that (fg)' \neq f'g'. The correct formula is the Leibniz theorem, which involves a sum of n+1 terms with binomial coefficients.
-
"If f''(x) = 0 at a point, it must be an inflection point." Not necessarily. The condition f''(x) = 0 is necessary but not sufficient for an inflection point. You also need the concavity to change — f'' must switch sign. For f(x) = x^4, f''(0) = 0 but f''(x) = 12x^2 \geq 0 everywhere, so there is no change in concavity and x = 0 is not an inflection point.
-
"The Leibniz theorem is different from the binomial theorem." They are structurally the same. The binomial theorem says (a + b)^n = \sum \binom{n}{k} a^k b^{n-k}. The Leibniz theorem says (D_f \cdot D_g)^{(n)} = \sum \binom{n}{k} f^{(k)} g^{(n-k)}, where "multiplication" is pointwise and "power" means repeated differentiation. The analogy is exact.
Going deeper
If you came here to learn about second derivatives, nth derivative formulas, and the Leibniz theorem, you have that — you can stop here. The rest explores what higher derivatives mean conceptually, how they connect to Taylor series, and some edge cases.
What the nth derivative measures
Each order of derivative captures a new layer of information about a function's behaviour near a point.
- f(a): the value — where the function is.
- f'(a): the slope — which direction the function is heading.
- f''(a): the concavity — whether the function is curving up or down.
- f'''(a): the jerk (in physics) — the rate of change of acceleration. This controls how abruptly the curvature changes.
- f^{(n)}(a): the nth-order refinement — each derivative adds one more piece of local information.
This is the idea behind the Taylor series. If you know f(a), f'(a), f''(a), ..., f^{(n)}(a), you can build a polynomial that approximates f near x = a:
The more derivatives you include, the better the approximation. In this sense, higher derivatives are the raw material for the most powerful approximation tool in mathematics.
A concrete Taylor approximation
To see higher derivatives in action, approximate e^x near x = 0 using its first few derivatives. Since every derivative of e^x is e^x, and e^0 = 1, the Taylor polynomial of degree n at a = 0 is:
How good is this? At x = 1, the exact value is e \approx 2.71828. The approximations:
| n | Polynomial value at x=1 | Error |
|---|---|---|
| 1 | 1 + 1 = 2 | 0.718 |
| 2 | 1 + 1 + 0.5 = 2.5 | 0.218 |
| 3 | 2.5 + 0.1\overline{6} = 2.6\overline{6} | 0.052 |
| 4 | 2.6\overline{6} + 0.041\overline{6} = 2.708\overline{3} | 0.010 |
| 5 | 2.708\overline{3} + 0.008\overline{3} = 2.71\overline{6} | 0.002 |
Each additional derivative doubles the accuracy (roughly). The nth derivative controls the nth term of the approximation, and each term makes the polynomial fit the curve more tightly.
Functions that are infinitely differentiable
Most functions you meet in school — polynomials, e^x, \sin x, \cos x, rational functions away from their poles — can be differentiated any number of times. Such functions are called smooth (or C^\infty).
But there are functions that are differentiable once or twice but not further. The function f(x) = x^{5/2} for x \geq 0 is twice differentiable everywhere, but its third derivative does not exist at x = 0 (the computation involves x^{-1/2}, which blows up). How many times you can differentiate a function at a point is a measure of how "regular" it is there.
The operator point of view
Define the operator D by Df = f'. Then D^2 f = f'', D^3 f = f''', and so on. The Leibniz theorem in this notation becomes:
Compare this with the algebraic identity (a + b)^n = \sum \binom{n}{k} a^k b^{n-k}. The formal similarity is so strong that mathematicians sometimes write D^n(fg) = (D_f + D_g)^n(fg), where D_f acts only on f and D_g acts only on g. This "operator binomial theorem" is not just a mnemonic — it is a rigorous identity in the algebra of differential operators, and it is the starting point for much of the theory of partial differential equations.
Leibniz theorem for three or more functions
The Leibniz theorem extends to products of three or more functions, using the multinomial coefficient instead of the binomial coefficient:
This is less commonly needed in practice, but it arises in certain JEE problems. The structure is the same — it mirrors the multinomial theorem (a + b + c)^n = \sum \frac{n!}{i!j!k!} a^i b^j c^k.
Where this leads next
Higher derivatives are foundational to many topics in calculus and analysis. Here are the most direct continuations:
- Parametric Differentiation — the second derivative in parametric form, which uses the machinery you have just learned.
- Differentiation of Functions w.r.t. Functions — differentiating one function with respect to another, a chain rule reinterpretation.
- Rules of Differentiation — the product, quotient, and sum rules that the Leibniz theorem generalises.
- Mathematical Induction — the proof technique used to establish the Leibniz theorem.
- Differentiability — when and why derivatives exist, and what happens at corners and cusps.