In short
The derivative \frac{dy}{dx} is literally a rate — how fast y changes per unit change in x. When several quantities are linked by an equation and all change with time, you differentiate the whole equation with respect to t and the chain rule connects their rates. This is the setup behind related-rate problems, marginal analysis, and every physics equation that has a "d/dt" in it.
A spherical balloon is being inflated at a constant rate of 100 cm^3 per second. When the balloon's radius is 10 cm, how fast is the radius growing?
You know the volume: V = \frac{4}{3}\pi r^3. You know \frac{dV}{dt} = 100 cm^3/s. You want \frac{dr}{dt} when r = 10. The volume and the radius are not independent — they are connected by a formula — and both are changing with time. The derivative is the tool that links their rates.
This is a related rates problem: two (or more) quantities are related by an equation, both change with time, and you know the rate of one and want the rate of the other. The entire strategy is: differentiate the connecting equation with respect to time, then solve for the unknown rate.
The derivative is a rate
Before plunging into problems, it is worth pausing on what the derivative means when the independent variable is time.
If s(t) is the position of a car at time t, then \frac{ds}{dt} is the velocity — how fast position changes per unit time. If v(t) is the velocity, then \frac{dv}{dt} is the acceleration — how fast velocity itself changes per unit time.
If T(t) is the temperature of a cup of tea at time t, then \frac{dT}{dt} is the rate of cooling — how many degrees the tea loses per second. If P(t) is the population of a city, \frac{dP}{dt} is the population growth rate.
In each case, the derivative gives the instantaneous rate: not the average over some period, but the rate right now. The units of the derivative are always units of the numerator per unit of the denominator — metres per second, degrees per second, people per year.
The sign matters. A positive \frac{dT}{dt} means the temperature is rising; a negative one means it is falling. A positive \frac{ds}{dt} means the object moves in the positive direction; negative means it moves in the negative direction. The derivative encodes both the magnitude and the direction of change.
Related rates: the core technique
Here is the complete method, written out once so you can apply it mechanically.
Step 1. Draw a picture and label every quantity that changes with time. Assign a variable to each.
Step 2. Write down the equation that connects the variables. This is usually a formula from geometry or physics.
Step 3. Differentiate both sides with respect to time t. Use the chain rule wherever a variable depends on t.
Step 4. Plug in all known values — the specific instant in time, the known rates, the known quantities at that instant.
Step 5. Solve for the unknown rate.
The balloon problem at the start of the article is a perfect first application.
The balloon, solved
V = \frac{4}{3}\pi r^3. Differentiate both sides with respect to t:
The chain rule produced \frac{dr}{dt} automatically — the radius is a function of time, so differentiating r^3 with respect to t gives 3r^2 \cdot \frac{dr}{dt}.
Plug in \frac{dV}{dt} = 100 and r = 10:
When the radius is 10 cm, the radius grows at \frac{1}{4\pi} cm per second — about 0.08 cm/s. That is surprisingly slow, considering the volume is increasing at 100 cm^3/s. The reason: at r = 10, the surface area is 4\pi(100) = 400\pi \approx 1257 cm^2. The 100 cm^3 of air spreads over this large surface, so the radius barely budges.
Notice something: if you asked the same question when r = 1, you would get \frac{dr}{dt} = \frac{100}{4\pi} \approx 7.96 cm/s — a hundred times faster. A small balloon inflates rapidly; a large balloon inflates slowly. The derivative quantifies this precisely.
The chain rule in disguise
Every related-rates problem is a chain rule problem in disguise. The chain rule says
and this is exactly what you computed for the balloon: \frac{dV}{dr} = 4\pi r^2 (how volume responds to radius) times \frac{dr}{dt} (how radius responds to time) gives \frac{dV}{dt} (how volume responds to time). The chain rule chains together the two sensitivities.
When there are three linked quantities — say, the angle, the radius, and the area of a sector — you may need two chain-rule steps:
or equivalently, if A depends on both r and \theta:
The multi-variable case requires partial derivatives, which you will meet later. For now, the single-chain version handles all standard problems.
Worked examples
Example 1: The sliding ladder
A 5-metre ladder leans against a vertical wall. The foot of the ladder slides away from the wall at 0.5 m/s. When the foot is 3 m from the wall, how fast is the top of the ladder sliding down the wall?
Step 1. Let x be the distance from the foot of the ladder to the wall, and y be the height of the top of the ladder above the ground. Both change with time.
Why: the ladder has constant length, but as the foot moves out, the top moves down. Two changing quantities, linked by the ladder's length.
Step 2. By Pythagoras: x^2 + y^2 = 25 (the ladder is 5 m long).
Step 3. Differentiate with respect to t:
Why: the right side is zero because 25 is a constant — the ladder's length does not change. The chain rule turns each variable into its own rate.
Step 4. When x = 3: y = \sqrt{25 - 9} = 4 m. Given \frac{dx}{dt} = 0.5 m/s.
Why: the negative sign means y is decreasing — the top of the ladder is sliding down, which is exactly what you expect.
Result: The top slides down at \frac{3}{8} m/s when the foot is 3 m from the wall.
Something worth noticing: as x approaches 5 (the ladder almost flat on the ground), y approaches 0, and \frac{dy}{dt} = -\frac{x}{2y} \cdot \frac{dx}{dt} blows up — the top of the ladder accelerates downward without limit. In reality, the ladder would separate from the wall before that happens.
Example 2: Shadow of a walking person
A 1.8 m tall person walks away from a 6 m tall lamppost at 1.2 m/s. How fast is the tip of their shadow moving when they are 4 m from the post?
Step 1. Let x be the distance of the person from the post, and s be the length of the shadow. The tip of the shadow is at distance x + s from the post.
Why: the shadow extends beyond the person. Its tip is farther from the post than the person is.
Step 2. By similar triangles (the lamppost, the person, and the shadow tip form two similar triangles):
Cross-multiply: 6s = 1.8(x + s) = 1.8x + 1.8s, so 4.2s = 1.8x, giving
Step 3. The tip of the shadow is at distance d = x + s = x + \frac{3}{7}x = \frac{10}{7}x. Differentiate:
Why: since d is a constant multiple of x, the rate of d is the same constant multiple of the rate of x. No chain rule complexity here — the relationship is linear.
Step 4. Plug in \frac{dx}{dt} = 1.2 m/s:
Why: this is the speed of the shadow tip, not the rate of growth of the shadow itself. The shadow tip moves faster than the person because the shadow lengthens as the person walks away.
Result: The tip of the shadow moves at \frac{12}{7} \approx 1.71 m/s — and this is constant, independent of x. The shadow tip always moves at the same speed, no matter how far the person is from the post.
The surprise is that the shadow-tip speed is constant. This happens because the height ratio is fixed, making d a fixed multiple of x. If the ground were curved, or the lamppost were not vertical, the algebra would be different.
Applications in physics
The derivative appears everywhere in physics under different names.
Velocity and acceleration. If s(t) is displacement, v(t) = \frac{ds}{dt} is velocity and a(t) = \frac{dv}{dt} = \frac{d^2s}{dt^2} is acceleration. Newton's second law F = ma can be written F = m\frac{d^2s}{dt^2} — force is mass times the second derivative of position.
Take a concrete case. A stone is thrown vertically upward with initial velocity 20 m/s. Its height at time t is h(t) = 20t - 4.9t^2 (ignoring air resistance). The velocity is h'(t) = 20 - 9.8t. At t = 0, the velocity is 20 m/s (upward). At t = 20/9.8 \approx 2.04 s, the velocity is zero — the stone has reached its peak. After that, h'(t) < 0 — the stone falls back down. The acceleration is h''(t) = -9.8 m/s^2 throughout, constant and downward — exactly g.
The sign of h'(t) tells you the direction of motion. The sign of h''(t) tells you the direction of the force. These are not the same. A stone at the top of its arc has h' = 0 (momentarily stationary) but h'' = -9.8 (gravity still pulling it down). The velocity is zero but the acceleration is not — which is why the stone does not stay at the top.
Current. If Q(t) is the charge that has flowed past a point in a circuit, then I = \frac{dQ}{dt} is the current — the rate of charge flow. A current of 5 amperes means 5 coulombs of charge flow past the point every second.
Power. If W(t) is the work done up to time t, then P = \frac{dW}{dt} is the power — the rate at which energy is delivered.
Density as a derivative. If m(x) is the mass of a non-uniform rod from position 0 to position x, then \rho(x) = \frac{dm}{dx} is the linear density — the mass per unit length at position x. A rod whose mass function is m(x) = x^2 (grams, with x in cm) has density \rho(x) = 2x g/cm — denser at the far end than at the near end. The derivative reveals how mass is distributed.
In each case, the physical quantity is the derivative. Velocity is not "related to" the derivative — it is the derivative of displacement with respect to time. Learning to read \frac{d(\text{something})}{dt} as "the rate at which (something) changes" is the single most important skill this article teaches.
Applications in geometry
Many geometric quantities — areas, volumes, lengths — depend on a parameter that changes with time, and the derivative tells you how the geometric quantity evolves.
Expanding circle. If a circle's radius grows at rate \frac{dr}{dt}, its area A = \pi r^2 grows at
This is \frac{dA}{dt} = (\text{circumference}) \times \frac{dr}{dt} — which makes geometric sense. A thin ring of width dr at radius r has area approximately 2\pi r \cdot dr, and the rate of area growth is that ring area per unit time.
Expanding sphere. Similarly, V = \frac{4}{3}\pi r^3 gives \frac{dV}{dt} = 4\pi r^2 \frac{dr}{dt} = (\text{surface area}) \times \frac{dr}{dt}. The rate of volume growth equals the surface area times the rate of radius growth — a thin shell of thickness dr at radius r has volume approximately 4\pi r^2 \cdot dr.
These are not coincidences. They reflect the fact that the derivative of a "bulk" quantity with respect to its "boundary" parameter is the "boundary" quantity. Area is the integral of circumference; volume is the integral of surface area. The derivative reverses this.
A cone being filled with water. Water is poured into an inverted cone (apex at the bottom) at a rate of 12 cm^3/s. The cone has a half-angle of 30°, so the radius of the water surface at height h is r = h/\sqrt{3}. What is the rate of change of the water level when h = 6 cm?
The volume of the water is V = \frac{1}{3}\pi r^2 h = \frac{1}{3}\pi \cdot \frac{h^2}{3} \cdot h = \frac{\pi h^3}{9}.
Differentiate with respect to t: \frac{dV}{dt} = \frac{\pi}{3} h^2 \frac{dh}{dt}.
With \frac{dV}{dt} = 12 and h = 6: 12 = \frac{\pi}{3}(36)\frac{dh}{dt} = 12\pi \frac{dh}{dt}, so \frac{dh}{dt} = \frac{1}{\pi} \approx 0.318 cm/s.
The water level rises at about 0.32 cm/s when the water is 6 cm deep. At h = 1 cm, the same calculation gives \frac{dh}{dt} = \frac{12}{\pi/3} = \frac{36}{\pi} \approx 11.5 cm/s — much faster, because the cone is narrow at the bottom and the same volume of water raises the level much more. This is the same phenomenon as the balloon: the rate of a linear measurement depends on the current size of the object.
Marginal cost and marginal revenue
In economics, the derivative has a name: marginal. If C(x) is the total cost of producing x units of a good, then
This is the approximate cost of producing one additional unit when you are already at x units. It is not the average cost C(x)/x — it is the incremental cost. The difference matters.
Similarly, if R(x) is the total revenue from selling x units, then R'(x) is the marginal revenue — the additional revenue from selling one more unit.
Profit maximisation. Profit P(x) = R(x) - C(x) is maximised when P'(x) = 0, i.e., when R'(x) = C'(x) — marginal revenue equals marginal cost. This is the fundamental result of microeconomics, and it is nothing more than setting the derivative of profit to zero.
For example, if C(x) = 500 + 10x + 0.01x^2 (in rupees) and R(x) = 50x - 0.02x^2, then:
Setting C'(x) = R'(x): 10 + 0.02x = 50 - 0.04x, so 0.06x = 40, giving x \approx 667 units.
At 667 units, the cost of producing one more unit exactly equals the revenue from selling one more unit. Producing beyond this point loses money on each additional unit.
Average cost vs. marginal cost. These are different quantities. The average cost at 667 units is C(667)/667 = (500 + 6670 + 4449)/667 \approx 17.4 rupees per unit. The marginal cost at 667 units is C'(667) = 10 + 13.3 = 23.3 rupees. The marginal cost is higher — each additional unit costs more than the average. This happens when the average cost curve is rising, and the relationship between average and marginal (the marginal pulls the average toward itself) is one of the central ideas in microeconomics.
Common confusions
-
"\frac{dV}{dt} is the volume." It is the rate of change of the volume. The volume is V; the rate is \frac{dV}{dt}. Mixing them up leads to dimensional errors: V is in cm^3, but \frac{dV}{dt} is in cm^3/s.
-
"I should differentiate with respect to r in the balloon problem." You differentiate with respect to time, because time is the variable with respect to which everything is changing. Differentiating V = \frac{4}{3}\pi r^3 with respect to r gives \frac{dV}{dr} = 4\pi r^2, which is a true statement (it is the surface area) but does not answer the question. The question asks about \frac{dV}{dt} and \frac{dr}{dt}.
-
"Marginal cost is the cost of the last unit." Almost — it is the cost of the next unit, evaluated at the current production level. Since C(x) is treated as a continuous function, "the cost of one more unit" is approximated by the derivative C'(x). The approximation is excellent when x is large, because the function is nearly linear over one unit.
-
"Related rates problems give you a constant rate." They give you a rate at a specific instant. In the ladder problem, \frac{dy}{dt} = -3/8 is the rate when x = 3, not the rate at all times. A moment later, x and y have changed, and so has \frac{dy}{dt}.
-
"I don't need the chain rule — I can just differentiate each variable." You can differentiate each variable with respect to itself. But to connect \frac{dV}{dt} and \frac{dr}{dt}, you need the chain rule: \frac{dV}{dt} = \frac{dV}{dr} \cdot \frac{dr}{dt}. The chain rule is the bridge.
Going deeper
The main techniques are above. What follows goes beyond standard exam problems into the structure underneath.
Why Leibniz notation works so well here
In related-rate problems, you treat \frac{dV}{dr} and \frac{dr}{dt} as if they are fractions and multiply them:
This "cancellation" of dr is not a coincidence — it is the chain rule written in Leibniz notation. Leibniz designed the notation precisely so that the chain rule looks like fraction cancellation. The notation carries the theorem inside its form. This is why Leibniz notation dominates applied mathematics and physics: it makes the chain rule invisible.
Strictly, \frac{dV}{dr} is not a fraction — the d's are not numbers. But the chain rule guarantees that the notation behaves as if it were a fraction, and in practice you can rely on this.
Rates of rates: higher-order related rates
In the ladder problem, you found \frac{dy}{dt} at a specific instant. But \frac{dy}{dt} itself is changing — the top of the ladder accelerates as it slides down. To find the acceleration of the top, differentiate the equation 2x\frac{dx}{dt} + 2y\frac{dy}{dt} = 0 with respect to t once more. Using the product rule:
If the foot slides at constant speed (\frac{d^2x}{dt^2} = 0), you can solve for \frac{d^2y}{dt^2} — the vertical acceleration of the top. At x = 3, y = 4, \frac{dx}{dt} = 0.5, \frac{dy}{dt} = -3/8:
The top is accelerating downward (negative \frac{d^2y}{dt^2}), even though the foot moves at constant speed. The algebra is messier, but the method is identical: differentiate the connecting equation one more time with respect to t.
Sensitivity and elasticity
Economists often care not about the absolute rate \frac{dQ}{dP} (how demand Q changes per rupee change in price P) but about the percentage rate: how a 1% change in price affects the percentage change in demand. This is the elasticity:
Elasticity is a dimensionless number — it measures relative sensitivity. When |\varepsilon| > 1, demand is "elastic" (price-sensitive); when |\varepsilon| < 1, demand is "inelastic." The derivative provides the raw rate; the ratio P/Q scales it into a percentage. This is the same idea as percentage error, which you will see in the article on approximations.
Where this leads next
- Approximations — using the derivative to estimate function values near a known point, with error bounds.
- Maxima and Minima — where the rate of change is zero — the turning points that solve optimisation problems.
- Tangent and Normal — Advanced — the derivative as a slope, extended to common tangents, angles of intersection, and orthogonal curves.
- Monotonicity — using the sign of the derivative to determine where a function is increasing or decreasing.
- Differential Equations — when the relationship between a quantity and its rate is the problem, not just a step toward the answer.