Definite Integration - Introduction

In short

A definite integral \int_a^b f(x)\,dx computes the signed area between the curve y = f(x) and the x-axis, from x = a to x = b. It is defined as the limit of a sum of thin rectangular areas. In practice, you evaluate it by finding an antiderivative F and computing F(b) - F(a).

You know the area of a rectangle: base times height. You know the area of a triangle: half base times height. But what is the area of this?

Take the curve y = x^2 from x = 0 to x = 3. The region trapped between the curve, the x-axis, and the vertical lines x = 0 and x = 3 is not a rectangle, not a triangle, not any shape with a name. Its boundary is curved. How do you compute the area of something with a curved boundary?

This question — finding the area under a curve — is one of the oldest problems in mathematics. The answer is the definite integral, and the idea behind it is both simple and powerful: approximate the curved region with rectangles, then take a limit.

Rectangles that almost fit

Here is the strategy. Divide the interval [0, 3] into n equal pieces, each of width \Delta x = 3/n. In each piece, build a rectangle whose height equals the value of the function at the right endpoint of that piece.

The first rectangle sits over [0, 3/n] and has height f(3/n) = (3/n)^2 = 9/n^2. Its area is \frac{3}{n} \cdot \frac{9}{n^2} = \frac{27}{n^3}.

The second rectangle sits over [3/n, 6/n] and has height f(6/n) = (6/n)^2 = 36/n^2. Its area is \frac{3}{n} \cdot \frac{36}{n^2} = \frac{108}{n^3}.

The k-th rectangle sits over [\frac{3(k-1)}{n}, \frac{3k}{n}] and has height f\!\left(\frac{3k}{n}\right) = \frac{9k^2}{n^2}. Its area is \frac{3}{n} \cdot \frac{9k^2}{n^2} = \frac{27k^2}{n^3}.

The total area of all n rectangles is:

S_n = \sum_{k=1}^{n} \frac{27k^2}{n^3} = \frac{27}{n^3}\sum_{k=1}^{n} k^2

You know the formula \sum_{k=1}^{n} k^2 = \frac{n(n+1)(2n+1)}{6}. Substituting:

S_n = \frac{27}{n^3} \cdot \frac{n(n+1)(2n+1)}{6} = \frac{27(n+1)(2n+1)}{6n^2}

Expand the numerator: (n+1)(2n+1) = 2n^2 + 3n + 1. So:

S_n = \frac{27(2n^2 + 3n + 1)}{6n^2} = \frac{27}{6}\left(2 + \frac{3}{n} + \frac{1}{n^2}\right) = \frac{9}{2}\left(2 + \frac{3}{n} + \frac{1}{n^2}\right)

Now watch what happens as n grows:

n	S_n
4	\frac{9}{2}(2 + 0.75 + 0.0625) = 12.656
10	\frac{9}{2}(2 + 0.3 + 0.01) = 10.395
100	\frac{9}{2}(2 + 0.03 + 0.0001) = 9.135
1000	\frac{9}{2}(2 + 0.003 + 0.000001) \approx 9.0135

As n \to \infty, the terms 3/n and 1/n^2 vanish, and S_n \to \frac{9}{2} \cdot 2 = 9.

The area under y = x^2 from 0 to 3 is exactly 9.

The rectangles overshoot (since x^2 is increasing, the right-endpoint rectangle is taller than the curve), but the overshoot shrinks as the rectangles get thinner. In the limit, the approximation becomes exact.

Left: 4 rectangles approximate the area under $y = x^2$. The rectangles overshoot the curve. Right: 10 rectangles — the overshoot is much smaller. As the number of rectangles grows toward infinity, the total area of the rectangles approaches the exact area under the curve.

The formal definition

What you just did — summing rectangular areas and taking a limit — is the definition of the definite integral.

Definite Integral (Riemann Sum)

Let f be a function defined on [a, b]. Divide [a, b] into n equal subintervals, each of width \Delta x = \frac{b-a}{n}. Choose a sample point x_k^* in the k-th subinterval. The definite integral of f from a to b is

\int_a^b f(x)\,dx = \lim_{n \to \infty} \sum_{k=1}^{n} f(x_k^*)\,\Delta x

when this limit exists and is the same regardless of the choice of sample points.

Reading the notation. The symbol \int_a^b says "integrate from a to b." The a is called the lower limit and b the upper limit. The f(x)\,dx inside is the integrand — the function you are summing. The dx reminds you that \Delta x is the width of each strip, and the integral sign \int is a stretched-out S for "summa" (sum).

The sample point x_k^* can be the left endpoint, the right endpoint, the midpoint, or any point in the subinterval. The limit comes out the same for any choice, as long as the function is "well-behaved" (continuous, or at worst with finitely many discontinuities). This independence is what makes the definition robust.

Geometric interpretation: signed area

The definite integral computes signed area: area above the x-axis counts as positive, area below counts as negative.

If f(x) \geq 0 on [a, b], then \int_a^b f(x)\,dx is the ordinary area of the region between the curve and the x-axis.

If f(x) \leq 0 on [a, b], then \int_a^b f(x)\,dx is negative — the absolute value gives the area, but the sign tells you the region is below the axis.

If f crosses the x-axis, the integral sums the positive parts and subtracts the negative parts. The result is the net signed area, not the total area. To find the total (unsigned) area, you integrate |f(x)| instead, which means splitting the integral at each zero and taking absolute values.

For example, \int_0^{2\pi} \sin x\,dx = 0, because the positive hump from 0 to \pi exactly cancels the negative hump from \pi to 2\pi. But the total area enclosed is \int_0^{\pi} \sin x\,dx + \int_{\pi}^{2\pi} |\sin x|\,dx = 2 + 2 = 4.

Evaluation using antiderivatives

Computing a definite integral from the limit-of-sums definition is laborious — you saw the algebra involved even for x^2. There is a breathtakingly simpler method: find an antiderivative of f, and subtract.

If F is any antiderivative of f — meaning F'(x) = f(x) — then:

\int_a^b f(x)\,dx = F(b) - F(a)

This is written as \bigl[F(x)\bigr]_a^b or F(x)\Big|_a^b — the notation means "evaluate F at the upper limit and subtract F at the lower limit."

Why this works is the content of the Fundamental Theorem of Calculus, which gets its own article. For now, take it as a computing tool and verify it on the example you already solved.

You found that the area under y = x^2 from 0 to 3 is 9, by summing rectangles. An antiderivative of x^2 is F(x) = \frac{x^3}{3}. So:

\int_0^3 x^2\,dx = F(3) - F(0) = \frac{27}{3} - \frac{0}{3} = 9 - 0 = 9

The same answer, in one line. The sum-of-rectangles approach took half a page of algebra. The antiderivative approach takes 10 seconds.

This is the power of the definite integral: a difficult geometric problem (area under a curve) reduces to an algebraic one (find an antiderivative and subtract two values). The entire apparatus of integration techniques — substitution, parts, partial fractions — exists to make finding that antiderivative possible.

Example 1: Area under a parabola

Compute \displaystyle\int_1^4 (2x + 1)\,dx and interpret the result geometrically.

Step 1. Find an antiderivative of f(x) = 2x + 1.

F(x) = x^2 + x

Why: the antiderivative of 2x is x^2, and the antiderivative of 1 is x. No constant C is needed — it would cancel in F(b) - F(a).

Step 2. Evaluate at the limits.

F(4) = 16 + 4 = 20, \qquad F(1) = 1 + 1 = 2

Why: substitute the upper and lower limits into the antiderivative.

Step 3. Subtract.

\int_1^4 (2x+1)\,dx = 20 - 2 = 18

Why: F(b) - F(a) gives the definite integral.

Step 4. Geometric check. The graph of y = 2x + 1 is a straight line. The region from x = 1 to x = 4 is a trapezium with parallel sides f(1) = 3 and f(4) = 9, and width 3.

\text{Area} = \frac{1}{2}(3 + 9)(3) = 18

Why: the trapezium formula confirms the integral. For a linear function, the area under the curve is always a trapezium — a useful sanity check.

Result: \displaystyle\int_1^4 (2x+1)\,dx = 18.

The line $y = 2x + 1$ from $x = 1$ to $x = 4$. The shaded region is a trapezium with area $18$. The definite integral gives this area exactly — no surprise for a straight line, but the same method works for any curve.

Example 2: Signed area with a curve that crosses the axis

Compute \displaystyle\int_0^3 (x^2 - 2x)\,dx.

Step 1. Find an antiderivative.

F(x) = \frac{x^3}{3} - x^2

Why: integrate term by term — the antiderivative of x^2 is x^3/3, and the antiderivative of 2x is x^2.

Step 2. Evaluate at the limits.

F(3) = \frac{27}{3} - 9 = 9 - 9 = 0, \qquad F(0) = 0 - 0 = 0

Why: direct substitution.

Step 3. Subtract.

\int_0^3 (x^2 - 2x)\,dx = 0 - 0 = 0

Why: the integral is zero. But the curve is not identically zero — so what is going on?

Step 4. Investigate. The function f(x) = x^2 - 2x = x(x-2) has roots at x = 0 and x = 2. Between 0 and 2, the function is negative (the parabola dips below the axis). Between 2 and 3, it is positive.

The signed area below the axis (from 0 to 2) exactly cancels the signed area above (from 2 to 3). The definite integral, being signed area, returns zero — even though the total (unsigned) area is not zero.

To find the actual area: \int_0^2 |x^2-2x|\,dx + \int_2^3 (x^2-2x)\,dx = \frac{4}{3} + \frac{4}{3} = \frac{8}{3}.

Why: \int_0^2 (2x - x^2)\,dx = [x^2 - x^3/3]_0^2 = 4 - 8/3 = 4/3, and \int_2^3 (x^2 - 2x)\,dx = [x^3/3 - x^2]_2^3 = 0 - (-4/3) = 4/3. The two pieces have equal absolute area but opposite signs, so they cancel in the signed integral.

Result: \displaystyle\int_0^3 (x^2-2x)\,dx = 0. The signed area is zero because the region below the axis cancels the region above.

The parabola $y = x^2 - 2x$. Between $x = 0$ and $x = 2$, the curve is below the axis (negative area). Between $x = 2$ and $x = 3$, it is above (positive area). These two regions happen to have equal absolute area, so the definite integral from $0$ to $3$ is exactly $0$.

Common confusions

"The definite integral is always positive." No — it computes signed area. If the function is below the x-axis, the integral is negative. The integral \int_0^{\pi} (-\sin x)\,dx = -2, even though the region has a perfectly real area of 2 square units.
"I need the constant C in a definite integral." You do not. The arbitrary constant cancels: (F(b) + C) - (F(a) + C) = F(b) - F(a). The +C is only needed for indefinite integrals, where no limits are given.
"The limits of integration are x-values." They are — but students sometimes confuse them with y-values. In \int_a^b f(x)\,dx, a and b are values of x, not values of f. The function values f(a) and f(b) do not appear in the limits.
"\int_a^b f(x)\,dx and \int_a^b f(t)\,dt are different." They are the same number. The variable of integration is a "dummy variable" — it does not matter what letter you use. The integral depends on f, a, and b, not on the name of the variable.
"A definite integral of zero means the function is zero." The example above shows otherwise. A zero integral means the signed areas cancel. The function can be wildly nonzero and still have a zero integral over a symmetric region.

Going deeper

If you understand the Riemann sum definition, the signed-area interpretation, and the antiderivative evaluation method, you have the full picture for computing definite integrals. What follows is for readers who want to see why the limit-of-sums definition is carefully set up the way it is, and how it connects to the antiderivative.

Why the choice of sample point does not matter

The definition says "choose a sample point x_k^* in the k-th subinterval." It does not specify which point. Remarkably, the limit is the same for any consistent choice: left endpoints, right endpoints, midpoints, or even random points.

The proof of this fact requires the function to be Riemann integrable — which continuous functions always are. The key idea is that as n \to \infty, the maximum difference between f at any two points in the same subinterval goes to zero (by uniform continuity on a closed interval). So left-endpoint sums, right-endpoint sums, and midpoint sums all converge to the same number.

For the x^2 example, using left endpoints instead of right endpoints gives:

S_n^{\text{left}} = \frac{27}{n^3}\sum_{k=0}^{n-1} k^2 = \frac{27}{n^3} \cdot \frac{(n-1)n(2n-1)}{6}

As n \to \infty, this also approaches 9. The right-endpoint sum overestimates (because x^2 is increasing), the left-endpoint sum underestimates, and both converge to the same limit from opposite sides.

The definite integral as accumulated change

There is another way to think about \int_a^b f(x)\,dx beyond area. If f(x) represents a rate — speed, growth rate, flow rate — then the integral represents the total accumulated quantity over the interval.

A car moving at velocity v(t) metres per second from time t = a to t = b covers a total distance of \int_a^b v(t)\,dt metres. A population growing at rate r(t) organisms per year accumulates \int_a^b r(t)\,dt new organisms over [a, b].

This interpretation makes the Riemann sum physically natural: v(t_k) \cdot \Delta t is approximately the distance covered in a short time interval — speed times time. Summing all these small distances gives the total. The integral is the exact total.

Connection to the antiderivative: a preview

The fact that \int_a^b f(x)\,dx = F(b) - F(a) where F' = f is the Fundamental Theorem of Calculus. It connects two seemingly different ideas: the limit of a sum (a geometric/physical concept) and the antiderivative (an algebraic concept).

The full statement and proof are in the next article. The key insight is this: define G(x) = \int_a^x f(t)\,dt. Then G is a function of x — the "area so far" as you slide the upper limit. It turns out that G'(x) = f(x) — the rate at which area accumulates is exactly the height of the curve. So G is an antiderivative of f, and \int_a^b f(x)\,dx = G(b) = F(b) - F(a) for any antiderivative F.

That single connection — area accumulation rate equals function value — is the deepest idea in elementary calculus.

Where this leads next

Fundamental Theorem of Calculus — the theorem that justifies using antiderivatives to evaluate definite integrals, with a complete proof.
Properties of Definite Integrals — rules that let you simplify and manipulate definite integrals without computing them from scratch.
Properties - Advanced — periodic-function properties and the King's property for symmetric integrals.
Area Under Curves — applying definite integrals to compute areas of regions bounded by curves.
Partial Fractions - Review and Integration — a technique for finding antiderivatives of rational functions, which you then evaluate as definite integrals.