Integration - Introduction

In short

An antiderivative of f is any function F whose derivative is f. There is never just one — if F works, so does F + 7, F - \pi, or F plus any constant. The full collection of all antiderivatives, written \int f(x)\, dx = F(x) + C, is called the indefinite integral of f. The + C is not decoration; it captures every possible antiderivative at once.

You know that a ball thrown straight up has velocity v(t) = 20 - 10t metres per second — starting at 20 m/s upward and slowing by 10 m/s every second. Where is the ball at time t?

Differentiation answered one direction of this. If you knew the position s(t), you could find the velocity by differentiating: v(t) = s'(t). But here the problem runs the other way. You know the velocity, and you want the position. The velocity is the derivative of something, and that something is what you want to recover.

Is there a function whose derivative is 20 - 10t? Yes — s(t) = 20t - 5t^2 works, because differentiating gives s'(t) = 20 - 10t, which is v(t). But is this the only such function? No. Try s(t) = 20t - 5t^2 + 15. Its derivative is still 20 - 10t, because adding a constant doesn't change the derivative. Try s(t) = 20t - 5t^2 - \sqrt{2}. Same derivative. You can add any constant and the derivative stays the same.

That is not a quirk. It is the central feature of this operation. Any one antiderivative gives you the shape of the position, but you need an extra piece of information — the starting height — to know which of infinitely many vertically-shifted copies is the actual ball's trajectory. This is the subject of the whole article: the process of running differentiation backwards, called integration, and the extra constant that forever attaches itself to the answer.

The antiderivative

Differentiation takes a function and produces another function — its derivative. Integration is the inverse: it takes a function and asks "what function has this as its derivative?"

Antiderivative

A function F is an antiderivative of f on an interval I if

F'(x) = f(x) \quad \text{for every } x \in I.

Every antiderivative is defined by working one derivative rule backwards.

What is an antiderivative of 2x? Well, \frac{d}{dx}(x^2) = 2x, so x^2 is one.
What is an antiderivative of \cos x? Well, \frac{d}{dx}(\sin x) = \cos x, so \sin x is one.
What is an antiderivative of 3x^2? Well, \frac{d}{dx}(x^3) = 3x^2, so x^3 is one.

At this stage, finding antiderivatives is reverse-engineering. You look at the function in front of you and ask: what could I have differentiated to get this? Every differentiation rule you already know gives you a matching antiderivative rule — that is literally all you have at the start.

The infinity of antiderivatives

Look again at the three examples above, and the key observation that started this article. If F(x) = x^2 is an antiderivative of 2x, then so is G(x) = x^2 + 3, because

G'(x) = \frac{d}{dx}(x^2 + 3) = 2x + 0 = 2x = f(x).

The derivative of a constant is zero, so adding any constant to an antiderivative gives another antiderivative.

So there is no such thing as "the antiderivative." There are infinitely many antiderivatives, one for each choice of additive constant.

Four antiderivatives of $f(x) = 2x$: $x^2$, $x^2 + 2$, $x^2 - 3$, and $x^2 + 4$. All four are parabolas with the same shape; they differ only in vertical position. Each one, when differentiated, gives back the same $2x$.

Look at the picture. All four curves are identical except for vertical shifts. Differentiating any of them gives the same slope at each x — because they have the same slope at each x. Their derivatives are the same function. They are different antiderivatives of the same 2x.

The natural question: is this the only way antiderivatives can differ, or might there be some stranger kind of antiderivative of 2x that isn't of the form x^2 + C?

The answer is clean. On any interval, two antiderivatives of the same function differ by a constant — nothing else.

Two antiderivatives differ by a constant

If F and G are both antiderivatives of f on an interval I, then there is a constant C such that

G(x) = F(x) + C

for every x \in I.

Proof. Define H(x) = G(x) - F(x). Then H'(x) = G'(x) - F'(x) = f(x) - f(x) = 0 for every x \in I. So H is a function whose derivative is zero on an interval. By the mean value theorem, any function with zero derivative on an interval must be constant (take any two points a, b \in I; the MVT gives H(b) - H(a) = H'(c)(b - a) = 0, so H(b) = H(a) — every pair of values of H is equal, so H is constant). Thus G - F is a constant C, which gives G = F + C. \blacksquare

So once you find one antiderivative, you have them all: every other antiderivative is a vertical shift of the one you found. The full family of antiderivatives is just

\{F(x) + C : C \in \mathbb{R}\}

— a single curve and every vertical shift of it.

One caveat in the statement. The "differ by a constant" result is about an interval. If the domain is not an interval — if it has gaps — then two antiderivatives can differ by different constants on different pieces. Example: f(x) = \frac{1}{x^2} is defined on (-\infty, 0) \cup (0, \infty). An antiderivative is -\frac{1}{x}. But so is -\frac{1}{x} + 3 on the left and -\frac{1}{x} + 7 on the right, combined into one piecewise function. That function has derivative \frac{1}{x^2} everywhere it is defined, but it differs from -\frac{1}{x} by two different constants on the two branches. This is why the theorem is stated on an interval.

The indefinite integral

Having the whole family of antiderivatives in one place is useful enough that it gets its own symbol.

Indefinite integral

The indefinite integral of a function f is the collection of all its antiderivatives, written

\int f(x) \, dx = F(x) + C,

where F is any particular antiderivative and C is an arbitrary constant, called the constant of integration.

The symbol \int is called the integral sign. The function f(x) being integrated is the integrand. The dx indicates that the variable of integration is x.

Read the whole thing out: "the integral of f of x with respect to x." Each piece has a job:

\int — says "find the antiderivatives of what comes next."
f(x) — the function you want antiderivatives of.
dx — says "treat x as the variable; everything else is a constant." This matters when the function has more than one letter in it. \int 3ax\, dx treats a as a constant (a fixed number along for the ride), and the answer is \frac{3a x^2}{2} + C.
+ C — the constant of integration, standing in for every possible vertical shift.

The indefinite integral is not a number. It is a family of functions. When you write \int 2x\, dx = x^2 + C, you are saying: "the antiderivatives of 2x are exactly the functions x^2 + C as C ranges over the reals." Every C gives a legitimate antiderivative; every antiderivative arises for some C.

Reading off antiderivatives

You can write down the indefinite integral of any function whose derivative rule you already know — just run the derivative rule backwards.

Start with the power rule for derivatives: \frac{d}{dx}(x^n) = n x^{n-1}. Rearrange it to ask, "what is an antiderivative of x^m?" If F(x) = \frac{x^{m+1}}{m+1}, then

F'(x) = \frac{(m+1) x^m}{m+1} = x^m.

So:

\int x^m \, dx = \frac{x^{m+1}}{m+1} + C, \quad \text{valid for } m \neq -1.

The exclusion m \neq -1 is necessary: at m = -1 the formula gives \frac{x^0}{0}, which is nonsense. (The antiderivative of x^{-1} = \frac{1}{x} is \ln|x|, a separate story.)

Some examples:

\int x^2 \, dx = \frac{x^3}{3} + C
\int x^5 \, dx = \frac{x^6}{6} + C
\int \frac{1}{x^2} \, dx = \int x^{-2} \, dx = \frac{x^{-1}}{-1} + C = -\frac{1}{x} + C
\int \sqrt{x} \, dx = \int x^{1/2} \, dx = \frac{x^{3/2}}{3/2} + C = \frac{2}{3} x^{3/2} + C

The same backward-read idea works for all the standard derivative rules. Two that you will use over and over:

\frac{d}{dx}(\sin x) = \cos x \Rightarrow \int \cos x \, dx = \sin x + C
\frac{d}{dx}(e^x) = e^x \Rightarrow \int e^x \, dx = e^x + C

The next article, basic integration formulas, tabulates all these standard antiderivatives in one place. For now, the point is that you already know the machinery — it is just derivative rules read right-to-left.

Using initial conditions to pin down C

The constant C is not a nuisance. It is a genuine degree of freedom, and pinning it down requires one extra piece of information — typically the value of the antiderivative at a single point. That one piece of information is called an initial condition.

Go back to the falling-ball example. You know v(t) = 20 - 10t and you want s(t). The indefinite integral gives

s(t) = \int (20 - 10 t) \, dt = 20 t - 5 t^2 + C.

This is the whole family of possible positions: every vertical shift of the parabola 20 t - 5 t^2. To pick the right one, you need to know where the ball starts. Say the ball is thrown from ground level, so s(0) = 0. Plug in:

0 = 20 \cdot 0 - 5 \cdot 0^2 + C \implies C = 0.

So s(t) = 20 t - 5 t^2. The initial condition collapsed the infinite family down to one specific function.

Different initial conditions give different specific answers. If the ball is thrown from the top of a 5 metre ledge, s(0) = 5, and the calculation gives C = 5, so s(t) = 20 t - 5 t^2 + 5. Same velocity, different position — because the trajectories are vertical shifts of each other.

Example 1: Antiderivative by reading the power rule backward

Find \int (x^3 - 4x + 7) \, dx.

Step 1. Split the integral using linearity.

\int (x^3 - 4x + 7) \, dx = \int x^3 \, dx - 4 \int x \, dx + 7 \int 1 \, dx

Why: you can integrate term-by-term because differentiation is linear. The derivative of a sum is a sum of derivatives, so running that backwards, the antiderivative of a sum is a sum of antiderivatives. Constants come out front.

Step 2. Apply the power rule to each piece.

\int x^3 \, dx = \frac{x^4}{4}, \quad \int x \, dx = \frac{x^2}{2}, \quad \int 1 \, dx = x

Why: \int x^m \, dx = \frac{x^{m+1}}{m+1} for each exponent. For the last piece, 1 = x^0, and the formula gives \frac{x^1}{1} = x.

Step 3. Reassemble.

\int x^3 \, dx - 4 \int x \, dx + 7 \int 1 \, dx = \frac{x^4}{4} - 4 \cdot \frac{x^2}{2} + 7 x = \frac{x^4}{4} - 2 x^2 + 7 x

Step 4. Add the constant.

\int (x^3 - 4x + 7) \, dx = \frac{x^4}{4} - 2 x^2 + 7 x + C

Why: only one constant is needed, not three — adding three constants together just gives one constant. Keep a single C at the end.

Step 5. Check by differentiating.

\frac{d}{dx}\left(\frac{x^4}{4} - 2 x^2 + 7 x + C\right) = x^3 - 4 x + 7 \ \checkmark

Result: \int (x^3 - 4 x + 7) \, dx = \dfrac{x^4}{4} - 2 x^2 + 7 x + C.

Three members of the family $\frac{x^4}{4} - 2x^2 + 7x + C$, with $C = -3, 0, 3$. All three curves have the same shape, just shifted vertically. Differentiating any of them gives back the same $x^3 - 4x + 7$.

The check step at the end is not optional. It is the best way to catch arithmetic errors while integrating — differentiate your answer, compare it term-by-term with the integrand, and make sure every power, every sign, and every coefficient lines up.

Example 2: A ball thrown from a ledge

A ball is thrown upward at 12 m/s from a ledge 20 metres above the ground. Taking the acceleration due to gravity as 10 m/s² downward, find the height of the ball at time t, and the time at which it hits the ground.

Step 1. Set up the problem. Let s(t) be the height in metres and v(t) the velocity in m/s, with up as positive. Acceleration is the derivative of velocity, and is -10 (downward):

\frac{dv}{dt} = -10.

Step 2. Integrate to find velocity.

v(t) = \int (-10) \, dt = -10 t + C_1.

Use the initial condition v(0) = 12 to pin down C_1: v(0) = -10 \cdot 0 + C_1 = C_1 = 12.

v(t) = -10 t + 12.

Why: the initial condition at t = 0 picks out the specific velocity function whose graph passes through (0, 12). Without this, any vertical shift of -10t would be a possible velocity.

Step 3. Integrate velocity to find position.

s(t) = \int v(t) \, dt = \int (12 - 10 t) \, dt = 12 t - 5 t^2 + C_2.

Use the initial condition s(0) = 20 to pin down C_2: s(0) = C_2 = 20.

s(t) = 20 + 12 t - 5 t^2.

Step 4. Find when the ball hits the ground. Solve s(t) = 0:

-5 t^2 + 12 t + 20 = 0 \implies 5 t^2 - 12 t - 20 = 0.

Quadratic formula:

t = \frac{12 \pm \sqrt{144 + 400}}{10} = \frac{12 \pm \sqrt{544}}{10} = \frac{12 \pm 4\sqrt{34}}{10} = \frac{6 \pm 2\sqrt{34}}{5}.

The negative root corresponds to a time before the throw, so the physical answer is t = \frac{6 + 2\sqrt{34}}{5} \approx \frac{6 + 11.66}{5} \approx 3.53 seconds.

Result: s(t) = 20 + 12 t - 5 t^2, ground-hit at t = \frac{6 + 2\sqrt{34}}{5} s ≈ 3.53 s.

The trajectory $s(t) = 20 + 12 t - 5 t^2$, starting from $(0, 20)$, peaking at $(1.2, 27.2)$, and hitting the ground at $t \approx 3.53$ s. The dashed line at $y = 20$ marks the ledge height.

The two integrations — velocity to position, then position to when it hits the ground — correspond exactly to the two initial conditions. Each integration introduces one new constant of integration, and each initial condition pins down exactly one such constant. If you had left off either initial condition, the answer would be one constant short of specific.

Common confusions

Dropping the + C. The single most common mistake. Any answer to an indefinite integral that omits the constant is incomplete. The + C is part of the meaning, not ornament. On exams, graders subtract marks for missing it because the answer without C represents one specific antiderivative and not the full family.
"Integration is the inverse of differentiation" — therefore \int f'(x)\, dx = f(x). Almost. \int f'(x)\, dx = f(x) + C. Integration recovers the function up to a constant. The inverse statement is "\frac{d}{dx}\int f(x)\, dx = f(x)," which holds exactly (because \frac{d}{dx}(F(x) + C) = F'(x) = f(x) for any C).
Writing one C per term. \int (x + 1)\, dx = \frac{x^2}{2} + C_1 + x + C_2 is wrong. Since any sum of constants is a constant, you need only one: \frac{x^2}{2} + x + C.
Treating dx as optional. The dx tells you what variable you are integrating with respect to. \int 3xy \, dx and \int 3xy \, dy give different answers — \frac{3 x^2 y}{2} + C in the first case (treating y as constant) and \frac{3 x y^2}{2} + C in the second (treating x as constant). When there is only one letter in sight the choice is obvious, but do not leave dx off.
Thinking \int f(x)\, dx is a number. It is not. It is a family of functions. The definite integral \int_a^b f(x)\, dx — written with limits of integration — is a number, but that is a different object, and a separate article.

Going deeper

You know what an antiderivative is, you can compute simple ones by reading derivative rules backward, and you understand why there is always a + C. The rest of this section is about why this simple-looking idea is so powerful, and what happens when you try to extend it naively.

The link to the fundamental theorem of calculus

The most surprising fact in elementary calculus, which you will meet soon, is that antiderivatives are not just the inverse of differentiation in a formal sense — they are also the tool for computing areas under curves. That is, if F is an antiderivative of f, then the area under the graph of f from x = a to x = b is exactly F(b) - F(a). The same operation that "undoes differentiation" also "sums up infinitely many tiny rectangles under a curve," and these two things turn out to be the same thing. This equivalence is the fundamental theorem of calculus, and it is the reason the indefinite integral and the definite integral share a name and a symbol.

For now, you are only meeting the indefinite integral — the "undo differentiation" half of the story. The area half comes later. But when you do meet it, the notation will already feel natural, because the two halves are connected by one idea: once you have F such that F' = f, you have answered both questions.

Not every function has an elementary antiderivative

Every reasonable function is differentiable in closed form — or at least, if you give me an explicit formula, I can differentiate it mechanically. Integration is not like this. There are specific, honest-looking functions whose antiderivatives cannot be written using the standard "elementary" operations (polynomials, roots, exponentials, logs, sines, cosines, and combinations). Three famous examples:

\int e^{-x^2} \, dx, \quad \int \frac{\sin x}{x} \, dx, \quad \int \frac{1}{\ln x} \, dx.

None of these have a closed-form antiderivative. The first one comes up in probability (the normal distribution). The second one comes up in signal processing. The third one comes up in number theory (the prime counting function). These functions have antiderivatives — they exist as functions you can define — but the antiderivatives cannot be written as finite combinations of the functions you already know.

This is worth internalising early. Differentiation is mechanical. Integration is harder than differentiation, and sometimes integration in closed form is impossible. A large part of the rest of the calculus syllabus is dedicated to techniques — substitution, integration by parts, partial fractions — that let you handle a growing variety of integrals in closed form. But even with all of them, there are functions that slip through.

Why the +C sometimes looks like more than one constant

The statement "two antiderivatives differ by a constant" is about a connected interval. If you try to integrate \frac{1}{x^2} on its full domain (-\infty, 0) \cup (0, \infty), you find the answer -\frac{1}{x} + C — but strictly speaking, the C is allowed to be different on the two branches of the domain. So the full answer is something like

\int \frac{dx}{x^2} = \begin{cases} -\frac{1}{x} + C_1 & x > 0 \\ -\frac{1}{x} + C_2 & x < 0 \end{cases}

with C_1 and C_2 independent. Most textbooks sweep this under the rug and write one C, which is fine for 99% of applications. But when the domain is disconnected, the constant of integration is really one independent constant per connected piece. This is the single subtlety you should keep in mind when you see a function with a discontinuity in its domain.

Where this leads next

You now have the concept of integration. The next articles give you the formulas that let you integrate any standard function in one step, plus the techniques that extend the method to harder cases.

Basic Integration Formulas — the table of standard antiderivatives, read backwards from the corresponding derivative rules.
Integration by Substitution — the chain rule in reverse, the first and most important technique for handling non-standard integrals.
Definite Integration - Introduction — the area-under-the-curve side of integration, where the answer is a number instead of a family of functions.
Fundamental Theorem of Calculus — the theorem that links antiderivatives to areas, making this whole story one story.
Differential Equations - Introduction — the kind of problem where integration is the main tool, because the unknown is a function whose derivative you already know something about.