Integration

Integration is the reverse of differentiation. If $f'(x) = 2x$ , then $f$ could be $x^2$ — we ran the derivative backward. The same reverse operation also computes areas, total accumulated quantities, and the cumulative effect of any rate of change. This page builds up the toolkit from scratch.

The basic question

Differentiation asks: given $f$ , what is its rate of change $f'$ ?

Integration asks the inverse: given a rate $f$ , what function has that rate as its derivative? Such a function is called an antiderivative (or primitive) of $f$ , written $F$ , with $F' = f$ .

The notation for this hunt is

\int f(x) \, dx = F(x) + C

The integral sign $\int$ is a stretched “S” (for summa), $f(x)$ is the integrand, and $dx$ has two roles: it marks the variable being integrated over, and also stands for an infinitesimal slice along that variable — the role made precise by the Riemann-sum picture further down (and the reason substitution lets us replace $dx$ with $du / g'(x)$ ). The whole expression reads “the antiderivative of $f$ with respect to $x$ .”

Why the $+ C$ ?

Differentiation throws away constants: $\frac{d}{dx}[x^2] = 2x$ , but also $\frac{d}{dx}[x^2 + 7] = 2x$ , and $\frac{d}{dx}[x^2 - 100] = 2x$ . So when we run the derivative backward from $2x$ , we cannot recover which constant was originally there.

Every integration leaves behind a constant of integration $C$ : the answer is not a single function but a whole family $\{F + C : C \in \mathbb{R}\}$ , all sharing the same derivative.

Building the rules

The power rule, by reversal

Differentiation: $\frac{d}{dx}[x^n] = n x^{n-1}$ — the power drops by one, and the old power comes out front as a coefficient.

To reverse it, we need to raise the power by one (so it lands at $n$ ) and divide by the new power (to cancel the coefficient that the derivative would pull out):

\int x^n \, dx = \frac{x^{n+1}}{n+1} + C \quad (n \neq -1)

Verify by differentiating

\frac{d}{dx}\!\left[\frac{x^{n+1}}{n+1}\right] = \frac{(n+1) x^n}{n+1} = x^n

The case $n = -1$ fails because we would divide by zero. That case has its own answer — see the table below — and it’s the reason the natural logarithm exists.

Linearity

Two easy rules carry over from differentiation directly:

Constants pass through: $\int c \cdot f(x) \, dx = c \int f(x) \, dx$ .
Sums split apart: $\int (f + g) \, dx = \int f \, dx + \int g \, dx$ .

Together, these mean integration is linear. You can break any integral of a sum into a sum of integrals, and pull constants out front. Both follow immediately from the corresponding differentiation rules.

Putting it together

The fundamental integration rules are:

Rule	Formula
Constant	$\int c \, dx = c x + C$
Power	$\int x^n \, dx = \frac{x^{n+1}}{n+1} + C \;\; (n \neq -1)$
Constant multiple	$\int c \cdot f \, dx = c \int f \, dx$
Sum	$\int (f + g) \, dx = \int f \, dx + \int g \, dx$
Substitution	$\int f(g(x)) \, g'(x) \, dx = F(g(x)) + C, \;\; F' = f$

The substitution rule is more subtle than the others — we’ll build it up in its own section.

Common Integrals

Each row pairs a function $f$ with its antiderivative. To verify any row, differentiate the right column — you should get back the left.

$f(x)$	$\int f(x) \, dx$
$0$	$C$
$x^n$ , $n \neq -1$	$\dfrac{x^{n+1}}{n+1} + C$
$\dfrac{1}{x}$	$\ln \lvert x \rvert + C$
$e^x$	$e^x + C$
$e^{ax}$	$\dfrac{1}{a} e^{ax} + C$
$\ln x$	$x \ln x - x + C$
$\sin x$	$-\cos x + C$
$\cos x$	$\sin x + C$
$\dfrac{1}{\cos^2 x}$	$\tan x + C$

Two oddities worth noting:

$\int \frac{1}{x} \, dx = \ln \lvert x \rvert + C$ uses absolute value because $\ln$ is undefined for negative numbers, but $\frac{1}{x}$ has antiderivatives on both sides of zero.
$\int e^{ax} \, dx = \frac{1}{a} e^{ax} + C$ — the $\frac{1}{a}$ appears because differentiating $e^{ax}$ pulls an $a$ out (chain rule), which we have to un-multiply by dividing.

Trigonometric Integrals

Memorize the derivatives of sine and cosine, then everything else follows by reversal:

\frac{d}{dx} \sin x = \cos x \implies \int \cos x \, dx = \sin x + C

\frac{d}{dx} \cos x = -\sin x \implies \int \sin x \, dx = -\cos x + C

Be careful with the sign in the second one. The full set:

The integrals of the trigonometric functions are:

$f(x)$	$\int f(x) \, dx$
$\sin x$	$-\cos x + C$
$\cos x$	$\sin x + C$
$\dfrac{1}{\cos^2 x}$	$\tan x + C$
$\dfrac{1}{\sin^2 x}$	$-\cot x + C$
$\sec x \tan x$	$\sec x + C$
$\csc x \cot x$	$-\csc x + C$

The integrands producing inverse trigonometric antiderivatives are worth recognizing too:

\int \frac{1}{\sqrt{1 - x^2}} \, dx = \arcsin x + C \qquad \int \frac{1}{1 + x^2} \, dx = \arctan x + C

Substitution: reversing the chain rule

The chain rule says

\frac{d}{dx}[F(g(x))] = F'(g(x)) \cdot g'(x)

Read right to left, this is an integral statement:

\int F'(g(x)) \cdot g'(x) \, dx = F(g(x)) + C

In words: if your integrand is a composite function $F'(g(x))$ multiplied by the derivative of its inner part $g'(x)$ , the antiderivative is $F(g(x))$ — no extra work required.

The mechanical technique for using this is called u-substitution (or just substitution): let $u = g(x)$ , compute $du = g'(x) \, dx$ , and rewrite the integral entirely in terms of $u$ :

\int F'(g(x)) \cdot g'(x) \, dx = \int F'(u) \, du = F(u) + C = F(g(x)) + C

The diagnostic: what to look for

Substitution is not a general technique — it works only when a specific pattern is present. Before trying it, run this two-part check:

Find a composite. Does your integrand contain something of the form $f(\text{inner})$ , where “inner” isn’t just $x$ by itself? Call that inner expression $g(x)$ .
Find the inner derivative. Is $g'(x)$ (or a constant multiple of it) also present as a factor in the integrand?

If both answers are yes, set $u = g(x)$ and write $du = g'(x) \, dx$ . The factor $g'(x) \, dx$ in the integrand gets replaced by $du$ , and the integral collapses into one in $u$ alone. If the second answer is no — the derivative is completely absent and not just off by a constant — substitution will not simplify the integral.

Why does the derivative have to be there? When you set $u = g(x)$ , you need to replace every piece of $x$ in the integrand, including the $dx$ . The relation $du = g'(x) \, dx$ shows that $dx = \frac{du}{g'(x)}$ — so if $g'(x)$ doesn’t cancel with something already in the integrand, it ends up stuck in the denominator and the substitution makes things worse, not better.

For example, $\int e^{x^2} \, dx$ looks like a candidate with inner function $g(x) = x^2$ , but $g'(x) = 2x$ is nowhere in the integrand. Setting $u = x^2$ gives $dx = \frac{du}{2x}$ , and the integral becomes $\int \frac{e^u}{2x} \, du$ — which still contains $x$ , leaving the substitution stuck. This particular integral has no elementary closed form.

Fixing an off-by-constant factor

If the derivative is present but multiplied by the wrong constant, fix it by multiplying and dividing. For example, $\int x^2 e^{x^3} \, dx$ : the inner function is $x^3$ with derivative $3x^2$ , but only $x^2$ appears in the integrand — off by a factor of 3.

\int x^2 e^{x^3} \, dx = \frac{1}{3} \int 3x^2 e^{x^3} \, dx = \frac{1}{3} \int e^u \, du = \frac{1}{3} e^{x^3} + C

Constants only. This multiply-and-divide trick works only when the missing factor is a constant. Linearity lets you pull constants in and out of integrals freely — but not anything involving $x$ . Writing $\int e^{x^2} \, dx = \frac{1}{2x} \int 2x \, e^{x^2} \, dx$ is wrong, because $\frac{1}{2x}$ is not a constant and cannot leave the integrand. If the missing factor depends on the variable of integration, substitution doesn’t apply — try a different technique.

Examples

$\int 2x \cos(x^2) \, dx$ . The inner function of $\cos(\cdot)$ is $x^2$ , and its derivative $2x$ sits right next to it as a factor.

Let $u = x^2$ , so $du = 2x \, dx$ :

\int 2x \cos(x^2) \, dx = \int \cos u \, du = \sin u + C = \sin(x^2) + C

$\int e^{-x} \, dx$ . The inner function is $-x$ , whose derivative is the constant $-1$ . A constant is always implicitly present.

Let $u = -x$ , so $du = -dx$ , meaning $dx = -du$ :

\int e^{-x} \, dx = \int e^u \cdot (-du) = -e^u + C = -e^{-x} + C

$\int \frac{2x}{1 + x^2} \, dx$ . The denominator $1 + x^2$ is the inner function, and its derivative $2x$ is the numerator.

Let $u = 1 + x^2$ , so $du = 2x \, dx$ :

\int \frac{2x}{1 + x^2} \, dx = \int \frac{1}{u} \, du = \ln \lvert u \rvert + C = \ln(1 + x^2) + C

Integration by parts: reversing the product rule

The product rule says $(fg)' = f'g + fg'$ . Integrating both sides:

fg = \int f'g \, dx + \int fg' \, dx

Rearranging to solve for one of the two integrals:

For differentiable functions $f$ and $g$ , integration by parts states:

\int f(x) g'(x) \, dx = f(x) g(x) - \int f'(x) g(x) \, dx

In short form with $u = f(x)$ and $v = g(x)$ (so $du = f' \, dx$ and $dv = g' \, dx$ ):

\int u \, dv = u v - \int v \, du

The idea: trade an integral you can’t do for one you can. You pick $u$ and $dv$ from the integrand, hoping that $du$ (the derivative of $u$ ) and $v = \int dv$ (the antiderivative of $dv$ ) make the resulting integral $\int v \, du$ easier than what you started with.

Rule of thumb: pick $u$ to be the part that gets simpler when differentiated — polynomials, $\ln x$ , $\arctan x$ . Pick $dv$ to be the part you can easily integrate — exponentials, $\sin x$ , $\cos x$ .

LIATE mnemonic. When in doubt about which factor should be $u$ , try them in this order — pick the first type that appears in the integrand:

Letter	Function type	Example
L	Logarithm	$\ln x$
I	Inverse trig	$\arctan x$ , $\arcsin x$
A	Algebraic (polynomials, roots)	$x^2$ , $\sqrt{x}$
T	Trigonometric	$\sin x$ , $\cos x$
E	Exponential	$e^x$ , $a^x$

The order is roughly “what gets simpler fastest under differentiation” → “what stays the same or grows worse.” For $\int x \ln x \, dx$ , L beats A, so $u = \ln x$ . For $\int x e^x \, dx$ , A beats E, so $u = x$ . The mnemonic is a guide, not a law — when it fails, try the other split.

$\int x e^x \, dx$ . Pick $u = x$ (so $du = dx$ , simpler) and $dv = e^x \, dx$ (so $v = e^x$ , easy to integrate):

\int x e^x \, dx = x e^x - \int e^x \, dx = x e^x - e^x + C = (x - 1) e^x + C

The polynomial factor $x$ disappeared after one differentiation — that’s the whole reason this works.

$\int \ln x \, dx$ — looks impossible, but the trick is to take $dv = dx$ and let the entire $\ln x$ be $u$ :

$u = \ln x$ (so $du = \frac{1}{x} \, dx$ ), $dv = dx$ (so $v = x$ ):

\int \ln x \, dx = x \ln x - \int x \cdot \frac{1}{x} \, dx = x \ln x - \int 1 \, dx = x \ln x - x + C

If the resulting integral isn’t easier, you picked $u$ and $dv$ the wrong way around — try swapping them.

The Definite Integral

So far we’ve computed antiderivatives — families of functions defined up to a constant. The definite integral asks for a single number: the amount accumulated by $f$ between two endpoints $a$ and $b$ .

Geometrically, this number is the signed area between the graph of $f$ and the $x$ -axis from $x = a$ to $x = b$ — area above the axis counted positive, area below counted negative:

\int_a^b f(x) \, dx = \text{(area above axis)} - \text{(area below axis)}

Riemann sums: where the area definition comes from

The “signed area” idea is made precise by chopping $[a, b]$ into $n$ small slices of width $\Delta x = (b-a)/n$ , building a rectangle of height $f(x_i^*)$ over each slice (for some sample point $x_i^*$ in the slice — see remark below for what the asterisk means), and adding up their areas:

S_n = \sum_{i=1}^{n} f(x_i^*) \, \Delta x

This is a Riemann sum. As $n \to \infty$ , the rectangles shrink and the sum converges to a single number — the definite integral of $f$ over $[a, b]$ .

The definite integral of a function $f$ from $a$ to $b$ is the limit of Riemann sums as the partition is refined:

\int_a^b f(x) \, dx = \lim_{n \to \infty} \sum_{i=1}^{n} f(x_i^*) \, \Delta x

with $\Delta x = (b-a)/n$ and $x_i^*$ any sample point in the $i$ -th subinterval. The number $\int_a^b f(x) \, dx$ measures the signed area between the graph of $f$ and the $x$ -axis from $x = a$ to $x = b$ .

\sum_{i=1}^{n} f(x_i^*) \, \Delta x \;\xrightarrow{\,n \to \infty\,}\; \int_a^b f(x) \, dx

What the asterisk on

x_i^*

means

The $i$ -th slice is an interval, not a single point — so to evaluate $f$ for the rectangle’s height, we have to pick one point from inside that interval. The asterisk is a free choice marker: it says “some point we picked from the $i$ -th subinterval,” without committing to which one. Common choices are the left endpoint, the right endpoint, or the midpoint — but any point in the slice is fair game. The whole reason this notation works is the theorem that as $n \to \infty$ , which point you pick stops mattering — every valid choice converges to the same limit. The $*$ acknowledges the ambiguity up front, then the limit erases it.

This sum-based definition is what “area” actually means — but computing the limit directly is brutal even for simple integrands. The miracle is that we never have to: the limit equals the difference of any antiderivative at the endpoints.

Fundamental theorem of calculus (often abbreviated FTC). If $f$ is continuous on $[a, b]$ and $F$ is any antiderivative of $f$ , then:

\int_a^b f(x) \, dx = F(b) - F(a)

The constant $C$ cancels in the difference $F(b) - F(a)$ and does not appear.

We don’t need to add up infinitely many tiny rectangles — we just find any antiderivative and subtract. The link between the two pictures (limit of sums on one side, antiderivative on the other) is what makes calculus the toolkit it is: a question about accumulation answered by a question about rates.

$\int_0^\pi \sin x \, dx$ . An antiderivative of $\sin x$ is $-\cos x$ :

\int_0^\pi \sin x \, dx = [-\cos x]_0^\pi = -\cos \pi - (-\cos 0) = -(-1) - (-1) = 2

The same rules — linearity, substitution, integration by parts — all apply to definite integrals. The only extra adjustment for substitution: when you change variable from $x$ to $u = g(x)$ , the limits transform too:

\int_a^b f(g(x)) \, g'(x) \, dx = \int_{g(a)}^{g(b)} f(u) \, du

After substituting you can either update the limits to $g(a)$ and $g(b)$ (faster) or substitute $x$ back in at the end before evaluating (safer when first learning).

Arc length: integrating speed

A definite integral doesn’t have to be the area under a graph. The same Riemann-sum machinery — chop a parameter interval, sum a quantity proportional to the slice width, take the limit — measures any quantity that accumulates linearly along a path. The first non-area example is arc length: the total length traced out by a parametrized curve.

Consider a curve in the plane (or in $\mathbb{R}^n$ ) given by a parametrization

\gamma : [a, b] \to \mathbb{R}^n, \qquad \gamma(t) = (x_1(t), \dots, x_n(t))

To approximate the length of $\gamma$ , partition $[a, b]$ into $n$ slices, sample points $t_0 = a < t_1 < \dots < t_n = b$ , and add up the straight-line chord lengths between consecutive samples:

L_n = \sum_{i=1}^{n} \|\gamma(t_i) - \gamma(t_{i-1})\|

For a smooth curve, $\gamma(t_i) - \gamma(t_{i-1}) \approx \gamma'(t_{i-1}) \, \Delta t$ over a small slice (this is the linearization from the derivative), so each chord length is approximately $\|\gamma'(t_{i-1})\| \, \Delta t$ . The sum becomes a Riemann sum for $\|\gamma'(t)\|$ , and the limit is a definite integral.

Before writing it down, it helps to name what $\|\gamma'(t)\|$ actually is. Think of $t$ as time and $\gamma(t)$ as the position of a moving particle that traces out the curve as $t$ runs from $a$ to $b$ . Then:

$\gamma'(t)$ is the velocity vector at time $t$ — direction-and-rate of motion.
$\|\gamma'(t)\|$ is the speed at time $t$ — a single non-negative number, “how fast the particle is moving” with the direction stripped away. (This is the standard physics distinction between velocity and speed.)
“Speed of the parametrization” just means: how fast the parameter $t$ pushes the point along the curve.

What does “trace the same curve faster” actually mean?

“Faster” means changing the parameter, not scaling the components. Take the unit circle $\gamma(t) = (\cos t, \sin t)$ on $[0, 2\pi]$ . The reparametrization $\tilde\gamma(s) = \gamma(2s) = (\cos 2s, \sin 2s)$ on $[0, \pi]$ traces the same circle — but $s$ only needs to reach $\pi$ for one full lap. By the chain rule $\tilde\gamma'(s) = 2\gamma'(2s)$ , so the speed $\|\tilde\gamma'(s)\| = 2$ is doubled at every point. The image (the unit circle) is unchanged.

What this is not: scaling the components, like $(2\cos t, 2\sin t)$ . That’s a circle of radius 2 — a different curve (circumference $4\pi$ ), not the same circle traced faster. Scaling components scales the curve itself; reparametrizing the variable scales the speed at which it’s traced.

With that vocabulary, the arc length integral is just the total-distance formula from physics in disguise. The infinitesimal piece $\|\gamma'(t)\| \, dt$ reads as “speed $\times$ time slice = distance covered in that slice” — and integrating over $[a, b]$ adds up all those tiny distances:

The arc length of a continuously differentiable (i.e., $C^1$ ) curve $\gamma : [a, b] \to \mathbb{R}^n$ is

L(\gamma) = \int_a^b \|\gamma'(t)\| \, dt

For a 2D curve $\gamma(t) = (x(t), y(t))$ this expands to

L(\gamma) = \int_a^b \sqrt{x'(t)^2 + y'(t)^2} \, dt

For the graph of a function $y = f(x)$ on $[a, b]$ — a special case parametrized by $\gamma(x) = (x, f(x))$ — it becomes

L = \int_a^b \sqrt{1 + f'(x)^2} \, dx

Two points worth noticing:

The integrand is a scalar. Even though $\gamma$ is vector-valued, $\|\gamma'(t)\|$ collapses to a single non-negative number at each $t$ — the instantaneous speed. Arc length is a 1D integral of that scalar.
The answer doesn’t depend on the parametrization. The same physical curve traced twice as fast still has the same length: doubling the speed of $\gamma$ halves the parameter interval needed to cover it, and $\|\gamma'\|$ doubles, so the integral is unchanged. Arc length is a property of the image (the trace) of $\gamma$ , not of the parametrization.

Circle of radius $R$ . Parametrize $\gamma(t) = (R \cos t, R \sin t)$ for $t \in [0, 2\pi]$ :

\gamma'(t) = (-R \sin t, R \cos t), \qquad \|\gamma'(t)\| = \sqrt{R^2 \sin^2 t + R^2 \cos^2 t} = R

so the speed is constant, and

L = \int_0^{2\pi} R \, dt = 2\pi R

— the familiar circumference, recovered from the integral.

Arc length is the simplest of a wider family of path integrals — quantities that accumulate along a curve. In every case the recipe is the same: identify what’s being accumulated per unit parameter, multiply by $dt$ , and integrate. This same pattern is what generalizes to surface area as a double integral over a parametrized surface, where the analog of “speed” $\|\gamma'(t)\|$ becomes the magnitude of the cross product of the two tangent vectors $\|\boldsymbol{\phi}_u \times \boldsymbol{\phi}_v\|$ — but that’s for the curves chapter to develop.

Integration

The basic question

Why the $+ C$ ?

Building the rules

The power rule, by reversal

Linearity

Putting it together

Common Integrals

Trigonometric Integrals

Substitution: reversing the chain rule

The diagnostic: what to look for

Fixing an off-by-constant factor

Examples

Integration by parts: reversing the product rule

The Definite Integral

Riemann sums: where the area definition comes from

Arc length: integrating speed

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

V

W

#

Integration

The basic question

Why the +C?

Building the rules

The power rule, by reversal

Linearity

Putting it together

Common Integrals

Trigonometric Integrals

Substitution: reversing the chain rule

The diagnostic: what to look for

Fixing an off-by-constant factor

Examples

Integration by parts: reversing the product rule

The Definite Integral

Riemann sums: where the area definition comes from

Arc length: integrating speed

Why the $+ C$ ?