Partial Differential Equations

This chapter is an introduction to PDEs — a tour of the vocabulary, the standard classifications, and the canonical examples that show up across physics and engineering. We won’t be deriving these equations from physical principles, and we won’t learn how to solve them; both the analytical and numerical solution machinery is taken up later in the course. The goal here is recognition: by the end you should know what a PDE is, see the difference between first- and second-order forms, recognize the canonical PDEs by sight, and feel comfortable with the terminology of PDE classification — order, linear vs nonlinear, stationary vs instationary.

A differential equation is a relation between an unknown function and its own derivatives — solving it means finding the function (or family of functions) that makes the relation hold everywhere on a given domain. Differential equations split into two families based on how many independent variables the unknown depends on: the 1D case (one independent variable) gives an ordinary differential equation, and the higher-dimensional case (several independent variables) gives a partial differential equation. The same 1D-vs-higher-D split we used for differentiation in the calculus chapters carries straight over to differential equations.

An ordinary differential equation (ODE) is an equation for an unknown function $x = x(t)$ of a single variable, written in terms of $t$ , $x$ , and the ordinary derivatives $\dot{x}, \ddot{x}, \ldots$

Notation — same letter for the function and its value. The phrasing $x = x(t)$ uses the same letter $x$ for two related things: on the left, the value the function produces; on the right, applied to $t$ , the function itself. Read it as a typing declaration — ” $x$ is a quantity that depends on $t$ , and I’ll call that dependence $x$ too.” In strict math style we would invent a separate function name and write something like $x = f(t)$ , with $f$ as the rule and $x$ as the output. Physics writing reuses the letter because $x$ usually names a single physical quantity (a position, an amount, a temperature) that is the function of time — there’s no need to label the rule and its output differently.

Notation — dots over the variable. The dot form $\dot{x}, \ddot{x}, \ldots$ is Newton’s notation for derivatives, conventionally used when the independent variable is time — so $\dot{x} = \tfrac{dx}{dt}$ and $\ddot{x} = \tfrac{d^2x}{dt^2}$ . It coexists with the prime notation $f'(x)$ (Lagrange’s notation) we’ve used elsewhere and the fraction notation $\tfrac{dx}{dt}$ (Leibniz’s notation); all three express the same operation. Newton’s dots dominate in physics and ODE writing precisely because $t$ is so often the variable being differentiated against.

Two familiar examples:

Exponential decay: $\dot{x}(t) = -k\, x(t)$ — the rate of change is proportional to the current amount. Models radioactive decay, Newton’s law of cooling, drug clearance from the bloodstream. Solution: $x(t) = x_0\, e^{-kt}$ .
Simple harmonic oscillator: $m\, \ddot{x}(t) = -k\, x(t)$ — Newton’s second law for a mass on a spring. Solutions oscillate sinusoidally about the equilibrium position.

A partial differential equation (PDE) is an equation for an unknown function $u$ of several variables — typically space and time, e.g. $u(x, t)$ , $u(x, y)$ , or $u(x, y, z, t)$ — written in terms of those variables, $u$ , and partial derivatives of $u$ .

Most quantities of physical interest depend on more than one axis (a temperature distribution across a room, a wave traveling through a medium, a fluid filling a domain), so PDEs are the natural language for almost every model in physics and engineering that varies over space and time, or over more than one spatial axis.

Many of the most important equations in physics and engineering are PDEs:

Maxwell’s equations — relate the electric and magnetic fields to the charge and current that generate them; the foundation of classical electromagnetism.
Navier–Stokes equations — describe how the velocity, pressure, and density of an incompressible viscous fluid (a fluid that doesn’t change density and resists internal shearing, like water at low speeds) evolve in space and time.
Schrödinger equation — governs how the wave function of a quantum particle (the complex-valued amplitude whose squared modulus gives the probability of finding the particle at each point) evolves over time.
Heat equation — describes the diffusion of heat (or any analogous quantity) through a medium.

First-order PDEs

For the rest of the chapter we narrow the setting to functions of two variables — written either $u = u(x, y)$ when both inputs are spatial, or $u = u(x, t)$ when one is space and the other is time. This is enough to develop every concept we need without dragging in the notational baggage of $n$ -variable expressions.

One vocabulary item splits every classification we are about to make: a PDE is called linear when the unknown and its derivatives appear in the most controlled possible way.

A linear PDE is an equation in which the unknown $u$ and its partial derivatives appear only to the first power and not inside any nonlinear function (like $\sin$ , $\exp$ , etc.).

A PDE that doesn’t satisfy this is called nonlinear. So $u_x + u_y = 0$ is linear; $u\, u_x + u_y = 0$ is not (the coefficient $u$ in front of $u_x$ depends on the unknown), and $\sin(u) = u_x$ is not (the unknown sits inside a nonlinear function). Every classification below — first-order or second-order — splits along this axis.

Linear with constant coefficients

The simplest first-order PDE has the most restrictive possible shape:

A 1st-order linear PDE with constant coefficients for $u = u(x, y)$ has the form

a\, u_x + b\, u_y = f(x, y)

with $a, b \in \mathbb{R}$ and a given function $f = f(x, y)$ .

Three qualifiers stack to make this the simplest case, and each name picks out one structural feature:

1st order — only first partial derivatives appear ( $u_x$ , $u_y$ ); no second derivatives like $u_{xx}$ or $u_{xy}$ , no third, etc.
linear — in the sense just defined: $u$ and its derivatives appear only to the first power, not inside any nonlinear function. No $u^2$ , no $u \cdot u_x$ , no $\sin(u)$ , no $e^{u_x}$ .
constant coefficients — the multipliers $a$ and $b$ are real numbers, not functions of $x$ , $y$ , or $u$ .

The subscripts read as partial derivatives: $u_x = \tfrac{\partial u}{\partial x}$ , $u_y = \tfrac{\partial u}{\partial y}$ , $u_{xx} = \tfrac{\partial^2 u}{\partial x^2}$ , and so on — the same subscript shorthand used earlier for $f_{x_i}$ , now specialized to a function of two variables.

Two functions, two roles

The equation involves two functions, $u$ and $f$ , and they play very different parts:

$u(x, y)$ is the unknown — the function we are solving for. We don’t know it yet; the PDE is the constraint we’ll use to pin it down.
$f(x, y)$ is given input data — a known function whose value depends on $x$ and $y$ but not on $u$ . Even though $f$ varies from point to point, it does so independently of whatever solution $u$ turns out to be. That’s what’s meant by saying $f$ is constant with respect to $u$ : not that $f$ is a flat constant function, but that $f$ contains no $u$ .

Because $f$ contains no $u$ , the equation is genuinely linear in $u$ : doubling $u$ doubles every term on the left, while $f$ on the right is unaffected. When $f$ vanishes everywhere — the equation reduces to $a u_x + b u_y = 0$ — the PDE is homogeneous; otherwise it is inhomogeneous, and $f$ is what supplies the inhomogeneity.

In settings where the PDE describes some quantity being added to or removed from $u$ , $f$ goes by a more evocative name: the source term where it’s positive (driving $u$ up) and the sink term where it’s negative (draining $u$ down) — collectively, the source/sink term of the equation.

Example — traffic density

Imagine cars on a single-lane highway, no overtaking. At each point $x$ along the road and each time $t$ , let $u(x, t)$ denote the traffic density — cars per unit length around that point. Suppose every car drives at the same constant velocity $v > 0$ , and entrances and exits along the highway add or remove cars at a net rate $f(x, t)$ — positive where cars are entering, negative where they’re leaving.

Tracking how the density evolves leads to the PDE

u_t + v\, u_x = f(x, t).

Reading term by term:

$u_t$ — the rate at which density changes at a fixed location $x$ as time passes.
$v\, u_x$ — the spatial slope of density times the cars’ velocity. If density is increasing ahead of you ( $u_x > 0$ ) and the column of cars is rolling forward at speed $v$ , the location you just occupied empties out at rate $v\, u_x$ .
$f(x, t)$ — the source/sink term: net rate of cars entering minus leaving at $(x, t)$ .

With no entrances or exits ( $f \equiv 0$ ), the equation reduces to $u_t = -v\, u_x$ : the density at a fixed point changes only because the spatial profile is sliding past — every car carries its bit of density along at speed $v$ . Adding $f$ on the right injects external forcing on top of that picture: the profile still slides, but now it gains cars wherever $f > 0$ and loses them wherever $f < 0$ .

The traffic equation is given, not derived: we haven’t shown why density satisfies this particular PDE (that needs a conservation-law argument), and we haven’t solved it either. The example is only demonstrating that the abstract template $a\, u_x + b\, u_y = f(x, y)$ has real physical content — here $a = v$ , $b = 1$ , the unknown $u$ is the density, and $f$ is the net inflow rate. Deriving such PDEs from physical principles and learning solution techniques are taken up later in the course.

The constant-coefficient case is the most restrictive 1st-order PDE we’ll meet. Loosening the conditions on the coefficients takes us through two strictly more general classes — both still “first-order” since only first partial derivatives appear.

Linear with variable coefficients

The next step up in generality keeps the linearity but allows the coefficients to depend on the location $(x, y)$ :

A linear 1st-order PDE in 2D has the form

a(x, y)\, u_x + b(x, y)\, u_y = 0

with coefficient functions $a, b$ that may depend on $(x, y)$ but not on $u$ .

Compared to the constant-coefficient class above, the only relaxation is that $a$ and $b$ are now functions of $(x, y)$ instead of fixed real numbers. The constant-coefficient case is the special sub-case where those functions happen to be flat.

x\, u_x + y\, u_y = 0

Reading off: $a(x, y) = x$ and $b(x, y) = y$ . They vary with position but contain no $u$ , so the equation is linear — and because they are not just constants, it sits in this class rather than the constant-coefficient one.

Quasilinear

One more relaxation: let the coefficients (and the right-hand side) depend on $u$ itself.

A quasilinear 1st-order PDE in 2D has the form

a(x, y, u(x, y))\, u_x + b(x, y, u(x, y))\, u_y = c(x, y, u(x, y))

with $a, b, c$ differentiable functions that may depend on $u$ in addition to $(x, y)$ .

The third argument to each of $a$ , $b$ , $c$ is the value of the unknown function $u$ at the point $(x, y)$ being considered. Read $a(x, y, u(x, y))$ as: ” $a$ takes three real arguments, and we plug in $u(x, y)$ for the third.” When the dependence is clear from context, this is often shortened to $a(x, y, u)$ — the same letter-reuse trick as the $x = x(t)$ convention from earlier, where $u$ doubles as a name for both the function and its value.

The “quasi” in quasilinear signals that the equation is still linear in the derivatives $u_x$ and $u_y$ — each shows up only to the first power, not inside any nonlinear function of itself — but $u$ is now allowed to appear nonlinearly through the coefficients or the right-hand side. So $u\, u_x + u_y = 0$ is quasilinear (the coefficient of $u_x$ is $u$ , which is fine: $u_x$ still appears linearly), while $u_x^2 + u_y = 0$ is genuinely nonlinear (because $u_x$ shows up squared).

y\, u_x - x\, u_y = x\, u^2

Reading off: $a(x, y, u) = y$ , $b(x, y, u) = -x$ , $c(x, y, u) = x\, u^2$ . The coefficients in front of the derivatives — namely $y$ and $-x$ — don’t actually depend on $u$ here, but the right-hand side does, and that $u^2$ is enough to push the equation out of the linear class. The derivatives $u_x$ and $u_y$ still appear linearly, so the “quasi” condition holds.

Second-order PDEs

Most of the canonical PDEs of classical physics are second-order: Laplace, heat, wave, and Schrödinger all live here. This is the order at which the subject becomes the workhorse of mathematical modeling. The linear / nonlinear split from before still applies — and at second order it does most of its real work, since the canonical physics equations divide cleanly into a linear group (Laplace, heat, wave, Schrödinger, Maxwell) and a nonlinear one (Burgers, Navier–Stokes).

Stationary vs instationary

Second-order PDEs are usually classified along a second axis: how the solution relates to time.

A PDE is stationary if its solution does not depend on time — it is determined entirely by the spatial variables.

A PDE is instationary (also called non-stationary or time-dependent) if its solution evolves over time — equivalently, time derivatives like $u_t$ or $u_{tt}$ appear in the equation.

Stationary equations describe a static state of the world (a steady-state temperature distribution, an electric field at equilibrium); instationary ones describe how the world changes in time.

Canonical linear examples

Four linear second-order PDEs do most of the work in classical physics. Each one is built around the Laplace operator $\Delta$ — summing the pure second-order spatial partial derivatives of $u$ — combined with some time-derivative term (or none at all):

Laplace equation: $-\Delta u = 0$ . Stationary; describes a physical potential at equilibrium, like a gravitational potential or the strength of an electrical field. With no time derivatives in sight, the equation says nothing about how the field evolves — it characterizes the static shape the field must have to satisfy the equation everywhere.
Heat equation: $u_t - c^2\, \Delta u = 0$ . The Laplace equation extended with a first time derivative $u_t$ , scaled by a positive constant $c^2$ (the thermal diffusivity, controlling how fast heat spreads). Describes heat transfer or any analogous diffusion process. Instationary.
Wave equation: $u_{tt} - c^2\, \Delta u = 0$ . Same structural shape as the heat equation, but with a second time derivative $u_{tt}$ in place of the first. Here $c$ is the wave speed. Models the propagation of waves through a fluid or acoustic medium. Instationary.
Schrödinger equation: $i\, u_t + \Delta u = 0$ , where $i = \sqrt{-1}$ is the imaginary unit. A first time derivative again, but multiplied by $i$ — a small notational change with sweeping consequences for the kind of behavior the equation supports. Describes how the wave function of an elementary particle evolves in time. Instationary.

The structural pattern is hard to miss: every one of these equations is the Laplace operator $\Delta u$ — the “spatial curvature” term, measuring how $u$ bends in space — coupled to time through some derivative expression on $u$ (or none, for Laplace itself). Reading the heat equation as “the Laplace equation with a time-evolution term added” is exactly right, and the wave and Schrödinger equations are variations on the same theme: same spatial machinery, different time coupling.

Maxwell’s equations — a linear system

A fifth canonical linear example doesn’t fit the single-Laplacian-plus-time template above. Maxwell’s equations are a system of four coupled linear PDEs relating the electric field $\boldsymbol{E}$ and magnetic field $\boldsymbol{B}$ to the charge density $\rho$ and current density $\boldsymbol{j}$ that produce them:

\begin{aligned} \nabla \cdot \boldsymbol{E} &= \rho/\varepsilon_0, & \nabla \cdot \boldsymbol{B} &= 0, \\ \nabla \times \boldsymbol{E} &= -\partial_t \boldsymbol{B}, & \nabla \times \boldsymbol{B} &= \mu_0\, \boldsymbol{j} + \mu_0 \varepsilon_0\, \partial_t \boldsymbol{E}. \end{aligned}

Here $\varepsilon_0$ is the permittivity of free space and $\mu_0$ is the permeability — both physical constants. Together these four equations describe how electric and magnetic fields are produced by charges and currents and how they evolve in response, and the whole edifice of classical electromagnetism rests on them.

Canonical nonlinear examples

Two nonlinear 2nd-order PDEs round out the canonical set.

Burgers’ equation

$u_t + u\, u_x = 0$ — appears in various conservation laws (gas dynamics, traffic flow, shock formation) and is the textbook example of how a nonlinear transport term produces shocks. Strictly speaking the form shown is first-order — it’s the quasilinear 1st-order pattern from earlier with $a = u$ , $b = 1$ , $c = 0$ (and $y \to t$ ). The genuinely 2nd-order viscous Burgers’ equation adds a diffusion term:

u_t + u\, u_x = \nu\, u_{xx}.

Instationary.

Navier–Stokes equations

\rho\bigl(\partial_t \boldsymbol{u} + (\boldsymbol{u} \cdot \nabla) \boldsymbol{u}\bigr) = -\nabla p + \mu\, \Delta \boldsymbol{u}, \qquad \nabla \cdot \boldsymbol{u} = 0.

A nonlinear system describing the instationary flow of an incompressible viscous fluid. The unknowns are the velocity field $\boldsymbol{u} : D \subseteq \mathbb{R}^3 \to \mathbb{R}^3$ , the density $\rho$ , and the pressure $p$ . The first equation is Newton’s second law applied to a fluid parcel; the second ( $\nabla \cdot \boldsymbol{u} = 0$ ) enforces incompressibility. The heart of all serious fluid-dynamics modeling — and famously, whether smooth solutions always exist is one of the seven Millennium Prize Problems. Instationary.

Other classification schemes for second-order PDEs and methods for actually solving them — analytically and numerically — are taken up later in the course.