Partial Differential Equations

This chapter is an introduction to PDEs — a tour of the vocabulary, the standard classifications, and the canonical examples that show up across physics and engineering. We won’t be deriving these equations from physical principles, and we won’t learn how to solve them; both the analytical and numerical solution machinery is taken up later in the course. The goal here is recognition: by the end you should know what a PDE is, see the difference between first- and second-order forms, recognize the canonical PDEs by sight, and feel comfortable with the terminology of PDE classification — order, linear vs nonlinear, stationary vs instationary.

A differential equation is a relation between an unknown function and its own derivatives — solving it means finding the function (or family of functions) that makes the relation hold everywhere on a given domain. Differential equations split into two families based on how many independent variables the unknown depends on: the 1D case (one independent variable) gives an ordinary differential equation, and the higher-dimensional case (several independent variables) gives a partial differential equation. The same 1D-vs-higher-D split we used for differentiation in the calculus chapters carries straight over to differential equations.

An ordinary differential equation (ODE) is an equation for an unknown function x=x(t)x = x(t) of a single variable, written in terms of tt, xx, and the ordinary derivatives x˙,x¨,\dot{x}, \ddot{x}, \ldots

Notation — same letter for the function and its value. The phrasing x=x(t)x = x(t) uses the same letter xx for two related things: on the left, the value the function produces; on the right, applied to tt, the function itself. Read it as a typing declaration — ”xx is a quantity that depends on tt, and I’ll call that dependence xx too.” In strict math style we would invent a separate function name and write something like x=f(t)x = f(t), with ff as the rule and xx as the output. Physics writing reuses the letter because xx usually names a single physical quantity (a position, an amount, a temperature) that is the function of time — there’s no need to label the rule and its output differently.

Notation — dots over the variable. The dot form x˙,x¨,\dot{x}, \ddot{x}, \ldots is Newton’s notation for derivatives, conventionally used when the independent variable is time — so x˙=dxdt\dot{x} = \tfrac{dx}{dt} and x¨=d2xdt2\ddot{x} = \tfrac{d^2x}{dt^2}. It coexists with the prime notation f(x)f'(x) (Lagrange’s notation) we’ve used elsewhere and the fraction notation dxdt\tfrac{dx}{dt} (Leibniz’s notation); all three express the same operation. Newton’s dots dominate in physics and ODE writing precisely because tt is so often the variable being differentiated against.

Two familiar examples:

  • Exponential decay: x˙(t)=kx(t)\dot{x}(t) = -k\, x(t) — the rate of change is proportional to the current amount. Models radioactive decay, Newton’s law of cooling, drug clearance from the bloodstream. Solution: x(t)=x0ektx(t) = x_0\, e^{-kt}.
  • Simple harmonic oscillator: mx¨(t)=kx(t)m\, \ddot{x}(t) = -k\, x(t) — Newton’s second law for a mass on a spring. Solutions oscillate sinusoidally about the equilibrium position.

A partial differential equation (PDE) is an equation for an unknown function uu of several variables — typically space and time, e.g. u(x,t)u(x, t), u(x,y)u(x, y), or u(x,y,z,t)u(x, y, z, t) — written in terms of those variables, uu, and partial derivatives of uu.

Most quantities of physical interest depend on more than one axis (a temperature distribution across a room, a wave traveling through a medium, a fluid filling a domain), so PDEs are the natural language for almost every model in physics and engineering that varies over space and time, or over more than one spatial axis.

Many of the most important equations in physics and engineering are PDEs:

  • Maxwell’s equations — relate the electric and magnetic fields to the charge and current that generate them; the foundation of classical electromagnetism.
  • Navier–Stokes equations — describe how the velocity, pressure, and density of an incompressible viscous fluid (a fluid that doesn’t change density and resists internal shearing, like water at low speeds) evolve in space and time.
  • Schrödinger equation — governs how the wave function of a quantum particle (the complex-valued amplitude whose squared modulus gives the probability of finding the particle at each point) evolves over time.
  • Heat equation — describes the diffusion of heat (or any analogous quantity) through a medium.

First-order PDEs

For the rest of the chapter we narrow the setting to functions of two variables — written either u=u(x,y)u = u(x, y) when both inputs are spatial, or u=u(x,t)u = u(x, t) when one is space and the other is time. This is enough to develop every concept we need without dragging in the notational baggage of nn-variable expressions.

One vocabulary item splits every classification we are about to make: a PDE is called linear when the unknown and its derivatives appear in the most controlled possible way.

A linear PDE is an equation in which the unknown uu and its partial derivatives appear only to the first power and not inside any nonlinear function (like sin\sin, exp\exp, etc.).

A PDE that doesn’t satisfy this is called nonlinear. So ux+uy=0u_x + u_y = 0 is linear; uux+uy=0u\, u_x + u_y = 0 is not (the coefficient uu in front of uxu_x depends on the unknown), and sin(u)=ux\sin(u) = u_x is not (the unknown sits inside a nonlinear function). Every classification below — first-order or second-order — splits along this axis.

Linear with constant coefficients

The simplest first-order PDE has the most restrictive possible shape:

A 1st-order linear PDE with constant coefficients for u=u(x,y)u = u(x, y) has the form

aux+buy=f(x,y)a\, u_x + b\, u_y = f(x, y)

with a,bRa, b \in \mathbb{R} and a given function f=f(x,y)f = f(x, y).

Three qualifiers stack to make this the simplest case, and each name picks out one structural feature:

  • 1st order — only first partial derivatives appear (uxu_x, uyu_y); no second derivatives like uxxu_{xx} or uxyu_{xy}, no third, etc.
  • linear — in the sense just defined: uu and its derivatives appear only to the first power, not inside any nonlinear function. No u2u^2, no uuxu \cdot u_x, no sin(u)\sin(u), no euxe^{u_x}.
  • constant coefficients — the multipliers aa and bb are real numbers, not functions of xx, yy, or uu.

The subscripts read as partial derivatives: ux=uxu_x = \tfrac{\partial u}{\partial x}, uy=uyu_y = \tfrac{\partial u}{\partial y}, uxx=2ux2u_{xx} = \tfrac{\partial^2 u}{\partial x^2}, and so on — the same subscript shorthand used earlier for fxif_{x_i}, now specialized to a function of two variables.

Two functions, two roles

The equation involves two functions, uu and ff, and they play very different parts:

  • u(x,y)u(x, y) is the unknown — the function we are solving for. We don’t know it yet; the PDE is the constraint we’ll use to pin it down.
  • f(x,y)f(x, y) is given input data — a known function whose value depends on xx and yy but not on uu. Even though ff varies from point to point, it does so independently of whatever solution uu turns out to be. That’s what’s meant by saying ff is constant with respect to uu: not that ff is a flat constant function, but that ff contains no uu.

Because ff contains no uu, the equation is genuinely linear in uu: doubling uu doubles every term on the left, while ff on the right is unaffected. When ff vanishes everywhere — the equation reduces to aux+buy=0a u_x + b u_y = 0 — the PDE is homogeneous; otherwise it is inhomogeneous, and ff is what supplies the inhomogeneity.

In settings where the PDE describes some quantity being added to or removed from uu, ff goes by a more evocative name: the source term where it’s positive (driving uu up) and the sink term where it’s negative (draining uu down) — collectively, the source/sink term of the equation.

Example — traffic density

Imagine cars on a single-lane highway, no overtaking. At each point xx along the road and each time tt, let u(x,t)u(x, t) denote the traffic density — cars per unit length around that point. Suppose every car drives at the same constant velocity v>0v > 0, and entrances and exits along the highway add or remove cars at a net rate f(x,t)f(x, t) — positive where cars are entering, negative where they’re leaving.

Tracking how the density evolves leads to the PDE

ut+vux=f(x,t).u_t + v\, u_x = f(x, t).

Reading term by term:

  • utu_t — the rate at which density changes at a fixed location xx as time passes.
  • vuxv\, u_x — the spatial slope of density times the cars’ velocity. If density is increasing ahead of you (ux>0u_x > 0) and the column of cars is rolling forward at speed vv, the location you just occupied empties out at rate vuxv\, u_x.
  • f(x,t)f(x, t) — the source/sink term: net rate of cars entering minus leaving at (x,t)(x, t).

With no entrances or exits (f0f \equiv 0), the equation reduces to ut=vuxu_t = -v\, u_x: the density at a fixed point changes only because the spatial profile is sliding past — every car carries its bit of density along at speed vv. Adding ff on the right injects external forcing on top of that picture: the profile still slides, but now it gains cars wherever f>0f > 0 and loses them wherever f<0f < 0.

The traffic equation is given, not derived: we haven’t shown why density satisfies this particular PDE (that needs a conservation-law argument), and we haven’t solved it either. The example is only demonstrating that the abstract template aux+buy=f(x,y)a\, u_x + b\, u_y = f(x, y) has real physical content — here a=va = v, b=1b = 1, the unknown uu is the density, and ff is the net inflow rate. Deriving such PDEs from physical principles and learning solution techniques are taken up later in the course.

The constant-coefficient case is the most restrictive 1st-order PDE we’ll meet. Loosening the conditions on the coefficients takes us through two strictly more general classes — both still “first-order” since only first partial derivatives appear.

Linear with variable coefficients

The next step up in generality keeps the linearity but allows the coefficients to depend on the location (x,y)(x, y):

A linear 1st-order PDE in 2D has the form

a(x,y)ux+b(x,y)uy=0a(x, y)\, u_x + b(x, y)\, u_y = 0

with coefficient functions a,ba, b that may depend on (x,y)(x, y) but not on uu.

Compared to the constant-coefficient class above, the only relaxation is that aa and bb are now functions of (x,y)(x, y) instead of fixed real numbers. The constant-coefficient case is the special sub-case where those functions happen to be flat.

xux+yuy=0x\, u_x + y\, u_y = 0

Reading off: a(x,y)=xa(x, y) = x and b(x,y)=yb(x, y) = y. They vary with position but contain no uu, so the equation is linear — and because they are not just constants, it sits in this class rather than the constant-coefficient one.

Quasilinear

One more relaxation: let the coefficients (and the right-hand side) depend on uu itself.

A quasilinear 1st-order PDE in 2D has the form

a(x,y,u(x,y))ux+b(x,y,u(x,y))uy=c(x,y,u(x,y))a(x, y, u(x, y))\, u_x + b(x, y, u(x, y))\, u_y = c(x, y, u(x, y))

with a,b,ca, b, c differentiable functions that may depend on uu in addition to (x,y)(x, y).

The third argument to each of aa, bb, cc is the value of the unknown function uu at the point (x,y)(x, y) being considered. Read a(x,y,u(x,y))a(x, y, u(x, y)) as: ”aa takes three real arguments, and we plug in u(x,y)u(x, y) for the third.” When the dependence is clear from context, this is often shortened to a(x,y,u)a(x, y, u) — the same letter-reuse trick as the x=x(t)x = x(t) convention from earlier, where uu doubles as a name for both the function and its value.

The “quasi” in quasilinear signals that the equation is still linear in the derivatives uxu_x and uyu_y — each shows up only to the first power, not inside any nonlinear function of itself — but uu is now allowed to appear nonlinearly through the coefficients or the right-hand side. So uux+uy=0u\, u_x + u_y = 0 is quasilinear (the coefficient of uxu_x is uu, which is fine: uxu_x still appears linearly), while ux2+uy=0u_x^2 + u_y = 0 is genuinely nonlinear (because uxu_x shows up squared).

yuxxuy=xu2y\, u_x - x\, u_y = x\, u^2

Reading off: a(x,y,u)=ya(x, y, u) = y, b(x,y,u)=xb(x, y, u) = -x, c(x,y,u)=xu2c(x, y, u) = x\, u^2. The coefficients in front of the derivatives — namely yy and x-x — don’t actually depend on uu here, but the right-hand side does, and that u2u^2 is enough to push the equation out of the linear class. The derivatives uxu_x and uyu_y still appear linearly, so the “quasi” condition holds.

Second-order PDEs

Most of the canonical PDEs of classical physics are second-order: Laplace, heat, wave, and Schrödinger all live here. This is the order at which the subject becomes the workhorse of mathematical modeling. The linear / nonlinear split from before still applies — and at second order it does most of its real work, since the canonical physics equations divide cleanly into a linear group (Laplace, heat, wave, Schrödinger, Maxwell) and a nonlinear one (Burgers, Navier–Stokes).

Stationary vs instationary

Second-order PDEs are usually classified along a second axis: how the solution relates to time.

A PDE is stationary if its solution does not depend on time — it is determined entirely by the spatial variables.

A PDE is instationary (also called non-stationary or time-dependent) if its solution evolves over time — equivalently, time derivatives like utu_t or uttu_{tt} appear in the equation.

Stationary equations describe a static state of the world (a steady-state temperature distribution, an electric field at equilibrium); instationary ones describe how the world changes in time.

Canonical linear examples

Four linear second-order PDEs do most of the work in classical physics. Each one is built around the Laplace operator Δ\Delta — summing the pure second-order spatial partial derivatives of uu — combined with some time-derivative term (or none at all):

  • Laplace equation: Δu=0-\Delta u = 0. Stationary; describes a physical potential at equilibrium, like a gravitational potential or the strength of an electrical field. With no time derivatives in sight, the equation says nothing about how the field evolves — it characterizes the static shape the field must have to satisfy the equation everywhere.
  • Heat equation: utc2Δu=0u_t - c^2\, \Delta u = 0. The Laplace equation extended with a first time derivative utu_t, scaled by a positive constant c2c^2 (the thermal diffusivity, controlling how fast heat spreads). Describes heat transfer or any analogous diffusion process. Instationary.
  • Wave equation: uttc2Δu=0u_{tt} - c^2\, \Delta u = 0. Same structural shape as the heat equation, but with a second time derivative uttu_{tt} in place of the first. Here cc is the wave speed. Models the propagation of waves through a fluid or acoustic medium. Instationary.
  • Schrödinger equation: iut+Δu=0i\, u_t + \Delta u = 0, where i=1i = \sqrt{-1} is the imaginary unit. A first time derivative again, but multiplied by ii — a small notational change with sweeping consequences for the kind of behavior the equation supports. Describes how the wave function of an elementary particle evolves in time. Instationary.

The structural pattern is hard to miss: every one of these equations is the Laplace operator Δu\Delta u — the “spatial curvature” term, measuring how uu bends in space — coupled to time through some derivative expression on uu (or none, for Laplace itself). Reading the heat equation as “the Laplace equation with a time-evolution term added” is exactly right, and the wave and Schrödinger equations are variations on the same theme: same spatial machinery, different time coupling.

Maxwell’s equations — a linear system

A fifth canonical linear example doesn’t fit the single-Laplacian-plus-time template above. Maxwell’s equations are a system of four coupled linear PDEs relating the electric field E\boldsymbol{E} and magnetic field B\boldsymbol{B} to the charge density ρ\rho and current density j\boldsymbol{j} that produce them:

E=ρ/ε0,B=0,×E=tB,×B=μ0j+μ0ε0tE.\begin{aligned} \nabla \cdot \boldsymbol{E} &= \rho/\varepsilon_0, & \nabla \cdot \boldsymbol{B} &= 0, \\ \nabla \times \boldsymbol{E} &= -\partial_t \boldsymbol{B}, & \nabla \times \boldsymbol{B} &= \mu_0\, \boldsymbol{j} + \mu_0 \varepsilon_0\, \partial_t \boldsymbol{E}. \end{aligned}

Here ε0\varepsilon_0 is the permittivity of free space and μ0\mu_0 is the permeability — both physical constants. Together these four equations describe how electric and magnetic fields are produced by charges and currents and how they evolve in response, and the whole edifice of classical electromagnetism rests on them.

Canonical nonlinear examples

Two nonlinear 2nd-order PDEs round out the canonical set.

Burgers’ equation

ut+uux=0u_t + u\, u_x = 0 — appears in various conservation laws (gas dynamics, traffic flow, shock formation) and is the textbook example of how a nonlinear transport term produces shocks. Strictly speaking the form shown is first-order — it’s the quasilinear 1st-order pattern from earlier with a=ua = u, b=1b = 1, c=0c = 0 (and yty \to t). The genuinely 2nd-order viscous Burgers’ equation adds a diffusion term:

ut+uux=νuxx.u_t + u\, u_x = \nu\, u_{xx}.

Instationary.

ρ(tu+(u)u)=p+μΔu,u=0.\rho\bigl(\partial_t \boldsymbol{u} + (\boldsymbol{u} \cdot \nabla) \boldsymbol{u}\bigr) = -\nabla p + \mu\, \Delta \boldsymbol{u}, \qquad \nabla \cdot \boldsymbol{u} = 0.

A nonlinear system describing the instationary flow of an incompressible viscous fluid. The unknowns are the velocity field u:DR3R3\boldsymbol{u} : D \subseteq \mathbb{R}^3 \to \mathbb{R}^3, the density ρ\rho, and the pressure pp. The first equation is Newton’s second law applied to a fluid parcel; the second (u=0\nabla \cdot \boldsymbol{u} = 0) enforces incompressibility. The heart of all serious fluid-dynamics modeling — and famously, whether smooth solutions always exist is one of the seven Millennium Prize Problems. Instationary.

Other classification schemes for second-order PDEs and methods for actually solving them — analytically and numerically — are taken up later in the course.