Analysis of Models

Suppose a model has been derived and is sitting in front of us — equations, constraints, the works. Before we hand it to a numerical solver and start trusting whatever number comes back, a more basic question has to be answered: can we actually work with this model at all? The two umbrella concerns are manageability (is the model amenable to systematic analysis?) and solvability (does any concrete query against it return an answer?). Three properties together decide both, and they form the standard checklist for analyzing any newly-derived model.

Existence of solutions — given a query, is there any answer at all?
Uniqueness of solutions — when an answer exists, is it the only one?
Continuous dependency on input data — do small perturbations of the input produce small perturbations of the solution?

The remaining sections walk through each in turn. Failing any one of them is a serious finding about the model — sometimes a reason to revise the model, sometimes a reason to reformulate the question, occasionally a reason to abandon the approach altogether.

Existence of solutions

The bluntest of the three: given the model and a concrete query, is there any solution? If the answer is no, every downstream stage of the simulation pipeline is wasted effort — the numerics will spin, the implementation will run, and whatever comes out will be either nonsense or a numerical artefact masquerading as an answer. A model whose typical queries have no solution is not a tool but a trap.

A solution can fail to exist for reasons that range from a quiet contradiction in the equations to a structural impossibility baked into the problem itself. Three flavors come up repeatedly across the kinds of models we have already met.

Population dynamics — does a stationary limit state exist? A population model evolved over time might be expected to settle into an equilibrium where births and deaths balance and the size stops changing. Whether such a stationary limit exists is a substantive question about the model: with the wrong choice of growth and interaction terms, the population either grows without bound, collapses to zero, or perpetually oscillates, and no fixed end state is reached. A second, sharper question follows: even if a stationary limit exists in principle, does the trajectory actually converge to it from the initial condition we are starting from?
Ordering problems — does the precedence graph contain cycles? When the sub-tasks of a larger job have precedence constraints, the structure is the directed graph from the modeling step: vertices are tasks, an arrow $A \to B$ means $A$ must finish before $B$ may start. A valid execution order is a walk through this graph that never starts a task with an unfinished predecessor. If the graph contains a cycle — $A$ before $B$ , $B$ before $C$ , $C$ before $A$ — no such walk exists. The model has been written down cleanly and asks for something logically impossible; the only honest answer is no solution.
Minimization problems — actual minimum or only saddle points? A query of the form “find the minimum of this objective” presupposes that a minimum exists in the search region. If the objective has only saddle points there — points where the gradient vanishes but the function dips below in some direction — every iterative minimizer that hunts for a vanishing gradient will land on one and report it, but the answer is not actually a minimum. The query has been issued; the geometry of the model says no such point exists.

The pattern is consistent across all three: existence is not a technicality, it is the question of whether the model and the query are compatible at all.

Uniqueness of solutions

Even when a solution exists, it might not be the only one — and that creates its own problem. A model that quietly admits several solutions for the same input gives the modeler no principled way to pick which one to report. Some queries make sense only when the answer is unique; others tolerate multiplicity but then require a separate criterion to choose between the candidates.

Local versus global minimum. An iterative optimizer that converges to a local minimum has solved the local problem cleanly — the objective increases in every direction nearby. But the question the modeler asked was “find the global minimum,” and a different valley far away may sit lower. Local methods have only local uniqueness; the global picture has to be argued separately, by exhausting starts, by analyzing the geometry, or by choosing a method that is not purely local.
Stable limit state versus oscillations. A dynamical system does not always pick one configuration to settle into. Molecular dynamics — simulating the motion of atoms in a molecule or solid — is a textbook case: the trajectory keeps oscillating between several equally valid configurations rather than converging to a single one. The “solution” is then not a single state but a whole family of states that the system keeps visiting, and the model has to be queried differently — averages over long times, distributions over configurations — for any single number to come back.
Preferences among solutions. When a problem genuinely has many valid solutions (several feasible schedules, several routes satisfying the same constraints), uniqueness has to be supplied from the outside. The model is augmented with a preference: a tie-breaking rule, a secondary objective, a cost function. Without it, the model’s answer to “give me the solution” is “which one?” — and the question has to go back to the user.

The point of the uniqueness check is not to demand that every model produce a unique answer — many useful models do not. It is to make the modeler aware up front that picking among candidates is part of the problem, so the picking happens by deliberate choice rather than by accident of which solver was used.

Continuous dependency on input data

The third criterion is about robustness. A model is fed inputs — initial values for an ODE, boundary values for a PDE, starting conditions for an iterative solver, parameters set by measurement or by hand. Real inputs are never exact: they carry rounding, measurement noise, and unstated approximation. The question is whether small perturbations of the input produce small perturbations of the solution, or whether tiny disturbances change the answer beyond recognition.

If the dependence is continuous — small in, small out — the model is usable in practice, because the imperfect inputs we actually have are close enough to the ideal inputs we wish we had. If the dependence is discontinuous, the practical input we feed in might land arbitrarily far from the answer the “true” input would have produced, and we have no way of telling from the output alone which case we are in.

This property is exactly what numerics calls the sensitivity or conditioning of a problem: how strongly the output reacts when the input is jiggled. A model that is mathematically valid but badly conditioned is technically solvable and practically useless — every digit of noise in the input gets amplified into the output, and the result is dominated by the perturbations rather than by the structure of the problem.

The check is not “does the input ever matter” — of course it does, that’s the point of the model. The check is whether the response to input changes is proportional and predictable, so that the imperfect inputs of the real world translate into imperfect-but-meaningful outputs rather than into noise.

Well-posed and ill-posed problems

The three criteria above bundle into a single classical condition with a name and an attribution.

A problem is well-posed (Hadamard, 1923) if it satisfies all three of:

existence of a solution,
uniqueness of the solution, and
stability of the solution — i.e. continuous dependence on the input data.

The three criteria apply to the problem — the combination of the model’s equations with a particular query and a particular set of inputs — not to the model in isolation. The same model can host one well-posed problem and one ill-posed problem at the same time, depending on which direction we are asking the question in. This matters because a clean, well-derived model is no guarantee that the question we want to answer is well-posed.

A problem that fails one or more of the well-posedness conditions — existence, uniqueness, or stability — is ill-posed. Following the work of John and Tikhonov, most modeling problems that come up in practice fall into this category.

The most common reason a practical problem is ill-posed is that the question we actually care about is the inverse of a forward process the model can describe well.

An inverse problem asks for the initial configuration — parameters, controls, sources, designs — that would produce a given outcome. It is the reverse of a forward problem, which takes a known initial configuration and computes the outcome.

Inverse problems lose well-posedness for an intuitive reason. Forward processes routinely smooth, average, or compress their inputs: many different inputs end up looking similar on the output side, and small input changes produce only small output changes. The forward direction is continuous and well-behaved.

Run the same map backward, though, and that smoothing becomes the problem. Take the economy: a small change in today’s tax policy produces a correspondingly small change in next year’s growth — that’s the (well-behaved) forward direction. Now reverse the question. What policy adjustments today would lift next year’s growth from 1.0% to 1.1%? The required policy shift can be disproportionately large, several different shifts might land at the same target, or no realistic shift might land there at all. Continuity in the forward direction does not imply continuity in the inverse — that is exactly stability failing. Likewise, a forward map that compresses many inputs to similar outputs has a non-unique inverse, and a forward map that hits only a subset of possible outcomes immediately makes any target outside that subset unreachable. All three Hadamard criteria can collapse at once when the arrow is reversed.

The same pattern shows up across applied fields:

Economics. Given a target on next year’s unemployment in Germany — say below 3.5 million — what policies should be set today to land there? The forward map (policy → outcome) is the model’s natural direction; the inverse direction (target → required policy) is the actual question.
Engineering. A pressing tool stamps a flat metal sheet into a shaped one. The forward problem maps a tool configuration to the resulting sheet shape and is straightforward to simulate. The interesting question is the inverse: given a desired sheet shape, which tool configuration produces it?
Computer networks. A given configuration of components has some throughput, and that’s a forward computation. The deployment question is the reverse: how should the components be configured to guarantee a target minimum throughput?

In every case, the practical question lives on the harder side of the arrow. Two strategies are commonly used to make headway anyway.

Trial and adjustment via forward problems. Pick a candidate configuration, run the (well-posed) forward problem, compare the outcome to the target, adjust the configuration, and iterate. Each step is a forward problem we know how to solve; the loop wraps a sequence of forward problems into an inverse search. Meaningful adjustment matters here — the rule for how to update the candidate carries the load, since random guessing converges only by accident.
Solve a related, regularized problem. Replace the ill-posed problem with a nearby one that is well-posed by construction, and accept its solution as an approximation to what was originally wanted. The standard technique is regularization — augmenting the formulation with extra penalty terms (typically penalizing candidate solutions that are too large or too oscillatory) that pin down a unique, stable solution out of the many (or none) the original problem offered. The price is that we are no longer answering exactly the original question, but a controlled approximation of it; the gain is that there is now an answer at all, and it does not amplify input noise.

The takeaway is that having a model is not the end of the story. A clean, well-derived model can still pose ill-posed problems for the queries we actually care about — and recognizing this before pouring effort into a solver that will only amplify noise or wander among non-existent solutions is the whole point of the analysis stage.

Applicability of the model

Existence, uniqueness, and stability ask whether the problem posed by the model admits a solution in principle. Applicability asks the next question: even when it does, can the solution actually be obtained — by a computer, on a useful timescale, with the data we actually have? Mathematical well-posedness is necessary but nowhere near sufficient. Five practical conditions sit alongside the mathematical ones, and any of them can disqualify a model before a single line of solver code is written.

Input data — availability and quality. A model that depends on inputs we cannot measure, cannot collect in time, or can only measure with poor accuracy is unusable in production no matter how clean its equations. The check is whether the inputs the model assumes are realistically obtainable at the precision the model needs. A formula calibrated to three-decimal-place data, fed one-decimal-place measurements, has accuracy that is hollow rather than sharper.
Implementation effort. Once the algorithms are chosen, somebody has to write the code. Sometimes the model fits inside an existing toolkit — well-tested libraries, standard solvers, mature simulation packages — and the only work is integration. Sometimes new code, new data structures, new infrastructure are needed, and the engineering effort can rival the mathematics. Implementation is rarely the headline cost, but it routinely tips the choice toward a less ambitious model that is already implemented.
Absolute computing and memory requirements. Even when an algorithm exists, the resources it demands may simply exceed what is available. Two flavors come up repeatedly. Combinatorial blow-up: the canonical example is the family of NP-complete problems, whose worst-case run time appears to grow exponentially in the input size, putting even modest instances out of reach. Real-time deadlines: a weather forecast is only valuable if tomorrow’s prediction is delivered before tomorrow arrives. If simulating the next 24 hours of atmosphere takes 48 hours of compute, the result is no longer a forecast — it is a reconstruction of yesterday. Here the compute budget is bounded by the very thing we are trying to predict.
Relative computing and memory requirements. Even when the absolute cost fits, this model has to compete with other candidate approaches for the same problem — a different formulation, a coarser model, a simpler heuristic. The question is the cost-benefit ratio: does the extra accuracy or detail this model offers justify the extra compute, memory, and engineering compared to a cheaper alternative that does almost as well? A model that is technically applicable but uncompetitive against simpler rivals tends not to be the one that gets deployed.
Sensitivity in practice. When the model is ill-posed, the stability failure that was a mathematical concern in the previous section becomes a practical one: small disturbances in the input — measurement noise, rounding, missing significant figures — can falsify the output completely. The same input-amplification pathology shows up in dynamical systems as chaos, popularly pictured by the butterfly effect — a butterfly flapping its wings in one place could, weeks later, shift the weather in another. The practical consequence is the same in either case: the model spends its computational budget amplifying input noise rather than resolving the structure we wanted.

The first four conditions are about resources and engineering — data, code, hardware, competitive position. The fifth wraps back to ill-posedness and names its practical face. Together with the mathematical criteria from the earlier sections, applicability is the second half of what analysis is: a model that survives both halves — well-posed in principle and applicable in practice — is the rare combination simulations actually need.

Solution approaches

Once a model has been analyzed and judged tractable, the next question is how the solution is actually obtained. Several structurally different approaches sit on the modeler’s bench, and the right one depends on what the model permits — whether the equations admit a closed form, whether the search space is continuous or combinatorial, and how much accuracy and computation we are willing to trade for each other.

Analytic

An analytic solution is a solution obtained in closed form by direct mathematical manipulation, accompanied by a proof of existence and uniqueness. The construction is formal, analytic, and direct — no discretization, no approximation, no iteration. When available, this is the optimum: exact, complete, and as good as the answer can be.

The catch is that analytic solutions are available almost exclusively for very simple special cases. The moment the equations stop having a clean structure — nonlinearity, irregular boundary, coupling between many quantities — the closed form vanishes, and another approach has to take over. The cases where it does work are still worth knowing: they appear as test problems for numerical methods, and they pin down what an “ideal” answer looks like.

Exponential growth. The simplest first-order ODE, $\dot y(t) = y(t)$ , has the closed-form solution $y(t) = c \cdot e^{t}$ , parameterized by the initial value $y(0) = c$ . The rate of change equals the value; the function whose derivative is itself is exactly $e^t$ .
1D heat equation. The PDE $u_{xx}(x,t) = u_t(x,t)$ — describing how heat diffuses along a thin rod — admits a closed-form solution $u(x,t) = \sin(cx) \cdot e^{-c^2 t}$ for each constant $c$ , under simple boundary conditions.
Shortest path in a small graph. When the graph has few enough vertices to enumerate by hand, the shortest path is found by direct case analysis — list the candidate routes, compute their lengths, take the minimum. For larger graphs the analytic approach gives way to algorithms — Dijkstra and its relatives, which grow a frontier outward from the start vertex, locking in the shortest known path to each newly reached vertex — but the small case is genuinely closed in the same sense as the differential examples above.

Heuristic

A heuristic is a trial-and-error strategy guided by problem-specific rules of thumb, used to search for a solution when no efficient analytical or exact algorithmic route is available. Heuristics typically do not guarantee the optimum, and the central practical questions are whether the search converges to a good answer at all, and how fast it does so.

Heuristics are especially common in discrete optimization, where the search space is combinatorial — a finite but enormous set of candidates — and there is no continuous gradient to follow. The strategy is often local: at each step, look at the choices immediately available and pick the one that looks best by some myopic criterion.

The textbook example is a greedy heuristic for the knapsack problem — given a set of items, each with a value and a weight, and a fixed weight capacity, pick a subset of maximum total value that still fits in the budget. The greedy rule is to take the best local alternative at each step — concretely, rank the items by value-to-weight ratio and accept them in that order, taking each one that still fits until none of the remaining items does. Fast, intuitive, and entirely driven by local comparisons.

The natural question is whether such a strategy always lands on the true best answer. It does not. There are knapsack instances where the greedy choice early on uses up budget that a different combination could have spent more profitably, and the procedure finishes with a feasible but strictly sub-optimal pack. This is the opening of a much broader pair of concerns that haunts every heuristic: convergence — does the procedure even produce a stable answer in finite time? — and speed of convergence — how quickly does it close in on a good one? Both are open questions in general, and either can be the reason a heuristic is rejected for a given problem.

Direct-numerical

A direct-numerical method is a concrete algorithm that, when run to completion, returns the exact solution to the problem — exact in the mathematical sense, modulo the rounding errors inherent in floating-point arithmetic (the finite-precision number format computers actually use, which can only represent values to a fixed number of significant digits). There is no heuristic component and no iterative approximation to tighten; the algorithm runs in a known, finite number of steps and is guaranteed to deliver a result.

A direct-numerical method occupies a different niche from both the analytic and the heuristic approach. Like an analytic solution, it returns the answer — not a “good guess” — but the answer is delivered as a concrete numerical output rather than a closed-form formula, and the path to it is an algorithm rather than a derivation. Like a heuristic, it lives in the discrete world of computer arithmetic, but unlike a heuristic, it is built on a deterministic procedure with a provable termination and a provable correctness.

The textbook example is the simplex algorithm for linear optimization (also called linear programming) — the problem of maximizing a linear objective subject to linear inequality constraints:

\max_{\mathbf{x}} \, \mathbf{c}^\top \mathbf{x} \quad \text{subject to} \quad A \mathbf{x} \le \mathbf{b}.

The objective vector $\mathbf{c}$ , constraint matrix $A$ , and right-hand side $\mathbf{b}$ are given; the variable $\mathbf{x}$ ranges over the feasible region — the points in $\mathbb{R}^n$ satisfying every inequality. The simplex algorithm walks from corner to corner of that region until no edge improves the objective further, terminating at the optimum in finitely many steps.

Approximate-numerical

An approximate-numerical method is an iterative algorithm that produces a sequence of approximations, each closer to the true solution than the last. Given enough iterations, the approximation can be made as accurate as desired — there is a quantitative guarantee, not just a hope. These methods typically operate on a discretized version of the original equations rather than on the equations themselves, and they are the main workhorse for problems in numerical simulation.

Two layers of approximation are at play here, and it helps to keep them separate. First, the equations themselves are replaced by a discretized version — the continuous unknown becomes a finite list of numbers, and the equation becomes a matrix equation among them. Second, that discretized problem is solved iteratively, each step tightening the approximation. The two questions worth asking are how accurate and how fast: dialling accuracy up costs more iterations or a finer discretization. Quantifying that trade-off is what numerical analysis as a subject is about.

Two examples appear over and over.

Iterative methods for systems of linear equations. Solving $A\mathbf{x} = \mathbf{b}$ for large, sparse $A$ (most entries zero) is typically done by iteration: start with a guess $\mathbf{x}^{(0)}$ , apply a fixed update rule, and watch the residual $\|A\mathbf{x}^{(k)} - \mathbf{b}\|$ — the leftover discrepancy when the current guess is plugged back in — shrink toward zero. Each iteration is cheap; the total cost depends on how fast the shrinkage proceeds.
Newton’s method for roots of functions. Given $f(\mathbf{x}) = \mathbf{0}$ and a guess $\mathbf{x}^{(0)}$ , follow the local linear approximation (the tangent line in 1D) to where it hits zero, and repeat. Converges quickly when started near a root.

Across the four approaches, a rough division of labor emerges: analytic for the rare cases where the math is clean enough to admit a closed form; direct-numerical for structured problems whose exact answers can be reached algorithmically in finitely many steps; approximate-numerical for the bulk of numerical simulation, where iteration tightens accuracy at the cost of compute; and heuristic as the fallback for combinatorial problems where no better guarantee is available. Most realistic solvers in fact use more than one — a discretization stage turns a continuous problem into a linear system, an approximate-numerical iteration solves the linear system, a heuristic chooses the discretization in the first place — and the modeler’s job includes recognizing which combination is appropriate for the problem at hand.

Model assessment

Even when a model is well-posed, applicable, and yields cleanly to one of the solution approaches above, something is still missing. Solving a model’s equations correctly is not the same as the equations being correct. The model is a simplified image of reality, and the simplification might have dropped the wrong details, kept terms that don’t actually drive the behavior, or fixed a coefficient at a plausible-but-wrong value. Two complementary checks make up model assessment: validation asks whether the model is correct — produces an answer that corresponds to the real system at all — and accuracy asks how precisely it does so. Validation is the subject of this section.

Validation

Validation is the check that asks: is the model correct? — that is, do the outputs of the model agree with the behavior of the real system it is supposed to represent? Validation is performed against an external reference, and which reference is available depends on the system being modeled.

The awkwardness of validation is that “the real system” is rarely a single clean reference one can compare against. Physical experiments have their own measurement error and bias; observational records may be incomplete; established theory is itself a model. Four families of techniques cover the practical possibilities, each with a different blind spot.

Comparison with experimental tests

The most direct approach: run a physical experiment, run the simulation of it, and compare. Two flavors are common.

1-to-1 experiments. Full-scale physical setups — a car driven into a barrier in a crash test, an aircraft in a wind tunnel at its actual operating conditions. The simulation and the experiment cover the same situation, and the comparison is direct.
Laboratory experiments with downsized prototypes. Full-scale tests are sometimes too expensive or infeasible — a wind tunnel large enough for a real airliner, structural tests on a finished bridge. The standard alternative is a scaled-down prototype in a controlled lab. The catch: physics doesn’t always scale cleanly. Effects that were negligible at full size can dominate in miniature, and whether the small-scale setup behaves like the real one is a substantive question — a “no” disqualifies the comparison entirely.

A subtler caveat applies to both flavors: experimental data isn’t ground truth either. Sensors have noise, conditions aren’t perfectly controlled — validation against experiment is a comparison of two imperfect representations of the real system. The careful question is whether the disagreement falls within the combined uncertainty of both, not whether the simulation reproduces the experiment to the last digit.

A-posteriori observations

Some systems cannot be experimented on at all — there is no second copy of next week’s weather, of last quarter’s stock market, or of a particular geopolitical scenario. Validation then waits for the system to play itself out and compares the model’s prediction with what actually happened.

Reality tests. A weather model issues a forecast for tomorrow, and tomorrow eventually arrives; over many forecasts a track record builds up. Stock-market models, election models, and military-scenario simulations are evaluated the same way — the model gets one shot per real-world event, and a score accumulates over many shots.
Satisfaction tests. Some systems do not even have a sharp “right answer” to compare against; they have a user whose acceptance is the criterion. A traffic-control simulation is validated when the flows it produces look reasonable to operators on the ground; an illumination model in computer graphics is validated when the rendered scene is convincing to the human eye, whether or not its photons travel along the same paths as in reality. The reference here is not physics but human judgment downstream of the simulation.

Plausibility tests

When neither a direct experiment nor an a-posteriori observation is feasible — typically because the regime is too remote (early-universe astrophysics) or too small (quantum-scale interactions) — validation falls back on consistency with previously verified theory. The simulation’s outputs should reproduce the predictions of established physics where the two overlap, and any disagreement is evidence either against the simulation or, more rarely, against the previously held theory. Plausibility tests do not establish correctness, but they rule out the larger class of clearly-wrong models.

Model comparison

A final technique sidesteps the absence of a clean external reference altogether: run several independently-derived models of the same system and compare their outputs. Agreement among different models is circumstantial evidence that they are all picking up the same underlying behavior; sustained disagreement is evidence that at least one of them is missing something. The check is weaker than experimental validation — multiple wrong models can quietly share the same wrong assumption — but it is often the only one available when the system itself is out of reach.

None of these four techniques is sufficient on its own, and each has a blind spot the others can partly fill. A serious validation effort layers them: experimental comparison where possible, a-posteriori tracking of live predictions where the system runs on its own, plausibility checks against theory in regimes where neither is available, and cross-comparison with alternative models as a sanity floor underneath the lot. Validation in this sense is less a single test than a portfolio.

Accuracy

Accuracy is the second component of model assessment, asking: how precise is the model? — once the outputs broadly correspond to the real system (validation), how close are they, quantitatively, to what the system actually does?

The single most useful thing to remember about accuracy is that it is not an absolute value. A figure like “the model is accurate to ±2%”, taken on its own, says very little; the same number is generous in one context and disqualifying in another. Three relative comparisons sit under any honest accuracy claim, and a model can pass any one of them and fail another.

Accuracy versus the quality of the input data

A simulation cannot manufacture precision its inputs do not contain. If the inputs are accurate to three decimal places, the output is at best accurate to three decimal places — pushing the result to eight decimal places makes it look sharper, but the extra digits are just noise dressed up as precision. The output can’t be more accurate than the inputs allow, and the limiting factor is usually the messiest input rather than the model itself.

Accuracy versus the problem at hand

Even when the absolute precision is high, it can still be useless if the question being asked of the model needs a sharper resolution. The classic case is a decision question whose answer flips at a margin smaller than the model’s error.

A simulation forecasts vote shares in a German Bundestag (federal parliament) election. The decision-maker’s question is not what are the vote shares? — they want to know which coalition will form the government? Two coalitions are on the table:

Red-Green, made up of the SPD (Social Democrats) and the Grüne (Greens).
Yellow-Black, made up of the FDP (Free Democrats) and the Union (the CDU/CSU bloc of conservatives).

The model returns: FDP $4\%$ , Grüne $6\%$ , Union $45\%$ , SPD $45\%$ , with an accuracy of $\pm 2\%$ on each share.

Adding up the coalitions, Red-Green sits at $45\% + 6\% = 51\%$ and Yellow-Black at $45\% + 4\% = 49\%$ . Taken at face value, Red-Green wins by a 2-point margin. But once the $\pm 2\%$ uncertainty is applied to those totals, the intervals — roughly $[49\%, 53\%]$ for Red-Green and $[47\%, 51\%]$ for Yellow-Black — overlap. The forecast cannot reliably distinguish Red-Green wins from Yellow-Black wins, even though every individual party’s share has been estimated to a perfectly reasonable precision.

The model is not wrong, in any obvious sense. It is simply not accurate enough for this particular question. For a different question — what fraction of the Bundestag will the Union hold? — the same $\pm 2\%$ would be plenty. For the coalition question, the model is basically unsuited.

The lesson generalizes past elections. Every decision question has a decision boundary — the value of the underlying quantity at which the answer flips from one option to the other — and the model’s accuracy has to be sharper than the distance between the true answer and that boundary, or the model cannot resolve which side of the boundary the answer lies on. A model accurate enough for one question can be unsuited for another, with no change to the model itself.

Worst case versus average case

A claim like “the error is at most $\delta$ ” is meaningful only after the kind of error has been pinned down. Two natural choices answer different questions.

Worst case. The largest possible error across all admissible inputs — the strongest guarantee on offer. The right framing for safety-critical applications (bridges, aircraft, medical devices), where what matters is the worst the model could ever produce, not what it usually produces. A worst-case bound that is loose still gives a hard guarantee; an average-case bound that is tight does not.
Average case. The typical error over the distribution of inputs the model is actually expected to see, which is often much smaller than the worst case. The right framing for statistical applications (financial risk, throughput optimization, large-scale forecasting), where rare extremes are absorbed by the law of averages and what matters is the bulk performance.

A model with a generous worst-case error bound and a tight average-case error bound is normal, not a contradiction. Reporting either one without naming which it is obscures more than it reveals.

The three relative comparisons together make precise what “the model is accurate” actually has to mean: accurate enough for the question, given inputs of known quality, with the kind of error guarantee the application requires. Failing any of those three is a different way for a model that “looked accurate” to fall apart in use. With validation and accuracy in hand alongside the well-posedness and applicability checks earlier in the chapter, the analyst’s part of the simulation pipeline is complete: the model has been examined for whether it can be solved, how it should be solved, and how much to trust the answer once it comes back.