Verification vs Validation in Simulations

Reading Time: 8 minutes

You need to know two different things about a simulation: whether the code solves the equations correctly, and whether the model is accurate enough for the real-world problem. The first question is verification. The second question is validation.

But that is not enough. You also need to understand how much you can trust the numbers your simulation produces. That is where uncertainty quantification enters the workflow.

This guide explains the complete VVUQ framework: verification, validation, and uncertainty quantification. It also includes practical Python and FiPy examples that can help you apply these ideas in your own simulation work.

Key Takeaways

Verification answers the question: did we solve the equations right? It checks code correctness, numerical accuracy, and implementation bugs.
Validation answers the question: did we solve the right equations? It compares simulation output against experimental data, benchmark data, or trusted reference results.
Uncertainty Quantification answers the question: how much can we trust these results? It propagates input uncertainties through the model and reports confidence bounds on predictions.
UQ is often mentioned in V&V discussions but rarely implemented with practical Python examples. This guide gives concrete starting points.
The ASME VVUQ standards portfolio provides a widely adopted structure for VVUQ workflows across computational disciplines.

Verification, Validation, and Uncertainty Quantification: Why They Belong Together

If you have ever run a simulation and asked whether you can trust the results, you have already encountered the problem that VVUQ addresses.

Verification, validation, and uncertainty quantification are not separate tasks that you complete independently. They form a credibility pipeline:

Verification proves that the code is mathematically and computationally correct.
Validation checks whether the model represents the real system well enough for the intended use.
Uncertainty Quantification tells you how confident you can be in the predictions.

The ASME VVUQ framework formalizes this pipeline into a structured workflow used across computational solid mechanics, fluid dynamics, medical devices, and other simulation-heavy fields. ASME VVUQ

The distinction between verification and validation is often summarized through two questions.

Question	What It Means	What You Are Checking
Are we solving the equations right?	Verification	Coding bugs, numerical errors, and discretization mistakes
Are we solving the right equations?	Validation	Physical model accuracy, boundary conditions, and assumptions

The third question is: how confident can we be? That is where uncertainty quantification enters the process.

A simulation can be verified and validated but still produce predictions with wide uncertainty intervals. If those intervals are too wide, the result may not be useful for design, regulation, or decision-making.

1. Verification: Proving Your Code Is Correct

Verification is mathematical and computational. It does not test the real world directly. Instead, it assumes the governing equations are correct and asks whether the computer program solves them without numerical or implementation errors.

There are two main aspects to verification: code verification and order of accuracy testing.

Code Verification: Did the Implementation Match the Math?

Code verification demonstrates that the solver correctly implements the mathematical model. Two widely used methods are the Method of Manufactured Solutions and order of accuracy testing.

Method of Manufactured Solutions

The Method of Manufactured Solutions is one of the strongest tools for code verification in computational science. It works by creating a known analytical solution, computing the required source term, and then checking whether the code reproduces that solution.

The procedure is:

Choose a smooth analytical solution, such as u_m = sin(x) * cos(y) * exp(-t).
Substitute the manufactured solution into the PDE operator to compute the required source term.
Run the simulation with the manufactured source term and matching boundary or initial conditions.
Compare the numerical solution against the exact manufactured solution.
Run mesh refinement studies to verify that the observed convergence order matches the theoretical order.

Here is a simplified FiPy-style example for a transient diffusion problem:

import numpy as np
from fipy import Grid2D, CellVariable, TransientTerm, DiffusionTerm

# Manufactured solution:
# u(x, y, t) = sin(pi*x) * sin(pi*y) * exp(-2*pi^2*t)

nx, ny = 32, 32
Lx, Ly = 1.0, 1.0

dx = Lx / nx
dy = Ly / ny

mesh = Grid2D(nx=nx, ny=ny, dx=dx, dy=dy)

x, y = mesh.cellCenters

u = CellVariable(name="u", mesh=mesh, hasOld=True)

def exact_solution(t):
    return np.sin(np.pi * x) * np.sin(np.pi * y) * np.exp(-2 * np.pi**2 * t)

# Initial condition
u.setValue(exact_solution(0.0))

# For this manufactured solution and diffusion coefficient 1:
# du/dt = Laplacian(u), so the source term is zero.
source = CellVariable(name="source", mesh=mesh, value=0.0)

eq = TransientTerm(var=u) == DiffusionTerm(coeff=1.0, var=u) + source

dt = 0.001
nt = 100

for step in range(nt):
    u.updateOld()
    eq.solve(var=u, dt=dt)

t_final = dt * nt
exact = exact_solution(t_final)

l2_error = np.sqrt(np.mean((u.value - exact) ** 2))

print(f"L2 error at t={t_final:.3f}: {l2_error:.6e}")

If you halve the mesh size and the error drops by about a factor of four, a second-order scheme is behaving as expected. If it does not, the implementation may contain a bug or the boundary treatment may reduce the observed order.

Order of Accuracy Testing

Order of accuracy testing verifies that the code achieves the expected convergence rate under mesh refinement.

For a second-order finite volume scheme, halving the mesh spacing should reduce the error by roughly a factor of four.

The procedure is:

Choose a problem with a known exact solution, either from MMS or a textbook benchmark.
Solve the problem on a sequence of refined meshes, such as 32×32, 64×64, and 128×128.
Compute an error norm at each refinement level, such as L1, L2, or L∞.
Plot error versus mesh size on a log-log plot.
Compute the observed order with order = log(e_coarse / e_fine) / log(h_coarse / h_fine).
Check that the observed order matches the theoretical discretization order within a reasonable tolerance.

This is a minimum requirement for any PDE code that claims numerical correctness. Without it, error estimates and mesh-convergence conclusions are weak.

Important Warning About Cross-Code Comparison

Comparing two different codes can be useful as a sanity check, but it is not a substitute for verification against analytical or manufactured solutions.

Two codes can agree and still be wrong if both share the same systematic mistake. Cross-code comparison should be used as a supplementary check after proper MMS or order testing.

2. Validation: Comparing Simulations to Reality

Validation assesses whether the simulation model is sufficiently accurate for its intended use. It does this by comparing predictions with independent experimental data, benchmark data, or trusted reference results.

A key rule is that calibration and validation must be separate. Calibration adjusts model parameters to match data. Validation tests predictive power on independent data. Using the same dataset for both creates artificial confidence.

Benchmark Problems as Validation Targets

Benchmark problems are standardized test cases with well-characterized experimental or high-fidelity reference data. They provide objective validation targets.

Common benchmark categories include:

Fluid dynamics, such as flow past a cylinder or lid-driven cavity flow.
Transport equations, such as 1D advection-diffusion with known analytical behavior.
Phase-field models, such as Allen-Cahn or Cahn-Hilliard patterns compared against reference behavior.
Diffusion-reaction systems, such as Fisher-KPP wave speed validation.

For computational materials modeling and PDE-based machine learning, benchmark datasets such as PDEBench can provide standardized reference problems.

When Experimental Data Is Unavailable

High-quality experimental data is not always available. In that case, you can still build validation evidence by using the best available alternatives.

Use high-fidelity reference solutions, such as DNS for turbulent flows, when available.
Compare with analytical solutions for simplified cases.
Perform cross-code comparisons with independent, well-verified codes.
Be transparent about the limitation and characterize predictive uncertainty through sensitivity analysis.

Hierarchical Validation Strategy

A practical validation strategy should be hierarchical.

Start with simple benchmark problems that isolate specific physics.
Build complexity through system-level tests that combine several phenomena.
Document the validation problems, results, error metrics, and conclusions about adequacy for the intended use.

A model is rarely validated perfectly. It is validated to a certain level of accuracy for a specific intended use.

3. Uncertainty Quantification: Gauging Confidence in Predictions

Uncertainty Quantification is the process of characterizing, quantifying, and propagating uncertainties in model inputs, parameters, and approximations. It helps assess how uncertainty affects model outputs and prediction confidence.

No model is perfect, and every real system includes variability and incomplete knowledge. UQ helps you understand not only what the model predicts, but how certain that prediction is.

Types of Uncertainty

UQ distinguishes two main categories of uncertainty.

Type	What It Means	Can It Be Reduced?	Examples
Aleatory	Inherent randomness in a system	No, it is irreducible	Manufacturing tolerances and environmental fluctuations
Epistemic	Lack of knowledge about the system	Yes, through more data or better models	Unknown material properties and unmeasured initial conditions

This distinction matters because it affects the choice of UQ method. Aleatory uncertainty is usually described with probability distributions. Epistemic uncertainty can often be reduced through more measurements, better models, or improved calibration.

UQ Methods: From Simple to Advanced

Monte Carlo Simulation

Monte Carlo simulation is the most straightforward UQ method. You repeatedly run the model with randomly sampled input values drawn from defined distributions. The result is a distribution of output values.

import numpy as np

# Example: propagate uncertainty in thermal conductivity.
# Assume k = 200 W/mK +/- 10%, represented as a uniform distribution.
k_samples = np.random.uniform(180, 220, size=10000)

# Simplified steady-state heat transfer:
# T = q * L / (k * A)
q = 1000
L = 0.01
A = 1.0

T = (q * L) / (k_samples * A)

print(f"Mean temperature: {np.mean(T):.2f} K")
print(
    "95% confidence interval: "
    f"[{np.percentile(T, 2.5):.2f}, {np.percentile(T, 97.5):.2f}] K"
)

Monte Carlo is easy to understand and implement. Its main drawback is cost. If each model run is expensive, thousands of samples may be impractical.

Polynomial Chaos Expansion

Polynomial Chaos Expansion builds a surrogate model that represents the output as a polynomial function of uncertain inputs. It can be much more efficient than brute-force Monte Carlo when each simulation run is expensive.

Python tools such as EasyVVUQ and EasySurrogate can support polynomial chaos and surrogate-based UQ workflows.

Sensitivity Analysis

Sensitivity analysis investigates how variation in model output can be attributed to variation in model inputs. It helps identify which parameters influence predictions most.

Two major categories are:

Local Sensitivity Analysis. This changes one input at a time while holding others fixed. It is simple but does not capture parameter interactions.
Global Sensitivity Analysis. This varies all inputs together and can account for interactions. Methods include Sobol indices, regression-based methods, and derivative-based methods.

Sensitivity analysis is useful for:

Prioritizing data collection by identifying which parameters matter most.
Simplifying models by fixing parameters that have negligible impact.
Understanding which mechanisms drive predictions.

The Integrated VVUQ Pipeline: Putting It All Together

An integrated VVUQ pipeline follows a systematic workflow.

Identify and characterize uncertainties in inputs, physical parameters, and numerical approximations.
Perform code and solution verification to reduce numerical errors, coding bugs, and discretization uncertainty.
Propagate input uncertainties through the computational model to generate output distributions.
Validate model predictions against controlled experimental data or trusted benchmark results.
Support decision-making with probabilistic, risk-informed results rather than single deterministic predictions.

In practice, this workflow can be supported by tools such as EasyVVUQ, FabSim3, surrogate modeling libraries, and HPC-oriented workflow systems.

Common V&V Mistakes and How to Avoid Them

Verification Mistakes

Assuming the code is bug-free. Even widely used codes can contain undetected bugs. Regression testing with MMS cases can catch new errors.
Neglecting order accuracy. Without confirming theoretical convergence rates, error estimates are weak.
Using cross-code comparison as the only verification method. Two wrong codes can agree.
Treating verification as a one-time activity. Every meaningful code change should trigger relevant re-verification.

Validation Mistakes

Confusing calibration with validation. Tuning parameters to data and then validating against the same data inflates confidence.
Ignoring experimental uncertainty. A discrepancy may be acceptable if experimental uncertainty is larger than the difference.
Extrapolating beyond the validated regime. A model validated in one regime should not be trusted in a very different regime without additional evidence.
Poor documentation. Without detailed validation records, simulation credibility is difficult to assess.

UQ-Specific Mistakes

Assuming UQ is optional. Without uncertainty bounds, a validated simulation still lacks a usable confidence interval.
Ignoring parameter interactions. One-at-a-time sensitivity analysis can miss interaction effects that global methods capture.
Treating all uncertainty as aleatory. Some uncertainty is epistemic and can be reduced through better data or better models.

Decision Guide: How Much VVUQ Rigor Do You Need?

The level of VVUQ rigor should match the consequences of model failure.

Situation	Recommended Rigor	Why
Academic research code	Basic verification with MMS and mesh convergence	Minimum acceptable level for credibility
Published simulation results	Verification plus validation against benchmarks	Supports reproducibility and peer review
Industrial design decisions	Full VVUQ with UQ bounds	Design choices and compliance can depend on the model
Safety-critical applications	ASME-style framework with independent review	Regulatory and liability risks are high
Machine learning surrogate	Surrogate accuracy verification plus UQ of prediction bounds	Surrogate error must be quantified
Code maintenance or bug fixes	Targeted MMS tests for affected modules	Cost-effective verification for changed code paths

The bottom line is simple: VVUQ rigor should match the consequences of failure. Academic code needs basic verification. Published results need stronger validation. Industrial and safety-critical applications require comprehensive, documented VVUQ.

Summary and Next Steps

Verification, validation, and uncertainty quantification are not optional extras. They are core parts of credible scientific simulation.

A practical VVUQ pipeline should follow this sequence:

Start with code verification using MMS and order accuracy testing.
Quantify numerical errors through solution verification for production runs.
Build validation evidence with benchmark problems and independent data.
Add uncertainty quantification to report confidence bounds on predictions.
Follow standards such as ASME VVUQ to structure the process and communicate rigor.

Even a basic VVUQ program, such as MMS tests, mesh convergence, and one UQ method, can dramatically increase confidence in simulation results and catch errors early.

Related Guides

For related topics in scientific simulation workflows on MatForge:

References and Further Reading

Roy, C. J. (2005). Review of code and solution verification procedures for computational simulation. Journal of Computational Physics.
Oberkampf, W. L., & Roy, C. J. (2010). Verification and validation in scientific computing. Cambridge University Press.
ASME VVUQ Standards Portfolio: VVUQ 1, V&V 10, V&V 20
OSTI: Introduction to Verification, Validation, and Uncertainty Quantification

Verification vs Validation in Scientific Simulations: A Practical Guide