In-Situ Visualization: Real-Time Analysis for HPC Simulations

Reading Time: 10 minutes

TL;DR

In-situ visualization processes and analyzes simulation data while the simulation runs, directly in memory, instead of writing raw results to disk for later post-processing. This approach is essential for exascale computing where I/O bandwidth cannot keep up with data generation. By integrating visualization during computation, you can:

Avoid I/O bottlenecks that cripple traditional workflows
Monitor simulations in real-time and steer parameters on-the-fly
Reduce storage needs by orders of magnitude
Capture transient phenomena that would be missed with infrequent snapshots

The trade-off is added complexity and resource contention—you must decide what to visualize before the run, and visualization competes with the simulation for CPU/GPU cycles.

Introduction: Why In-Situ Visualization Matters

Scientific simulations generate vast amounts of data. A single exascale run can produce petabytes of output, but writing that data to disk becomes a fundamental bottleneck: storage systems cannot keep pace with computational throughput. The result is a widening gap between how fast simulations can produce data and how fast that data can be saved and analyzed.

In-situ visualization (Latin for ‘in place’) addresses this by performing visualization and analysis tasks concurrently with the simulation, accessing data directly from memory before it is written to disk. This paradigm shifts the workflow from:

Simulation → Storage → Visualization

to:

Simulation + Visualization → Storage of images/reduced data

As high-performance computing (HPC) scales toward exascale, in-situ techniques are no longer optional—they are becoming a necessity for scientific discovery. This guide explains how in-situ visualization works, compares it to traditional post-processing, surveys available tools, and provides practical advice for adopting it in your research projects.

How In-Situ Visualization Works: The Core Workflow

In-situ visualization leverages the same computational resources that run the simulation to perform visual analysis. There are two primary coupling strategies:

Tightly Coupled (Inline) In-Situ Visualization

The visualization code shares the same memory space and executable as the simulation. At specified intervals (e.g., every time step or when certain criteria are met), the simulation pauses, data is passed to visualization algorithms, and images or derived data are produced.

Pros:

Lowest latency access to simulation data
No network communication overhead
Simplest data mapping (same memory layout)

Cons:

Visualization competes directly for CPU/GPU cycles, slowing the simulation
Increased memory footprint per node
A crash in the visualization code can crash the entire simulation

Example tools: ParaView Catalyst integrated directly into the simulation binary.

Loosely Coupled (In-Transit) In-Situ Visualization

The simulation runs on one set of compute nodes while visualization runs on a separate, co-allocated set of nodes. Data is transferred over the network (often via MPI) to the visualization processes.

Pros:

Visualization does not directly interfere with simulation computation
Can scale visualization independently
Better fault isolation

Cons:

Network transfer overhead
Requires more total resources (separate nodes)
Data may need to be repartitioned for visualization

Example frameworks: Damaris, which uses dedicated cores for asynchronous analysis.

Hybrid Approaches

Many real-world deployments use a hybrid strategy: tightly coupled within a node (using the same CPU/GPU) but loosely coupled across nodes, or employing asynchronous pipelines where visualization runs ahead of the simulation on the same resources.

In-Situ vs. Post-Processing: A Comparative Analysis

Feature	In-Situ Visualization	Post-Processing Visualization
Timing	Real-time, during simulation	Delayed (after simulation)
Data Flow	Processes data in-memory	Writes data to disk, then reads it
I/O Overhead	Very low (minimizes disk writes)	Very high (disk bottleneck)
Storage Needs	Low (only images/summaries saved)	Extremely high (full raw data)
Temporal Resolution	High (every time step possible)	Low (limited by storage capacity)
Flexibility	Low (decisions fixed before run)	High (can re-analyze saved data)
Compute Impact	Shares resources with simulation	No impact on simulation runtime

When to Use In-Situ Visualization

Exascale simulations where I/O bandwidth is the limiting factor
Transient phenomena that require high temporal fidelity (e.g., fast-moving fronts, turbulence)
Debugging—you need to catch errors early before wasting millions of core hours
Simulation steering—you want to adjust parameters on-the-fly based on intermediate results
Data reduction—you need to extract features, statistics, or images and discard raw data

When Post-Processing Still Makes Sense

Small to moderate datasets that can be stored affordably
Exploratory analysis where you need to try many different visualization techniques after the fact
High-fidelity rendering that requires the full dataset and cannot be done with reduced representations
Archival purposes—preserving raw data for future reproducibility

In practice, many teams adopt a hybrid strategy: use in-situ to produce reduced datasets (e.g., Cinema databases, statistics, isosurfaces) and retain just enough raw data for select time steps, then perform detailed post-processing on the reduced set.

Popular Tools and Frameworks for In-Situ Visualization

Several mature open-source frameworks support in-situ integration:

ParaView Catalyst

ParaView Catalyst is a lightweight library that enables in-situ analysis using the VTK pipeline. It provides a small API that simulations in C++, C, Fortran, or Python can call to pass data to ParaView’s visualization algorithms. Catalyst can generate images, extract features, or produce Cinema databases for later exploration.

Key strengths:

Leverages the full power of VTK and ParaView
Supports both tightly and loosely coupled modes
Active development and extensive documentation

Resources:

VisIt LibSim

VisIt offers LibSim, a set of libraries that allow simulations to connect to VisIt’s visualization engine. Like Catalyst, it supports inline and in-transit modes. VisIt is particularly strong for large-scale parallel visualization and has been used on many leadership-class systems.

Key strengths:

Highly scalable to hundreds of thousands of cores
Rich set of visualization operators
Strong support for adaptive mesh refinement (AMR) data

Resources:

In Situ Visualization with VisIt

Ascent

Ascent is a lightweight, many-core capable in-situ library designed for resource-constrained environments. It is especially effective for GPU-accelerated applications and can run efficiently on a small subset of nodes.

Key strengths:

Minimal overhead
Good GPU support
Part of the U.S. Exascale Computing Project

Damaris

Damaris focuses on using dedicated cores for asynchronous in-situ analysis, isolating visualization from the main simulation to minimize performance impact. It supports both VisIt and ParaView backends.

Key strengths:

Asynchronous execution reduces simulation interference
Handles data movement and repartitioning
Suitable for long-running, steady-state simulations

SENSEI and Conduit

SENSEI provides a generic interface that decouples simulations from specific in-situ tools. Conduit handles data description and exchange. This abstraction allows the same simulation code to work with Catalyst, Ascent, or other backends without code changes.

When to choose: If you want to avoid vendor lock-in and support multiple in-situ frameworks.

Implementation Challenges: What Can Go Wrong

Integrating in-situ visualization into an existing simulation codebase is not trivial. Be prepared for these common challenges:

Resource Contention and Performance Overhead

Visualization consumes CPU cycles, GPU time, and memory that would otherwise be available to the simulation. Studies have shown that poorly optimized in-situ pipelines can slow simulations by 20–50% or more. The key is to measure and bound the overhead.

Mitigation strategies:

Use asynchronous or in-transit modes to separate resources
Limit visualization frequency (e.g., every 10th time step)
Employ data reduction algorithms to reduce processing volume
Dedicate a subset of cores/GPUs exclusively to visualization

Memory Footprint

Loading visualization libraries alongside the simulation increases per-node memory usage. On memory-bound systems, this can force a reduction in problem size per node.

Mitigation:

Choose lightweight frameworks (Ascent is designed for this)
Release visualization memory promptly after use
Consider in-transit modes where visualization runs on separate nodes

Data Conversion and Interoperability

Simulation codes often use custom data structures optimized for PDE solvers. In-situ tools like VTK expect data in specific formats (e.g., structured grids, unstructured meshes). Writing adapters to translate between these representations can be complex and may involve expensive deep copies.

Solution:

Use abstraction layers like SENSEI/Conduit to define a common data model
Explore zero-copy interfaces if your simulation and visualization share the same underlying layout
Factor adapter code into a separate module to isolate complexity

Parallel Scalability

Visualization algorithms must scale to the same size as the simulation (potentially millions of MPI ranks). Global operations like volumetric rendering or isosurface extraction across all data can become communication bottlenecks.

Solution:

Use distributed visualization algorithms that minimize cross-node communication
Employ data reduction or sampling to lower the effective data volume
Consider hierarchical approaches: local analysis first, then aggregate results

The ‘A Priori’ Problem

In-situ visualization requires you to decide before the run what to visualize. If you later realize you needed to examine a different variable or apply a different filter, the raw data is gone. This is the fundamental inflexibility of in-situ.

Workarounds:

Save a reduced but flexible dataset (e.g., a Cinema database) that supports later re-rendering from different viewpoints
Periodically save small raw data snapshots at key time steps
Use simulation steering to dynamically adjust visualization parameters as the run progresses

Code Complexity and Fragility

Tight coupling means the visualization library becomes part of the simulation executable. A bug in the visualization code can crash the entire simulation. The integration also increases code complexity and maintenance burden.

Best practices:

Encapsulate all in-situ calls behind a clean interface
Keep visualization code separate from core solvers
Provide runtime configuration to enable/disable in-situ easily
Test with visualization disabled to ensure simulation stability

Best Practices for Successful In-Situ Adoption

Based on experience from the HPC community, follow these guidelines to maximize success:

1. Start with a Prototype

Before committing to a full integration, build a small prototype that exercises the basic pipeline: simulation → in-situ tool → output image. This validates the toolchain and reveals integration hurdles early.

2. Adopt Asynchronous or Hybrid Architectures

Avoid tightly coupled modes unless you have strong evidence that overhead will be acceptable. Prefer in-transit or asynchronous approaches where visualization runs on dedicated cores or nodes. This isolates performance interference.

3. Define Clear Data Reduction Goals

In-situ is most effective when you reduce data before saving it. Decide upfront:

What images or movies do you need?
Which derived quantities (e.g., integrated fluxes, maximum values) are essential?
Can you use Cinema databases for flexible post-hoc exploration?

4. Measure Everything

Instrument your code to measure:

Simulation time with and without in-situ enabled
Memory usage per node
Frequency and duration of visualization pauses
Quality and utility of produced outputs

Without metrics, you cannot tell whether in-situ is helping or hurting.

5. Leverage Existing Infrastructure

Don’t reinvent the wheel. Use established frameworks (Catalyst, Ascent, Damaris) rather than writing custom in-situ code from scratch. They handle many low-level details (MPI communication, data repartitioning, ghost cells) that are error-prone to implement yourself.

6. Plan for Simulation Steering

If you want to steer simulations interactively, design your in-situ pipeline to accept external control messages. Tools like ICARUS (for ParaView) provide steering capabilities with minimal code changes.

7. Use Ghost Cells for Parallel Boundaries

When your simulation domain is decomposed across MPI ranks, visualization algorithms that need neighbor data (e.g., gradients, isosurfaces) must communicate across subdomain boundaries. Precompute and store “ghost cells” to avoid repeated communication during visualization.

8. Keep Visualization Optional

Wrap all in-situ calls behind a runtime flag. The simulation should run normally with in-situ disabled. This is essential for debugging and for scenarios where you want to run without the overhead.

Real-World Applications: Where In-Situ Shines

In-situ visualization has become indispensable in several scientific domains:

Computational Fluid Dynamics (CFD)

Turbulent flow simulations generate massive 3D fields of velocity and pressure. In-situ visualization can extract 2D slices, streamlines, or vortex cores in real-time, allowing engineers to monitor flow separation, shock waves, or combustion instability as they develop.

Astrophysics and Cosmology

Cosmological simulations with adaptive mesh refinement (AMR) track dark matter halos and gas densities over billions of years. In-situ analysis identifies halos, classifies structures, and tracks mergers at high temporal resolution—tasks that would be impossible with infrequent saved snapshots.

Molecular Dynamics and Materials Science

Ensembles of molecular dynamics runs can be monitored in-situ to detect rare events like nucleation, phase transitions, or dislocation nucleation. Researchers can stop simulations early when an event occurs, saving computational resources.

Weather and Urgent Computing

Ensemble weather forecasts for hurricanes or wildfires require running dozens of simulations in parallel. In-situ topological analysis identifies the most probable outcome from the ensemble, providing critical information for emergency decisions before the full dataset is written.

Performance Visualization

Ironically, in-situ techniques are also used to visualize the performance of the HPC system itself—monitoring memory bandwidth, communication patterns, and load balance during execution to guide optimization.

Getting Started: A Practical Checklist

If you’re considering adding in-situ visualization to your simulation code, follow this step-by-step plan:

Phase 1: Evaluation (1–2 weeks)

[ ] Quantify your I/O problem: How much data do you generate? How long does writing take? Would reducing output save significant time?
[ ] Identify visualization goals: What specific images or analyses do you need during the run? (e.g., “2D slice of temperature every 100 steps”, “maximum pressure over time”)
[ ] Survey tools: Review ParaView Catalyst, VisIt LibSim, Ascent, and Damaris. Which fits your programming language and HPC environment?
[ ] Check system support: Does your supercomputing center support in-situ workflows? Are the required libraries installed?

Phase 2: Prototype (2–4 weeks)

[ ] Set up a minimal example: Take a small test case (e.g., a simple heat equation solver) and integrate the chosen in-situ framework to produce a single image.
[ ] Measure baseline overhead: Compare simulation runtime with and without in-situ enabled. Is the overhead acceptable (< 10–15%)?
[ ] Test data reduction: Experiment with different reduction strategies (subsampling, isosurfaces, statistical aggregates). Does the output meet your scientific needs?
[ ] Verify scalability: Run the prototype on 1, 10, 100 ranks. Does overhead scale linearly or does it explode at larger scales?

Phase 3: Integration (4–8 weeks)

[ ] Design an abstraction layer: Create a module that encapsulates all in-situ calls behind a simple API (e.g., beginTimestep(), addData(field), execute(), endTimestep()). This makes it easy to swap frameworks later.
[ ] Add configuration: Use runtime flags or environment variables to control in-situ frequency, output formats, and which variables to capture.
[ ] Implement graceful degradation: Ensure the simulation continues even if in-situ fails (e.g., write a warning but don’t abort).
[ ] Integrate with existing I/O: Decide how in-situ outputs (images, databases) will coexist with any existing post-processing data dumps.
[ ] Add steering if needed: If interactive steering is a goal, implement callbacks or message passing to allow external parameter changes.

Phase 4: Validation and Production (2–4 weeks)

[ ] Run end-to-end tests: Execute a full scientific simulation with in-situ enabled. Verify that outputs are correct and that the simulation completes within acceptable time.
[ ] Stress test: Push the limits—run on the full intended core count, with maximum problem size. Monitor memory, communication, and stability.
[ ] Document the workflow: Write clear instructions for users: how to enable in-situ, what options are available, where outputs are written, how to interpret them.
[ ] Train users: Ensure team members understand the in-situ workflow and can troubleshoot common issues.

Common Pitfalls and How to Avoid Them

Pitfall	Symptom	Prevention / Fix
Unbounded overhead	Simulation slows by > 30%	Limit in-situ frequency; use in-transit; profile to find bottlenecks
Memory exhaustion	Jobs killed by OOM killer	Reduce data volume passed; use fewer dedicated cores; switch to loosely coupled mode
Missing variables	Later realize you needed to save quantity X	Save small raw snapshots periodically; use Cinema format for flexibility
Poor scaling	In-situ time increases with core count	Use distributed visualization algorithms; repartition data; aggregate locally first
Visualization crashes simulation	One bad visualization call aborts entire run	Wrap in-situ calls in try-catch; disable in-situ on failure; keep simulation code separate
Output files overwrite each other	Multiple runs write to same images	Include timestamps or job IDs in output filenames; use separate output directories

Conclusion: Is In-Situ Visualization Right for You?

In-situ visualization is a powerful paradigm for tackling the data deluge of modern scientific computing. By processing data in memory during simulation runs, you can overcome I/O bottlenecks, reduce storage costs, and gain real-time insight into your models. However, it is not a silver bullet: it introduces complexity, requires upfront design decisions, and can impact simulation performance if implemented poorly.

For research groups running large-scale PDE simulations on HPC systems—especially those approaching exascale—learning and adopting in-situ techniques is increasingly essential. Start small with a prototype, measure rigorously, and scale up deliberately. The investment pays off in faster iteration cycles, richer scientific insight, and the ability to run simulations that would otherwise be impossible due to data volume.

Related Guides

Visualizing Simulation Results Effectively – General principles for turning simulation data into actionable visuals.
Illuminator Distributed Visualization Library – Overview of a specialized tool for parallel rendering and storage.
Managing Large-Scale PDE Problems – Strategies for solving massive systems efficiently.
Understanding Phase-Field Models – A common application area for PDE simulations.
Using FiPy for Phase-Field Modeling – Practical implementation in Python.

Next Steps

Ready to try in-situ visualization? Begin by:

Assessing your current I/O bottlenecks: Run a profiling tool (e.g., Darshan) on an existing simulation to quantify time spent writing data.
Installing ParaView Catalyst or Ascent on your development system and running the tutorials.
Building a minimal prototype with a simple test case to measure overhead and validate the toolchain.
Reaching out to your institution’s HPC support team—they often have expertise and example codes to get you started.

If you need help integrating in-situ visualization into your research software, contact us for consultation and training tailored to scientific computing teams.

In-Situ Visualization: Integrating Visualization During Computation

TL;DR

Introduction: Why In-Situ Visualization Matters

How In-Situ Visualization Works: The Core Workflow

Tightly Coupled (Inline) In-Situ Visualization

Loosely Coupled (In-Transit) In-Situ Visualization

Hybrid Approaches

In-Situ vs. Post-Processing: A Comparative Analysis

When to Use In-Situ Visualization

When Post-Processing Still Makes Sense

Popular Tools and Frameworks for In-Situ Visualization

ParaView Catalyst

VisIt LibSim

Ascent

Damaris

SENSEI and Conduit

Implementation Challenges: What Can Go Wrong

Resource Contention and Performance Overhead

Memory Footprint

Data Conversion and Interoperability

Parallel Scalability

The ‘A Priori’ Problem

Code Complexity and Fragility

Best Practices for Successful In-Situ Adoption

1. Start with a Prototype

2. Adopt Asynchronous or Hybrid Architectures

3. Define Clear Data Reduction Goals

4. Measure Everything

5. Leverage Existing Infrastructure

6. Plan for Simulation Steering

7. Use Ghost Cells for Parallel Boundaries

8. Keep Visualization Optional

Real-World Applications: Where In-Situ Shines

Computational Fluid Dynamics (CFD)

Astrophysics and Cosmology

Molecular Dynamics and Materials Science

Weather and Urgent Computing

Performance Visualization

Getting Started: A Practical Checklist

Phase 1: Evaluation (1–2 weeks)

Phase 2: Prototype (2–4 weeks)

Phase 3: Integration (4–8 weeks)

Phase 4: Validation and Production (2–4 weeks)

Common Pitfalls and How to Avoid Them

Conclusion: Is In-Situ Visualization Right for You?

Related Guides

Next Steps

Related articles

Reproducible Research Workflows: Docker and Conda for Simulation Projects

Geophysics Applications: PDE Modeling for Earth Systems

Differential Equations as the Backbone of Simulations