You just finished a simulation pipeline. Your results are publication-ready. Your code works. So why do you still need to document it?
Because without documentation, your software is harder to cite, harder to reproduce, and harder to maintain. Collaborators may not understand how to run it. Future students may not know which configuration file matters. Even you may forget key design decisions after several months.
Documentation is not a nice-to-have add-on for research software. It is the bridge between a working tool and a reproducible research asset. You do not need to be a technical writer to do it well. You need a structure.
Key Takeaways
- Documentation is part of reproducible research. Code without documentation becomes a black box that only its original author can maintain.
- Four documentation types serve four different user needs: tutorials for learning, how-to guides for doing, reference for describing, and explanation for understanding.
- The Diátaxis framework is a practical way to organize research software documentation.
- The Ten Simple Rules for documenting scientific software provide a useful checklist for writing better documentation.
- Templates and tools already exist. README structures,
CITATION.cfffiles, Sphinx, MkDocs, and Read the Docs make the process more efficient.
Why Documentation Matters
Research software sits between science and engineering. Unlike traditional lab equipment, software can be shared, modified, reused, and cited by anyone with the right environment. But that potential means little if nobody understands how the software works.
The stakes are practical:
- Reproducibility. Without documentation, other researchers cannot reliably verify or reuse your work.
- Citation. Software that lacks citation guidance and usage instructions is less likely to receive proper credit.
- Maintenance. When a researcher leaves a lab, undocumented code often becomes technical debt for the next person.
Good documentation makes software easier to use, extend, review, and preserve. It also reduces the number of repeated questions from collaborators and future users.
This guide gives you a practical framework and templates for documenting research software effectively.
The Diátaxis Framework: Four Types of Documentation
If your documentation currently lives in one long README file, it may feel disorganized. Tutorials, installation notes, API details, theory, examples, and troubleshooting can easily get mixed together.
The Diátaxis framework solves this by separating documentation into four distinct types. Each type serves a different user need.
1. Tutorials
A tutorial is learning-oriented. It takes a beginner through a guided path and helps them achieve a concrete outcome.
Example: “Set up FiPy for your first phase-field simulation.”
A tutorial answers: how do I learn to use this software?
2. How-To Guides
A how-to guide is action-oriented. It helps someone complete a specific task after they already understand the basics.
Examples include:
- How to run a Monte Carlo simulation with FiPy.
- How to troubleshoot convergence errors.
- How to export simulation results to CSV.
A how-to guide answers: how do I accomplish a specific task?
3. Reference
Reference documentation is information-oriented. It is factual, neutral, and complete. It describes what the software provides without teaching or persuading.
Examples include:
- API documentation.
- Function signatures.
- Parameter specifications.
- Class definitions.
Reference documentation answers: what does this do?
4. Explanation
Explanation is understanding-oriented. It provides background, context, reasoning, and design rationale.
Examples include:
- Why the solver uses implicit time stepping.
- The mathematical model behind the implementation.
- Why one mesh strategy was chosen over another.
Explanation answers: why does this work this way?
The Ten Simple Rules for Documenting Research Software
The Ten Simple Rules for documenting scientific software are a practical checklist for documentation quality. They are useful because they focus on habits that researchers can apply without building a full documentation department.
Rule 1: Write Comments as You Code
Comments should explain the ideas and reasoning behind the algorithm, not merely repeat what the code already says.
# Good: explains why this approach is used
# Use implicit time stepping for stiff reaction terms to avoid
# timestep restrictions that would make the simulation impractical.
solver = ImplicitTimeStepping(reaction_terms)
# Bad: repeats the line without explaining intent
solver = ImplicitTimeStepping(reaction_terms) # creates solver
Rule 2: Include Lots of Examples
Examples show users how the software works in practice. Provide executable examples that demonstrate the main workflow.
If the documentation becomes too crowded with examples, move them into a dedicated examples/ directory and link to them from the main documentation.
Rule 3: Include a Quickstart Guide
A quickstart guide should let someone use the software within a few minutes of downloading it. It should include installation, a minimal example, and expected output.
Without a quickstart, many users assume the software is too difficult to use and leave before testing it.
Rule 4: Write a Comprehensive README
Assume the README will be the only documentation many users read. It should cover the essentials clearly.
A strong README should include:
- A short project description.
- Installation instructions and dependencies.
- A quickstart example.
- License information.
- Citation instructions.
- A link to full documentation.
README template:
# Project Name
Short description: one sentence explaining what the software does.
## Installation
1. Clone this repository.
2. Run `pip install -e .` or your preferred install command.
3. Verify the installation:
```python
import mypackage
print(mypackage.__version__)
```
## Quickstart
```python
from mypackage import MySimulator
sim = MySimulator(config="default.yaml")
results = sim.run()
```
## Documentation
Full documentation: [Read the Docs link]
## Citation
Please cite this software using the CITATION.cff file.
## License
MIT License
Rule 5: Include a Help Command for CLIs
If your software has a command-line interface, include a clear --help flag. It should explain commands, required arguments, optional parameters, and examples.
Python tools such as argparse make this straightforward.
Rule 6: Version Control Your Documentation
Keep documentation alongside the code in version control. Users of older software versions need access to documentation that matches those versions.
Use versioned documentation hosting when possible so users can switch between releases.
Rule 7: Document Your API
Document public functions, classes, arguments, return values, and exceptions. Use a consistent docstring style so automated tools can generate readable API documentation.
Example:
def run_simulation(config_path, steps):
"""Run the simulation from a configuration file.
Args:
config_path: Path to the YAML configuration file.
steps: Number of time steps to run.
Returns:
Simulation results object with fields, metadata, and diagnostics.
Raises:
ValueError: If the configuration file is invalid.
"""
...
Rule 8: Use Automated Documentation Tools
Do not write everything manually if tools can generate part of it. Automated documentation tools reduce repetitive work and keep documentation closer to the code.
Useful tools include:
- Sphinx for Python packages with complex APIs.
- MkDocs for simple Markdown-based documentation sites.
- Read the Docs for hosting and automatic documentation builds.
Rule 9: Write Actionable Error Messages
Good error messages tell users what went wrong, why it happened, and how to fix it.
# Bad
raise ValueError("Invalid input")
# Good
raise ValueError(
f"Input parameter 'temperature' must be between 0 and 3000 K. "
f"Received {temperature} K. Check your simulation config file."
)
This saves debugging time and reduces support requests.
Rule 10: Tell People How to Cite Your Software
If you want your research software to receive credit, provide citation instructions. Include a DOI, BibTeX entry, and CITATION.cff file.
If the software does not have a journal publication, use Zenodo to mint a DOI for releases. Submitting to the Journal of Open Source Software can also make software easier to cite.
Documentation Templates You Can Use Today
The Documentation Decision Tree
Before writing documentation, ask three questions:
- Who is it for? Users, developers, maintainers, reviewers, or collaborators?
- What do they want? Run the software, modify it, understand the model, or cite it?
- What format fits the need? Tutorial, how-to guide, reference, explanation, inline comment, or API page?
These questions help prevent a common mistake: writing one overloaded document for every audience.
The CITATION File Format
The CITATION.cff file is a machine-readable and human-readable file for software citation. It can include:
- Software name and version.
- Authors and affiliations.
- DOI for the software.
- BibTeX information.
- Repository URL.
When paired with a Zenodo DOI, CITATION.cff becomes a stable citation record for your software.
Example CITATION.cff Template
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "Project Name"
version: "1.0.0"
doi: "10.5281/zenodo.xxxxxxx"
authors:
- family-names: "Surname"
given-names: "First Name"
affiliation: "Research Institution"
repository-code: "https://github.com/username/project-name"
license: "MIT"
Tools of the Trade
| Tool | Purpose | Best For |
|---|---|---|
| Sphinx | Generates documentation from docstrings and reStructuredText or Markdown | Python packages with complex APIs |
| MkDocs | Builds Markdown-based documentation sites | Lightweight projects needing a simple documentation site |
| Read the Docs | Hosts and auto-builds versioned documentation | Projects that need automatic documentation deployment |
| Doxygen | Generates documentation for C, C++, Python, and mixed-language projects | Projects with C++ or mixed scientific codebases |
| Zenodo | Mints DOIs and archives software releases | Long-term citation and reproducibility |
Common Mistakes and How to Avoid Them
Mistake 1: Treating Everything as a README
A README cannot do every job well. If you combine tutorials, reference, explanation, API details, and troubleshooting into one file, users struggle to find what they need.
Separate documentation by purpose using the Diátaxis quadrants.
Mistake 2: Writing Explanation Into Tutorials
Tutorials should be short, practical, and linear. If a beginner must understand the mathematical model before using the software, put that explanation on a separate page and link to it.
Mistake 3: Not Version-Controlling Documentation
If you change a default parameter in version 2.0, users of version 1.5 need the old documentation. Versioned docs prevent confusion and make older releases more usable.
Mistake 4: Assuming Future You Will Remember Everything
You may not remember your design decisions six months later. Comments and explanation pages serve as a lab notebook for your implementation choices.
Mistake 5: Forgetting Citation Instructions
Software without citation guidance often receives less credit. Include a DOI, BibTeX, and CITATION.cff file so users know exactly how to cite your work.
A Practical Documentation Workflow
You can implement documentation in stages. The goal is not to write everything at once, but to build the structure early and improve it as the project matures.
Phase 1: Before Writing Any Code
- Create a draft
CITATION.cfffile. - Draft a skeleton README with project purpose, installation placeholder, and license.
- Decide whether the main audience is users, developers, maintainers, or all three.
Phase 2: During Development
- Write comments as you code, especially for algorithmic choices.
- Add docstrings for every public function and class.
- Set up Sphinx or MkDocs early so documentation builds alongside the code.
- Add examples when features become stable.
Phase 3: After Development
- Write a quickstart guide.
- Complete the README with installation, usage, license, and citation instructions.
- Write at least one how-to guide for the most common use case.
- Create or update the DOI through Zenodo, JOSS, or another appropriate publication route.
Internal Links and Related Guides
For related topics in scientific simulation workflows:
- Documentation Best Practices for Scientific Python Packages — Python-specific tooling and documentation structure.
- Continuous Integration for Research Software — CI/CD for automated testing and validation.
- Reading and Understanding FiPy Documentation — Practical FiPy documentation patterns.
- Reproducible Research Workflows: Docker and Conda — Environment reproducibility.
- Best Practices for Maintaining Scientific Code — Long-term project maintenance.
Summary and Next Steps
Documentation transforms code from a fragile experimental artifact into a durable research asset. The Diátaxis framework gives structure. The Ten Simple Rules give a practical checklist. Templates and tools provide a fast starting point.
Start with one small step: create a CITATION.cff file and add citation guidance. This makes the software easier to cite and credit.
Then add a README with installation, quickstart, license, and documentation links. This is the most impactful document for usability.
Next, document the API with consistent docstrings and set up Sphinx or MkDocs. This helps users and future maintainers understand how the code works.
Finally, write at least one how-to guide for the most common use case. That is often the page collaborators actually need.
Every piece of documentation makes your software one step closer to reproducible research.
References and Further Reading
- Lee, B. D. (2018). Ten simple rules for documenting scientific software. PLOS Computational Biology, 14(12): e1006561. DOI: 10.1371/journal.pcbi.1006561
- Software Sustainability Institute. What are best practices for research software documentation? Source
- Procida, D. Diátaxis: A systematic approach to technical documentation authoring. Source
- Wilson, G., et al. (2014). Best practices for scientific computing. PLOS Biology, 12(1): e1001745. DOI: 10.1371/journal.pbio.1001745
- Journal of Open Source Software. Source
- Read the Docs. Source
Need Help Structuring Documentation for Your Simulation Project?
If your research team needs help setting up automated documentation pipelines, designing a Diátaxis-compliant documentation structure, or integrating documentation into CI/CD workflows, our computational science experts can help.
Contact us through our issue tracking system to discuss your project’s documentation needs.