Documentation Best Practices for Scientific Python Packages

Reading Time: 8 minutes

Excellent documentation transforms scientific Python packages from unusable code to reproducible research assets. Adopt a Documentation-as-Code approach: store docs alongside code, use Sphinx with NumPy or Google-style docstrings, automate builds with Read the Docs, and integrate documentation updates into every code review. Include a clear README, maintain a CHANGELOG, and test examples with doctest. Treat documentation as a first-class deliverable, not an afterthought.

Why Documentation Matters in Scientific Python

Scientific software often fails to achieve impact not because of flawed algorithms, but because others (or even the original authors months later) cannot understand or reproduce the work. According to a study of scientific software best practices, clear documentation is essential for reproducibility, maintainability, and peer validation. Unlike commercial software where documentation is often neglected, research code requires especially careful documentation to ensure that computational results can be trusted and extended.

The consequences of poor documentation in scientific contexts include:

Irreproducible results due to unclear configuration
Wasted time reverse-engineering own code months later
Inability to build upon others’ work
Failed peer review of computational methods
Abandoned projects when original developers leave

Good documentation bridges the gap between mathematical formulation and working simulation—the very gap MatForge aims to close.

The Documentation-as-Code Philosophy

The most effective approach to documentation in scientific Python projects is Documentation-as-Code (DaC): treat documentation with the same rigor as source code. This means:

Version documentation alongside code – Store Markdown or reStructuredText files in a docs/ directory within the same repository as your source code. This ensures that documentation always matches the corresponding code version.
Review documentation in pull requests – Make documentation updates mandatory for any code change that alters functionality. A code review is incomplete if the documentation is not updated.
Automate building and deployment – Use GitHub Actions or GitLab CI to build documentation automatically on each push and deploy to hosting services like Read the Docs.
Apply the same quality standards – Lint your Markdown, check for broken links, and treat documentation bugs with the same seriousness as code bugs.

This approach prevents the most common documentation failure: docs that drift out of sync with the code they describe.

The Diátaxis Framework: Four Documentation Types

Effective documentation serves distinct purposes. The Diátaxis framework divides documentation into four categories:

1. Tutorials (Learning-Oriented)

Tutorials are step-by-step lessons that guide newcomers through a complete, meaningful task. They should be concrete, hands-on, and result in a working outcome. For scientific Python packages, tutorials might include:

Setting up FiPy for a simple diffusion problem
Running your first phase-field simulation
Validating a PDE solver against an analytical solution

Key principle: Tutorials teach by doing. Avoid abstract concepts; focus on practical steps with immediate feedback.

2. How-to Guides (Goal-Oriented)

How-to guides provide recipes for specific tasks. Unlike tutorials, they assume basic familiarity and target a clear objective. Examples:

How to implement custom boundary conditions in FiPy
How to parallelize your simulation with MPI
How to profile and optimize a PDE solver

Structure: Present a clear goal, then provide numbered steps or code snippets that achieve it.

3. Technical Reference (Information-Oriented)

API reference documentation describes what each function, class, and module does. This is where comprehensive docstrings become critical. Reference documentation should be exhaustive and precise, allowing experienced users to look up details quickly.

4. Explanation (Understanding-Oriented)

Explanations discuss background, design decisions, and conceptual models. They answer “why” questions that tutorials and reference docs cannot. Examples:

Why choose finite volume over finite element methods?
Understanding numerical stability in time-stepping
The mathematics behind phase-field models

A well-structured documentation set includes all four types, each in its proper place.

Setting Up Your Documentation Stack

For scientific Python packages, the de facto standard toolchain is Sphinx with Read the Docs hosting.

Sphinx: The Documentation Engine

Sphinx is a powerful documentation generator that transforms reStructuredText or Markdown into professional websites, PDFs, and e-books. Its key features for scientific software:

Automatic API documentation – Sphinx can extract docstrings from your Python code and generate API reference pages automatically via the autodoc extension.
Cross-references – Link between documentation pages and to external projects easily.
Mathematical notation – Support for LaTeX equations rendered with MathJax, essential for scientific content.
Extensible – Hundreds of extensions for custom functionality.

To get started:

pip install sphinx sphinx-rtd-theme
sphinx-quickstart

Configure conf.py to include your package’s path and enable extensions like sphinx.ext.autodoc, sphinx.ext.napoleon (for Google/NumPy docstrings), and sphinx.ext.mathjax.

Read the Docs: Free Hosting with Automation

Read the Docs is a free hosting platform for Sphinx documentation. It integrates seamlessly with GitHub:

Connect your repository
Read the Docs automatically builds documentation on each push
Custom domains, version selection, and PDF downloads available
Supports multiple versions (stable, latest, tagged releases)

This automation ensures your documentation is always up-to-date with your code.

Choosing a Docstring Format: NumPy vs Google

Docstrings are the foundation of API documentation. Three formats dominate Python:

Format	Characteristics	Scientific Preference
reST	Original Sphinx format, uses `:param name: description` syntax	Legacy projects
Google	Clean, minimal markup; sections with simple headers	Modern projects, general Python
NumPy	Structured sections with underlines; excellent for complex signatures	Scientific Python

The NumPy style is most common in scientific packages because its structured format handles multiple parameters, returns, and complex type annotations clearly. The Scientific Python development guide recommends NumPy-style for its clarity.

Example: NumPy-style docstring

def solve_poisson(potential, conductivity, tolerance=1e-6):
    """
    Solve the Poisson equation ∇·(σ∇φ) = 0 using finite volumes.

    Parameters
    ----------
    potential : ndarray
        Initial guess for potential field (will be overwritten).
    conductivity : ndarray
        Conductivity array on cell centers.
    tolerance : float, optional
        Convergence criterion for residual (default: 1e-6).

    Returns
    -------
    residual : float
        Final residual after convergence.

    Notes
    -----
    Uses a conjugate gradient solver with Jacobi preconditioner.
    Boundary conditions must be applied before calling.

    Examples
    --------
    >>> phi = np.zeros(grid.shape)
    >>> sigma = np.ones(grid.shape)
    >>> residual = solve_poisson(phi, sigma)
    >>> print(f"Converged to {residual:.2e}")
    """

The napoleon Sphinx extension parses both Google and NumPy styles, so choose based on your team’s preference.

Writing Effective Docstrings

Effective docstrings follow consistent conventions and provide complete information. The pyOpenSci documentation guide outlines essential sections:

Required Sections

Summary line – One sentence describing what the function does.
Parameters – Name, type, and description for each argument.
Returns – Type and description of return value(s).
Raises – Exceptions that may be thrown and conditions.

Optional but Valuable Sections

Examples – Concrete usage snippets; these can be tested with doctest.
Notes – Implementation details, algorithm references, performance characteristics.
References – Citations to papers or external documentation.
See Also – Links to related functions or classes.

The Power of Examples

Examples serve dual purposes:

They show users how to apply your code.
They become executable tests via doctest.

When examples are written as interactive Python sessions, both users and automated tools can verify they work correctly. This guards against documentation rot.

Testing Documentation with Doctest

Doctest is a Python module that verifies code examples in docstrings actually run and produce the expected output. This creates living documentation that cannot silently become incorrect.

How it works: You write an example as if entered at a Python prompt:

>>> from mypackage import compute_diffusion
>>> result = compute_diffusion(concentration=1.0, D=0.01)
>>> round(result, 4)
0.1234

Running pytest --doctest-module or python -m doctest -v your_module.py executes these examples and fails if output differs.

For scientific packages, doctest is particularly valuable because:

Numerical code can easily produce wrong results without raising errors; doctest catches silent inaccuracies.
Examples demonstrate proper usage patterns (units, boundary conditions, etc.).
They serve as minimal regression tests for core functionality.

The pytest-doctestplus plugin from Scientific Python provides enhanced features for testing documentation.

The README: Your Project’s Front Door

The README is often the first—and sometimes only—documentation users encounter. A well-crafted README should appear at the root of your repository and on PyPI.

Essential README sections:

Project description – 1-3 sentences explaining what the package does and its domain.
Installation instructions – How to install, including dependencies and platform requirements.
Quick example – Minimal code snippet showing a typical use case.
Links to full documentation – Direct users to comprehensive docs hosted elsewhere.
Citation information – How to cite the software in academic work.
License – Clearly state the license (e.g., MIT, BSD, GPL).
Badges – Build status, coverage, PyPI version, etc.

The pyOpenSci README guide provides detailed recommendations.

Pro tip: Write your README before writing any code. This clarifies your project’s goals and audience.

Maintaining a CHANGELOG

A CHANGELOG is a chronological list of notable changes for each version. It answers “what changed between version X and Y?” for both users and developers.

Best practices:

Follow Keep a Changelog conventions.
Use Semantic Versioning to communicate compatibility.
Group changes by type: Added, Changed, Deprecated, Removed, Fixed, Security.
Write for humans: explain why a change matters, not just that it happened.
Include dates for unreleased changes.
Never automate from git commit messages alone—curate the entries.

Example format:

## [Unreleased]
### Added
- New `adaptive_mesh` module for dynamic refinement.
- Support for HDF5 output with compression.

### Changed
- `solve()` now returns residual history (breaking change).

### Fixed
- Memory leak in sparse matrix assembly (#123).

A good changelog builds trust by showing active maintenance and transparency about breaking changes.

Common Documentation Pitfalls (And How to Avoid Them)

Based on the literature and community experience, here are frequent mistakes:

1. Outdated Documentation

Documentation that contradicts actual behavior is worse than no documentation. Solution: Integrate documentation updates into code reviews. If a PR changes functionality, the corresponding docs must be updated in the same commit.

2. Missing Examples

Abstract descriptions without concrete usage examples leave users guessing. Solution: Every public function and class should include at least one runnable example.

3. Explaining “What” but Not “Why”

Documentation often describes mechanics but omits the reasoning. Users need to understand the context to make correct decisions. Solution: Include sections explaining when to use a function, trade-offs, and alternatives.

4. Audience Mismatch

Writing for experts when beginners are the primary audience (or vice versa). Solution: Structure your docs using the Diátaxis framework to serve different needs separately.

5. Inconsistent Style

Mixed docstring formats, varying heading levels, and ad hoc organization. Solution: Adopt a style guide and enforce it with linters (markdownlint, doc8).

6. No Testing

Untested examples eventually break. Solution: Use doctest or pytest-doctestplus to verify all examples work.

7. Neglecting the README

Assuming users will read extensive guides before trying the package. Solution: Make the README compelling and actionable; include a quick-start section.

Documentation Workflow Integration

Documentation should flow naturally with your development process:

Pre-commit Hooks

Use pre-commit hooks to lint Markdown and check for common issues before allowing commits:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/markdownlint/markdownlint
    rev: v0.11.0
    hooks:
      - id: markdownlint
  - repo: https://github.com/antonbabenko/pre-commit-docs
    rev: v1.6.0
    hooks:
      - id: check-links

CI/CD Pipelines

Configure GitHub Actions to:

Build documentation on every push to main
Deploy to Read the Docs automatically
Run doctest as part of the test suite
Check for broken links in the built HTML

Example workflow:

name: Documentation
on:
  push:
    branches: [main]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Build docs
        run: |
          pip install -e .[docs]
          sphinx-build -b html docs/ docs/_build/html

Code Reviews

Make documentation review a checklist item:

New/changed functions have docstrings
Examples are included and tested
README is updated if user-facing changes occurred
CHANGELOG entry added for version bump

Making Your Documentation Citable

Scientific software should be citable as a research artifact. Include:

CITATION.cff – A standard CITATION.cff file in the repository root with citation metadata (authors, title, version, DOI).
Zenodo integration – Connect your GitHub repository to Zenodo to automatically assign DOIs for each release.
Software citation instructions – Add a “Citation” section to your README and documentation showing BibTeX entries.

This ensures your work receives academic credit and meets reproducibility requirements from journals and funding agencies.

Internal Linking and Further Reading

For more on related topics:

These articles cover complementary aspects of sustainable research software development.

Conclusion and Next Steps

Documentation is not a secondary task—it is the vehicle through which your scientific Python package achieves impact. By adopting Documentation-as-Code, using the right toolchain (Sphinx + Read the Docs), following structured frameworks like Diátaxis, and integrating documentation into your development workflow, you create software that is truly reusable and reproducible.

Action items to implement today:

Ensure every public function and class has a docstring in NumPy or Google style.
Set up a docs/ directory with Sphinx configuration.
Connect your repository to Read the Docs for automated builds.
Add doctest to your CI pipeline to verify examples.
Write or improve your README with a clear description and quick example.
Start a CHANGELOG if you don’t have one.

Treat documentation as an investment: the time you spend writing clear docs will pay dividends in reduced support burden, broader adoption, and long-term maintainability of your scientific software.

Further resources: