Extending the Analysis Framework

This guide shows you how to add a new analysis type to PolyzyMD. By the end, you will have a working analysis plugin that can compute metrics per replicate, aggregate across replicates, compare conditions, generate figures, and display results on the CLI.

Note

This guide covers the plugin system in analyses/. Each plugin is a self-contained package — you implement the science (compute, aggregate) and the framework handles discovery, iteration, result saving, comparison statistics, and CLI wiring.

Prerequisites

A working pixi environment: pixi install -e build
Basic familiarity with Pydantic v2 BaseModel
Understanding of the PolyzyMD chain convention (A=protein, B=substrate, C=polymer)

How the Plugin System Works

Every analysis in PolyzyMD is a single class that subclasses Analysis and lives in a package under src/polyzymd/analyses/<name>/. The framework discovers plugins automatically — no registry edits, no imports, no boilerplate. You drop a package, and the CLI can run it.

The lifecycle for each analysis is:

For each condition:
  For each replicate:
    result = compute_replicate(ctx, replicate)       # your code
  aggregated = aggregate(ctx, [replicate_results])   # your code

filter_conditions(conditions)  →  filtered list
compare(ctx)                   →  ComparisonResult
plot(ctx)                      →  [figure paths]
format(result, "text")         →  CLI output string

You implement the science (compute, aggregate). The framework provides discovery, replicate iteration, result saving, comparison statistics, and CLI wiring.

Anatomy of a Plugin

Before seeing the full code, here is what each piece does and why it exists.

`name` (required)

A unique, lowercase string identifier. Used in CLI commands (polyzymd compare run solvent_contacts), config files, and output directories.

`Settings` (required)

A Pydantic v2 BaseModel inner class. Users configure your analysis through the plugins: section of comparison.yaml, and the framework deserializes those settings into your Settings model. Provide sensible defaults for every field so the analysis works out of the box.

`compute_replicate(ctx, replicate)` (required)

Runs once per replicate per condition. The framework passes a ReplicateContext containing everything you need: the loaded SimulationConfig, your Settings, output paths, and equilibration time. Use TrajectoryLoader from analyses/shared/ to load trajectories — it handles topology discovery, trajectory segment daisy-chaining, timestep detection, and equilibration skipping.

Return a dict or Pydantic model. The framework collects these for aggregate().

`aggregate(ctx, results)` (required)

Runs once per condition after all replicates complete. Receives the list of objects your compute_replicate() returned. Compute summary statistics (mean, SEM) and return the aggregated result. The framework auto-saves this to ctx.result_path.

`extract_metrics(summary)` (optional, but recommended)

Exposes scalar metrics for the default comparison path. Return a dict of MetricValue objects, and the framework automatically runs t-tests, computes Cohen’s d, runs ANOVA (≥3 conditions), and ranks conditions. This is the simplest way to get a fully statistical comparison — no need to implement compare().

`compare(ctx)` (optional override)

Override only for complex multi-metric or entry-table comparisons. If you implement extract_metrics(), you typically do not need to override compare().

`plot(ctx)` (optional)

Generate matplotlib figures. Use self._build_plot_data(ctx) to collect per-condition paths, then self._load_aggregated_result(agg_dir) to load each condition’s data. Use shared plotting helpers (get_theme, apply_axis_style, save_figure, get_colors) for consistent theming. Return a list of figure paths. The scaffold generates a working plot() method automatically.

`format(result, output_format)` (optional)

Custom CLI display. Use format_scalar_comparison() from analyses.stats for formatted tables with rankings, effect sizes, and significance stars.

`filter_conditions(conditions, settings)` (optional)

Exclude conditions where the analysis cannot run at all. The framework calls this before dispatching any compute_replicate() or aggregate() jobs, so excluded conditions have zero work done.

When to use it: Your analysis requires a molecular component that may not be present in all conditions. The canonical case is polymer-specific analyses — if the comparison includes a “No Polymer (Control)” condition, analyses like contacts, exposure, polymer_affinity, and polymer_bridging must skip it because there are no polymer atoms to analyse.

Canonical pattern (check sim_config.polymers, fail-open on errors):

from polyzymd.analyses.base import Condition


def filter_conditions(
    self,
    conditions: list[Condition],
    settings: "BaseModel | None" = None,
) -> list[Condition]:
    """Exclude conditions without polymer."""
    valid: list[Condition] = []
    for cond in conditions:
        try:
            sim_config = cond.sim_config
            polymers = getattr(sim_config, "polymers", None)
            has_polymer = polymers is not None and getattr(polymers, "enabled", True)
            if not has_polymer:
                logger.info("Excluding '%s': no polymer configured", cond.label)
                continue
            valid.append(cond)
        except (AttributeError, ValueError, KeyError, OSError) as e:
            logger.warning("Error checking '%s': %s — including anyway", cond.label, e)
            valid.append(cond)  # fail-open
    return valid

Key rules:

Always fail-open on unexpected errors — include the condition rather than crashing the entire comparison.
Log excluded conditions at INFO level so users can see what was skipped.
The settings parameter carries the resolved plugin Settings instance. Use it if your exclusion logic depends on user-configurable selections (e.g., a custom polymer selection string).

Existing implementations: contacts, exposure, polymer_affinity, and polymer_bridging all implement filter_conditions(). binding_free_energy has a no-op override (intentionally keeps all conditions).

Tip

If only some sub-computations within your analysis are inapplicable — not the entire analysis — use graceful empty-selection handling inside compute_replicate() instead. See the next section.

Handling partially-applicable conditions in `compute_replicate()`

Sometimes an analysis is valid for a condition overall, but specific sub-computations within it are not. In that case, don’t exclude the entire condition — instead, handle empty AtomGroup selections gracefully inside compute_replicate().

Example: The hydrogen_bonds plugin computes H-bonds between multiple group pairs (protein–protein, protein–polymer, protein–substrate). For a “No Polymer” control, protein–protein H-bonds are still meaningful, but protein–polymer H-bonds are not because there are no polymer atoms. Rather than excluding the condition entirely (which would lose the protein–protein data), the plugin:

Detects that the “polymer” group selection matched zero atoms.
Logs a warning: "Group 'polymer' selection 'chainid C' matched no atoms — will skip summaries using it".
Produces zero-filled placeholder results for summaries involving that empty group (e.g., 0 H-bonds, 0.0 occupancy). This is scientifically correct — zero protein–polymer H-bonds is the ground truth when no polymer is present.
Computes normally for summaries that don’t involve the empty group (protein–protein H-bonds still run).

# Pattern: graceful empty-selection handling (from hydrogen_bonds)
for group_name, selection_str in settings.groups.items():
    atom_group = u.select_atoms(selection_str)
    if len(atom_group) == 0:
        if settings.allow_empty_groups:
            logger.warning(
                "Group '%s' selection '%s' matched no atoms — will skip summaries using it",
                group_name,
                selection_str,
            )
        else:
            raise ValueError(
                f"Group '{group_name}' selection '{selection_str}' matched no atoms."
            )

When to use which pattern:

Situation	Pattern	Example
Entire analysis is meaningless for a condition	`filter_conditions()`	Polymer–protein contacts with no polymer
Some sub-computations are invalid, others are valid	Graceful empty-selection handling	H-bonds: protein–protein valid, protein–polymer not
Analysis works for all conditions	Neither (use defaults)	RMSF, RMSD, secondary structure

The Complete Example

Here is a single, complete plugin file that uses real framework utilities. This is the pattern you should follow for new analyses.

Important

The SolventContactsAnalysis implementation below is a walkthrough example for a hypothetical solvent_contacts plugin name that does not collide with built-in plugins.

Create src/polyzymd/analyses/solvent_contacts/ as a package in your own branch or downstream project, or use the scaffold command to generate the boilerplate automatically (the scaffold includes compute, aggregate, comparison, plotting, and tests):

# Default: dict-based results (simplest starting point)
polyzymd new-analysis solvent_contacts

# Typed Pydantic result models (better for complex analyses)
polyzymd new-analysis solvent_contacts --style pydantic

The --style flag controls how results are structured:

dict (default) — Uses plain dicts for replicate/aggregated results. No result model classes needed. Best for getting started quickly.
pydantic — Generates typed ReplicateResult and AggregatedResult Pydantic models with AggregatedResultClass. Better when you need validation, IDE autocomplete, or NPZ sidecar storage.

If creating manually:

"""Solvent contacts analysis plugin."""
from __future__ import annotations

import logging
from pathlib import Path
from typing import Any, ClassVar, Sequence

import numpy as np
from pydantic import BaseModel, Field

from polyzymd.analyses.base import (
    AggregateContext,
    Analysis,
    MetricValue,
    PlotContext,
    ReplicateContext,
)
from polyzymd.analyses.shared import TrajectoryLoader, convert_time, parse_time_string
from polyzymd.analyses.stats import format_scalar_comparison

logger = logging.getLogger(__name__)


class SolventContactsAnalysis(Analysis):
    """Solvent contacts around a selected protein region."""

    name: ClassVar[str] = "solvent_contacts"

    class Settings(BaseModel):
        """Settings for solvent contact-fraction analysis.

        Attributes
        ----------
        selection : str
            MDAnalysis selection string for target atoms
        solvent_selection : str
            MDAnalysis selection for solvent atoms used in contacts
        cutoff : float
            Contact cutoff distance in Angstrom
        """

        selection: str = Field(
            default="protein and name CA",
            description="MDAnalysis atom selection for target atoms",
        )
        solvent_selection: str = Field(
            default="resname HOH and name O",
            description="MDAnalysis atom selection for solvent contact atoms",
        )
        cutoff: float = Field(
            default=4.5,
            gt=0,
            description="Distance cutoff in Angstrom for a contact",
        )

    def compute_replicate(self, ctx: ReplicateContext, replicate: int) -> dict[str, Any]:
        """Compute solvent contact fraction for one replicate.

        Uses TrajectoryLoader for topology discovery, trajectory segment
        daisy-chaining, and equilibration frame skipping
        """
        # Lazy-import heavy third-party dependency
        import MDAnalysis as mda  # noqa: F401
        from MDAnalysis.analysis.distances import distance_array

        loader = TrajectoryLoader(ctx.sim_config)
        u = loader.load_universe(replicate)

        eq_value, eq_unit = parse_time_string(ctx.equilibration)
        timestep_ps = loader.get_timestep(replicate, unit="ps")
        eq_time_ps = convert_time(eq_value, eq_unit, "ps")
        start_frame = int(eq_time_ps / timestep_ps)

        target_atoms = u.select_atoms(ctx.settings.selection)
        solvent_atoms = u.select_atoms(ctx.settings.solvent_selection)
        contact_fractions = []
        for _ts in u.trajectory[start_frame:]:
            if len(target_atoms) == 0 or len(solvent_atoms) == 0:
                contact_fractions.append(0.0)
                continue
            dists = distance_array(target_atoms.positions, solvent_atoms.positions)
            contact_pairs = np.sum(dists <= ctx.settings.cutoff)
            normalizer = len(target_atoms) * len(solvent_atoms)
            contact_fractions.append(float(contact_pairs / normalizer))

        return {
            "mean_contact_fraction": float(np.mean(contact_fractions)),
            "std_contact_fraction": float(np.std(contact_fractions)),
            "n_frames": len(contact_fractions),
            "replicate": replicate,
        }

    def aggregate(self, ctx: AggregateContext, results: Sequence[Any]) -> dict[str, Any]:
        """Average contact fraction across replicates with SEM."""
        values = [r["mean_contact_fraction"] for r in results]
        return {
            "mean_contact_fraction": float(np.mean(values)),
            "sem_contact_fraction": float(np.std(values, ddof=1) / np.sqrt(len(values))),
            "replicate_values": values,
        }

    def extract_metrics(self, summary: Any) -> dict[str, MetricValue]:
        """Expose contact fraction as scalar metric for default comparison."""
        return {
            "mean_contact_fraction": MetricValue(
                name="mean_contact_fraction",
                mean=summary["mean_contact_fraction"],
                sem=summary["sem_contact_fraction"],
                replicate_values=summary["replicate_values"],
                higher_is_better=False,
                direction_labels=("more exposed", "unchanged", "more shielded"),
            )
        }

    def plot(self, ctx: PlotContext) -> list[Path]:
        """Generate a contact-fraction bar chart comparing conditions."""
        import matplotlib.pyplot as plt

        from polyzymd.analyses.shared import apply_axis_style, get_colors, get_theme, save_figure

        # Use the base class helper to gather per-condition paths
        data, labels = self._build_plot_data(ctx)
        if not labels:
            return []

        # Load aggregated results for each condition
        plot_labels = []
        means = []
        sems = []
        for label in labels:
            if label not in data or label == "__meta__":
                continue
            agg_dir = data[label]["aggregated_dir"]
            summary = self._load_aggregated_result(agg_dir)
            if summary is None:
                continue
            plot_labels.append(label)
            means.append(summary["mean_contact_fraction"])
            sems.append(summary["sem_contact_fraction"])

        if not plot_labels:
            return []

        # Use shared plotting utilities for consistent theming
        plot_settings = ctx.plot_settings
        theme = get_theme(plot_settings)
        colors = get_colors(len(plot_labels), plot_settings)

        fig, ax = plt.subplots(figsize=(8, 5))
        ax.bar(plot_labels, means, yerr=sems, capsize=5, color=colors[: len(plot_labels)])
        apply_axis_style(
            ax,
            plot_settings,
            title="Solvent Contact Fraction Across Conditions",
            ylabel="Contact fraction",
        )

        output_path = ctx.output_dir / "solvent_contacts_comparison.png"
        output_path.parent.mkdir(parents=True, exist_ok=True)
        save_figure(fig, output_path, plot_settings)
        plt.close(fig)
        return [output_path]

    def format(self, result: Any, output_format: str = "text") -> str:
        """Format comparison result for CLI display."""
        from polyzymd.analyses.base import ComparisonResult

        if isinstance(result, ComparisonResult):
            return format_scalar_comparison(
                result,
                title="Solvent Contacts Comparison",
                metric_label="Mean contact fraction",
                metric_unit="",
                metric_key="mean_contact_fraction",
                output_format=output_format,
                higher_is_better=False,
            )
        return super().format(result, output_format)

That file is now discoverable, runnable through compare workflows, and compatible with default framework formatting/statistics.

Base Class Helpers

The Analysis base class provides three utility methods that reduce boilerplate across the lifecycle. The SolventContactsAnalysis example above uses _build_plot_data() in its plot() method, but all three are available to every plugin.

`_build_plot_data(ctx, *, include_replicates=False)`

Collects per-condition paths from a PlotContext into a dict ready for plotter functions. Returns (data, labels) where data[label] contains "analysis_dir" and "aggregated_dir" keys:

def plot(self, ctx: PlotContext) -> list[Path]:
    # One line replaces the 8-12 line loop through ctx.conditions
    data, labels = self._build_plot_data(ctx, include_replicates=True)
    if not labels:
        return []

    for label in labels:
        agg_dir = data[label]["aggregated_dir"]
        summary = self._load_aggregated_result(agg_dir)
        # ... plot summary data ...

Manual alternative

Without _build_plot_data(), you write the loop yourself — which is fine for understanding the data flow:

def plot(self, ctx: PlotContext) -> list[Path]:
    for cond in ctx.conditions:
        agg_dir = ctx.analysis_dirs[cond.label] / "aggregated"
        summary = self._load_aggregated_result(agg_dir)
        if summary is None:
            continue
        # ... plot summary data ...

`_check_cache(result_cls, cache_path, *, recompute, sim_config=None)`

Consolidates the cache-checking pattern used in compute_replicate(). Loads a Pydantic result from disk if the cache file exists and recompute is False, optionally validating the config hash:

def compute_replicate(self, ctx: ReplicateContext, replicate: int) -> dict:
    result_file = ctx.output_dir / "solvent_contacts_result.json"

    # One line replaces the 5-line cache check pattern
    cached = self._check_cache(
        SolventContactsReplicateResult, result_file,
        recompute=ctx.recompute, sim_config=ctx.sim_config,
    )
    if cached is not None:
        return cached

    # ... expensive computation ...

Note

_check_cache() requires a Pydantic model class with a .load(path) method (i.e., a BaseAnalysisResult subclass). Dict-based plugins that want caching should use explicit json.loads() / Path.exists() checks instead.

`_format_replicate_range(replicates)`

Formats replicate tuples into compact strings for log messages and filenames. Contiguous ranges are collapsed:

cls._format_replicate_range((1, 2, 3))    # → "reps1-3"
cls._format_replicate_range((1, 3, 5))    # → "reps1_3_5"

Useful in aggregate() for logging or naming output files:

def aggregate(self, ctx: AggregateContext, results):
    rep_str = self._format_replicate_range(ctx.replicates)
    logger.info("Aggregating %s across %s", self.name, rep_str)
    # ...

Summary of helpers

Helper	Used in	Purpose
`_build_plot_data()`	`plot()`	Collect per-condition paths from `PlotContext`
`_check_cache()`	`compute_replicate()`	Load cached Pydantic result if valid
`_format_replicate_range()`	`aggregate()`, logging	Compact replicate string for messages/filenames

Import Rules

The import structure in the example above is intentional. Getting imports wrong causes subtle test failures, so follow these rules:

Module-level imports (top of file):

Standard library (logging, pathlib)
NumPy — imported at module level because it is available in project environments
Framework utilities from analyses/shared/ (TrajectoryLoader, parse_time_string, convert_time, AlignmentConfig, etc.)
Framework base classes from analyses/base and analyses/stats

Lazy imports inside methods (inside compute_replicate, plot, etc.):

MDAnalysis / mdtraj — heavy simulation libraries
matplotlib / seaborn — plotting libraries
Any package that may not be installed in all environments

# At module level — always available, needed for patch targets in tests
from polyzymd.analyses.shared import TrajectoryLoader, convert_time, parse_time_string


# Inside methods — heavy deps that may not be installed
def compute_replicate(self, ctx, replicate):
    import MDAnalysis as mda
    ...


def plot(self, ctx):
    import matplotlib.pyplot as plt
    ...

Why this matters for testing: when you mock TrajectoryLoader in tests, you write @patch("polyzymd.analyses.solvent_contacts.TrajectoryLoader"). This replaces the module-level name. If TrajectoryLoader were imported lazily inside the method, the patch target would be different and easier to get wrong.

Context Objects Reference

The framework provides context objects so plugins never need to load configs or discover paths themselves:

Context	Passed To	Key Attributes
`ReplicateContext`	`compute_replicate()`	`.sim_config`, `.settings`, `.output_dir`, `.replicate`, `.recompute`, `.equilibration`, `.result_path`
`AggregateContext`	`aggregate()`	`.condition`, `.replicates`, `.output_dir`, `.settings`, `.equilibration`, `.result_path`
`ComparisonContext`	`compare()`	`.name`, `.conditions`, `.analysis_dirs`, `.results_dir`, `.effective_control`, `.control_label`, `.settings`, `.equilibration`, `.recompute`, `.fdr_alpha`, `.result_path`, `.aggregated_results`
`PlotContext`	`plot()`	`.conditions`, `.analysis_dirs`, `.output_dir`, `.results_dir`, `.settings`, `.plot_settings`, `.comparison_path`, `.control_label`

`PlotContext.plot_settings`

PlotContext.plot_settings carries global plotting configuration from polyzymd.config.comparison.PlotSettings. Use this object instead of hard-coding figure style, DPI, or output behavior.

Type and global fields

ctx.plot_settings is always an instance of PlotSettings (the orchestrator guarantees this — it is never None).

Global fields available on PlotSettings include:

output_dir
format
dpi
style
color_palette
theme

Theme system

PlotSettings.theme uses a PlotTheme model that includes 28 visual style fields (for example: title_fontsize, label_fontsize, tick_fontsize, dot_size, line widths, transparency, and grid controls). The theme system has three presets:

PlotTheme.publication()
PlotTheme.presentation()
PlotTheme.minimal()

Per-analysis settings

Per-analysis settings are accessed as attributes on PlotSettings, for example:

plot_settings.contacts
plot_settings.rmsf

These fields use a __getattr__ fallback pattern, so access is stable even when a specific analysis block is not configured; defaults are returned.

Standard plugin pattern

The orchestrator guarantees that ctx.plot_settings is always a valid PlotSettings instance — never None. Use it directly:

plot_settings = ctx.plot_settings

Example using shared plotting helpers

from pathlib import Path

from polyzymd.analyses.base import PlotContext
from polyzymd.analyses.shared import (
    apply_axis_style,
    get_colors,
    get_theme,
    save_figure,
)


def plot(self, ctx: PlotContext) -> list[Path]:
    import matplotlib.pyplot as plt

    plot_settings = ctx.plot_settings
    theme = get_theme(plot_settings)
    colors = get_colors(len(ctx.conditions), plot_settings)

    fig, ax = plt.subplots(figsize=(8, 5))
    _ = theme
    _ = colors
    # ... plotting code ...
    apply_axis_style(ax, plot_settings, title="My Plot", ylabel="Value (Å)")
    out = ctx.output_dir / "my_plot.png"
    save_figure(fig, out, plot_settings)
    plt.close(fig)
    return [out]

Tip

Prefer shared plotting helpers over custom style code in each plugin. This keeps visual output consistent across analyses and makes global theming work.

Choosing Your Comparison Path

Path A: Default Scalar Comparison (recommended)

Implement extract_metrics() and you get t-tests, Cohen’s d, ANOVA, and ranking for free.

The framework loads aggregated results from disk automatically using:

json.loads() for dict-based plugins
your AggregatedResultClass.load(path) when AggregatedResultClass is set

You generally do not need to override _deserialize_result().

Path B: Custom Comparison

Override compare() for multi-metric or entry-table analyses:

from typing import Any

from polyzymd.analyses.base import ComparisonContext


def compare(self, ctx: ComparisonContext) -> Any:
    # Load results, compute custom statistics, return your result model
    # The returned object must support save(path)
    ...

See src/polyzymd/analyses/contacts/ or src/polyzymd/analyses/distances/ for full examples.

Loading Results in `plot()`

PlotContext provides paths and conditions but does not carry pre-loaded aggregated results. The recommended approach is to use _build_plot_data() to collect per-condition paths, then _load_aggregated_result() to load each condition’s data:

from pathlib import Path

from polyzymd.analyses.base import PlotContext


def plot(self, ctx: PlotContext) -> list[Path]:
    import matplotlib.pyplot as plt

    data, labels = self._build_plot_data(ctx)
    for label in labels:
        if label not in data or label == "__meta__":
            continue
        agg_dir = data[label]["aggregated_dir"]
        summary = self._load_aggregated_result(agg_dir)
        if summary is not None:
            # summary is a dict or AggregatedResultClass instance
            # ... plot data from summary ...
            pass

    return []

You can also loop through ctx.conditions directly for fine-grained control:

def plot(self, ctx: PlotContext) -> list[Path]:
    import matplotlib.pyplot as plt

    for cond in ctx.conditions:
        agg_dir = ctx.analysis_dirs[cond.label] / "aggregated"
        summary = self._load_aggregated_result(agg_dir)
        if summary is not None:
            # ... plot data from summary ...
            pass

    return []

_load_aggregated_result() looks for result.json in the aggregated directory, falling back to the most recent *.json file. It deserializes using AggregatedResultClass if set, otherwise json.loads(). Returns None if no result file is found.

Warning

Do not bypass _load_aggregated_result() with custom file parsing unless your plugin requires a non-standard storage format. The helper already handles both dict and model-based deserialization behavior.

Formatting with `format_scalar_comparison()`

For analyses using the default comparison path, format_scalar_comparison() from analyses.stats produces formatted tables with rankings, effect sizes, and significance stars:

from typing import Any

from polyzymd.analyses.base import ComparisonResult
from polyzymd.analyses.stats import format_scalar_comparison


def format(self, result: Any, output_format: str = "text") -> str:
    if isinstance(result, ComparisonResult):
        return format_scalar_comparison(
            result,
            title="My Analysis Comparison",
            metric_label="My Metric",
            metric_unit="Å",
            metric_key="my_metric",
            output_format=output_format,
            higher_is_better=True,
        )
    return super().format(result, output_format)

Parameters:

Parameter	Type	Description
`result`	`ComparisonResult`	The result to format
`title`	`str`	Display title (for example, `"RMSF Comparison"`)
`metric_label`	`str`	Human-readable metric name for table headers
`metric_unit`	`str`	Unit string appended to values (for example, `"Å"`)
`metric_key`	`str \| None`	Key prefix in `ConditionSummary` extra fields; auto-detected if `None`
`output_format`	`str`	`"text"`, `"markdown"`, or `"json"`
`higher_is_better`	`bool`	Affects interpretation wording

Return Types: Dicts vs Pydantic Models

Plugins can return either plain dicts or typed Pydantic models from compute_replicate() and aggregate(). Both paths are fully supported.

For new plugins, start with dicts. They require no extra imports, no extra classes, and they work with the default comparison pipeline out of the box. You can always migrate to Pydantic models later if your plugin grows in complexity.

Dict path (recommended starting point)

Dict plugins are the simplest way to get a working analysis:

compute_replicate() returns dict
aggregate() returns dict
No AggregatedResultClass class variable needed
Framework saves/loads JSON using standard json.dumps() / json.loads()

class MyAnalysis(Analysis):
    name = "my_analysis"

    class Settings(BaseModel):
        selection: str = "protein and name CA"

    def compute_replicate(self, ctx, replicate):
        return {"value": 1.23, "replicate": replicate}

    def aggregate(self, ctx, results):
        vals = [r["value"] for r in results]
        return {"mean_value": sum(vals) / len(vals), "replicate_values": vals}

    def extract_metrics(self, summary):
        return {
            "value": MetricValue(
                name="value",
                mean=summary["mean_value"],
                sem=0.1,
                replicate_values=summary["replicate_values"],
            )
        }

This is the path used in the complete SolventContactsAnalysis example earlier in this tutorial. The framework handles serialization, deserialization, and the full comparison lifecycle without any Pydantic result models.

What is Pydantic, and when would you use it?

Pydantic is a Python data validation library. A Pydantic BaseModel subclass is like a dataclass with automatic type checking: if you declare a field as float, Pydantic raises a validation error if someone passes a string. PolyzyMD already uses Pydantic for its config system (Settings classes), so it is always available in the environment.

For result models, Pydantic adds:

Validation on construction — catches type errors early (e.g., passing a string where a float is expected)
Typed attribute access — result.mean_rg instead of result["mean_rg"], with IDE autocomplete
Structured extension path — start with scaffold-generated BaseModel classes, then upgrade to BaseAnalysisResult if you need helper I/O methods or NPZ sidecar storage
Nested structure — complex result hierarchies with validated sub-models

Pydantic model path

Use models when you need validation, nested structure, strict typing, or more explicit result contracts.

Key pieces:

The scaffold --style pydantic path generates result models inheriting from Pydantic BaseModel
Plugin class sets AggregatedResultClass to the aggregated model class
Framework deserializes aggregated files through AggregatedResultClass.model_validate_json(...) (or .load(path) when the class provides it)
BaseModel classes are the simple default and do not include .save() / .load() helpers
If you need NPZ sidecars or model-level I/O helpers, manually upgrade your result models to polyzymd.analyses._results_base.BaseAnalysisResult

from pathlib import Path
from typing import ClassVar

from pydantic import BaseModel

from polyzymd.analyses.base import Analysis


class ReplicateResult(BaseModel):
    mean_rg: float
    replicate: int


class AggregatedResult(BaseModel):
    mean_rg: float
    sem_rg: float
    replicate_values: list[float]


class SolventContactsAnalysis(Analysis):
    name: ClassVar[str] = "solvent_contacts"
    AggregatedResultClass = AggregatedResult

    class Settings(BaseModel):
        selection: str = "protein and name CA"

    def compute_replicate(self, ctx, replicate) -> ReplicateResult:
        return ReplicateResult(mean_rg=15.0, replicate=replicate)

    def aggregate(self, ctx, results) -> AggregatedResult:
        values = [r.mean_rg for r in results]
        return AggregatedResult(
            mean_rg=sum(values) / len(values),
            sem_rg=0.3,
            replicate_values=values,
        )

If you later need NPZ sidecar storage or .save() / .load() helper methods, replace BaseModel with BaseAnalysisResult in those result classes.

Important

Gotcha: AggregatedResultClass requires model returns

When you set AggregatedResultClass, your compute_replicate() and aggregate() should return instances of the corresponding Pydantic result models. Do not return plain dicts in this mode.

Why: _load_aggregated_result() / _deserialize_result() uses the declared class for deserialization. If on-disk JSON represents dict-shaped data that does not match the model contract expected by AggregatedResultClass.load(path), deserialization fails.

Comparison: dicts vs Pydantic models

Aspect	Dicts	Pydantic models
Setup effort	None — just return `{}`	Define result classes (scaffold uses `BaseModel` by default)
Validation	None — typos in keys are silent	Automatic type checking on construction
Attribute access	`result["mean_rg"]`	`result.mean_rg` with IDE autocomplete
Large array support	Manual — save NumPy arrays yourself	BaseModel: manual; BaseAnalysisResult: NPZ sidecar helpers
Nested structure	Manual — dicts of dicts	Validated sub-models with typed fields
Serialization	`json.dumps()` / `json.loads()`	BaseModel JSON by framework; optional `BaseAnalysisResult.save()` / `.load()`
Framework wiring	No `AggregatedResultClass` needed	Set `AggregatedResultClass` on plugin class
Best for	Scalar analyses, quick prototypes, new contributors	Complex results, per-residue arrays, production plugins

Rule of thumb: Start with dicts. If you find yourself writing defensive result.get("key", default) checks, duplicating key strings, or needing to store large NumPy arrays alongside JSON, migrate to Pydantic models.

A Note on Comparison Infrastructure

Shared comparison infrastructure now lives across the core packages:

polyzymd.config.comparison for comparison config and plot settings
polyzymd.cli.compare for polyzymd compare commands
polyzymd.analyses.shared for inferential statistics and result/path helpers

You do NOT need to create new top-level comparison modules for a plugin. Keep your plotting logic in your plugin’s plot() method and your formatting in format(). Start with all logic in __init__.py; as your plugin grows, extract plotting functions into a _plotters.py module within the package (see Extracting Plotting to _plotters.py below).

Canonical comparison base classes

If your plugin needs a custom comparison result (Path B), import base classes from analyses.base — this is the only canonical import path:

from polyzymd.analyses.base import (
    ANOVAResult,
    BaseComparisonResult,
    BaseConditionSummary,
    PairwiseResult,
)

Class	Purpose
`BaseComparisonResult`	Abstract base for custom plugin comparison results (generic over summary/pairwise types)
`BaseConditionSummary`	Abstract base for per-condition summaries (requires `primary_metric_value` / `primary_metric_sem` properties)
`PairwiseResult`	Standard pairwise t-test result with FDR fields
`ANOVAResult`	Standard ANOVA result

These classes provide .save() / .load() helpers, FDR-adjusted p-value fields, and serialization with NaN/Inf handling. The RMSD, Rg, and SASA plugins all subclass them for their multi-run comparison results.

Shared Utilities (`analyses/shared/`)

The analyses/shared/ package provides reusable infrastructure for plugin authors. The example above uses TrajectoryLoader, parse_time_string, and convert_time. The package now also re-exports plotting and config-hash helper symbols for direct plugin use.

TrajectoryLoader

The most important shared utility. Handles topology discovery, daisy-chain trajectory segments, timestep detection, and equilibration frame skipping:

from polyzymd.analyses.shared import TrajectoryLoader, convert_time, parse_time_string

loader = TrajectoryLoader(ctx.sim_config)
u = loader.load_universe(replicate)

# Convert equilibration time to frame offset
eq_value, eq_unit = parse_time_string(ctx.equilibration)
timestep_ps = loader.get_timestep(replicate, unit="ps")
eq_time_ps = convert_time(eq_value, eq_unit, "ps")
start_frame = int(eq_time_ps / timestep_ps)

Available re-exports

The following symbols are re-exported from polyzymd.analyses.shared for convenient one-line imports:

Source module	Re-exported symbols
`shared.loader`	`TrajectoryLoader`, `TrajectoryInfo`, `parse_time_string`, `convert_time`, `time_to_frame`
`shared.alignment`	`AlignmentConfig`, `ReferenceMode`, `align_trajectory`, `get_alignment_description`
`shared.statistics`	`compute_sem`, `aggregate_per_residue_stats`, `aggregate_region_stats`, `weighted_mean_with_sem`, `StatResult`, `PerResidueStats`
`shared.autocorrelation`	`compute_acf`, `estimate_correlation_time`, `statistical_inefficiency`, `statistical_inefficiency_multiple`, `n_effective`, `get_independent_indices`, `check_statistical_reliability`, `ACFResult`, `CorrelationTimeResult`
`shared.pbc`	`minimum_image_distance`, `pairwise_distances_pbc`
`shared.constants`	`DEFAULT_CONTACT_CUTOFF`, `DEFAULT_DISTANCE_THRESHOLD`, `DEFAULT_SURFACE_EXPOSURE_THRESHOLD`
`shared.defaults`	`AnalysisDefaults`
`shared.config_hash`	`compute_config_hash`, `validate_config_hash`
`shared.plotting`	`get_theme`, `apply_axis_style`, `apply_legend`, `get_colors`, `get_output_path`, `save_figure`, `grouped_bars`

Convergence detection

analyses/shared/convergence.py provides find_convergence_time(), a sliding-window slope heuristic for detecting sustained convergence in any timeseries. The RMSD plugin uses it to annotate per-run convergence diagnostics, but it is generic enough for any signal (e.g., Rg, energy).

from polyzymd.analyses.shared.convergence import find_convergence_time

result = find_convergence_time(time_ns, rmsd_values)
if result.converged:
    print(f"Converged at {result.convergence_time_ns:.1f} ns")

Multi-run comparison helpers

Some plugins track multiple named runs per condition — for example, per-chain RMSD, per-domain Rg, or per-target SASA. These plugins share a common compare loop: filter conditions → build pairs → run pairwise t-tests → run ANOVA → apply FDR correction. The multi_run_comparison module extracts that orchestration logic so plugins do not reimplement it.

from polyzymd.analyses.shared.multi_run_comparison import (
    apply_fdr_correction,
    build_condition_pairs,
    filter_summaries_with_run,
)

Function	Purpose
`filter_summaries_with_run()`	Keep only conditions that contain a specific named run
`build_condition_pairs()`	Build control-vs-all or all-vs-all condition pairs
`apply_fdr_correction()`	Benjamini-Hochberg FDR correction across pairwise and ANOVA results

Usage sketch inside a custom compare() override:

def compare(self, ctx):
    summaries = self._load_all_summaries(ctx)
    for run_label in run_labels:
        # Keep only conditions that have this run
        filtered = filter_summaries_with_run(summaries, run_label, get_run_fn)
        # Build pairwise pairs (control-vs-all or all-vs-all)
        pairs = build_condition_pairs(
            list(filtered.keys()), ctx.control_label
        )
        # ... run t-tests, ANOVA ...
    # Correct p-values across all runs
    apply_fdr_correction(all_pairwise, all_anova, fdr_alpha=ctx.fdr_alpha)

The RMSD, Rg, and SASA plugins all use this pattern. Study src/polyzymd/analyses/rmsd/__init__.py for a complete example.

Multi-run formatting helpers

After computing multi-run statistics, you need to format them for the CLI. The multi_run_formatting module provides reusable building blocks for ranked tables, pairwise comparison lines, and ANOVA summaries:

from polyzymd.analyses.shared.multi_run_formatting import (
    format_anova_line,
    format_pairwise_line,
    make_ranked_table_header,
    make_section_title,
)

Function	Purpose
`make_ranked_table_header()`	Build a ranked-table header with separator
`format_pairwise_line()`	Format one pairwise comparison as a text line
`format_anova_line()`	Format ANOVA results as a text line
`make_section_title()`	Build a section title with separator
`make_ranked_rows()`	Build `(label, mean, sem, rank)` tuples for table rows
`make_ranked_markdown_header()`	Build ranked-table headers for markdown output

These helpers keep CLI output visually consistent across multi-run plugins. See src/polyzymd/analyses/rmsd/_formatters.py for a complete formatting example.

Tip

If your plugin tracks multiple named runs (like per-chain or per-domain metrics), use the shared multi_run_comparison and multi_run_formatting helpers instead of reimplementing the compare → rank → pairwise → ANOVA → FDR loop. This keeps the statistical pipeline consistent and avoids subtle bugs in p-value correction.

Other submodules

Other submodules (selections, aggregation, diagnostics, aa_classification, logging_utils) are available via direct import but are not re-exported from the package root:

The metric_type classification is a planned future enhancement and is not yet implemented in the current codebase.

# Direct submodule import (not re-exported)
from polyzymd.analyses.shared.selections import get_positions
from polyzymd.analyses.shared.aggregation import collect_replicate_results

Plugins can now import plotting and config hash utilities directly from the package root:

from polyzymd.analyses.shared import (
    compute_config_hash,
    get_output_path,
    save_figure,
)

Note

Re-export imports are preferred for plugin code because they keep import paths stable even if internal module locations change.

What a Good Community PR Looks Like

A strong contribution usually includes:

one new plugin package in src/polyzymd/analyses/<name>/
one focused test file
one short docs update showing the plugins: config block
figures only if they add scientific value

Aim for a plugin that another MDAnalysis user can understand in one read.

Extracting Plotting to `_plotters.py`

As a plugin grows, its plot() method may accumulate many helper functions. Established plugins extract these into a _plotters.py module within the plugin package:

analyses/rmsf/
├── __init__.py      # Analysis lifecycle (compute, aggregate, compare, plot)
├── _plotters.py     # Plotting functions called by plot()
├── _results.py      # Result models
└── ...

The plot() method in __init__.py delegates to functions in _plotters.py:

from ._plotters import plot_comparison_bars, plot_per_residue

def plot(self, ctx: PlotContext) -> list[Path]:
    data, labels = self._build_plot_data(ctx)
    paths = []
    paths.append(plot_comparison_bars(data, labels, ctx.output_dir, ctx.plot_settings))
    paths.append(plot_per_residue(data, labels, ctx.output_dir, ctx.plot_settings))
    return [p for p in paths if p is not None]

When to extract: Consider extracting when your plugin has 3+ plot functions or __init__.py exceeds ~500 lines. For simple plugins (like catalytic_triad/), keeping plotting inline is fine.

All 6 established plugins have _plotters.py: rmsf, secondary_structure, distances, contacts, catalytic_triad, hydrogen_bonds.

Anti-Patterns to Avoid

Do not bypass the context

# Wrong
def compute_replicate(self, ctx, replicate):
    config = SimulationConfig.from_yaml("/my/path/config.yaml")

# Right
def compute_replicate(self, ctx, replicate):
    sim_config = ctx.sim_config  # already loaded by framework

Understand the result-saving convention

The framework auto-saves return values from compute_replicate() and aggregate() to ctx.result_path. For dict plugins, returning your result is usually sufficient.

For caching with explicit target naming, save explicitly:

import json


def aggregate(self, ctx, results):
    agg = {"mean": 15.0}
    target_path = ctx.result_path
    if target_path is None:
        target_path = ctx.output_dir / "result.json"
    target_path.parent.mkdir(parents=True, exist_ok=True)
    target_path.write_text(json.dumps(agg, indent=2), encoding="utf-8")
    return agg

Do not import heavy deps at module level

# Wrong — breaks environments without MDAnalysis
import MDAnalysis as mda


# Right — lazy import inside method
def compute_replicate(self, ctx, replicate):
    import MDAnalysis as mda

Framework utilities from analyses/shared/ are imported at module level — they are always available and patchable in tests. Only heavy third-party dependencies (MDAnalysis, mdtraj, matplotlib) should be imported inside methods.

Do not create top-level comparison packages

New plugins should keep all logic in the plugin package. Use the canonical base classes from analyses.base (BaseComparisonResult, BaseConditionSummary, PairwiseResult, ANOVAResult) for custom comparison results, and the shared helpers from analyses/shared/ for multi-run orchestration and formatting.

Existing Plugins to Study

Plugin	Comparison path	Good example of
`catalytic_triad/`	Default (`extract_metrics`)	Simplest lifecycle — minimal compute logic with `_plotters.py` for KDE/bar plots
`rmsf/`	Default (`extract_metrics`)	Default scalar comparison with `_plotters.py` extraction
`secondary_structure/`	Default (`extract_metrics`)	Wrapping existing calculators with `_plotters.py` and typed result classes
`rmsd/`	Custom (multi-run)	Multi-run comparison with `multi_run_comparison` + `multi_run_formatting` shared helpers
`rg/`	Custom (multi-run)	Same multi-run pattern as RMSD, applied to radius of gyration
`contacts/`	Custom (`compare` override)	Multi-metric comparison, condition filtering, large `_plotters.py`
`distances/`	Custom (`compare` override)	Entry-table style comparison with `_plotters.py` extraction

Important

Do not interpret plugin differences as “simple” versus “complex” quality. Several existing plugins reflect earlier architectural phases. Use them as pattern references for specific tasks (result modeling, comparison style, plotting), not as maturity rankings.

Testing Your Plugin

Test file naming

Name your test file tests/analyses/plugins/test_<name>.py (for example, tests/analyses/plugins/test_solvent_contacts.py).

Testing strategy

A reliable test suite for plugins usually combines:

discovery and class-contract tests (name, Settings, discovery)
pure logic tests (extract_metrics, aggregate)
isolated compute tests (compute_replicate with mocked trajectory loader)
filesystem-backed integration tests for compare() and plot()

This layering keeps tests fast while still validating framework integration.

Basic tests (discovery, settings, metrics)

These tests use no trajectory data and verify the plugin integrates with the framework:

from polyzymd.analyses import get_analysis, list_analyses
from polyzymd.analyses.base import MetricValue


class TestSolventContactsDiscovery:
    """Test that the plugin is discoverable."""

    def test_discovered(self):
        assert "solvent_contacts" in list_analyses()

    def test_class_variables(self):
        cls = get_analysis("solvent_contacts")
        assert cls.name == "solvent_contacts"
        assert hasattr(cls, "Settings")

    def test_settings_defaults(self):
        cls = get_analysis("solvent_contacts")
        settings = cls.Settings()
        assert settings.selection == "protein and name CA"


class TestSolventContactsMetrics:
    """Test metric extraction for default comparison."""

    def test_extract_metrics(self):
        cls = get_analysis("solvent_contacts")
        analysis = cls()
        summary = {
            "mean_contact_fraction": 0.142,
            "sem_contact_fraction": 0.010,
            "replicate_values": [0.136, 0.148],
        }
        metrics = analysis.extract_metrics(summary)
        assert "mean_contact_fraction" in metrics
        assert isinstance(metrics["mean_contact_fraction"], MetricValue)
        assert metrics["mean_contact_fraction"].higher_is_better is False

Testing `compute_replicate()` with mocked trajectories

compute_replicate() requires MDAnalysis and trajectory files. Mock TrajectoryLoader at the module level to test compute logic without real trajectory data:

from pathlib import Path
from unittest.mock import MagicMock, patch

from polyzymd.analyses import get_analysis
from polyzymd.analyses.base import Condition, ReplicateContext


class TestSolventContactsCompute:
    """Test compute_replicate with mocked trajectories."""

    @patch("polyzymd.analyses.solvent_contacts.TrajectoryLoader")
    def test_computes_metric(self, mock_loader_cls, tmp_path):
        cls = get_analysis("solvent_contacts")
        analysis = cls()
        settings = cls.Settings()

        mock_loader = MagicMock()
        mock_loader_cls.return_value = mock_loader

        mock_universe = MagicMock()
        mock_target = MagicMock()
        mock_target.__len__.return_value = 10
        mock_target.positions = [[0.0, 0.0, 0.0]] * 10
        mock_solvent = MagicMock()
        mock_solvent.__len__.return_value = 20
        mock_solvent.positions = [[1.0, 1.0, 1.0]] * 20
        mock_universe.select_atoms.side_effect = [mock_target, mock_solvent]

        mock_trajectory = MagicMock()
        mock_trajectory.__getitem__.return_value = range(50)
        mock_universe.trajectory = mock_trajectory

        mock_loader.load_universe.return_value = mock_universe
        mock_loader.get_timestep.return_value = 10.0

        condition = Condition(
            label="Test",
            config_path=Path("/fake/config.yaml"),
            replicates=(1,),
            sim_config=MagicMock(),
        )
        ctx = ReplicateContext(
            condition=condition,
            replicate=1,
            sim_config=condition.sim_config,
            output_dir=tmp_path / "run_1",
            equilibration="0ns",
            recompute=True,
            settings=settings,
        )

        result = analysis.compute_replicate(ctx, replicate=1)
        assert "mean_contact_fraction" in result
        assert result["mean_contact_fraction"] >= 0
        assert result["replicate"] == 1

Note the patch target: @patch("polyzymd.analyses.solvent_contacts.TrajectoryLoader"). This works because TrajectoryLoader is imported at module level in solvent_contacts/__init__.py.

Testing `aggregate()`

Aggregation takes plain data and needs no trajectory mocks:

from pathlib import Path
from unittest.mock import MagicMock

import numpy as np

from polyzymd.analyses import get_analysis
from polyzymd.analyses.base import AggregateContext, Condition


class TestSolventContactsAggregate:
    """Test aggregation across replicates."""

    def test_aggregate(self, tmp_path):
        cls = get_analysis("solvent_contacts")
        analysis = cls()
        settings = cls.Settings()

        condition = Condition(
            label="Test",
            config_path=Path("/fake/config.yaml"),
            replicates=(1, 2, 3),
            sim_config=MagicMock(),
        )
        ctx = AggregateContext(
            condition=condition,
            replicates=(1, 2, 3),
            output_dir=tmp_path / "aggregated",
            equilibration="10ns",
            settings=settings,
        )

        results = [
            {"mean_contact_fraction": 0.12, "replicate": 1},
            {"mean_contact_fraction": 0.10, "replicate": 2},
            {"mean_contact_fraction": 0.11, "replicate": 3},
        ]

        agg = analysis.aggregate(ctx, results)
        expected_mean = np.mean([0.12, 0.10, 0.11])
        assert abs(agg["mean_contact_fraction"] - expected_mean) < 1e-10
        assert agg["replicate_values"] == [0.12, 0.10, 0.11]

Testing `compare()` with filesystem-backed aggregated results

compare() reads condition-specific result files from analysis directories, so tests should construct realistic filesystem layout with result.json files and a ComparisonContext:

import json
from pathlib import Path
from unittest.mock import MagicMock

from polyzymd.analyses import get_analysis
from polyzymd.analyses.base import ComparisonContext, Condition


class TestCompare:
    def test_compare_two_conditions(self, tmp_path):
        cls = get_analysis("solvent_contacts")
        analysis = cls()

        # Set up filesystem with aggregated results
        for label, value in [("WT", 0.12), ("Mutant", 0.18)]:
            agg_dir = tmp_path / label / "solvent_contacts" / "aggregated"
            agg_dir.mkdir(parents=True)
            (agg_dir / "result.json").write_text(
                json.dumps(
                    {
                        "mean_contact_fraction": value,
                        "sem_contact_fraction": 0.01,
                        "replicate_values": [value - 0.01, value + 0.01],
                    }
                ),
                encoding="utf-8",
            )

        conditions = [
            Condition(
                label="WT",
                config_path=Path("/fake"),
                replicates=(1, 2),
                sim_config=MagicMock(),
            ),
            Condition(
                label="Mutant",
                config_path=Path("/fake"),
                replicates=(1, 2),
                sim_config=MagicMock(),
            ),
        ]

        ctx = ComparisonContext(
            name="test",
            conditions=conditions,
            excluded_conditions=[],
            control_label="WT",
            analysis_dirs={
                "WT": tmp_path / "WT" / "solvent_contacts",
                "Mutant": tmp_path / "Mutant" / "solvent_contacts",
            },
            results_dir=tmp_path / "results",
            equilibration="10ns",
            settings=cls.Settings(),
            recompute=True,
        )

        result = analysis.compare(ctx)
        assert result is not None

Tip

This style of test is robust because it validates the same on-disk contract used by real compare runs (<condition>/<analysis>/aggregated/result.json).

Matplotlib backend in CI/headless tests

When testing plot code in CI or headless environments, force a non-interactive backend before importing matplotlib.pyplot:

import matplotlib

matplotlib.use("Agg")  # headless backend for CI / testing
import matplotlib.pyplot as plt

This avoids backend/display errors and keeps plotting tests deterministic.

Run tests with:

pixi run -e build pytest tests/analyses/plugins/test_solvent_contacts.py -v

Appendix: The `comparison.yaml` File

The comparison.yaml file is the single configuration file that drives the entire comparison workflow. It defines which simulation conditions to compare, which analyses to run, and how to generate plots. Understanding this file is important for plugin authors because your Settings class maps directly to the plugins: section.

Generating a template

The fastest way to create a comparison.yaml is with the scaffold command:

polyzymd compare init -n my_study

This creates a project directory with a fully commented template. You can also specify a custom equilibration time:

polyzymd compare init -n my_study --eq-time 20ns

File structure

A comparison.yaml has four main sections:

# Project metadata
name: "my_study"
description: "Comparison of polymer conditions"
control: "Condition A"  # or null if no control

# Conditions — each points to a simulation's config.yaml
conditions:
  - label: "Condition A"
    config: "../path/to/condition_a/config.yaml"
    replicates: [1, 2, 3]

  - label: "Condition B"
    config: "../path/to/condition_b/config.yaml"
    replicates: [1, 2, 3]

# Defaults
defaults:
  equilibration_time: "10ns"

# Plugins — which analyses to run and their settings
plugins:
  rmsf:
    selection: "protein and name CA"
    reference_mode: "centroid"

  # Your plugin's Settings fields go here:
  my_analysis:
    example_parameter: "value"

# Plot settings — visual customization
plot_settings:
  output_dir: "figures/"
  format: "png"
  dpi: 300
  style: "publication"

How `plugins:` connects to your `Settings` class

When the framework runs your plugin, it reads the plugins: section of comparison.yaml and deserializes the matching key into your Settings model. For example, if your plugin has:

class MySettings(BaseModel):
    cutoff: float = 4.5
    selection: str = "protein and name CA"

Then users configure it in comparison.yaml as:

plugins:
  my_analysis:
    cutoff: 5.0
    selection: "chainID A"

Any fields not specified in the YAML use the defaults from your Settings class. This is why providing sensible defaults for every field is important — it means the analysis works without any explicit configuration.

Running comparisons

After editing comparison.yaml:

# Validate the configuration
polyzymd compare validate

# Run a specific analysis
polyzymd compare run rmsf

# Run all enabled analyses
polyzymd compare run-all

# Generate plots
polyzymd compare plot-all

See How to Compare Simulation Conditions for a complete user guide on running comparisons.

Extending the Analysis Framework

Prerequisites

How the Plugin System Works

Anatomy of a Plugin

name (required)

Settings (required)

compute_replicate(ctx, replicate) (required)

aggregate(ctx, results) (required)

extract_metrics(summary) (optional, but recommended)

compare(ctx) (optional override)

plot(ctx) (optional)

format(result, output_format) (optional)

filter_conditions(conditions, settings) (optional)

Handling partially-applicable conditions in compute_replicate()

The Complete Example

Base Class Helpers

_build_plot_data(ctx, *, include_replicates=False)

_check_cache(result_cls, cache_path, *, recompute, sim_config=None)

_format_replicate_range(replicates)

Summary of helpers

Import Rules

Context Objects Reference

PlotContext.plot_settings

Type and global fields

Theme system

Per-analysis settings

Standard plugin pattern

Example using shared plotting helpers

Choosing Your Comparison Path

Path A: Default Scalar Comparison (recommended)

Path B: Custom Comparison

Loading Results in plot()

Formatting with format_scalar_comparison()

Return Types: Dicts vs Pydantic Models

Dict path (recommended starting point)

What is Pydantic, and when would you use it?

Pydantic model path

Comparison: dicts vs Pydantic models

A Note on Comparison Infrastructure

Canonical comparison base classes

Shared Utilities (analyses/shared/)

TrajectoryLoader

Available re-exports

Convergence detection

Multi-run comparison helpers

Multi-run formatting helpers

Other submodules

Contributor Checklist

What a Good Community PR Looks Like

Extracting Plotting to _plotters.py

Anti-Patterns to Avoid

Do not bypass the context

Understand the result-saving convention

Do not import heavy deps at module level

Do not create top-level comparison packages

Existing Plugins to Study

Testing Your Plugin

Test file naming

Testing strategy

Basic tests (discovery, settings, metrics)

Testing compute_replicate() with mocked trajectories

Testing aggregate()

Testing compare() with filesystem-backed aggregated results

Matplotlib backend in CI/headless tests

See Also

Appendix: The comparison.yaml File

Generating a template

File structure

How plugins: connects to your Settings class

Running comparisons

`name` (required)

`Settings` (required)

`compute_replicate(ctx, replicate)` (required)

`aggregate(ctx, results)` (required)

`extract_metrics(summary)` (optional, but recommended)

`compare(ctx)` (optional override)

`plot(ctx)` (optional)

`format(result, output_format)` (optional)

`filter_conditions(conditions, settings)` (optional)

Handling partially-applicable conditions in `compute_replicate()`

`_build_plot_data(ctx, *, include_replicates=False)`

`_check_cache(result_cls, cache_path, *, recompute, sim_config=None)`

`_format_replicate_range(replicates)`

`PlotContext.plot_settings`

Loading Results in `plot()`

Formatting with `format_scalar_comparison()`

Shared Utilities (`analyses/shared/`)

Extracting Plotting to `_plotters.py`

Testing `compute_replicate()` with mocked trajectories

Testing `aggregate()`

Testing `compare()` with filesystem-backed aggregated results

Appendix: The `comparison.yaml` File

How `plugins:` connects to your `Settings` class