Extending the Analysis Framework
This guide shows you how to add a new analysis type to PolyzyMD. By the end, you will have a working analysis plugin that can compute metrics per replicate, aggregate across replicates, compare conditions, generate figures, and display results on the CLI.
Note
This guide covers the plugin system in analyses/.
Each plugin is a self-contained package — you implement the science
(compute, aggregate) and the framework handles discovery, iteration,
result saving, comparison statistics, and CLI wiring.
Prerequisites
A working pixi environment:
pixi install -e buildBasic familiarity with Pydantic v2
BaseModelUnderstanding of the PolyzyMD chain convention (A=protein, B=substrate, C=polymer)
How the Plugin System Works
Every analysis in PolyzyMD is a single class that subclasses Analysis and
lives in a package under src/polyzymd/analyses/<name>/. The framework discovers
plugins automatically — no registry edits, no imports, no boilerplate. You drop
a package, and the CLI can run it.
The lifecycle for each analysis is:
For each condition:
For each replicate:
result = compute_replicate(ctx, replicate) # your code
aggregated = aggregate(ctx, [replicate_results]) # your code
filter_conditions(conditions) → filtered list
compare(ctx) → ComparisonResult
plot(ctx) → [figure paths]
format(result, "text") → CLI output string
You implement the science (compute, aggregate). The framework provides discovery, replicate iteration, result saving, comparison statistics, and CLI wiring.
Anatomy of a Plugin
Before seeing the full code, here is what each piece does and why it exists.
name (required)
A unique, lowercase string identifier. Used in CLI commands
(polyzymd compare run solvent_contacts), config files, and output directories.
Settings (required)
A Pydantic v2 BaseModel inner class. Users configure your analysis through
the plugins: section of comparison.yaml, and the framework deserializes
those settings into your Settings model. Provide sensible defaults for every
field so the analysis works out of the box.
compute_replicate(ctx, replicate) (required)
Runs once per replicate per condition. The framework passes a
ReplicateContext containing everything you need: the loaded
SimulationConfig, your Settings, output paths, and equilibration time. Use
TrajectoryLoader from analyses/shared/ to load trajectories — it handles
topology discovery, trajectory segment daisy-chaining, timestep detection, and
equilibration skipping.
Return a dict or Pydantic model. The framework collects these for aggregate().
aggregate(ctx, results) (required)
Runs once per condition after all replicates complete. Receives the list of
objects your compute_replicate() returned. Compute summary statistics
(mean, SEM) and return the aggregated result. The framework auto-saves this
to ctx.result_path.
extract_metrics(summary) (optional, but recommended)
Exposes scalar metrics for the default comparison path. Return a dict of
MetricValue objects, and the framework automatically runs t-tests, computes
Cohen’s d, runs ANOVA (≥3 conditions), and ranks conditions. This is the
simplest way to get a fully statistical comparison — no need to implement
compare().
compare(ctx) (optional override)
Override only for complex multi-metric or entry-table comparisons. If you
implement extract_metrics(), you typically do not need to override
compare().
plot(ctx) (optional)
Generate matplotlib figures. Use self._build_plot_data(ctx) to collect
per-condition paths, then self._load_aggregated_result(agg_dir) to load each
condition’s data. Use shared plotting helpers (get_theme, apply_axis_style,
save_figure, get_colors) for consistent theming. Return a list of figure
paths. The scaffold generates a working plot() method automatically.
format(result, output_format) (optional)
Custom CLI display. Use format_scalar_comparison() from analyses.stats
for formatted tables with rankings, effect sizes, and significance stars.
filter_conditions(conditions, settings) (optional)
Exclude conditions where the analysis cannot run at all. The framework calls
this before dispatching any compute_replicate() or aggregate() jobs,
so excluded conditions have zero work done.
When to use it: Your analysis requires a molecular component that may not be present in all conditions. The canonical case is polymer-specific analyses — if the comparison includes a “No Polymer (Control)” condition, analyses like contacts, exposure, polymer_affinity, and polymer_bridging must skip it because there are no polymer atoms to analyse.
Canonical pattern (check sim_config.polymers, fail-open on errors):
from polyzymd.analyses.base import Condition
def filter_conditions(
self,
conditions: list[Condition],
settings: "BaseModel | None" = None,
) -> list[Condition]:
"""Exclude conditions without polymer."""
valid: list[Condition] = []
for cond in conditions:
try:
sim_config = cond.sim_config
polymers = getattr(sim_config, "polymers", None)
has_polymer = polymers is not None and getattr(polymers, "enabled", True)
if not has_polymer:
logger.info("Excluding '%s': no polymer configured", cond.label)
continue
valid.append(cond)
except (AttributeError, ValueError, KeyError, OSError) as e:
logger.warning("Error checking '%s': %s — including anyway", cond.label, e)
valid.append(cond) # fail-open
return valid
Key rules:
Always fail-open on unexpected errors — include the condition rather than crashing the entire comparison.
Log excluded conditions at
INFOlevel so users can see what was skipped.The
settingsparameter carries the resolved pluginSettingsinstance. Use it if your exclusion logic depends on user-configurable selections (e.g., a custom polymer selection string).
Existing implementations: contacts, exposure, polymer_affinity,
and polymer_bridging all implement filter_conditions().
binding_free_energy has a no-op override (intentionally keeps all
conditions).
Tip
If only some sub-computations within your analysis are inapplicable — not
the entire analysis — use graceful empty-selection handling inside
compute_replicate() instead. See the next section.
Handling partially-applicable conditions in compute_replicate()
Sometimes an analysis is valid for a condition overall, but specific
sub-computations within it are not. In that case, don’t exclude the entire
condition — instead, handle empty AtomGroup selections gracefully inside
compute_replicate().
Example: The hydrogen_bonds plugin computes H-bonds between multiple
group pairs (protein–protein, protein–polymer, protein–substrate). For a
“No Polymer” control, protein–protein H-bonds are still meaningful, but
protein–polymer H-bonds are not because there are no polymer atoms. Rather
than excluding the condition entirely (which would lose the protein–protein
data), the plugin:
Detects that the “polymer” group selection matched zero atoms.
Logs a warning:
"Group 'polymer' selection 'chainid C' matched no atoms — will skip summaries using it".Produces zero-filled placeholder results for summaries involving that empty group (e.g., 0 H-bonds, 0.0 occupancy). This is scientifically correct — zero protein–polymer H-bonds is the ground truth when no polymer is present.
Computes normally for summaries that don’t involve the empty group (protein–protein H-bonds still run).
# Pattern: graceful empty-selection handling (from hydrogen_bonds)
for group_name, selection_str in settings.groups.items():
atom_group = u.select_atoms(selection_str)
if len(atom_group) == 0:
if settings.allow_empty_groups:
logger.warning(
"Group '%s' selection '%s' matched no atoms — will skip summaries using it",
group_name,
selection_str,
)
else:
raise ValueError(
f"Group '{group_name}' selection '{selection_str}' matched no atoms."
)
When to use which pattern:
Situation |
Pattern |
Example |
|---|---|---|
Entire analysis is meaningless for a condition |
|
Polymer–protein contacts with no polymer |
Some sub-computations are invalid, others are valid |
Graceful empty-selection handling |
H-bonds: protein–protein valid, protein–polymer not |
Analysis works for all conditions |
Neither (use defaults) |
RMSF, RMSD, secondary structure |
The Complete Example
Here is a single, complete plugin file that uses real framework utilities. This is the pattern you should follow for new analyses.
Important
The SolventContactsAnalysis implementation below is a walkthrough example for
a hypothetical solvent_contacts plugin name that does not collide with
built-in plugins.
Create src/polyzymd/analyses/solvent_contacts/ as a package in your own branch or downstream
project, or use the scaffold command to generate the boilerplate automatically
(the scaffold includes compute, aggregate, comparison, plotting, and tests):
# Default: dict-based results (simplest starting point)
polyzymd new-analysis solvent_contacts
# Typed Pydantic result models (better for complex analyses)
polyzymd new-analysis solvent_contacts --style pydantic
The --style flag controls how results are structured:
dict(default) — Uses plain dicts for replicate/aggregated results. No result model classes needed. Best for getting started quickly.pydantic— Generates typedReplicateResultandAggregatedResultPydantic models withAggregatedResultClass. Better when you need validation, IDE autocomplete, or NPZ sidecar storage.
If creating manually:
"""Solvent contacts analysis plugin."""
from __future__ import annotations
import logging
from pathlib import Path
from typing import Any, ClassVar, Sequence
import numpy as np
from pydantic import BaseModel, Field
from polyzymd.analyses.base import (
AggregateContext,
Analysis,
MetricValue,
PlotContext,
ReplicateContext,
)
from polyzymd.analyses.shared import TrajectoryLoader, convert_time, parse_time_string
from polyzymd.analyses.stats import format_scalar_comparison
logger = logging.getLogger(__name__)
class SolventContactsAnalysis(Analysis):
"""Solvent contacts around a selected protein region."""
name: ClassVar[str] = "solvent_contacts"
class Settings(BaseModel):
"""Settings for solvent contact-fraction analysis.
Attributes
----------
selection : str
MDAnalysis selection string for target atoms
solvent_selection : str
MDAnalysis selection for solvent atoms used in contacts
cutoff : float
Contact cutoff distance in Angstrom
"""
selection: str = Field(
default="protein and name CA",
description="MDAnalysis atom selection for target atoms",
)
solvent_selection: str = Field(
default="resname HOH and name O",
description="MDAnalysis atom selection for solvent contact atoms",
)
cutoff: float = Field(
default=4.5,
gt=0,
description="Distance cutoff in Angstrom for a contact",
)
def compute_replicate(self, ctx: ReplicateContext, replicate: int) -> dict[str, Any]:
"""Compute solvent contact fraction for one replicate.
Uses TrajectoryLoader for topology discovery, trajectory segment
daisy-chaining, and equilibration frame skipping
"""
# Lazy-import heavy third-party dependency
import MDAnalysis as mda # noqa: F401
from MDAnalysis.analysis.distances import distance_array
loader = TrajectoryLoader(ctx.sim_config)
u = loader.load_universe(replicate)
eq_value, eq_unit = parse_time_string(ctx.equilibration)
timestep_ps = loader.get_timestep(replicate, unit="ps")
eq_time_ps = convert_time(eq_value, eq_unit, "ps")
start_frame = int(eq_time_ps / timestep_ps)
target_atoms = u.select_atoms(ctx.settings.selection)
solvent_atoms = u.select_atoms(ctx.settings.solvent_selection)
contact_fractions = []
for _ts in u.trajectory[start_frame:]:
if len(target_atoms) == 0 or len(solvent_atoms) == 0:
contact_fractions.append(0.0)
continue
dists = distance_array(target_atoms.positions, solvent_atoms.positions)
contact_pairs = np.sum(dists <= ctx.settings.cutoff)
normalizer = len(target_atoms) * len(solvent_atoms)
contact_fractions.append(float(contact_pairs / normalizer))
return {
"mean_contact_fraction": float(np.mean(contact_fractions)),
"std_contact_fraction": float(np.std(contact_fractions)),
"n_frames": len(contact_fractions),
"replicate": replicate,
}
def aggregate(self, ctx: AggregateContext, results: Sequence[Any]) -> dict[str, Any]:
"""Average contact fraction across replicates with SEM."""
values = [r["mean_contact_fraction"] for r in results]
return {
"mean_contact_fraction": float(np.mean(values)),
"sem_contact_fraction": float(np.std(values, ddof=1) / np.sqrt(len(values))),
"replicate_values": values,
}
def extract_metrics(self, summary: Any) -> dict[str, MetricValue]:
"""Expose contact fraction as scalar metric for default comparison."""
return {
"mean_contact_fraction": MetricValue(
name="mean_contact_fraction",
mean=summary["mean_contact_fraction"],
sem=summary["sem_contact_fraction"],
replicate_values=summary["replicate_values"],
higher_is_better=False,
direction_labels=("more exposed", "unchanged", "more shielded"),
)
}
def plot(self, ctx: PlotContext) -> list[Path]:
"""Generate a contact-fraction bar chart comparing conditions."""
import matplotlib.pyplot as plt
from polyzymd.analyses.shared import apply_axis_style, get_colors, get_theme, save_figure
# Use the base class helper to gather per-condition paths
data, labels = self._build_plot_data(ctx)
if not labels:
return []
# Load aggregated results for each condition
plot_labels = []
means = []
sems = []
for label in labels:
if label not in data or label == "__meta__":
continue
agg_dir = data[label]["aggregated_dir"]
summary = self._load_aggregated_result(agg_dir)
if summary is None:
continue
plot_labels.append(label)
means.append(summary["mean_contact_fraction"])
sems.append(summary["sem_contact_fraction"])
if not plot_labels:
return []
# Use shared plotting utilities for consistent theming
plot_settings = ctx.plot_settings
theme = get_theme(plot_settings)
colors = get_colors(len(plot_labels), plot_settings)
fig, ax = plt.subplots(figsize=(8, 5))
ax.bar(plot_labels, means, yerr=sems, capsize=5, color=colors[: len(plot_labels)])
apply_axis_style(
ax,
plot_settings,
title="Solvent Contact Fraction Across Conditions",
ylabel="Contact fraction",
)
output_path = ctx.output_dir / "solvent_contacts_comparison.png"
output_path.parent.mkdir(parents=True, exist_ok=True)
save_figure(fig, output_path, plot_settings)
plt.close(fig)
return [output_path]
def format(self, result: Any, output_format: str = "text") -> str:
"""Format comparison result for CLI display."""
from polyzymd.analyses.base import ComparisonResult
if isinstance(result, ComparisonResult):
return format_scalar_comparison(
result,
title="Solvent Contacts Comparison",
metric_label="Mean contact fraction",
metric_unit="",
metric_key="mean_contact_fraction",
output_format=output_format,
higher_is_better=False,
)
return super().format(result, output_format)
That file is now discoverable, runnable through compare workflows, and compatible with default framework formatting/statistics.
Base Class Helpers
The Analysis base class provides three utility methods that reduce
boilerplate across the lifecycle. The SolventContactsAnalysis example above uses
_build_plot_data() in its plot() method, but all three are available
to every plugin.
_build_plot_data(ctx, *, include_replicates=False)
Collects per-condition paths from a PlotContext into a dict ready for
plotter functions. Returns (data, labels) where data[label] contains
"analysis_dir" and "aggregated_dir" keys:
def plot(self, ctx: PlotContext) -> list[Path]:
# One line replaces the 8-12 line loop through ctx.conditions
data, labels = self._build_plot_data(ctx, include_replicates=True)
if not labels:
return []
for label in labels:
agg_dir = data[label]["aggregated_dir"]
summary = self._load_aggregated_result(agg_dir)
# ... plot summary data ...
Manual alternative
Without _build_plot_data(), you write the loop yourself — which is
fine for understanding the data flow:
def plot(self, ctx: PlotContext) -> list[Path]:
for cond in ctx.conditions:
agg_dir = ctx.analysis_dirs[cond.label] / "aggregated"
summary = self._load_aggregated_result(agg_dir)
if summary is None:
continue
# ... plot summary data ...
_check_cache(result_cls, cache_path, *, recompute, sim_config=None)
Consolidates the cache-checking pattern used in compute_replicate().
Loads a Pydantic result from disk if the cache file exists and
recompute is False, optionally validating the config hash:
def compute_replicate(self, ctx: ReplicateContext, replicate: int) -> dict:
result_file = ctx.output_dir / "solvent_contacts_result.json"
# One line replaces the 5-line cache check pattern
cached = self._check_cache(
SolventContactsReplicateResult, result_file,
recompute=ctx.recompute, sim_config=ctx.sim_config,
)
if cached is not None:
return cached
# ... expensive computation ...
Note
_check_cache() requires a Pydantic model class with a .load(path)
method (i.e., a BaseAnalysisResult subclass). Dict-based plugins that
want caching should use explicit json.loads() / Path.exists() checks
instead.
_format_replicate_range(replicates)
Formats replicate tuples into compact strings for log messages and filenames. Contiguous ranges are collapsed:
cls._format_replicate_range((1, 2, 3)) # → "reps1-3"
cls._format_replicate_range((1, 3, 5)) # → "reps1_3_5"
Useful in aggregate() for logging or naming output files:
def aggregate(self, ctx: AggregateContext, results):
rep_str = self._format_replicate_range(ctx.replicates)
logger.info("Aggregating %s across %s", self.name, rep_str)
# ...
Summary of helpers
Helper |
Used in |
Purpose |
|---|---|---|
|
|
Collect per-condition paths from |
|
|
Load cached Pydantic result if valid |
|
|
Compact replicate string for messages/filenames |
Import Rules
The import structure in the example above is intentional. Getting imports wrong causes subtle test failures, so follow these rules:
Module-level imports (top of file):
Standard library (
logging,pathlib)NumPy — imported at module level because it is available in project environments
Framework utilities from
analyses/shared/(TrajectoryLoader,parse_time_string,convert_time,AlignmentConfig, etc.)Framework base classes from
analyses/baseandanalyses/stats
Lazy imports inside methods (inside compute_replicate, plot, etc.):
MDAnalysis/mdtraj— heavy simulation librariesmatplotlib/seaborn— plotting librariesAny package that may not be installed in all environments
# At module level — always available, needed for patch targets in tests
from polyzymd.analyses.shared import TrajectoryLoader, convert_time, parse_time_string
# Inside methods — heavy deps that may not be installed
def compute_replicate(self, ctx, replicate):
import MDAnalysis as mda
...
def plot(self, ctx):
import matplotlib.pyplot as plt
...
Why this matters for testing: when you mock TrajectoryLoader in tests,
you write @patch("polyzymd.analyses.solvent_contacts.TrajectoryLoader"). This replaces
the module-level name. If TrajectoryLoader were imported lazily inside the
method, the patch target would be different and easier to get wrong.
Context Objects Reference
The framework provides context objects so plugins never need to load configs or discover paths themselves:
Context |
Passed To |
Key Attributes |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
PlotContext.plot_settings
PlotContext.plot_settings carries global plotting configuration from
polyzymd.config.comparison.PlotSettings. Use this object instead of hard-coding
figure style, DPI, or output behavior.
Type and global fields
ctx.plot_settings is always an instance of PlotSettings (the orchestrator
guarantees this — it is never None).
Global fields available on PlotSettings include:
output_dirformatdpistylecolor_palettetheme
Theme system
PlotSettings.theme uses a PlotTheme model that includes 28 visual style
fields (for example: title_fontsize, label_fontsize, tick_fontsize,
dot_size, line widths, transparency, and grid controls). The theme system has
three presets:
PlotTheme.publication()PlotTheme.presentation()PlotTheme.minimal()
Per-analysis settings
Per-analysis settings are accessed as attributes on PlotSettings, for example:
plot_settings.contactsplot_settings.rmsf
These fields use a __getattr__ fallback pattern, so access is stable even
when a specific analysis block is not configured; defaults are returned.
Standard plugin pattern
The orchestrator guarantees that ctx.plot_settings is always a valid
PlotSettings instance — never None. Use it directly:
plot_settings = ctx.plot_settings
Choosing Your Comparison Path
Path A: Default Scalar Comparison (recommended)
Implement extract_metrics() and you get t-tests, Cohen’s d, ANOVA, and
ranking for free.
The framework loads aggregated results from disk automatically using:
json.loads()for dict-based pluginsyour
AggregatedResultClass.load(path)whenAggregatedResultClassis set
You generally do not need to override _deserialize_result().
Path B: Custom Comparison
Override compare() for multi-metric or entry-table analyses:
from typing import Any
from polyzymd.analyses.base import ComparisonContext
def compare(self, ctx: ComparisonContext) -> Any:
# Load results, compute custom statistics, return your result model
# The returned object must support save(path)
...
See src/polyzymd/analyses/contacts/ or src/polyzymd/analyses/distances/
for full examples.
Loading Results in plot()
PlotContext provides paths and conditions but does not carry pre-loaded
aggregated results. The recommended approach is to use _build_plot_data() to
collect per-condition paths, then _load_aggregated_result() to load each
condition’s data:
from pathlib import Path
from polyzymd.analyses.base import PlotContext
def plot(self, ctx: PlotContext) -> list[Path]:
import matplotlib.pyplot as plt
data, labels = self._build_plot_data(ctx)
for label in labels:
if label not in data or label == "__meta__":
continue
agg_dir = data[label]["aggregated_dir"]
summary = self._load_aggregated_result(agg_dir)
if summary is not None:
# summary is a dict or AggregatedResultClass instance
# ... plot data from summary ...
pass
return []
You can also loop through ctx.conditions directly for fine-grained control:
def plot(self, ctx: PlotContext) -> list[Path]:
import matplotlib.pyplot as plt
for cond in ctx.conditions:
agg_dir = ctx.analysis_dirs[cond.label] / "aggregated"
summary = self._load_aggregated_result(agg_dir)
if summary is not None:
# ... plot data from summary ...
pass
return []
_load_aggregated_result() looks for result.json in the aggregated
directory, falling back to the most recent *.json file. It deserializes
using AggregatedResultClass if set, otherwise json.loads(). Returns None
if no result file is found.
Warning
Do not bypass _load_aggregated_result() with custom file parsing unless your
plugin requires a non-standard storage format. The helper already handles both
dict and model-based deserialization behavior.
Formatting with format_scalar_comparison()
For analyses using the default comparison path, format_scalar_comparison()
from analyses.stats produces formatted tables with rankings, effect sizes,
and significance stars:
from typing import Any
from polyzymd.analyses.base import ComparisonResult
from polyzymd.analyses.stats import format_scalar_comparison
def format(self, result: Any, output_format: str = "text") -> str:
if isinstance(result, ComparisonResult):
return format_scalar_comparison(
result,
title="My Analysis Comparison",
metric_label="My Metric",
metric_unit="Å",
metric_key="my_metric",
output_format=output_format,
higher_is_better=True,
)
return super().format(result, output_format)
Parameters:
Parameter |
Type |
Description |
|---|---|---|
|
|
The result to format |
|
|
Display title (for example, |
|
|
Human-readable metric name for table headers |
|
|
Unit string appended to values (for example, |
|
|
Key prefix in |
|
|
|
|
|
Affects interpretation wording |
Return Types: Dicts vs Pydantic Models
Plugins can return either plain dicts or typed Pydantic models from
compute_replicate() and aggregate(). Both paths are fully supported.
For new plugins, start with dicts. They require no extra imports, no extra classes, and they work with the default comparison pipeline out of the box. You can always migrate to Pydantic models later if your plugin grows in complexity.
Dict path (recommended starting point)
Dict plugins are the simplest way to get a working analysis:
compute_replicate()returnsdictaggregate()returnsdictNo
AggregatedResultClassclass variable neededFramework saves/loads JSON using standard
json.dumps()/json.loads()
class MyAnalysis(Analysis):
name = "my_analysis"
class Settings(BaseModel):
selection: str = "protein and name CA"
def compute_replicate(self, ctx, replicate):
return {"value": 1.23, "replicate": replicate}
def aggregate(self, ctx, results):
vals = [r["value"] for r in results]
return {"mean_value": sum(vals) / len(vals), "replicate_values": vals}
def extract_metrics(self, summary):
return {
"value": MetricValue(
name="value",
mean=summary["mean_value"],
sem=0.1,
replicate_values=summary["replicate_values"],
)
}
This is the path used in the complete SolventContactsAnalysis example earlier in this
tutorial. The framework handles serialization, deserialization, and the full
comparison lifecycle without any Pydantic result models.
What is Pydantic, and when would you use it?
Pydantic is a Python data validation library.
A Pydantic BaseModel subclass is like a dataclass with automatic type
checking: if you declare a field as float, Pydantic raises a validation
error if someone passes a string. PolyzyMD already uses Pydantic for its
config system (Settings classes), so it is always available in the
environment.
For result models, Pydantic adds:
Validation on construction — catches type errors early (e.g., passing a string where a float is expected)
Typed attribute access —
result.mean_rginstead ofresult["mean_rg"], with IDE autocompleteStructured extension path — start with scaffold-generated
BaseModelclasses, then upgrade toBaseAnalysisResultif you need helper I/O methods or NPZ sidecar storageNested structure — complex result hierarchies with validated sub-models
Pydantic model path
Use models when you need validation, nested structure, strict typing, or more explicit result contracts.
Key pieces:
The scaffold
--style pydanticpath generates result models inheriting from PydanticBaseModelPlugin class sets
AggregatedResultClassto the aggregated model classFramework deserializes aggregated files through
AggregatedResultClass.model_validate_json(...)(or.load(path)when the class provides it)BaseModelclasses are the simple default and do not include.save()/.load()helpersIf you need NPZ sidecars or model-level I/O helpers, manually upgrade your result models to
polyzymd.analyses._results_base.BaseAnalysisResult
from pathlib import Path
from typing import ClassVar
from pydantic import BaseModel
from polyzymd.analyses.base import Analysis
class ReplicateResult(BaseModel):
mean_rg: float
replicate: int
class AggregatedResult(BaseModel):
mean_rg: float
sem_rg: float
replicate_values: list[float]
class SolventContactsAnalysis(Analysis):
name: ClassVar[str] = "solvent_contacts"
AggregatedResultClass = AggregatedResult
class Settings(BaseModel):
selection: str = "protein and name CA"
def compute_replicate(self, ctx, replicate) -> ReplicateResult:
return ReplicateResult(mean_rg=15.0, replicate=replicate)
def aggregate(self, ctx, results) -> AggregatedResult:
values = [r.mean_rg for r in results]
return AggregatedResult(
mean_rg=sum(values) / len(values),
sem_rg=0.3,
replicate_values=values,
)
If you later need NPZ sidecar storage or .save() / .load() helper methods,
replace BaseModel with BaseAnalysisResult in those result classes.
Important
Gotcha: AggregatedResultClass requires model returns
When you set AggregatedResultClass, your compute_replicate() and
aggregate() should return instances of the corresponding Pydantic result
models. Do not return plain dicts in this mode.
Why: _load_aggregated_result() / _deserialize_result() uses the declared
class for deserialization. If on-disk JSON represents dict-shaped data that does
not match the model contract expected by AggregatedResultClass.load(path),
deserialization fails.
Comparison: dicts vs Pydantic models
Aspect |
Dicts |
Pydantic models |
|---|---|---|
Setup effort |
None — just return |
Define result classes (scaffold uses |
Validation |
None — typos in keys are silent |
Automatic type checking on construction |
Attribute access |
|
|
Large array support |
Manual — save NumPy arrays yourself |
BaseModel: manual; BaseAnalysisResult: NPZ sidecar helpers |
Nested structure |
Manual — dicts of dicts |
Validated sub-models with typed fields |
Serialization |
|
BaseModel JSON by framework; optional |
Framework wiring |
No |
Set |
Best for |
Scalar analyses, quick prototypes, new contributors |
Complex results, per-residue arrays, production plugins |
Rule of thumb: Start with dicts. If you find yourself writing defensive
result.get("key", default) checks, duplicating key strings, or needing to
store large NumPy arrays alongside JSON, migrate to Pydantic models.
A Note on Comparison Infrastructure
Shared comparison infrastructure now lives across the core packages:
polyzymd.config.comparisonfor comparison config and plot settingspolyzymd.cli.compareforpolyzymd comparecommandspolyzymd.analyses.sharedfor inferential statistics and result/path helpers
You do NOT need to create new top-level comparison modules for a plugin. Keep your
plotting logic in your plugin’s plot() method and your formatting in
format(). Start with all logic in __init__.py; as your plugin grows, extract
plotting functions into a _plotters.py module within the package (see
Extracting Plotting to _plotters.py below).
Canonical comparison base classes
If your plugin needs a custom comparison result (Path B), import base classes
from analyses.base — this is the only canonical import path:
from polyzymd.analyses.base import (
ANOVAResult,
BaseComparisonResult,
BaseConditionSummary,
PairwiseResult,
)
Class |
Purpose |
|---|---|
|
Abstract base for custom plugin comparison results (generic over summary/pairwise types) |
|
Abstract base for per-condition summaries (requires |
|
Standard pairwise t-test result with FDR fields |
|
Standard ANOVA result |
These classes provide .save() / .load() helpers, FDR-adjusted p-value
fields, and serialization with NaN/Inf handling. The RMSD, Rg, and SASA
plugins all subclass them for their multi-run comparison results.
Contributor Checklist
When creating a new analysis plugin:
File:
src/polyzymd/analyses/<name>/(usepolyzymd new-analysis <name>to scaffold; add--style pydanticfor typed result models)name: ClassVar[str]— unique, lowercase, used in CLISettingsclass — Pydantic v2 BaseModel with defaultscompute_replicate()— usesTrajectoryLoader, returns dict or Pydantic modelaggregate()— returns dict or Pydantic model; framework auto-saves toctx.result_pathChoose comparison path:
Default: implement
extract_metrics()Custom: override
compare()returning a model with.save()Multi-run custom: use
multi_run_comparison+multi_run_formattingshared helpers
plot()— use_build_plot_data()+ shared plotting helpers (get_theme,apply_axis_style,save_figure,get_colors)(Optional)
format()— CLI display viaformat_scalar_comparison()(Optional) Use
_check_cache()incompute_replicate()for result caching(Optional) Use
_format_replicate_range()for compact log messagesIf your analysis requires polymer (or other optional molecular groups), implement
filter_conditions()to exclude inapplicable conditions — or handle empty selections gracefully incompute_replicate()If using
AggregatedResultClass, return matching result model instancesTests:
tests/analyses/plugins/test_<name>.py(scaffold generates plot tests too)Verify:
pixi run -e build pytest tests/analyses/plugins/test_<name>.py -v
What a Good Community PR Looks Like
A strong contribution usually includes:
one new plugin package in
src/polyzymd/analyses/<name>/one focused test file
one short docs update showing the
plugins:config blockfigures only if they add scientific value
Aim for a plugin that another MDAnalysis user can understand in one read.
Extracting Plotting to _plotters.py
As a plugin grows, its plot() method may accumulate many helper functions.
Established plugins extract these into a _plotters.py module within the
plugin package:
analyses/rmsf/
├── __init__.py # Analysis lifecycle (compute, aggregate, compare, plot)
├── _plotters.py # Plotting functions called by plot()
├── _results.py # Result models
└── ...
The plot() method in __init__.py delegates to functions in _plotters.py:
from ._plotters import plot_comparison_bars, plot_per_residue
def plot(self, ctx: PlotContext) -> list[Path]:
data, labels = self._build_plot_data(ctx)
paths = []
paths.append(plot_comparison_bars(data, labels, ctx.output_dir, ctx.plot_settings))
paths.append(plot_per_residue(data, labels, ctx.output_dir, ctx.plot_settings))
return [p for p in paths if p is not None]
When to extract: Consider extracting when your plugin has 3+ plot
functions or __init__.py exceeds ~500 lines. For simple plugins (like
catalytic_triad/), keeping plotting inline is fine.
All 6 established plugins have _plotters.py: rmsf, secondary_structure,
distances, contacts, catalytic_triad, hydrogen_bonds.
Anti-Patterns to Avoid
Do not bypass the context
# Wrong
def compute_replicate(self, ctx, replicate):
config = SimulationConfig.from_yaml("/my/path/config.yaml")
# Right
def compute_replicate(self, ctx, replicate):
sim_config = ctx.sim_config # already loaded by framework
Understand the result-saving convention
The framework auto-saves return values from compute_replicate() and
aggregate() to ctx.result_path. For dict plugins, returning your result is
usually sufficient.
For caching with explicit target naming, save explicitly:
import json
def aggregate(self, ctx, results):
agg = {"mean": 15.0}
target_path = ctx.result_path
if target_path is None:
target_path = ctx.output_dir / "result.json"
target_path.parent.mkdir(parents=True, exist_ok=True)
target_path.write_text(json.dumps(agg, indent=2), encoding="utf-8")
return agg
Do not import heavy deps at module level
# Wrong — breaks environments without MDAnalysis
import MDAnalysis as mda
# Right — lazy import inside method
def compute_replicate(self, ctx, replicate):
import MDAnalysis as mda
Framework utilities from analyses/shared/ are imported at module level —
they are always available and patchable in tests. Only heavy third-party
dependencies (MDAnalysis, mdtraj, matplotlib) should be imported inside
methods.
Do not create top-level comparison packages
New plugins should keep all logic in the plugin package. Use the canonical
base classes from analyses.base (BaseComparisonResult,
BaseConditionSummary, PairwiseResult, ANOVAResult) for custom comparison
results, and the shared helpers from analyses/shared/ for multi-run
orchestration and formatting.
Existing Plugins to Study
Plugin |
Comparison path |
Good example of |
|---|---|---|
|
Default ( |
Simplest lifecycle — minimal compute logic with |
|
Default ( |
Default scalar comparison with |
|
Default ( |
Wrapping existing calculators with |
|
Custom (multi-run) |
Multi-run comparison with |
|
Custom (multi-run) |
Same multi-run pattern as RMSD, applied to radius of gyration |
|
Custom ( |
Multi-metric comparison, condition filtering, large |
|
Custom ( |
Entry-table style comparison with |
Important
Do not interpret plugin differences as “simple” versus “complex” quality. Several existing plugins reflect earlier architectural phases. Use them as pattern references for specific tasks (result modeling, comparison style, plotting), not as maturity rankings.
Testing Your Plugin
Test file naming
Name your test file tests/analyses/plugins/test_<name>.py (for example,
tests/analyses/plugins/test_solvent_contacts.py).
Testing strategy
A reliable test suite for plugins usually combines:
discovery and class-contract tests (
name,Settings, discovery)pure logic tests (
extract_metrics,aggregate)isolated compute tests (
compute_replicatewith mocked trajectory loader)filesystem-backed integration tests for
compare()andplot()
This layering keeps tests fast while still validating framework integration.
Basic tests (discovery, settings, metrics)
These tests use no trajectory data and verify the plugin integrates with the framework:
from polyzymd.analyses import get_analysis, list_analyses
from polyzymd.analyses.base import MetricValue
class TestSolventContactsDiscovery:
"""Test that the plugin is discoverable."""
def test_discovered(self):
assert "solvent_contacts" in list_analyses()
def test_class_variables(self):
cls = get_analysis("solvent_contacts")
assert cls.name == "solvent_contacts"
assert hasattr(cls, "Settings")
def test_settings_defaults(self):
cls = get_analysis("solvent_contacts")
settings = cls.Settings()
assert settings.selection == "protein and name CA"
class TestSolventContactsMetrics:
"""Test metric extraction for default comparison."""
def test_extract_metrics(self):
cls = get_analysis("solvent_contacts")
analysis = cls()
summary = {
"mean_contact_fraction": 0.142,
"sem_contact_fraction": 0.010,
"replicate_values": [0.136, 0.148],
}
metrics = analysis.extract_metrics(summary)
assert "mean_contact_fraction" in metrics
assert isinstance(metrics["mean_contact_fraction"], MetricValue)
assert metrics["mean_contact_fraction"].higher_is_better is False
Testing compute_replicate() with mocked trajectories
compute_replicate() requires MDAnalysis and trajectory files. Mock
TrajectoryLoader at the module level to test compute logic without real
trajectory data:
from pathlib import Path
from unittest.mock import MagicMock, patch
from polyzymd.analyses import get_analysis
from polyzymd.analyses.base import Condition, ReplicateContext
class TestSolventContactsCompute:
"""Test compute_replicate with mocked trajectories."""
@patch("polyzymd.analyses.solvent_contacts.TrajectoryLoader")
def test_computes_metric(self, mock_loader_cls, tmp_path):
cls = get_analysis("solvent_contacts")
analysis = cls()
settings = cls.Settings()
mock_loader = MagicMock()
mock_loader_cls.return_value = mock_loader
mock_universe = MagicMock()
mock_target = MagicMock()
mock_target.__len__.return_value = 10
mock_target.positions = [[0.0, 0.0, 0.0]] * 10
mock_solvent = MagicMock()
mock_solvent.__len__.return_value = 20
mock_solvent.positions = [[1.0, 1.0, 1.0]] * 20
mock_universe.select_atoms.side_effect = [mock_target, mock_solvent]
mock_trajectory = MagicMock()
mock_trajectory.__getitem__.return_value = range(50)
mock_universe.trajectory = mock_trajectory
mock_loader.load_universe.return_value = mock_universe
mock_loader.get_timestep.return_value = 10.0
condition = Condition(
label="Test",
config_path=Path("/fake/config.yaml"),
replicates=(1,),
sim_config=MagicMock(),
)
ctx = ReplicateContext(
condition=condition,
replicate=1,
sim_config=condition.sim_config,
output_dir=tmp_path / "run_1",
equilibration="0ns",
recompute=True,
settings=settings,
)
result = analysis.compute_replicate(ctx, replicate=1)
assert "mean_contact_fraction" in result
assert result["mean_contact_fraction"] >= 0
assert result["replicate"] == 1
Note the patch target:
@patch("polyzymd.analyses.solvent_contacts.TrajectoryLoader"). This works
because TrajectoryLoader is imported at module level in
solvent_contacts/__init__.py.
Testing aggregate()
Aggregation takes plain data and needs no trajectory mocks:
from pathlib import Path
from unittest.mock import MagicMock
import numpy as np
from polyzymd.analyses import get_analysis
from polyzymd.analyses.base import AggregateContext, Condition
class TestSolventContactsAggregate:
"""Test aggregation across replicates."""
def test_aggregate(self, tmp_path):
cls = get_analysis("solvent_contacts")
analysis = cls()
settings = cls.Settings()
condition = Condition(
label="Test",
config_path=Path("/fake/config.yaml"),
replicates=(1, 2, 3),
sim_config=MagicMock(),
)
ctx = AggregateContext(
condition=condition,
replicates=(1, 2, 3),
output_dir=tmp_path / "aggregated",
equilibration="10ns",
settings=settings,
)
results = [
{"mean_contact_fraction": 0.12, "replicate": 1},
{"mean_contact_fraction": 0.10, "replicate": 2},
{"mean_contact_fraction": 0.11, "replicate": 3},
]
agg = analysis.aggregate(ctx, results)
expected_mean = np.mean([0.12, 0.10, 0.11])
assert abs(agg["mean_contact_fraction"] - expected_mean) < 1e-10
assert agg["replicate_values"] == [0.12, 0.10, 0.11]
Testing compare() with filesystem-backed aggregated results
compare() reads condition-specific result files from analysis directories, so
tests should construct realistic filesystem layout with result.json files and
a ComparisonContext:
import json
from pathlib import Path
from unittest.mock import MagicMock
from polyzymd.analyses import get_analysis
from polyzymd.analyses.base import ComparisonContext, Condition
class TestCompare:
def test_compare_two_conditions(self, tmp_path):
cls = get_analysis("solvent_contacts")
analysis = cls()
# Set up filesystem with aggregated results
for label, value in [("WT", 0.12), ("Mutant", 0.18)]:
agg_dir = tmp_path / label / "solvent_contacts" / "aggregated"
agg_dir.mkdir(parents=True)
(agg_dir / "result.json").write_text(
json.dumps(
{
"mean_contact_fraction": value,
"sem_contact_fraction": 0.01,
"replicate_values": [value - 0.01, value + 0.01],
}
),
encoding="utf-8",
)
conditions = [
Condition(
label="WT",
config_path=Path("/fake"),
replicates=(1, 2),
sim_config=MagicMock(),
),
Condition(
label="Mutant",
config_path=Path("/fake"),
replicates=(1, 2),
sim_config=MagicMock(),
),
]
ctx = ComparisonContext(
name="test",
conditions=conditions,
excluded_conditions=[],
control_label="WT",
analysis_dirs={
"WT": tmp_path / "WT" / "solvent_contacts",
"Mutant": tmp_path / "Mutant" / "solvent_contacts",
},
results_dir=tmp_path / "results",
equilibration="10ns",
settings=cls.Settings(),
recompute=True,
)
result = analysis.compare(ctx)
assert result is not None
Tip
This style of test is robust because it validates the same on-disk contract
used by real compare runs (<condition>/<analysis>/aggregated/result.json).
Matplotlib backend in CI/headless tests
When testing plot code in CI or headless environments, force a non-interactive
backend before importing matplotlib.pyplot:
import matplotlib
matplotlib.use("Agg") # headless backend for CI / testing
import matplotlib.pyplot as plt
This avoids backend/display errors and keeps plotting tests deterministic.
Run tests with:
pixi run -e build pytest tests/analyses/plugins/test_solvent_contacts.py -v
See Also
Architecture — Overall system architecture
Statistics Best Practices for MD Analysis — Autocorrelation and uncertainty
How to Compare Simulation Conditions — User guide for running comparisons
API reference:
polyzymd.analyses.base.Analysis
Appendix: The comparison.yaml File
The comparison.yaml file is the single configuration file that drives the
entire comparison workflow. It defines which simulation conditions to compare,
which analyses to run, and how to generate plots. Understanding this file is
important for plugin authors because your Settings class maps directly to
the plugins: section.
Generating a template
The fastest way to create a comparison.yaml is with the scaffold command:
polyzymd compare init -n my_study
This creates a project directory with a fully commented template. You can also specify a custom equilibration time:
polyzymd compare init -n my_study --eq-time 20ns
File structure
A comparison.yaml has four main sections:
# Project metadata
name: "my_study"
description: "Comparison of polymer conditions"
control: "Condition A" # or null if no control
# Conditions — each points to a simulation's config.yaml
conditions:
- label: "Condition A"
config: "../path/to/condition_a/config.yaml"
replicates: [1, 2, 3]
- label: "Condition B"
config: "../path/to/condition_b/config.yaml"
replicates: [1, 2, 3]
# Defaults
defaults:
equilibration_time: "10ns"
# Plugins — which analyses to run and their settings
plugins:
rmsf:
selection: "protein and name CA"
reference_mode: "centroid"
# Your plugin's Settings fields go here:
my_analysis:
example_parameter: "value"
# Plot settings — visual customization
plot_settings:
output_dir: "figures/"
format: "png"
dpi: 300
style: "publication"
How plugins: connects to your Settings class
When the framework runs your plugin, it reads the plugins: section of
comparison.yaml and deserializes the matching key into your Settings
model. For example, if your plugin has:
class MySettings(BaseModel):
cutoff: float = 4.5
selection: str = "protein and name CA"
Then users configure it in comparison.yaml as:
plugins:
my_analysis:
cutoff: 5.0
selection: "chainID A"
Any fields not specified in the YAML use the defaults from your Settings
class. This is why providing sensible defaults for every field is important —
it means the analysis works without any explicit configuration.
Running comparisons
After editing comparison.yaml:
# Validate the configuration
polyzymd compare validate
# Run a specific analysis
polyzymd compare run rmsf
# Run all enabled analyses
polyzymd compare run-all
# Generate plots
polyzymd compare plot-all
See How to Compare Simulation Conditions for a complete user guide on running comparisons.