How to Compare Simulation Conditions

Use this guide when you already have completed PolyzyMD simulations and want to run the current polyzymd compare workflow.

You will:

create a comparison workspace
configure one or more analysis plugins under plugins:
run polyzymd compare run or polyzymd compare run-all
generate figures with polyzymd compare plot-all

Important

For the v1.3.0 release, the stable comparison stack is RMSD, Rg, RMSF, contacts, distances, catalytic triad, secondary structure, SASA, and hydrogen bonds.

Note

If you have not yet run a full analysis/comparison workflow, start with Tutorial: Analyze a Study from Finished Simulations.

Environment Setup

All commands below assume you have activated the PolyzyMD pixi environment:

pixi shell -e build

Alternatively, prefix each command with pixi run -e build.

Resource requirements

Validation, status, and help commands are lightweight. polyzymd compare run, run-all, and plotting over large cached results may load trajectories and can require substantial RAM, CPU/GPU time, and scratch I/O. On shared HPC systems, run these commands inside an allocated job or interactive compute session, not on a login node. If a command is killed or runs out of memory, request more resources or use polyzymd compare submit.

Before You Start

Make sure each condition already has:

a simulation config.yaml
finished trajectories for the replicates you want to compare
any shared inputs needed by the plugin you plan to run

The comparison pipeline can reuse cached analysis data when it exists, but it can also compute missing per-condition results during polyzymd compare run.

Step 1: Create a Comparison Workspace

polyzymd compare init -n polymer_stability_study
cd polymer_stability_study

This creates:

polymer_stability_study/
├── comparison.yaml
├── comparison/
├── figures/
└── structures/

comparison.yaml defines the conditions and enabled plugins
comparison/ stores cached comparison JSON, one subdirectory per analysis
figures/ stores generated plots
structures/ holds shared reference files such as an enzyme PDB for SASA

Step 2: Define a Minimal `comparison.yaml`

Start with one stable analysis. RMSF is a good first comparison because it has few extra inputs.

name: "polymer_stability_study"
description: "Effect of polymer composition on enzyme flexibility"
control: "No Polymer"

conditions:
  - label: "No Polymer"
    config: "../noPoly_enzyme_DMSO/config.yaml"
    replicates: [1, 2, 3]

  - label: "100% SBMA"
    config: "../SBMA_100_enzyme_DMSO/config.yaml"
    replicates: [1, 2, 3]

  - label: "100% EGMA"
    config: "../EGMA_100_enzyme_DMSO/config.yaml"
    replicates: [1, 2, 3]

defaults:
  equilibration_time: "10ns"

plugins:
  rmsf:
    selection: "protein and name CA"

To enable more analyses, add more sections under plugins::

plugins:
  rmsf:
    selection: "protein and name CA"

  contacts:
    polymer_selection: "chainid C"
    protein_selection: "chainid A"
    cutoff: 4.5

  catalytic_triad:
    name: "Ser-His-Asp"
    threshold: 3.5
    pairs:
      - label: "Ser77-His156"
        selection_a: "protein and resid 77 and name OG"
        selection_b: "protein and resid 156 and name NE2"

  distances:
    pairs:
      - label: "Substrate-Ser77"
        selection_a: "resname SUB and name C1"
        selection_b: "protein and resid 77 and name OG"

  rmsd:
    runs:
      - label: "Protein Backbone"
        selection: "protein and name CA"
        alignment_selection: "protein and name CA"
        reference_mode: "centroid"
      - label: "Active Site"
        selection: "protein and (resid 77 or resid 133 or resid 156) and name CA"
        alignment_selection: "protein and name CA"
        reference_mode: "centroid"

  rg:
    runs:
      - label: "Whole Protein"
        selection: "protein"
      - label: "Protein Backbone"
        selection: "protein and name CA"

Statistical settings for pairwise comparisons

Plugins that perform cross-condition statistical tests support per-plugin settings in the plugins: block. For example, contacts supports fdr_alpha, min_effect_size, and top_residues. See the Comparison Reference for the full settings table. For post-hoc method details (BH t-tests, Tukey HSD, Cohen’s d, and significance markers), see the Post-Hoc Testing Reference.

Step 3: Validate the Config

polyzymd compare validate

You should see a passing summary with the study name, condition count, and the enabled plugin sections.

Step 4: Run One Comparison

polyzymd compare run rmsf

This command:

resolves plugins.rmsf from comparison.yaml
computes or reloads per-condition RMSF data
performs the cross-condition comparison
writes the canonical cache file to comparison/rmsf/result.json
prints a formatted summary to the terminal

Running on an HPC cluster?

For expensive analyses (SASA, contacts, hydrogen bonds) or large studies with many conditions and replicates, use polyzymd compare submit to dispatch analysis as SLURM jobs instead of running interactively:

polyzymd compare submit sasa --partition <part> --mem 8G --time 02:00:00
polyzymd compare status sasa       # monitor progress
polyzymd compare finalize sasa     # (if needed) re-run compare + plot

Each replicate runs as an independent job, with automatic dependency wiring for aggregation and finalization. See How To: Submit Analysis Jobs to a SLURM Cluster for the full workflow, including dry-run previews and job arrays.

You can save the formatted report separately with -o:

polyzymd compare run rmsf --format markdown -o reports/rmsf.md

Step 5: Run All Enabled Comparisons

Once you have multiple plugin sections configured, run them together:

polyzymd compare run-all

Or run them and generate plots in one pass:

polyzymd compare run-all --plot

Step 6: Generate Figures

For a plotting smoke test:

polyzymd compare plot-all --list-available
polyzymd compare plot-all

--list-available is useful because it shows which plot types are available for the currently enabled plugins and which are experimental.

Step 7: Check the Outputs

After a successful run, expect files like these:

polymer_stability_study/
├── comparison.yaml
├── analysis/
│   ├── no_polymer/
│   │   └── rmsf/
│   │       ├── run_1/
│   │       │   └── result.json
│   │       ├── run_2/
│   │       │   └── result.json
│   │       └── aggregated/
│   │           └── result.json
│   └── 100_sbma/
│       └── rmsf/
│           └── ...
├── comparison/
│   ├── rmsf/
│   │   └── result.json
│   ├── contacts/
│   │   └── result.json
│   ├── distances/
│   │   └── result.json
│   └── catalytic_triad/
│       └── result.json
└── figures/
    ├── rmsf/
    │   ├── rmsf_comparison.png
    │   └── rmsf_profile.png
    └── ...

If your smoke test is polyzymd compare plot-all, success means:

the command completes without error
stable plots render normally
experimental plots, if enabled, render with explicit experimental labeling

Programmatic Use

If you need to run the comparison pipeline from Python, use the plugin orchestrator directly:

from pathlib import Path

from polyzymd.analyses.discovery import get_analysis
from polyzymd.analyses.orchestrator import run_comparison
from polyzymd.config.comparison import ComparisonConfig

config = ComparisonConfig.from_yaml(Path("comparison.yaml"))
analysis = get_analysis("rmsf")()

pipeline_result = run_comparison(
    analysis,
    config,
    equilibration="10ns",
)

result = pipeline_result["comparison"]
print(result.ranking)
print(pipeline_result["comparison_path"])

Adding More Stable Analyses

Common next additions to comparison.yaml are:

rmsd for RMSD timeseries and structural stability comparison
rg for Radius of Gyration and structural compactness comparison
contacts for polymer coverage and contact fraction
distances for custom atom-pair distances
catalytic_triad for active-site geometry
secondary_structure for helix/strand persistence and content
hydrogen_bonds for hydrogen-bond occupancy and lifetime summaries

For end-to-end examples, see:

Archived experimental analyses are not active v1.3 plugins. See Experimental analyses for historical access details.

Troubleshooting

`config` path not found

Paths in comparison.yaml are resolved relative to the location of comparison.yaml, not your current shell directory.

`No analyses are enabled`

You need at least one configured section under plugins:.

`plot-all` runs but expected figures are missing

Check that the corresponding comparison JSON files already exist under comparison/<analysis>/result.json and use polyzymd compare plot-all --list-available to verify the enabled plot types.