How to Compare Simulation Conditions
Use this guide when you already have completed PolyzyMD simulations and want to
run the current polyzymd compare workflow.
You will:
create a comparison workspace
configure one or more analysis plugins under
plugins:run
polyzymd compare runorpolyzymd compare run-allgenerate figures with
polyzymd compare plot-all
Important
For the v1.3.0 release, the stable comparison stack is RMSD, Rg, RMSF,
contacts, distances, catalytic triad, secondary structure, SASA, and hydrogen
bonds.
Note
If you have not yet run a full analysis/comparison workflow, start with Tutorial: Analyze a Study from Finished Simulations.
Environment Setup
All commands below assume you have activated the PolyzyMD pixi environment:
pixi shell -e build
Alternatively, prefix each command with pixi run -e build.
Resource requirements
Validation, status, and help commands are lightweight. polyzymd compare run,
run-all, and plotting over large cached results may load trajectories and can
require substantial RAM, CPU/GPU time, and scratch I/O. On shared HPC systems,
run these commands inside an allocated job or interactive compute session, not
on a login node. If a command is killed or runs out of memory, request more
resources or use polyzymd compare submit.
Before You Start
Make sure each condition already has:
a simulation
config.yamlfinished trajectories for the replicates you want to compare
any shared inputs needed by the plugin you plan to run
The comparison pipeline can reuse cached analysis data when it exists, but it
can also compute missing per-condition results during polyzymd compare run.
Step 1: Create a Comparison Workspace
polyzymd compare init -n polymer_stability_study
cd polymer_stability_study
This creates:
polymer_stability_study/
├── comparison.yaml
├── comparison/
├── figures/
└── structures/
comparison.yamldefines the conditions and enabled pluginscomparison/stores cached comparison JSON, one subdirectory per analysisfigures/stores generated plotsstructures/holds shared reference files such as an enzyme PDB for SASA
Step 2: Define a Minimal comparison.yaml
Start with one stable analysis. RMSF is a good first comparison because it has few extra inputs.
name: "polymer_stability_study"
description: "Effect of polymer composition on enzyme flexibility"
control: "No Polymer"
conditions:
- label: "No Polymer"
config: "../noPoly_enzyme_DMSO/config.yaml"
replicates: [1, 2, 3]
- label: "100% SBMA"
config: "../SBMA_100_enzyme_DMSO/config.yaml"
replicates: [1, 2, 3]
- label: "100% EGMA"
config: "../EGMA_100_enzyme_DMSO/config.yaml"
replicates: [1, 2, 3]
defaults:
equilibration_time: "10ns"
plugins:
rmsf:
selection: "protein and name CA"
To enable more analyses, add more sections under plugins::
plugins:
rmsf:
selection: "protein and name CA"
contacts:
polymer_selection: "chainid C"
protein_selection: "chainid A"
cutoff: 4.5
catalytic_triad:
name: "Ser-His-Asp"
threshold: 3.5
pairs:
- label: "Ser77-His156"
selection_a: "protein and resid 77 and name OG"
selection_b: "protein and resid 156 and name NE2"
distances:
pairs:
- label: "Substrate-Ser77"
selection_a: "resname SUB and name C1"
selection_b: "protein and resid 77 and name OG"
rmsd:
runs:
- label: "Protein Backbone"
selection: "protein and name CA"
alignment_selection: "protein and name CA"
reference_mode: "centroid"
- label: "Active Site"
selection: "protein and (resid 77 or resid 133 or resid 156) and name CA"
alignment_selection: "protein and name CA"
reference_mode: "centroid"
rg:
runs:
- label: "Whole Protein"
selection: "protein"
- label: "Protein Backbone"
selection: "protein and name CA"
Statistical settings for pairwise comparisons
Plugins that perform cross-condition statistical tests support per-plugin
settings in the plugins: block. For example, contacts supports fdr_alpha,
min_effect_size, and top_residues. See the
Comparison Reference
for the full settings table. For post-hoc method details (BH t-tests, Tukey
HSD, Cohen’s d, and significance markers), see the
Post-Hoc Testing Reference.
Step 3: Validate the Config
polyzymd compare validate
You should see a passing summary with the study name, condition count, and the enabled plugin sections.
Step 4: Run One Comparison
polyzymd compare run rmsf
This command:
resolves
plugins.rmsffromcomparison.yamlcomputes or reloads per-condition RMSF data
performs the cross-condition comparison
writes the canonical cache file to
comparison/rmsf/result.jsonprints a formatted summary to the terminal
Running on an HPC cluster?
For expensive analyses (SASA, contacts, hydrogen bonds) or large studies with
many conditions and replicates, use polyzymd compare submit to dispatch
analysis as SLURM jobs instead of running interactively:
polyzymd compare submit sasa --partition <part> --mem 8G --time 02:00:00
polyzymd compare status sasa # monitor progress
polyzymd compare finalize sasa # (if needed) re-run compare + plot
Each replicate runs as an independent job, with automatic dependency wiring for aggregation and finalization. See How To: Submit Analysis Jobs to a SLURM Cluster for the full workflow, including dry-run previews and job arrays.
You can save the formatted report separately with -o:
polyzymd compare run rmsf --format markdown -o reports/rmsf.md
Step 5: Run All Enabled Comparisons
Once you have multiple plugin sections configured, run them together:
polyzymd compare run-all
Or run them and generate plots in one pass:
polyzymd compare run-all --plot
Step 6: Generate Figures
For a plotting smoke test:
polyzymd compare plot-all --list-available
polyzymd compare plot-all
--list-available is useful because it shows which plot types are available
for the currently enabled plugins and which are experimental.
Step 7: Check the Outputs
After a successful run, expect files like these:
polymer_stability_study/
├── comparison.yaml
├── analysis/
│ ├── no_polymer/
│ │ └── rmsf/
│ │ ├── run_1/
│ │ │ └── result.json
│ │ ├── run_2/
│ │ │ └── result.json
│ │ └── aggregated/
│ │ └── result.json
│ └── 100_sbma/
│ └── rmsf/
│ └── ...
├── comparison/
│ ├── rmsf/
│ │ └── result.json
│ ├── contacts/
│ │ └── result.json
│ ├── distances/
│ │ └── result.json
│ └── catalytic_triad/
│ └── result.json
└── figures/
├── rmsf/
│ ├── rmsf_comparison.png
│ └── rmsf_profile.png
└── ...
If your smoke test is polyzymd compare plot-all, success means:
the command completes without error
stable plots render normally
experimental plots, if enabled, render with explicit experimental labeling
Programmatic Use
If you need to run the comparison pipeline from Python, use the plugin orchestrator directly:
from pathlib import Path
from polyzymd.analyses.discovery import get_analysis
from polyzymd.analyses.orchestrator import run_comparison
from polyzymd.config.comparison import ComparisonConfig
config = ComparisonConfig.from_yaml(Path("comparison.yaml"))
analysis = get_analysis("rmsf")()
pipeline_result = run_comparison(
analysis,
config,
equilibration="10ns",
)
result = pipeline_result["comparison"]
print(result.ranking)
print(pipeline_result["comparison_path"])
Adding More Stable Analyses
Common next additions to comparison.yaml are:
rmsdfor RMSD timeseries and structural stability comparisonrgfor Radius of Gyration and structural compactness comparisoncontactsfor polymer coverage and contact fractiondistancesfor custom atom-pair distancescatalytic_triadfor active-site geometrysecondary_structurefor helix/strand persistence and contenthydrogen_bondsfor hydrogen-bond occupancy and lifetime summaries
For end-to-end examples, see:
Archived experimental analyses are not active v1.3 plugins. See Experimental analyses for historical access details.
Troubleshooting
config path not found
Paths in comparison.yaml are resolved relative to the location of
comparison.yaml, not your current shell directory.
No analyses are enabled
You need at least one configured section under plugins:.
plot-all runs but expected figures are missing
Check that the corresponding comparison JSON files already exist under
comparison/<analysis>/result.json and use
polyzymd compare plot-all --list-available to verify the enabled plot types.