comparison.yaml Schema Reference
The comparison.yaml file defines a cross-condition analysis project. It
specifies which simulation conditions to compare, which analysis plugins to
run, and how to visualize results. Create one with polyzymd compare init -n <name> and place it at the root of your comparison project directory.
Source of truth: polyzymd.config.comparison.ComparisonConfig() in
src/polyzymd/config/comparison.py.
Important
Plugin settings path fields are resolved relative to the directory containing
comparison.yaml.
For example, in:
plugins.rmsf.reference_file, plugins.contacts.enzyme_pdb_for_sasa,
plugins.binding_free_energy.enzyme_pdb_for_sasa, and other plugin-declared
path fields, a relative path like structures/enzyme.pdb is interpreted as:
<comparison_yaml_parent>/structures/enzyme.pdb
For CLI commands that consume this file, see Comparison and Plotting Reference. For directory layout and data expectations, see Data Requirements & Directory Layout.
Minimal Working Example
name: "polymer_stability_study"
conditions:
- label: "No Polymer"
config: "../no_polymer/config.yaml"
replicates: [1, 2, 3]
- label: "100% SBMA"
config: "../sbma_100/config.yaml"
replicates: [1, 2, 3]
defaults:
equilibration_time: "10ns"
plugins:
rmsf:
selection: "protein and name CA"
Top-Level Fields
Field |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string |
yes |
— |
Human-readable project name |
|
string |
no |
|
Description of what is being compared |
|
string |
no |
|
Label of the control condition. Must match a |
|
list |
yes |
— |
List of condition entries (min 1 required) |
|
mapping |
no |
see below |
Default analysis parameters |
|
mapping |
no |
|
Analysis plugin settings — what to compute |
|
mapping |
no |
see below |
Plot customization — how to visualize |
Legacy key handling:
analysis_settings:is accepted as a backward-compatible alias forplugins:(emits deprecation warning).Unknown top-level keys raise a
ValueErrorlisting the invalid keys and valid alternatives.
conditions[*]
Each entry describes one simulation condition to include in the comparison.
Field |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string |
yes |
— |
Display name (must be unique across all conditions) |
|
path |
yes |
— |
Path to the simulation’s |
|
list of int |
yes |
— |
Replicate numbers to include. A single |
defaults
Field |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
Time to discard as equilibration (e.g., |
|
float (0, 1] |
|
Significance threshold for pairwise comparisons and ANOVA. Used as the Benjamini-Hochberg FDR threshold when |
|
|
|
Post-hoc pairwise comparison method. See Post-Hoc Testing Reference for details. |
|
|
|
Two-sample t-test variance assumption. Only used when |
plugins
Presence of a key enables that analysis. The value is a mapping of that
plugin’s settings. An empty mapping (rmsf: {}) enables the plugin with all
defaults.
plugins.rmsf
Field |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
MDAnalysis selection string for RMSF computation |
|
string |
|
Reference structure: |
|
int |
|
Required when |
|
path |
|
Path to external PDB reference structure. Required when |
|
string |
|
MDAnalysis selection used for trajectory alignment before RMSF calculation |
|
string |
|
MDAnalysis selection used to compute the centroid reference structure when |
plugins.secondary_structure
Field |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
Chain letter for the protein to analyze via DSSP |
plugins.sasa
Field |
Type |
Default |
Description |
|---|---|---|---|
|
list |
(required) |
List of SASA run definitions (see sub-fields) |
|
float |
|
SASA probe radius in nanometers |
|
int |
|
Number of sphere points for Shrake-Rupley SASA |
|
int |
|
Frames per chunk for memory management |
Each entry in runs:
Field |
Type |
Default |
Description |
|---|---|---|---|
|
string |
(required) |
Name for this SASA computation |
|
string |
(required) |
MDAnalysis selection for the target surface |
|
string |
same as |
Atoms to include in SASA context (affects shadowing) |
|
int |
|
Frame stride |
plugins.catalytic_triad
Field |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
Display name for the triad analysis |
|
string |
|
Optional description of the triad (e.g., |
|
float |
|
Distance threshold in Angstroms (H-bond cutoff) |
|
list |
(required) |
List of atom pair definitions |
Each entry in pairs:
Field |
Type |
Default |
Description |
|---|---|---|---|
|
string |
(required) |
Display label (e.g., |
|
string |
(required) |
MDAnalysis selection for atom/group A. Supports |
|
string |
(required) |
MDAnalysis selection for atom/group B |
plugins.distances
Field |
Type |
Default |
Description |
|---|---|---|---|
|
float |
|
Global default threshold in Angstroms |
|
list |
(required) |
List of distance pair definitions |
|
bool |
|
Apply periodic boundary conditions to distance calculations |
|
bool |
|
Align trajectory before computing distances |
|
string |
|
MDAnalysis selection used for trajectory alignment |
|
string |
|
Alignment reference mode: |
|
int |
|
Frame index to use as reference when |
Each entry in pairs:
Field |
Type |
Default |
Description |
|---|---|---|---|
|
string |
(required) |
Display label (e.g., |
|
string |
(required) |
MDAnalysis selection for group A. Supports |
|
string |
(required) |
MDAnalysis selection for group B |
|
float |
global |
Per-pair threshold override |
|
string |
|
Display text for d ≤ threshold |
|
string |
|
Display text for d > threshold |
plugins.contacts
Field |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
MDAnalysis selection for polymer atoms |
|
string |
|
MDAnalysis selection for protein atoms |
|
float |
|
Contact distance cutoff in Angstroms |
|
string |
|
Residue grouping: |
|
bool |
|
Whether to compute contact residence times |
|
bool |
|
Experimental. Enable enrichment by residue group |
|
float |
|
Relative SASA cutoff defining “surface exposed” (for binding preference) |
|
path |
|
Path to enzyme PDB for standalone SASA computation (relative to |
|
bool |
|
Include built-in amino acid groups (aromatic, polar, nonpolar, charged) |
|
mapping |
|
Custom residue groups: |
|
mapping |
|
Mutually exclusive partitions for coverage plots: |
|
list of string |
|
Explicit polymer type labels. If |
|
mapping |
|
Custom MDAnalysis selections per polymer type: |
|
string |
|
Chain ID used for polymer auto-detection |
|
float |
|
Per-plugin FDR threshold |
|
float |
|
Minimum Cohen’s d for practical significance |
|
int |
|
Max residues shown per condition in formatted output |
plugins.rmsd
Field |
Type |
Default |
Description |
|---|---|---|---|
|
list |
(required) |
List of RMSD run definitions |
Each entry in runs:
Field |
Type |
Default |
Description |
|---|---|---|---|
|
string |
(required) |
Name for this RMSD computation (e.g., |
|
string |
(required) |
MDAnalysis selection for RMSD atoms |
|
string |
same as |
MDAnalysis selection for alignment |
|
string |
|
Reference structure mode: |
|
int |
|
Frame index to use as reference when |
|
path |
|
Path to external PDB reference structure |
|
string |
|
MDAnalysis selection for centroid computation. If |
|
float |
|
Rolling window size in nanoseconds for convergence detection |
|
float |
|
Step size in nanoseconds between convergence windows |
|
float |
|
Maximum slope (Å/ns) for a window to be considered converged |
|
float |
|
Duration in nanoseconds that convergence must be sustained |
plugins.rg
Field |
Type |
Default |
Description |
|---|---|---|---|
|
list |
(required) |
List of Rg run definitions |
Each entry in runs:
Field |
Type |
Default |
Description |
|---|---|---|---|
|
string |
(required) |
Name for this Rg computation |
|
string |
(required) |
MDAnalysis selection for Rg atoms |
|
string |
|
Computation mode: |
|
string |
|
How to weight fragments when |
|
bool |
|
Save per-frame fragment Rg distributions |
|
int |
|
Number of bins for Rg distribution histograms |
plugins.hydrogen_bonds
Field |
Type |
Default |
Description |
|---|---|---|---|
|
mapping |
|
Named atom groups: |
|
list or mapping |
one default summary ( |
Named H-bond summaries (see below) |
|
float |
|
H-bond distance cutoff in Angstroms |
|
float |
|
H-bond angle cutoff in degrees |
|
bool |
|
Update atom selections every frame |
|
int |
|
Number of top residue pairs to report |
|
bool |
|
Allow empty group selections: |
|
bool |
|
Whether overlapping composition partitions are allowed |
|
mapping |
|
Composition analysis settings |
|
float |
|
Override trajectory timestep in picoseconds for time-axis plots |
Each summary entry in summaries has:
Field |
Type |
Required |
Description |
|---|---|---|---|
|
string |
yes |
Unique summary name |
|
|
exactly one of |
Inter-group H-bonds |
|
|
exactly one of |
Intra-group H-bonds |
For mapping-form input, keys are treated as name values.
composition sub-fields:
Field |
Type |
Default |
Description |
|---|---|---|---|
|
mapping |
— |
Named partitions: |
plugins.exposure
Experimental
Exposure dynamics is an experimental analysis. Results should be interpreted with caution and are subject to change.
Field |
Type |
Default |
Description |
|---|---|---|---|
|
float |
|
Fraction SASA defining “exposed” |
|
float |
|
Lower bound for transient classification |
|
float |
|
Upper bound for transient classification |
|
int |
|
Minimum consecutive frames for an event |
|
string |
|
Chain ID for protein |
|
string |
|
MDAnalysis selection for protein |
|
string |
|
MDAnalysis selection for polymer |
|
list of string |
|
Residue names for enrichment analysis |
|
float |
|
SASA probe radius (nm) |
|
int |
|
Number of sphere points |
plugins.binding_free_energy
Experimental
Binding free energy decomposition is experimental and under active development.
Field |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
Energy units: |
|
bool |
|
Recompute binding preference from contacts if no cache is available |
|
float |
|
Minimum relative SASA for surface-exposed |
|
path |
|
Enzyme PDB for SASA computation |
|
bool |
|
Include built-in amino acid class groups |
|
mapping |
|
Custom residue groups: |
|
mapping |
|
Mutually exclusive protein-group partitions |
|
mapping |
|
Custom MDAnalysis selections per polymer type |
|
string |
|
Chain ID used for polymer auto-detection |
|
float |
|
FDR threshold |
plugins.polymer_affinity
Experimental
Polymer affinity scoring is experimental and under active development.
Field |
Type |
Default |
Description |
|---|---|---|---|
|
bool |
|
Recompute binding preference from contacts if no cache is available |
|
float |
|
Minimum relative SASA |
|
path |
|
Enzyme PDB for SASA computation |
|
bool |
|
Use built-in AA groups |
|
mapping |
|
Custom residue groups |
|
mapping |
|
Mutually exclusive partitions |
|
mapping |
|
Custom MDAnalysis selections per polymer type |
|
string |
|
Chain ID used for polymer auto-detection |
|
float |
|
FDR threshold |
plugins.polymer_bridging
Experimental
Polymer bridging detection is experimental and under active development.
Field |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
MDAnalysis selection for protein |
|
string |
|
MDAnalysis selection for polymer |
|
float |
|
Contact distance cutoff in Angstroms for oligomer-protein contact detection |
|
float |
|
Minimum frame-wise CA-CA distance to count as multisite ( |
plot_settings
Field |
Type |
Default |
Description |
|---|---|---|---|
|
path |
|
Directory for generated plots (relative to |
|
string |
|
Image format: |
|
int |
|
Resolution for raster formats. Range: 50–600. |
|
string |
|
Style preset: |
|
string |
|
Seaborn/matplotlib color palette name |
|
mapping |
from style preset |
Visual theme overrides (see below) |
plot_settings.theme
All fields are optional — defaults are drawn from the selected style preset.
Font sizes:
Field |
publication |
presentation |
Description |
|---|---|---|---|
|
13 |
18 |
Axes title font size |
|
14 |
20 |
Figure suptitle font size |
|
11 |
15 |
Axis label font size |
|
9 |
12 |
Tick label font size |
|
9 |
12 |
Legend entry font size |
|
9 |
12 |
Heatmap annotation font size |
|
8 |
10 |
Secondary annotation font size |
|
7 |
9 |
Fine-grained annotation font size |
Bar chart:
Field |
Default |
Description |
|---|---|---|
|
|
Bar fill opacity |
|
|
Bar edge color |
|
|
Bar edge line width |
|
|
Error bar cap size in points |
Replicate dots:
Field |
Default |
Description |
|---|---|---|
|
|
Scatter marker size |
|
|
Dot opacity |
|
|
Dot color |
Lines:
Field |
Default |
Description |
|---|---|---|
|
|
Line plot opacity |
|
|
fill_between band opacity |
|
|
Reference line color |
|
|
Reference line style |
|
|
Reference line width |
|
|
Vertical highlight line opacity |
Axes chrome:
Field |
Default |
Description |
|---|---|---|
|
|
Hide top axis spine |
|
|
Hide right axis spine |
Title & legend:
Field |
Default |
Description |
|---|---|---|
|
|
Title font weight |
|
|
Matplotlib legend location |
|
|
bbox_to_anchor for legend placement |
|
|
Render “Made by PolyzyMD” watermark |
Per-Analysis Plot Settings
Per-analysis plot customization keys go under plot_settings: at the same
level as style, dpi, etc.
plot_settings.rmsf:
Field |
Default |
Description |
|---|---|---|
|
|
Show SEM fill_between bands |
|
|
Residue IDs for vertical reference lines |
|
|
Per-residue profile figure size |
|
|
Bar comparison figure size |
plot_settings.catalytic_triad:
Field |
Default |
Description |
|---|---|---|
|
|
Multi-row KDE panel |
|
|
Threshold bar chart |
|
|
2D joint KDE |
|
|
X-axis range for KDE (Angstroms) |
plot_settings.distances:
Field |
Default |
Description |
|---|---|---|
|
|
Threshold line on distributions |
|
|
KDE vs histogram |
|
|
Above/below threshold bars |
plot_settings.contacts:
Field |
Default |
Description |
|---|---|---|
|
|
Binding preference heatmap |
|
|
Enrichment bar chart |
|
|
System coverage heatmap |
|
|
System coverage bar chart |
|
|
Per-residue contact fraction profile |
|
|
Per-residue residence time profile |
plot_settings.binding_free_energy:
Field |
Default |
Description |
|---|---|---|
|
|
ΔG_sel heatmap |
|
|
ΔG_sel bar chart |
|
|
Diverging colormap for heatmap |
plot_settings.polymer_affinity:
Field |
Default |
Description |
|---|---|---|
|
|
Total score by condition |
|
|
Per-group contributions |
plot_settings.secondary_structure:
Field |
Default |
Description |
|---|---|---|
|
|
Residue × time SS heatmap |
|
|
Helix/strand/coil fraction bars |
|
|
One bar chart per SS type |
|
|
Δ(helix persistence) vs control |
|
|
Diverging colormap for diff heatmap |
Tip
Common tips:
Run
polyzymd compare validateto check yourcomparison.yamlfor errors before launching a full analysis run.Relative paths in
config:are resolved from the directory containingcomparison.yaml, not from your working directory.An empty plugin mapping (e.g.,
rmsf: {}) enables the analysis with all default settings — you only need to specify fields you want to override.Set
control:to match one of your condition labels to get Δ-from-control columns in comparison tables and plots.