RMSD Plugin Reference
For a step-by-step guide to running RMSD analysis, see RMSD Analysis: Quick Start.
Configuration Reference
All fields for RMSDRunSettings:
Field |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
Human-readable run label (must be unique) |
|
|
|
MDAnalysis selection for RMSD calculation |
|
|
|
MDAnalysis selection for trajectory alignment |
|
|
|
Reference mode: |
|
|
|
0-indexed frame for |
|
|
|
Path to external PDB for |
|
|
|
Selection for centroid finding; defaults to |
|
|
|
Sliding window size for convergence detection (ns) |
|
|
|
Step between successive windows (ns) |
|
|
|
Max absolute slope to qualify as “flat” (Å/ns) |
|
|
|
Required sustained duration below threshold (ns) |
Top-level RMSDSettings contains a single field:
Field |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
One or more named RMSD runs (at least one required) |
Note
Run labels must be unique within a single comparison.yaml. Duplicate labels
raise a validation error.
Output Files
Results are saved in your project’s analysis directory:
<projects_directory>/
└── analysis/
└── rmsd/
├── run_1/
│ ├── rmsd_eq10ns.json
│ ├── rmsd_Protein Backbone_timeseries.npz
│ └── rmsd_Active Site_timeseries.npz
├── run_2/
│ ├── rmsd_eq10ns.json
│ ├── rmsd_Protein Backbone_timeseries.npz
│ └── rmsd_Active Site_timeseries.npz
├── run_3/
│ └── ...
└── aggregated/
└── rmsd_reps1-3_eq10ns.json
Each replicate directory contains:
JSON result — summary statistics for all configured runs
NPZ sidecar(s) — raw per-frame RMSD timeseries (one per run)
JSON Result Structure
Per-replicate result (RMSDResult):
{
"config_hash": "abc123...",
"replicate": 1,
"equilibration_time": 10.0,
"equilibration_unit": "ns",
"selection_string": "protein and name CA; ...",
"n_frames_total": 10000,
"n_frames_used": 9000,
"trajectory_files": ["..."],
"run_results": [
{
"run_label": "Protein Backbone",
"selection": "protein and name CA",
"alignment_selection": "protein and name CA",
"reference_mode": "centroid",
"mean_rmsd": 1.823,
"std_rmsd": 0.312,
"median_rmsd": 1.791,
"min_rmsd": 0.987,
"max_rmsd": 3.104,
"final_rmsd": 1.956,
"sem_rmsd": 0.078,
"correlation_time": 4521.3,
"correlation_time_unit": "ps",
"n_independent_frames": 16,
"statistical_inefficiency": 562.7,
"n_frames_total": 10000,
"n_frames_used": 9000,
"npz_path": ".../rmsd_Protein Backbone_timeseries.npz",
"time_unit": "ns",
"timestep_ps": 10.0,
"converged": true,
"convergence_assessable": true,
"convergence_time_ns": 12.5,
"convergence_message": "Converged at 12.500 ns"
}
]
}
Aggregated result (RMSDAggregatedResult):
{
"replicates": [1, 2, 3],
"n_replicates": 3,
"run_results": [
{
"run_label": "Protein Backbone",
"selection": "protein and name CA",
"overall_mean": 1.856,
"overall_sem": 0.034,
"overall_median": 1.823,
"per_replicate_means": [1.823, 1.891, 1.854],
"per_replicate_stds": [0.312, 0.298, 0.324],
"per_replicate_medians": [1.791, 1.862, 1.816],
"n_converged_replicates": 3,
"convergence_fraction": 1.0,
"mean_convergence_time_ns": 13.2,
"median_convergence_time_ns": 12.5
}
]
}
Plot Types
The RMSD plugin generates figures through polyzymd compare plot-all:
Plot output |
Description |
|---|---|
|
Mean RMSD vs time with SEM shading, one per run |
|
Grouped bar chart of mean RMSD across conditions, one per run |
|
Dual-axis plot: RMSD timeseries with sliding-window slope and convergence marker (requires |
Timeseries plot features:
Mean RMSD curve per condition with SEM shading
Legend placed outside the plot area (
bbox_to_anchor=(1.02, 0.5))Optional per-replicate traces via
show_per_replicate: true
RMSD plot behavior can be customized in comparison.yaml:
plot_settings:
rmsd:
show_per_replicate: false # Overlay individual replicate traces
figsize: [10, 6] # Default figure size (bar charts)
timeseries_figsize: [12, 5] # Timeseries figure size (wider)
show_convergence_plots: false # Generate per-replicate convergence diagnostics
convergence_figsize: [12, 5] # Convergence panel figure size
Convergence Detection
Convergence detection is always on — every RMSD run automatically applies a
sliding-window slope heuristic to determine whether the RMSD timeseries has
plateaued. This is a purely additive diagnostic: it does not affect ranking,
statistical tests, or any other comparison output. Convergence results appear
as additional fields in per-replicate and aggregated JSON files, and optional
convergence plots can be enabled via show_convergence_plots: true.
For a conceptual explanation of the algorithm, its parameters, and its limitations, see Establishing Convergence in MD Simulations.
Common CLI Options
Option |
Default |
Description |
|---|---|---|
|
|
Path to comparison configuration |
|
|
Equilibration time to skip |
|
off |
Ignore cached results and recompute |
|
|
Output format ( |
|
(none) |
Save formatted output to file |
|
off |
Suppress INFO messages |
|
off |
Enable DEBUG logging |
Troubleshooting
“Selection matched no atoms”
Cause: MDAnalysis selection doesn’t match any atoms in your topology.
Fix:
Check residue numbering in your PDB vs. MDAnalysis (0-indexed vs 1-indexed)
Verify atom names match your topology
Use
polyzymd --debug compare run rmsd -f comparison.yaml ...for detailed diagnostics
“At least one RMSD run must be defined”
Cause: The runs list in plugins.rmsd is empty or missing.
Fix: Add at least one run entry with a label field:
plugins:
rmsd:
runs:
- label: "Protein Backbone"
“reference_file does not exist”
Cause: Using reference_mode: external but the PDB path is invalid.
Fix: Provide an absolute path or a path relative to the working directory:
reference_mode: "external"
reference_file: "/absolute/path/to/crystal.pdb"
“atom count mismatch between trajectory and external PDB”
Cause: The selection string matches different numbers of atoms in the
trajectory vs. the external reference PDB.
Fix:
Ensure both systems use the same atom naming convention
Check that the external PDB contains the same residues as your simulation
Use a more specific selection if topologies differ
Very high RMSD values (> 10 Å)
Cause: Usually indicates alignment issues, wrong selection, or unfolding.
Fix:
Check that
alignment_selectionmatches atoms in your systemTry
reference_mode: "average"to compareVerify trajectory files are complete
Check for protein unfolding or large conformational changes
“Low statistical reliability” warning
Cause: Long correlation time relative to trajectory length.
This is informational, not an error. Results are still valid but uncertainties may be underestimated.
Mitigation:
Use multiple replicates (aggregated SEM is more reliable)
Run longer simulations
Results are still useful for qualitative comparisons
Missing replicate data
Message: Skipping replicate N: trajectory data not found
Cause: The requested replicate hasn’t completed or path is incorrect.
Fix: This is informational — analysis continues with available replicates. Check simulation status if unexpected.
RMSD vs RMSF Comparison
Feature |
RMSD |
RMSF |
|---|---|---|
Measures |
Global deviation from reference |
Per-residue fluctuation |
Output |
One value per frame (timeseries) |
One value per residue (profile) |
Reference |
Fixed structure (centroid/average/external) |
Time-averaged position |
Detects |
Conformational drift, unfolding |
Flexible loops, rigid core |
Multi-run |
Yes ( |
Single selection |
Best for |
Equilibration assessment, stability comparison |
Flexibility mapping |
Tip
Use RMSD first to assess overall stability and choose equilibration time, then use RMSF to identify which regions drive flexibility differences.