RMSF Plugin Reference

For a step-by-step guide to running RMSF analysis, see RMSF Analysis: Quick Start.

Configuration Reference

RMSF settings live under plugins.rmsf in comparison.yaml.

Field

Type

Default

Description

enabled

bool

true

Enable or disable RMSF analysis

selection

str

"protein and name CA"

MDAnalysis selection used for RMSF calculation

reference_mode

str

"centroid"

Alignment reference mode: centroid, average, frame, external

reference_frame

int | null

null

1-indexed frame when reference_mode: frame

reference_file

str | null

null

Path to external PDB when reference_mode: external

alignment_selection

str

"protein and name CA"

Selection used for trajectory alignment

centroid_selection

str

"protein"

Selection used to find centroid reference frame

Note

Validation rules:

  • reference_mode: frame requires reference_frame

  • reference_mode: external requires reference_file

  • reference_file must point to an existing PDB file

Minimal plugin block

plugins:
  rmsf:
    enabled: true
    selection: "protein and name CA"
    reference_mode: "centroid"

External reference example

plugins:
  rmsf:
    selection: "protein and name CA"
    reference_mode: "external"
    reference_file: "/path/to/crystal_structure.pdb"

Output Files

RMSF writes per-replicate and aggregated JSON files under each condition’s analysis directory:

<projects_directory>/
└── analysis/
    └── rmsf/
        ├── run_1/
        │   └── rmsf_eq10ns.json
        ├── run_2/
        │   └── rmsf_eq10ns.json
        ├── run_3/
        │   └── rmsf_eq10ns.json
        └── aggregated/
            └── rmsf_reps1-3_eq10ns.json

Comparison-level output is written separately in the comparison workspace:

<comparison_workspace>/
└── comparison/
    └── rmsf/
        └── result.json

Per-replicate JSON (RMSFResult)

{
    "config_hash": "abc123...",
    "replicate": 1,
    "equilibration_time": 10.0,
    "equilibration_unit": "ns",
    "selection_string": "protein and name CA",
    "correlation_time": 15394.5,
    "correlation_time_unit": "ps",
    "n_independent_frames": 6,
    "residue_ids": [1, 2, 3],
    "residue_names": ["MET", "ALA", "SER"],
    "rmsf_values": [0.45, 0.52, 0.49],
    "mean_rmsf": 0.621,
    "std_rmsf": 0.215,
    "min_rmsf": 0.248,
    "max_rmsf": 3.160,
    "reference_mode": "centroid",
    "reference_frame": 401,
    "alignment_selection": "protein and name CA",
    "reference_file": null,
    "n_frames_total": 10000,
    "n_frames_used": 9000,
    "trajectory_files": [".../prod_1.xtc"]
}

Aggregated JSON (RMSFAggregatedResult)

{
    "replicates": [1, 2, 3],
    "n_replicates": 3,
    "residue_ids": [1, 2, 3],
    "residue_names": ["MET", "ALA", "SER"],
    "mean_rmsf_per_residue": [0.46, 0.50, 0.47],
    "sem_rmsf_per_residue": [0.02, 0.03, 0.02],
    "per_replicate_mean_rmsf": [0.64, 0.59, 0.63],
    "overall_mean_rmsf": 0.62,
    "overall_sem_rmsf": 0.02,
    "overall_min_rmsf": 0.30,
    "overall_max_rmsf": 4.21
}

Comparison JSON (result.json)

{
    "metric": "rmsf",
    "conditions": [
        {
            "label": "No Polymer",
            "n_replicates": 3,
            "mean_rmsf": 0.715,
            "sem_rmsf": 0.020,
            "replicate_values": [0.755, 0.693, 0.696]
        },
        {
            "label": "With Polymer",
            "n_replicates": 3,
            "mean_rmsf": 0.551,
            "sem_rmsf": 0.034,
            "replicate_values": [0.590, 0.520, 0.542]
        }
    ],
    "pairwise_comparisons": [
        {
            "condition_a": "No Polymer",
            "condition_b": "With Polymer",
            "percent_change": -22.9,
            "p_value": 0.0211,
            "cohens_d": 4.06,
            "significant": true,
            "direction": "stabilizing"
        }
    ],
    "ranking": ["With Polymer", "No Polymer"]
}

Plot Types

RMSF plots are generated by polyzymd compare plot-all -f comparison.yaml.

Plot output

Description

rmsf_profile.png

Per-residue RMSF profile by condition; optional SEM shading

rmsf_comparison.png

Horizontal bar chart of condition-level mean RMSF with SEM

The profile plot can include a reference secondary-structure annotation row when reference_file is set and readable.

RMSF plot settings

plot_settings:
  rmsf:
    show_error: true                # Show SEM band/bars
    highlight_residues: [77, 133]   # Vertical guide lines in profile plot
    figsize_profile: [14, 4]        # Profile figure size
    figsize_comparison: [8, 6]      # Comparison figure size

Field

Type

Default

Description

show_error

bool

true

Show SEM shading/bars

highlight_residues

list[int]

[]

Residue numbers to mark on profile plot

figsize_profile

tuple[float, float]

[14, 4]

Profile figure size

figsize_comparison

tuple[float, float]

[8, 6]

Comparison figure size

Common CLI Options

Option

Default

Description

-f, --file

comparison.yaml

Path to comparison configuration

--eq-time

0ns

Equilibration time to skip

--recompute

off

Ignore cached results and recompute

--format

table

Output format (table or json)

-o, --output

(none)

Save formatted output to file

-q, --quiet

off

Suppress INFO messages

--debug

off

Enable DEBUG logging

Troubleshooting

“Selection matched no atoms”

Cause: The MDAnalysis selection does not match atoms in the topology.

Fix:

  • Check residue numbering and atom names in your input structure

  • Start with selection: "protein and name CA"

  • Re-run with --debug for detailed selection diagnostics

“reference_file does not exist”

Cause: reference_mode: external is set, but the path is invalid.

Fix: Use an absolute path or a path relative to your working directory.

“External PDB atom count does not match trajectory selection”

Cause: The selection string resolves to different atom counts in trajectory and external reference.

Fix:

  • Ensure both structures use compatible atom naming

  • Use a stricter selection that matches in both systems

  • Confirm the external PDB contains the same residue set

Very high RMSF values (> 10 Å)

Cause: Usually alignment mismatch, overly broad selection, or genuine structural instability.

Fix:

  • Verify alignment_selection and selection

  • Try reference_mode: "average" as a cross-check

  • Confirm trajectory files are complete

“Low statistical reliability” warning

Cause: Correlation time is large relative to available production data.

Fix:

  • Use more replicates

  • Extend simulation length

  • Treat the result as qualitative if uncertainty is large

For interpretation guidance, see RMSF Analysis: Statistical Best Practices.

Missing replicate data

Message: Skipping replicate N: trajectory data not found

Cause: Replicate output is missing or path configuration is incorrect.

Fix: Analysis continues with available replicates. Verify simulation completion and file paths if missing replicates are unexpected.