# RMSF Plugin Reference For a step-by-step guide to running RMSF analysis, see {doc}`../how_to/analysis_rmsf_quickstart`. ## Configuration Reference RMSF settings live under `plugins.rmsf` in `comparison.yaml`. | Field | Type | Default | Description | |-------|------|---------|-------------| | `enabled` | `bool` | `true` | Enable or disable RMSF analysis | | `selection` | `str` | `"protein and name CA"` | MDAnalysis selection used for RMSF calculation | | `reference_mode` | `str` | `"centroid"` | Alignment reference mode: `centroid`, `average`, `frame`, `external` | | `reference_frame` | `int \| null` | `null` | 1-indexed frame when `reference_mode: frame` | | `reference_file` | `str \| null` | `null` | Path to external PDB when `reference_mode: external` | | `alignment_selection` | `str` | `"protein and name CA"` | Selection used for trajectory alignment | | `centroid_selection` | `str` | `"protein"` | Selection used to find centroid reference frame | ```{note} Validation rules: - `reference_mode: frame` requires `reference_frame` - `reference_mode: external` requires `reference_file` - `reference_file` must point to an existing PDB file ``` ### Minimal plugin block ```yaml plugins: rmsf: enabled: true selection: "protein and name CA" reference_mode: "centroid" ``` ### External reference example ```yaml plugins: rmsf: selection: "protein and name CA" reference_mode: "external" reference_file: "/path/to/crystal_structure.pdb" ``` ## Output Files RMSF writes per-replicate and aggregated JSON files under each condition's analysis directory: ```text / └── analysis/ └── rmsf/ ├── run_1/ │ └── rmsf_eq10ns.json ├── run_2/ │ └── rmsf_eq10ns.json ├── run_3/ │ └── rmsf_eq10ns.json └── aggregated/ └── rmsf_reps1-3_eq10ns.json ``` Comparison-level output is written separately in the comparison workspace: ```text / └── comparison/ └── rmsf/ └── result.json ``` ### Per-replicate JSON (`RMSFResult`) ```python { "config_hash": "abc123...", "replicate": 1, "equilibration_time": 10.0, "equilibration_unit": "ns", "selection_string": "protein and name CA", "correlation_time": 15394.5, "correlation_time_unit": "ps", "n_independent_frames": 6, "residue_ids": [1, 2, 3], "residue_names": ["MET", "ALA", "SER"], "rmsf_values": [0.45, 0.52, 0.49], "mean_rmsf": 0.621, "std_rmsf": 0.215, "min_rmsf": 0.248, "max_rmsf": 3.160, "reference_mode": "centroid", "reference_frame": 401, "alignment_selection": "protein and name CA", "reference_file": null, "n_frames_total": 10000, "n_frames_used": 9000, "trajectory_files": [".../prod_1.xtc"] } ``` ### Aggregated JSON (`RMSFAggregatedResult`) ```python { "replicates": [1, 2, 3], "n_replicates": 3, "residue_ids": [1, 2, 3], "residue_names": ["MET", "ALA", "SER"], "mean_rmsf_per_residue": [0.46, 0.50, 0.47], "sem_rmsf_per_residue": [0.02, 0.03, 0.02], "per_replicate_mean_rmsf": [0.64, 0.59, 0.63], "overall_mean_rmsf": 0.62, "overall_sem_rmsf": 0.02, "overall_min_rmsf": 0.30, "overall_max_rmsf": 4.21 } ``` ### Comparison JSON (`result.json`) ```python { "metric": "rmsf", "conditions": [ { "label": "No Polymer", "n_replicates": 3, "mean_rmsf": 0.715, "sem_rmsf": 0.020, "replicate_values": [0.755, 0.693, 0.696] }, { "label": "With Polymer", "n_replicates": 3, "mean_rmsf": 0.551, "sem_rmsf": 0.034, "replicate_values": [0.590, 0.520, 0.542] } ], "pairwise_comparisons": [ { "condition_a": "No Polymer", "condition_b": "With Polymer", "percent_change": -22.9, "p_value": 0.0211, "cohens_d": 4.06, "significant": true, "direction": "stabilizing" } ], "ranking": ["With Polymer", "No Polymer"] } ``` ## Plot Types RMSF plots are generated by `polyzymd compare plot-all -f comparison.yaml`. | Plot output | Description | |-------------|-------------| | `rmsf_profile.png` | Per-residue RMSF profile by condition; optional SEM shading | | `rmsf_comparison.png` | Horizontal bar chart of condition-level mean RMSF with SEM | The profile plot can include a reference secondary-structure annotation row when `reference_file` is set and readable. ### RMSF plot settings ```yaml plot_settings: rmsf: show_error: true # Show SEM band/bars highlight_residues: [77, 133] # Vertical guide lines in profile plot figsize_profile: [14, 4] # Profile figure size figsize_comparison: [8, 6] # Comparison figure size ``` | Field | Type | Default | Description | |-------|------|---------|-------------| | `show_error` | `bool` | `true` | Show SEM shading/bars | | `highlight_residues` | `list[int]` | `[]` | Residue numbers to mark on profile plot | | `figsize_profile` | `tuple[float, float]` | `[14, 4]` | Profile figure size | | `figsize_comparison` | `tuple[float, float]` | `[8, 6]` | Comparison figure size | ## Common CLI Options | Option | Default | Description | |--------|---------|-------------| | `-f, --file` | `comparison.yaml` | Path to comparison configuration | | `--eq-time` | `0ns` | Equilibration time to skip | | `--recompute` | off | Ignore cached results and recompute | | `--format` | `table` | Output format (`table` or `json`) | | `-o, --output` | (none) | Save formatted output to file | | `-q, --quiet` | off | Suppress INFO messages | | `--debug` | off | Enable DEBUG logging | ## Troubleshooting ### "Selection matched no atoms" **Cause:** The MDAnalysis selection does not match atoms in the topology. **Fix:** - Check residue numbering and atom names in your input structure - Start with `selection: "protein and name CA"` - Re-run with `--debug` for detailed selection diagnostics ### "reference_file does not exist" **Cause:** `reference_mode: external` is set, but the path is invalid. **Fix:** Use an absolute path or a path relative to your working directory. ### "External PDB atom count does not match trajectory selection" **Cause:** The `selection` string resolves to different atom counts in trajectory and external reference. **Fix:** - Ensure both structures use compatible atom naming - Use a stricter selection that matches in both systems - Confirm the external PDB contains the same residue set ### Very high RMSF values (> 10 Å) **Cause:** Usually alignment mismatch, overly broad selection, or genuine structural instability. **Fix:** - Verify `alignment_selection` and `selection` - Try `reference_mode: "average"` as a cross-check - Confirm trajectory files are complete ### "Low statistical reliability" warning **Cause:** Correlation time is large relative to available production data. **Fix:** - Use more replicates - Extend simulation length - Treat the result as qualitative if uncertainty is large For interpretation guidance, see {doc}`../explanation/analysis_rmsf_best_practices`. ### Missing replicate data **Message:** `Skipping replicate N: trajectory data not found` **Cause:** Replicate output is missing or path configuration is incorrect. **Fix:** Analysis continues with available replicates. Verify simulation completion and file paths if missing replicates are unexpected.