Rg Plugin Reference
For a step-by-step guide to running Rg analysis, see Rg Analysis: Quick Start.
Configuration Reference
All fields for RgRunSettings:
Field |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
Human-readable run label (must be unique) |
|
|
required |
MDAnalysis selection for Rg calculation |
|
|
|
|
|
|
|
|
|
|
|
Save per-fragment Rg values in NPZ sidecar for distribution analysis |
|
|
|
Number of bins for fragment/reduced distribution histograms (minimum 2) |
Top-level RgSettings contains a single field:
Field |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
One or more named Rg runs (at least one required) |
Note
Run labels must be unique within a single comparison.yaml. Duplicate labels
raise a validation error.
Warning
Unlike RMSD (which defaults selection to "protein and name CA"), Rg has
no default selection. You must always specify selection explicitly for
each run.
Fragment Mode Reference
Added in version 1.3.0: Fragment-aware Rg calculation was added in PolyzyMD 1.3.0.
Standard selection mode computes one Rg value per frame for the full atom
group matched by selection. Fragment mode first computes Rg for each
disconnected topological fragment, then reduces those per-fragment values to
one per-frame value.
Use fragment mode when your selection includes many independent molecules and you care about the average fragment size rather than the size of the entire multi-molecule cloud.
Fragment mode configuration
plugins:
rg:
runs:
- label: protein_rg
selection: protein
- label: polymer_blob_rg
selection: "resname SBM or resname EGM or resname EGP"
calculation_mode: fragments
fragment_weighting: equal
Setting |
Meaning |
|---|---|
|
Whole-group Rg |
|
Per-fragment Rg with reduction |
|
Arithmetic mean over fragments |
|
Mass-weighted mean over fragments |
How fragment mode reduction works
Identify disconnected fragments within the selected atom group
Compute fragment-level Rg for each frame
Reduce fragment values to one per-frame value using equal or mass weighting
Use the reduced timeseries for summary statistics and comparisons
Optionally save pooled fragment values in NPZ sidecars for distribution plots
Why Rg Has No Alignment or Reference Fields
Rg is based on mass-weighted distances from the center of mass, so it is translation and rotation invariant.
Because of that, Rg runs do not use RMSD-style fields such as
alignment_selection, reference_mode, reference_file, or
reference_frame.
Output Files
Results are saved as canonical v1.3 artifacts. JSON files use framework-owned artifact envelopes, and per-frame or distribution arrays are stored as NPZ sidecars.
<comparison_workspace>/
├── analysis/
│ └── <condition>/
│ └── rg/
│ ├── run_1/
│ │ ├── result.json
│ │ └── sidecars/
│ │ ├── rg_protein_rg_timeseries.npz
│ │ └── rg_polymer_blob_rg_timeseries.npz
│ ├── run_2/
│ │ └── ...
│ ├── run_3/
│ │ └── ...
│ └── aggregated/
│ ├── result.json
│ └── sidecars/
│ └── rg_polymer_blob_rg_distribution.npz
└── comparison/
└── rg/
└── result.json
The canonical paths are:
Level |
Artifact |
Path |
|---|---|---|
Per replicate |
|
|
Per condition |
|
|
Cross condition |
Comparison result |
|
Large arrays |
NPZ sidecars |
|
Each replicate artifact contains JSON summaries for all configured runs. NPZ sidecars store per-frame Rg timeseries and optional fragment distributions.
Artifact envelope fields
Field |
Description |
|---|---|
|
Rg run summaries, scalar metrics, fragment statistics, and relative sidecar paths |
|
Run settings, calculation modes, equilibration labels, and units |
|
Input topology/trajectory identity and workflow details |
|
Validated references to |
Use ArtifactStore for programmatic access:
from pathlib import Path
from polyzymd.analyses.mda import ArtifactStore
replicate = ArtifactStore(Path("analysis/PEGylated/rg/run_1")).read_replicate_result()
condition = ArtifactStore(Path("analysis/PEGylated/rg/aggregated")).read_condition_result()
print(replicate.payload["runs"][0]["mean_rg"])
print(condition.payload["runs"][0]["metrics"]["mean_rg"])
NPZ sidecar arrays
Each rg_<label>_timeseries.npz may include:
Array |
Mode |
Description |
|---|---|---|
|
always |
Per-frame reduced Rg timeseries (Å) |
|
always |
Time axis in ns |
|
always |
0-indexed frame indices |
|
fragments only |
Pooled fragment-level Rg values across all frames |
|
fragments only |
Number of fragments detected per frame |
|
fragments + mass weighting |
Fragment masses used for weighted reduction |
JSON result structures
Per-replicate result (ReplicateArtifact), representative structure:
{
"schema_version": "1",
"artifact_type": "replicate",
"analysis_name": "rg",
"condition_label": "PEGylated",
"replicate": 1,
"payload": {
"runs": [
{
"run_label": "protein_rg",
"selection": "protein",
"calculation_mode": "selection",
"mean_rg": 18.234,
"sem_rg": 0.098,
"timeseries_sidecar": "sidecars/rg_protein_rg_timeseries.npz"
},
{
"run_label": "polymer_blob_rg",
"selection": "resname SBM or resname EGM or resname EGP",
"calculation_mode": "fragments",
"fragment_weighting": "equal",
"mean_rg": 8.412,
"sem_rg": 0.054,
"mean_fragments_per_frame": 50.0,
"timeseries_sidecar": "sidecars/rg_polymer_blob_rg_timeseries.npz"
}
]
},
"metadata": {"equilibration": "10ns", "time_unit": "ns"},
"provenance": {"trajectory_files": ["..."], "n_frames_used": 9000},
"sidecars": [
{"path": "sidecars/rg_protein_rg_timeseries.npz", "metadata": {"kind": "timeseries"}},
{"path": "sidecars/rg_polymer_blob_rg_timeseries.npz", "metadata": {"kind": "timeseries"}}
]
}
Aggregated result (ConditionArtifact), representative structure:
{
"schema_version": "1",
"artifact_type": "condition",
"analysis_name": "rg",
"condition_label": "PEGylated",
"replicates": [1, 2, 3],
"payload": {
"runs": [
{
"run_label": "protein_rg",
"calculation_mode": "selection",
"metrics": {
"mean_rg": {"values": [18.234, 18.291, 18.244], "mean": 18.256, "sem": 0.044}
},
"distribution_sidecar": "sidecars/rg_protein_rg_distribution.npz"
},
{
"run_label": "polymer_blob_rg",
"calculation_mode": "fragments",
"metrics": {
"mean_rg": {"values": [8.412, 8.445, 8.439], "mean": 8.432, "sem": 0.021}
},
"fragment_summary": {"overall_mean_fragments_per_frame": 50.0},
"distribution_sidecar": "sidecars/rg_polymer_blob_rg_distribution.npz"
}
]
},
"metadata": {"equilibration": "10ns"},
"provenance": {"source_replicates": [1, 2, 3]},
"sidecars": [
{"path": "sidecars/rg_polymer_blob_rg_distribution.npz", "metadata": {"kind": "distribution"}}
]
}
Plot Types
The Rg plugin generates figures via polyzymd compare plot-all.
Plot output |
Description |
|---|---|
|
Mean Rg vs time with SEM shading, one figure per run |
|
Grouped bar chart of mean Rg across conditions, one figure per run |
|
Distribution view. Selection mode: reduced distribution panel only. Fragment mode: reduced + pooled fragment distributions |
Distribution plots are generated for runs that include histogram data in aggregated results.
Plot settings in comparison.yaml:
plot_settings:
rg:
show_per_replicate: false # Overlay individual replicate traces
figsize: [10, 6] # Default figure size (bar charts)
timeseries_figsize: [12, 5] # Timeseries figure size
Common CLI Options
Option |
Default |
Description |
|---|---|---|
|
|
Path to comparison configuration |
|
|
Equilibration time to skip |
|
off |
Ignore cached results and recompute |
|
|
Output format ( |
|
(none) |
Save formatted output to file |
|
off |
Suppress INFO messages |
|
off |
Enable DEBUG logging |
Replicates are configured per condition in comparison.yaml:
conditions:
- label: "no_polymer"
config: "configs/no_polymer.yaml"
replicates: [1, 3, 5]
Troubleshooting
“Selection matched no atoms”
When a run selection matches zero atoms, that run is skipped with a warning. Analysis continues for runs and conditions with valid data.
If unexpected:
Confirm residue numbering and atom naming in your topology
Check selection syntax directly against your system
Re-run with
--debugfor detailed diagnostics
“At least one Rg run must be defined”
Cause: plugins.rg.runs is missing or empty.
Fix: add at least one run with label and selection.
“Rg run labels must be unique”
Cause: duplicate run labels.
Fix: assign a unique label to each run.
“Equilibration removed all frames”
Cause: --eq-time exceeds trajectory duration.
Fix: lower equilibration time or verify simulation completion.
Very large Rg fluctuations (> 5 Å std)
Cause: often unfolding, large flexibility, or an overly broad selection.
Fix:
Validate the selection (whole protein vs backbone vs core)
Check trajectory integrity
Inspect timeseries for transitions or discontinuities
“Low statistical reliability” warning
Cause: correlation time is long relative to trajectory length.
This is informational. Consider longer simulations or more replicates.
Missing replicate data
Message: Skipping replicate N: trajectory data not found
Cause: trajectory files are missing or replicate is incomplete.
Fix: verify simulation output paths and replicate status.
Control condition missing for a run in fragment-mode workflows
Message: control has no data for a run and comparison falls back to all-vs-all pairwise testing for that run.
Cause: control selection matched no atoms for that run.
Behavior: expected in mixed run sets (for example, polymer-only runs with a no-polymer control).
Rg vs Other Metrics
Rg vs RMSD
Feature |
Rg |
RMSD |
|---|---|---|
Measures |
Structural compactness (mass-weighted size) |
Deviation from a reference structure |
Output |
One value per frame (timeseries) |
One value per frame (timeseries) |
Reference |
None required |
Required ( |
Alignment |
Not required |
Required |
Configuration |
|
|
Direction labels |
|
|
Best for |
Compaction, swelling, folding state shifts |
Drift and reference-relative stability |
Rg vs RMSF
Feature |
Rg |
RMSF |
|---|---|---|
Measures |
Global compactness over time |
Per-residue positional fluctuation |
Primary output |
Timeseries |
Residue profile |
Best question |
Is the structure compacting or expanding? |
Which regions are flexible or rigid? |