Distances Plugin Reference
For a step-by-step task guide, see Distance Analysis: Quick Start.
Configuration Reference
All fields for plugins.distances:
Field |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Global default threshold in Angstroms |
|
|
required |
One or more named distance pairs |
|
|
|
Apply periodic boundary conditions using minimum-image distance |
|
|
|
Align trajectory before distance calculation |
|
|
|
MDAnalysis selection used for alignment |
|
|
|
Alignment reference mode: |
|
|
|
Reference frame index when |
Each entry in pairs:
Field |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
Human-readable pair label |
|
|
required |
Selection for group A |
|
|
required |
Selection for group B |
|
|
global |
Per-pair threshold override |
|
|
|
Label for distance |
|
|
|
Label for distance |
Selection syntax extensions
Distances supports MDAnalysis selections plus helper wrappers:
Syntax |
Description |
Typical use |
|---|---|---|
|
Geometric midpoint of selected atoms |
Carboxylate oxygens (Asp/Glu) |
|
Center of mass of selected atoms |
Whole residues, ligands, or rings |
|
Atom by PDB serial index (1-indexed) |
Copying atom IDs from PDB/PyMOL |
Example pair definitions:
plugins:
distances:
pairs:
- label: "His156-Asp133"
selection_a: "protein and resid 156 and name ND1"
selection_b: "midpoint(protein and resid 133 and name OD1 OD2)"
- label: "Ligand-COM to Ser77"
selection_a: "com(resname LIG)"
selection_b: "protein and resid 77 and name OG"
- label: "Restraint atom check"
selection_a: "pdbindex 2740"
selection_b: "pdbindex 3011"
Important
Residue indices restart by chain in PolyzyMD systems. For protein residues,
prefer protein and resid ... to avoid accidental multi-chain matches.
PBC and alignment behavior
use_pbc: truecomputes minimum-image distances for wrapped trajectoriesalign_trajectory: trueremoves global rotation/translation before analysisOrthorhombic boxes are fully supported for PBC correction
Triclinic boxes trigger a warning and fall back to Euclidean distance
Alignment reference options:
Mode |
Description |
Best fit |
|---|---|---|
|
Align to most populated conformation |
General default |
|
Align to a specific frame index |
Reproducible fixed-reference comparisons |
Cache invalidation by settings
Distances cache keys include equilibration and geometry settings. Changing PBC/alignment settings produces new cache filenames automatically, for example:
distances_Ser77-His156_eq10ns_pbc_align-centroid.json
distances_Ser77-His156_eq10ns_nopbc_noalign.json
Output Files
Results are saved under your project analysis directory:
<projects_directory>/
└── analysis/
└── distances/
├── run_1/
│ ├── distances_Ser77-His156_eq10ns_pbc_align-centroid.json
│ └── distances_His156-Asp133_eq10ns_pbc_align-centroid.json
├── run_2/
│ └── ...
├── run_3/
│ └── ...
└── aggregated/
└── distances_reps1-3_eq10ns.json
Per-replicate result structure (representative):
{
"config_hash": "abc123...",
"replicate": 1,
"equilibration_time": 10.0,
"equilibration_unit": "ns",
"n_frames_total": 10000,
"n_frames_used": 9000,
"pair_results": [
{
"pair_label": "Ser77-His156",
"selection1": "protein and resid 77 and name OG",
"selection2": "protein and resid 156 and name NE2",
"mean_distance": 3.42,
"std_distance": 0.87,
"sem_distance": 0.15,
"median_distance": 3.31,
"min_distance": 2.61,
"max_distance": 5.87,
"kde_peak": 3.18,
"threshold": 3.5,
"fraction_below_threshold": 0.624,
"correlation_time": 245.3,
"n_independent_frames": 34,
"histogram_edges": [...],
"histogram_counts": [...],
"kde_x": [...],
"kde_y": [...]
}
]
}
Aggregated result structure (representative):
{
"replicates": [1, 2, 3],
"n_replicates": 3,
"pair_summaries": [
{
"pair_label": "Ser77-His156",
"overall_mean": 3.39,
"overall_sem": 0.11,
"overall_median": 3.28,
"per_replicate_means": [3.42, 3.31, 3.45],
"per_replicate_sems": [0.15, 0.13, 0.16],
"threshold": 3.5,
"overall_fraction_below_threshold": 0.61
}
]
}
Plot Types
Generate figures with:
polyzymd compare plot-all -f comparison.yaml
Distances plot outputs:
Plot output |
Description |
|---|---|
|
Distribution overlays across conditions for one pair |
|
Grouped bars of fraction below threshold |
|
Per-pair below/above threshold state summary |
plot_settings.distances options:
Field |
Default |
Description |
|---|---|---|
|
|
Draw threshold line on distributions |
|
|
Use KDE overlays (else histogram emphasis) |
|
|
Generate per-pair state bar figures |
Common CLI Options
Option |
Default |
Description |
|---|---|---|
|
|
Comparison config path |
|
|
Equilibration time to discard |
|
off |
Ignore cached results and recompute |
|
|
Output format ( |
|
(none) |
Write formatted output to a file |
|
off |
Suppress INFO logs |
|
off |
Enable DEBUG logging |
Typical run commands:
polyzymd compare run distances -f comparison.yaml --eq-time 10ns
polyzymd compare run distances -f comparison.yaml --eq-time 10ns --recompute --format json
polyzymd compare run-all -f comparison.yaml --eq-time 10ns
Troubleshooting
“Selection matched no atoms”
Cause: Selection string does not match topology atoms.
Fix:
Add chain-aware qualifiers, such as
protein and resid ...Verify atom names and residue IDs in your topology
Re-run with
--debugfor expanded selection diagnostics
Very wide or multimodal distance distribution
Cause: Selection may include flexible groups or unintended atoms.
Fix:
Confirm each selection resolves to the intended atom/group
Use
midpoint(...)orcom(...)where chemically appropriateInspect atom selections in a molecular viewer
Apparent long-distance outliers near box boundaries
Cause: PBC handling disabled or unsupported box geometry.
Fix:
Ensure
use_pbc: trueCheck logs for triclinic fallback warnings
Compare with aligned trajectories to reduce rigid-body artifacts
“Low statistical reliability” warning
Cause: Correlation time is large relative to trajectory length.
Fix:
Add replicates and compare aggregated results
Extend production trajectory length
Treat uncertainty estimates as conservative qualitative guidance
Missing replicate data
Message: Skipping replicate N: trajectory data not found
Cause: Replicate output is missing or incomplete.
Fix:
Confirm the requested replicate finished simulation
Verify scratch/project path mapping in config
Re-run analysis after data is available