Tutorial: Run Your First Analysis
This tutorial walks you from finished trajectory files to your first analysis result. You will run RMSF analysis on a single simulation condition using the comparison pipeline, and see where the results end up on disk.
What You Will Learn
How to create a comparison project for a single condition
How to run RMSF analysis using
polyzymd compare runHow to read the output and find result files
Prerequisites
Before starting, make sure you have:
A completed production simulation with at least 1 replicate
The
config.yamlfile from that simulationTrajectory files in the expected directory layout (see Data Requirements & Directory Layout)
PolyzyMD installed in a pixi environment (see Install PolyzyMD with pixi)
If you have not run a simulation yet, complete Run Your First PolyzyMD Simulation first.
Step 1: Create a Comparison Project
From the directory where you keep your simulation projects, run:
pixi run -e build polyzymd compare init -n my_first_analysis
cd my_first_analysis
This creates a small project scaffold:
my_first_analysis/
├── comparison.yaml # Analysis configuration (you will edit this)
├── comparison/ # Where result JSON files are written
├── figures/ # Where plots are saved
└── structures/ # Optional shared structure files
The generated comparison.yaml is a template with placeholder values. You
will replace them in the next step.
Step 2: Edit comparison.yaml
Open comparison.yaml in your editor and replace the contents with a minimal
single-condition configuration:
name: "my_first_analysis"
description: "First analysis run"
control: null
conditions:
- label: "My Simulation"
config: "/path/to/my_simulation/config.yaml"
replicates: [1]
defaults:
equilibration_time: "10ns"
plugins:
rmsf:
selection: "protein and name CA"
Here is what each section does:
nameanddescription— identify this comparison project.control— the label of the control condition for statistical tests. Set tonullwhen you only have one condition.conditions— a list of simulation conditions to analyze. Each entry needs alabel, a path to that simulation’sconfig.yaml, and whichreplicatesto include.defaults.equilibration_time— how much time at the start of each trajectory to discard before analysis. Adjust to match your system’s equilibration period.plugins.rmsf— settings for the RMSF analysis plugin. Theselectionfield is an MDAnalysis atom selection string.
Important
The config path must point to the simulation project’s config.yaml. This
is how PolyzyMD locates your topology and trajectory files on disk. Relative
paths are resolved from the directory containing comparison.yaml.
For the full list of configuration fields, see Comparison and Plotting Reference.
Step 3: Run RMSF Analysis
Run the analysis with:
pixi run -e build polyzymd compare run rmsf -f comparison.yaml --eq-time 10ns
Note
The --eq-time flag overrides defaults.equilibration_time from your YAML
file. If you omit --eq-time, the value from comparison.yaml is used. This
is handy for quickly testing different equilibration cutoffs without editing the
YAML each time.
Tip
On an HPC cluster? Use polyzymd compare submit instead of compare run
to dispatch analysis as SLURM jobs. This is especially important for expensive
analyses (SASA, contacts, hydrogen bonds) on large studies. See
How To: Submit Analysis Jobs to a SLURM Cluster for the full workflow.
You should see output similar to:
Comparison: my_first_analysis
Plugin: rmsf
Conditions: 1
Equilibration: 10ns
[My Simulation] Computing replicate 1...
Loading trajectory (skipping first 10 ns)...
RMSF computed (142 residues, 490 frames)
[My Simulation] Aggregating 1 replicate...
RMSF Comparison Complete
My Simulation: mean RMSF = 0.621 ± 0.015 Å
Tip
If you see RMSF Analysis Complete with a mean value, the analysis succeeded.
If you see an error about a missing working directory or trajectory, check that
the config path in comparison.yaml is correct and that your trajectory
files exist on disk. See Troubleshooting for common fixes.
Step 4: Find Your Results
After the run completes, your project directory looks like this:
my_first_analysis/
├── comparison.yaml
├── analysis/
│ └── My_Simulation/
│ └── rmsf/
│ ├── run_1/
│ │ └── rmsf_eq10ns.json # Per-replicate result
│ └── aggregated/
│ └── result.json # Combined result
├── comparison/
│ └── rmsf/
│ └── result.json # Comparison summary
├── figures/
└── structures/
The key files are:
rmsf_eq10ns.json— per-replicate RMSF values for every residue in the selection, computed after discarding the first 10 ns.aggregated/result.json— aggregated statistics across replicates (with one replicate, this matches the per-replicate file).comparison/rmsf/result.json— the comparison-level summary with mean RMSF, standard error, and ranking information.
Step 5: Add Plotting (Optional)
To generate figures alongside the analysis, re-run with the --plot flag on
the run-all command:
pixi run -e build polyzymd compare run-all -f comparison.yaml --eq-time 10ns --plot
Or generate plots separately after the analysis has already been cached:
pixi run -e build polyzymd compare plot-all -f comparison.yaml
Figures are saved to the figures/ directory:
figures/
└── rmsf/
└── rmsf_profile.png
With a single condition, the profile plot shows per-residue RMSF values. Comparison bar charts appear when you add a second condition.
What’s Next
Now that you have run one analysis on one condition, here are some natural next steps:
How to Compare Simulation Conditions — Add a second condition and run a statistical comparison
Tutorial: Analyze a Study from Finished Simulations — Full multi-condition workflow with multiple analysis types
RMSF Analysis: Quick Start — RMSF-specific options (reference modes, selections, troubleshooting)
Data Requirements & Directory Layout — Directory layout reference and path resolution rules