Tutorial: Analyze a Study from Finished Simulations

This tutorial walks through one complete PolyzyMD analysis story:

  • three simulation conditions already exist

  • you create one comparison.yaml

  • you compare the conditions

  • you finish with polyzymd compare plot-all as the smoke test

By the end, you will have a working comparison workspace with JSON results and figures for a small three-condition study.

What You Will Learn

  • How to initialize a comparison workspace with polyzymd compare init

  • How to write a comparison.yaml that defines conditions and analysis plugins

  • How to run cross-condition comparisons and generate figures

  • What the output directory structure looks like after a successful run

Prerequisites

Before starting, make sure you have:

  • Completed production trajectories for at least three conditions (DCD format in PolyzyMD’s standard directory layout)

  • One config.yaml per condition

  • A topology such as solvated_system.pdb already produced during the build

  • PolyzyMD installed in a pixi environment (see Install PolyzyMD with pixi)

If you have not run a single-condition analysis yet, complete Tutorial: Run Your First Analysis first.

Important

This tutorial uses the stable v1.3.0 comparison stack: RMSD, Rg, RMSF, contacts, distances, catalytic triad, secondary structure, SASA, and hydrogen bonds. Experimental workflows are linked at the end, but they are not part of the main tutorial path.

The Study We Will Analyze

We will assume a project laid out like this:

my_enzyme_study/
├── noPoly_enzyme_DMSO/
│   ├── config.yaml
│   └── scratch/
├── SBMA_100_enzyme_DMSO/
│   ├── config.yaml
│   └── scratch/
└── EGMA_100_enzyme_DMSO/
    ├── config.yaml
    └── scratch/

The scratch/ directories may be symlinks to large trajectory storage on your cluster. PolyzyMD resolves those paths through each condition’s config.yaml.

Step 1: Create the Comparison Workspace

From the study root, initialize a comparison project and move into it:

cd my_enzyme_study
pixi run -e build polyzymd compare init -n polymer_stabilization_study
cd polymer_stabilization_study

Now edit comparison.yaml to point at the three conditions and define the analysis settings:

name: "polymer_stabilization_study"
description: "Effect of SBMA vs EGMA polymer conjugation on enzyme stability"
control: "No Polymer"

conditions:
  - label: "No Polymer"
    config: "../noPoly_enzyme_DMSO/config.yaml"
    replicates: [1, 2, 3]

  - label: "100% SBMA"
    config: "../SBMA_100_enzyme_DMSO/config.yaml"
    replicates: [1, 2, 3]

  - label: "100% EGMA"
    config: "../EGMA_100_enzyme_DMSO/config.yaml"
    replicates: [1, 2, 3]

defaults:
  equilibration_time: "10ns"

plugins:
  rmsf:
    selection: "protein and name CA"
    reference_mode: "average"

  catalytic_triad:
    name: "Ser-His-Asp"
    threshold: 3.5
    pairs:
      - label: "Ser77-His156"
        selection_a: "protein and resid 77 and name OG"
        selection_b: "protein and resid 156 and name NE2"
      - label: "His156-Asp133"
        selection_a: "protein and resid 156 and name ND1"
        selection_b: "midpoint(protein and resid 133 and name OD1 OD2)"

  distances:
    pairs:
      - label: "Substrate-Ser77"
        selection_a: "resname SUB and name C1"
        selection_b: "protein and resid 77 and name OG"

  contacts:
    polymer_selection: "chainID C"
    protein_selection: "protein"
    cutoff: 4.5
    compute_residence_times: true

Step 2: Validate the Comparison Config

pixi run -e build polyzymd compare validate

You should see a passing summary that lists the three conditions and the enabled analyses.

Step 3: Run the Cross-Condition Comparison

For the tutorial, use the batch runner:

pixi run -e build polyzymd compare run-all

This runs every enabled comparison and writes canonical cache files into comparison/<analysis>/result.json.

Tip

On an HPC cluster? For large studies, submit each analysis as a SLURM job DAG instead of running interactively:

pixi run -e build polyzymd compare submit sasa --partition <part> --mem 8G

This parallelizes across replicates and conditions. See How To: Submit Analysis Jobs to a SLURM Cluster for the complete HPC workflow.

If you prefer to inspect one comparison first, a good sanity check is:

pixi run -e build polyzymd compare run rmsf

Step 4: Generate the Figures

Now run the plotting smoke test:

pixi run -e build polyzymd compare plot-all --list-available
pixi run -e build polyzymd compare plot-all

If those commands succeed, your comparison workspace is in good shape.

What Success Looks Like

At this point you should have:

polymer_stabilization_study/
├── comparison.yaml
├── comparison/
│   ├── rmsf/
│   │   └── result.json
│   ├── contacts/
│   │   └── result.json
│   ├── distances/
│   │   └── result.json
│   └── catalytic_triad/
│       └── result.json
└── figures/
    ├── rmsf_comparison.png
    ├── rmsf_profile.png
    ├── triad_kde_panel.png
    └── ...

That is the tutorial success state: the canonical comparison caches exist, the figures exist, and polyzymd compare plot-all completes without error.

What to Do Next

Experimental workflows remain available, but they are intentionally outside the main tutorial path for this release.