Run Your First PolyzyMD Simulation

This tutorial walks through one complete first run: create a project scaffold, add an enzyme structure, write a minimal configuration, validate it, and make sure PolyzyMD can build the system.

What You Will Learn

  • How to create a project scaffold with polyzymd init

  • How to write a minimal config.yaml for an enzyme-only simulation

  • How to validate your configuration and run a system build

  • How to choose between local execution and HPC submission

Prerequisites

By the end, you will have a working project directory and a validated config.yaml that is ready for local building or HPC submission.

Important

Resource requirements: Lightweight commands such as polyzymd init, validate, status, and --help are safe to run interactively. System builds, simulations, and trajectory analyses can require substantial RAM, CPU/GPU time, and scratch I/O. On shared HPC systems, run heavy commands inside an allocated job or interactive compute session, not on a login node. If a tutorial command is killed or runs out of memory, request more resources or use the SLURM workflow.

Step 1: Create a new project scaffold

From the PolyzyMD repository root, run:

pixi run -e build polyzymd init --name my_first_simulation

This creates a directory like:

my_first_simulation/
|- config.yaml
|- structures/
|- job_scripts/
`- slurm_logs/

Move into the new project:

cd my_first_simulation

Step 2: Add your enzyme structure

Copy your prepared enzyme structure into structures/:

cp /path/to/enzyme.pdb structures/enzyme.pdb
rm structures/*.placeholder.txt

Your PDB should already be ready for simulation:

  • hydrogens added

  • missing atoms or residues fixed

  • intended protonation state chosen

  • alternate conformers removed

  • all protein residues assigned to chain ID A

If your input came directly from the Protein Data Bank, do not assume it is ready for PolyzyMD. Use Prepare a PDB for OpenFF and PolyzyMD for a worked 4CHA example that separates biological-system selection from mechanical cleanup.

If you are also using a substrate, place its SDF file in structures/ now, but this tutorial keeps the first run to an enzyme-only example.

Step 3: Replace the template with a minimal config

Open config.yaml and replace the template contents with:

name: "my_first_simulation"
description: "First PolyzyMD tutorial run"

enzyme:
  name: "MyEnzyme"
  pdb_path: "structures/enzyme.pdb"

solvent:
  primary:
    type: "water"
    model: "tip3p"
  co_solvents: []
  ions:
    neutralize: true
    nacl_concentration: 0.15
  box:
    padding: 1.2
    shape: "rhombic_dodecahedron"
    target_density: 1.0
    tolerance: 2.0

thermodynamics:
  temperature: 300.0
  pressure: 1.0

simulation_phases:
  equilibration_stages:
    - name: "heating"
      duration: 0.2
      samples: 20
      ensemble: "NVT"
      temperature_start: 60.0
      temperature_end: 300.0
      temperature_increment: 1.0
      temperature_interval: 1200.0
      position_restraints:
        - group: "protein_heavy"
          force_constant: 4184.0
    - name: "free_equilibration"
      duration: 0.8
      samples: 80
      ensemble: "NPT"
      temperature: 300.0
  production:
    ensemble: "NPT"
    duration: 10.0
    samples: 250
    time_step: 2.0
    thermostat: "LangevinMiddle"
    thermostat_timescale: 1.0
    barostat: "MC"
    barostat_frequency: 25

output:
  projects_directory: "."
  scratch_directory: null
  job_scripts_subdir: "job_scripts"
  slurm_logs_subdir: "slurm_logs"
  naming_template: "{enzyme}_{temperature}K_run{replicate}"
  save_checkpoint: true
  save_state_data: true
  trajectory_format: "dcd"

This is intentionally small, but it still uses the staged equilibration model that PolyzyMD now requires for all simulations.

Step 4: Validate the config

Run:

pixi run -e build polyzymd validate -c config.yaml

Success looks like a clean validation report with your enzyme name, temperature, and equilibration/production phases listed.

If validation fails, fix the reported paths or field values before continuing.

Step 5: Check that the system can build

Start with a dry run to see the full validation report:

pixi run -e build polyzymd build -c config.yaml --dry-run

If that succeeds, run the actual build. The commands differ depending on which simulation engine you plan to use:

Build the system for OpenMM simulation:

pixi run -e build polyzymd build -c config.yaml

This prepares the solvated system (PDB + OpenMM XML) for the configured equilibration and production phases.

Build the system and export GROMACS input files:

pixi run -e build polyzymd build -c config.yaml --format gromacs

This builds the solvated system and exports .gro, .top, .itp, .mdp, and a run script to replicate_1/gromacs/.

Tip

For the full GROMACS workflow, see Run GROMACS Simulations on HPC Clusters. Use polyzymd build --format gromacs when you only want input files, polyzymd run --engine gromacs when you want PolyzyMD to build and run locally, or polyzymd submit --engine gromacs to submit self-resubmitting SLURM jobs.

For multiple replicates (each with an independently built system):

pixi run -e build polyzymd build -c config.yaml --format gromacs --replicates 1-3

Step 6: Decide how you want to run production

For a local smoke test on a GPU-enabled workstation, you can run one segment:

pixi run -e cuda-12-4 polyzymd run-segment -c config.yaml -r 1

For an HPC workflow, generate job scripts first:

pixi run -e cuda-12-4 polyzymd submit -c config.yaml --preset aa100 --replicates 1 --generate-only

Use --dry-run instead if you only want a preview without creating files.

If the generated scripts look right, submit for real with the CUDA environment that matches your cluster.

Run the full GROMACS workflow locally (build + EM + equilibration + production):

pixi run -e build polyzymd run -c config.yaml --engine gromacs --replicates 1

If you only need the input files without running GROMACS, use build --format gromacs instead (see Step 5 above).

Submit GROMACS jobs to SLURM with self-resubmitting checkpoint-based restart:

pixi run -e build polyzymd submit \
    -c config.yaml \
    --engine gromacs \
    --preset aa100 \
    --replicates 1-3

Add a gromacs: block to your config.yaml for GPU acceleration and module loading. See Run GROMACS Simulations on HPC Clusters for the full GROMACS HPC workflow with cluster-specific recipes.

Tip

CU Boulder Blanca users: Replace --preset aa100 with --preset blanca-shirts --constraint "A40" and run ml slurm/blanca first. See the Blanca GPU recipe in Run GROMACS Simulations on HPC Clusters.

What you should have now

At this point you should have:

  • a project directory created by polyzymd init

  • a minimal, validated config.yaml

  • a successful build dry run or real build

  • a clear next step for local execution or SLURM submission

Step 7: Monitor progress

After submitting jobs, check simulation progress across all replicates:

pixi run -e build polyzymd status -c config.yaml

This shows a compact dashboard with colored progress bars, completion percentages, and status for each replicate. If any replicate shows interrupted, use polyzymd recover to resume it — see Run PolyzyMD on SLURM Clusters for details.

Where to go next