Quick Start Guide

This guide walks through running your first PolyzyMD simulation.

Overview

A typical PolyzyMD workflow:

  1. Initialize project with polyzymd init

  2. Add structure files (PDB, SDF)

  3. Configure simulation (edit YAML file)

  4. Validate the configuration

  5. Submit jobs to HPC cluster

Step 1: Initialize Your Project

The easiest way to start is with the init command, which creates a complete project scaffold:

polyzymd init --name my_simulation

This creates:

my_simulation/
├── config.yaml              <- Template configuration (edit this)
├── structures/              <- Add your PDB/SDF files here
│   ├── place_protein_here.placeholder.txt
│   └── place_ligand_here.placeholder.txt
├── job_scripts/             <- SLURM scripts will go here
└── slurm_logs/              <- Job output will go here

Tip

The config.yaml template has all sections commented out with example values. This teaches you the configuration syntax while you customize it for your system.

Step 2: Add Your Structure Files

Copy your prepared structure files into the structures/ directory:

cd my_simulation

# Copy your enzyme PDB
cp /path/to/my_enzyme.pdb structures/enzyme.pdb

# Copy your substrate SDF (if using)
cp /path/to/my_substrate.sdf structures/substrate.sdf

# Remove the placeholder files
rm structures/*.placeholder.txt

Enzyme PDB Requirements

Your enzyme PDB must be simulation-ready:

  • All hydrogens added

  • Missing residues/atoms modeled

  • Proper protonation states

  • No alternate conformations

  • No crystallographic waters/ligands (unless intentional)

Recommended preparation tools:

  • PDB2PQR - Protonation

  • CHARMM-GUI - Full preparation

  • PyMOL/Chimera - Manual inspection

Substrate SDF Requirements (if using)

  • 3D coordinates (from docking or crystal structure)

  • Correct protonation state for simulation pH

  • Single conformer (or specify conformer_index in config)

Step 3: Configure Your Simulation

Open config.yaml in your editor. The template has all sections commented out with placeholder values. You need to:

  1. Uncomment the sections you need

  2. Replace placeholder values with your actual data

  3. Leave commented sections you don’t need

Minimal Configuration (Enzyme Only)

For a simple enzyme-in-water simulation, uncomment and edit these sections:

name: "my_first_simulation"
description: "Testing PolyzyMD with my enzyme"

# Enzyme - REQUIRED
enzyme:
  name: "MyEnzyme"
  pdb_path: "structures/enzyme.pdb"

# Solvent - REQUIRED
solvent:
  primary:
    type: "water"
    model: "tip3p"
  co_solvents: []
  ions:
    neutralize: true
    nacl_concentration: 0.15
  box:
    padding: 1.2
    shape: "rhombic_dodecahedron"
    target_density: 1.0
    tolerance: 2.0

# Thermodynamics - REQUIRED
thermodynamics:
  temperature: 300.0
  pressure: 1.0

# Simulation phases - REQUIRED
simulation_phases:
  equilibration:
    ensemble: "NVT"
    duration: 1.0
    samples: 100
    time_step: 2.0
    thermostat: "LangevinMiddle"
    thermostat_timescale: 1.0
  production:
    ensemble: "NPT"
    duration: 10.0       # Short for testing
    samples: 250
    time_step: 2.0
    thermostat: "LangevinMiddle"
    thermostat_timescale: 1.0
    barostat: "MC"
    barostat_frequency: 25


# Output - REQUIRED
output:
  projects_directory: "."
  scratch_directory: null
  job_scripts_subdir: "job_scripts"
  slurm_logs_subdir: "slurm_logs"
  naming_template: "{enzyme}_{temperature}K_run{replicate}"
  save_checkpoint: true
  save_state_data: true
  trajectory_format: "dcd"

Adding a Substrate

To include a docked ligand, uncomment the substrate section:

substrate:
  name: "MyLigand"
  sdf_path: "structures/substrate.sdf"
  conformer_index: 0
  charge_method: "nagl"
  residue_name: "LIG"

Adding Polymers

To add co-polymer chains around your enzyme:

polymers:
  enabled: true
  type_prefix: "SBMA-EGPMA"
  monomers:
    - label: "A"
      probability: 0.98
      name: "SBMA"
    - label: "B"
      probability: 0.02
      name: "EGPMA"
  length: 5
  count: 2
  cache_directory: ".polymer_cache"

Warning

YAML List Syntax

When defining monomers (or any list), each - starts a new list item. All fields for one monomer must be grouped together:

Incorrect:

monomers:
  - label: "A"
  - probability: 0.98
  - name: "SBMA"

Correct:

monomers:
  - label: "A"
    probability: 0.98
    name: "SBMA"

See Polymer Setup Guide for detailed polymer configuration.

Adding Restraints

To keep a substrate in the active site:

restraints:
  - type: "flat_bottom"
    name: "substrate_restraint"
    atom1:
      selection: "resid 77 and name OG"
      description: "Catalytic serine"
    atom2:
      selection: "resname LIG and name C1"
      description: "Substrate carbon"
    distance: 3.5
    force_constant: 10000.0
    enabled: true

See Restraints Guide for detailed restraint configuration.

Step 4: Validate Configuration

Before running, validate your configuration:

polyzymd validate -c config.yaml

This checks:

  • All required sections are present

  • File paths exist

  • Values are within valid ranges

  • No conflicting settings

Expected output:

Validating configuration: config.yaml
Configuration is valid!

Summary:
  Name: my_first_simulation
  Enzyme: MyEnzyme
  Substrate: None (apo simulation)
  Polymers: Disabled
  Temperature: 300.0 K
  Pressure: 1.0 atm

Simulation phases:
  Equilibration: 1.0 ns (NVT)
  Production: 10.0 ns (NPT)

Step 5: Test Build (Dry Run)

Test that the system can be built without actually building:

polyzymd build -c config.yaml --dry-run

Step 6: Run Locally (Optional)

For testing on a local machine with GPU, first build the system, then run a single production segment:

# Build the system (energy minimization + equilibration)
polyzymd build -c config.yaml

# Run one production segment
polyzymd run-segment -c config.yaml -r 1

Step 7: Submit to HPC

For production runs on an HPC cluster:

# Generate scripts without submitting (dry run)
polyzymd submit -c config.yaml \
    --replicates 1-3 \
    --preset aa100 \
    --dry-run

# Submit for real
polyzymd submit -c config.yaml \
    --replicates 1-3 \
    --preset aa100 \
    --email your.email@university.edu

Step 8: Monitor Jobs

# Check job status
squeue -u $USER

# View job output
tail -f slurm_logs/*.out

Output Files

After completion:

my_simulation/
├── config.yaml
├── job_scripts/           # SLURM scripts
├── slurm_logs/            # Job output
└── MyEnzyme_300K_run1/    # Simulation output
    ├── solvated_system.pdb
    ├── equilibration/
    │   └── trajectory.dcd
    └── production_0/
        ├── trajectory.dcd
        ├── checkpoint.chk
        └── state_data.csv

Next Steps