Quick Start Guide

This guide walks through running your first PolyzyMD simulation.

Overview

A typical PolyzyMD workflow:

Initialize project with polyzymd init
Add structure files (PDB, SDF)
Configure simulation (edit YAML file)
Validate the configuration
Submit jobs to HPC cluster

Step 1: Initialize Your Project

The easiest way to start is with the init command, which creates a complete project scaffold:

polyzymd init --name my_simulation

This creates:

my_simulation/
├── config.yaml              <- Template configuration (edit this)
├── structures/              <- Add your PDB/SDF files here
│   ├── place_protein_here.placeholder.txt
│   └── place_ligand_here.placeholder.txt
├── job_scripts/             <- SLURM scripts will go here
└── slurm_logs/              <- Job output will go here

Tip

The config.yaml template has all sections commented out with example values. This teaches you the configuration syntax while you customize it for your system.

Step 2: Add Your Structure Files

Copy your prepared structure files into the structures/ directory:

cd my_simulation

# Copy your enzyme PDB
cp /path/to/my_enzyme.pdb structures/enzyme.pdb

# Copy your substrate SDF (if using)
cp /path/to/my_substrate.sdf structures/substrate.sdf

# Remove the placeholder files
rm structures/*.placeholder.txt

Enzyme PDB Requirements

Your enzyme PDB must be simulation-ready:

All hydrogens added
Missing residues/atoms modeled
Proper protonation states
No alternate conformations
No crystallographic waters/ligands (unless intentional)

Recommended preparation tools:

PDB2PQR - Protonation
CHARMM-GUI - Full preparation
PyMOL/Chimera - Manual inspection

Substrate SDF Requirements (if using)

3D coordinates (from docking or crystal structure)
Correct protonation state for simulation pH
Single conformer (or specify conformer_index in config)

Step 3: Configure Your Simulation

Open config.yaml in your editor. The template has all sections commented out with placeholder values. You need to:

Uncomment the sections you need
Replace placeholder values with your actual data
Leave commented sections you don’t need

Minimal Configuration (Enzyme Only)

For a simple enzyme-in-water simulation, uncomment and edit these sections:

name: "my_first_simulation"
description: "Testing PolyzyMD with my enzyme"

# Enzyme - REQUIRED
enzyme:
  name: "MyEnzyme"
  pdb_path: "structures/enzyme.pdb"

# Solvent - REQUIRED
solvent:
  primary:
    type: "water"
    model: "tip3p"
  co_solvents: []
  ions:
    neutralize: true
    nacl_concentration: 0.15
  box:
    padding: 1.2
    shape: "rhombic_dodecahedron"
    target_density: 1.0
    tolerance: 2.0

# Thermodynamics - REQUIRED
thermodynamics:
  temperature: 300.0
  pressure: 1.0

# Simulation phases - REQUIRED
simulation_phases:
  equilibration:
    ensemble: "NVT"
    duration: 1.0
    samples: 100
    time_step: 2.0
    thermostat: "LangevinMiddle"
    thermostat_timescale: 1.0
  production:
    ensemble: "NPT"
    duration: 10.0       # Short for testing
    samples: 250
    time_step: 2.0
    thermostat: "LangevinMiddle"
    thermostat_timescale: 1.0
    barostat: "MC"
    barostat_frequency: 25


# Output - REQUIRED
output:
  projects_directory: "."
  scratch_directory: null
  job_scripts_subdir: "job_scripts"
  slurm_logs_subdir: "slurm_logs"
  naming_template: "{enzyme}_{temperature}K_run{replicate}"
  save_checkpoint: true
  save_state_data: true
  trajectory_format: "dcd"

Adding a Substrate

To include a docked ligand, uncomment the substrate section:

substrate:
  name: "MyLigand"
  sdf_path: "structures/substrate.sdf"
  conformer_index: 0
  charge_method: "nagl"
  residue_name: "LIG"

Adding Polymers

To add co-polymer chains around your enzyme:

polymers:
  enabled: true
  type_prefix: "SBMA-EGPMA"
  monomers:
    - label: "A"
      probability: 0.98
      name: "SBMA"
    - label: "B"
      probability: 0.02
      name: "EGPMA"
  length: 5
  count: 2
  cache_directory: ".polymer_cache"

Warning

YAML List Syntax

When defining monomers (or any list), each - starts a new list item. All fields for one monomer must be grouped together:

Incorrect:

monomers:
  - label: "A"
  - probability: 0.98
  - name: "SBMA"

Correct:

monomers:
  - label: "A"
    probability: 0.98
    name: "SBMA"

See Polymer Setup Guide for detailed polymer configuration.

Adding Restraints

To keep a substrate in the active site:

restraints:
  - type: "flat_bottom"
    name: "substrate_restraint"
    atom1:
      selection: "resid 77 and name OG"
      description: "Catalytic serine"
    atom2:
      selection: "resname LIG and name C1"
      description: "Substrate carbon"
    distance: 3.5
    force_constant: 10000.0
    enabled: true

See Restraints Guide for detailed restraint configuration.

Step 4: Validate Configuration

Before running, validate your configuration:

polyzymd validate -c config.yaml

This checks:

All required sections are present
File paths exist
Values are within valid ranges
No conflicting settings

Expected output:

Validating configuration: config.yaml
Configuration is valid!

Summary:
  Name: my_first_simulation
  Enzyme: MyEnzyme
  Substrate: None (apo simulation)
  Polymers: Disabled
  Temperature: 300.0 K
  Pressure: 1.0 atm

Simulation phases:
  Equilibration: 1.0 ns (NVT)
  Production: 10.0 ns (NPT)

Step 5: Test Build (Dry Run)

Test that the system can be built without actually building:

polyzymd build -c config.yaml --dry-run

Step 6: Run Locally (Optional)

For testing on a local machine with GPU, first build the system, then run a single production segment:

# Build the system (energy minimization + equilibration)
polyzymd build -c config.yaml

# Run one production segment
polyzymd run-segment -c config.yaml -r 1

Step 7: Submit to HPC

For production runs on an HPC cluster:

# Generate scripts without submitting (dry run)
polyzymd submit -c config.yaml \
    --replicates 1-3 \
    --preset aa100 \
    --dry-run

# Submit for real
polyzymd submit -c config.yaml \
    --replicates 1-3 \
    --preset aa100 \
    --email your.email@university.edu

Step 8: Monitor Jobs

# Check job status
squeue -u $USER

# View job output
tail -f slurm_logs/*.out

Output Files

After completion:

my_simulation/
├── config.yaml
├── job_scripts/           # SLURM scripts
├── slurm_logs/            # Job output
└── MyEnzyme_300K_run1/    # Simulation output
    ├── solvated_system.pdb
    ├── equilibration/
    │   └── trajectory.dcd
    └── production_0/
        ├── trajectory.dcd
        ├── checkpoint.chk
        └── state_data.csv

Next Steps

Add a substrate: See Configuration Reference
Add polymers: See Polymer Setup Guide
Add restraints: See Restraints Guide
Run on HPC: See HPC and SLURM Guide