Dynamic Polymer Generation
This guide explains how to generate polymer chains on-the-fly from raw monomer SMILES strings, eliminating the need for pre-built polymer SDF files.
Overview
PolyzyMD supports two modes for polymer generation:
Mode |
Description |
Use Case |
|---|---|---|
Cached (default) |
Load pre-built polymers from SDF files |
Reproducibility, specific sequences |
Dynamic |
Generate polymers from monomer SMILES |
Flexibility, new monomers, rapid prototyping |
Dynamic mode uses ATRP (Atom-Transfer Radical Polymerization) chemistry to:
Activate raw monomer SMILES via initiation reactions
Generate all possible polymer fragments
Build complete polymer chains with proper termination
Assign partial charges automatically
When to Use Dynamic Mode
Use Dynamic Mode when:
You want to test new monomer chemistries quickly
You don’t have pre-built polymer SDF files
You want the system to handle fragment generation automatically
Use Cached Mode when:
You need exact reproducibility of polymer structures
You have validated, pre-built polymer SDFs
You’re running production simulations with known good structures
Quick Start
Here’s a minimal configuration for dynamic polymer generation:
name: "dynamic_polymer_test"
enzyme:
name: "MyEnzyme"
pdb_path: "structures/enzyme.pdb"
polymers:
enabled: true
generation_mode: "dynamic" # Enable dynamic generation
type_prefix: "SBMA-EGPMA"
monomers:
- label: "A"
probability: 0.7
name: "SBMA"
smiles: "[H]C([H])=C(C(=O)OC([H])([H])C([H])([H])[N+](C([H])([H])[H])(C([H])([H])[H])C([H])([H])C([H])([H])C([H])([H])S(=O)(=O)[O-])C([H])([H])[H]"
- label: "B"
probability: 0.3
name: "EGPMA"
smiles: "[H]C([H])=C(C(=O)OC([H])([H])C([H])([H])Oc1c([H])c([H])c([H])c([H])c1[H])C([H])([H])[H]"
length: 5
count: 2
charger: "nagl"
# ... rest of configuration (solvent, thermodynamics, etc.)
Then build and run:
polyzymd build -c config.yaml
polyzymd run-segment -c config.yaml -r 1
Configuration Reference
Generation Mode
polymers:
generation_mode: "dynamic" # or "cached" (default)
Value |
Description |
|---|---|
|
Load polymers from pre-built SDF files (default) |
|
Generate polymers from monomer SMILES using ATRP chemistry |
Monomer SMILES
In dynamic mode, each monomer requires a smiles field:
monomers:
- label: "A"
probability: 0.7
name: "SBMA"
smiles: "[H]C([H])=C(C(=O)..." # Raw methacrylate SMILES
residue_name: "SBM" # Optional: 3-letter residue name
Field |
Required |
Description |
|---|---|---|
|
Yes |
Single character identifier (A, B, C…) |
|
Yes |
Selection probability (must sum to 1.0) |
|
Yes |
Monomer name for identification |
|
Yes (dynamic) |
Raw monomer SMILES string |
|
No |
3-letter residue name (auto-generated if not provided) |
Important
SMILES Format: Provide the raw, unactivated monomer SMILES. For methacrylates, this means the structure with the C=C double bond intact. The system will handle activation (chlorination) automatically.
ATRP Reaction Configuration
By default, PolyzyMD uses bundled ATRP reaction templates. You can also specify custom reaction files:
polymers:
generation_mode: "dynamic"
reactions:
initiation: "default" # Use bundled template
polymerization: "default" # Use bundled template
termination: "default" # Use bundled template
# Or specify custom reaction files:
# reactions:
# initiation: "/path/to/my_initiation.rxn"
# polymerization: "/path/to/my_polymerization.rxn"
# termination: "/path/to/my_termination.rxn"
Charge Assignment
polymers:
charger: "nagl" # Charge method for polymer atoms
Method |
Description |
Speed |
|---|---|---|
|
Graph neural network charges (recommended) |
Fast |
|
Machine learning charges |
Medium |
|
Semi-empirical QM charges |
Slow |
Retry Configuration
Dynamic generation may occasionally produce structures with ring-piercing artifacts. The system automatically retries:
polymers:
max_retries: 10 # Maximum attempts before failing (default: 10)
Complete Example Configuration
Here’s a complete configuration for running a simulation with dynamically generated polymers:
name: "LipA_dynamic_polymer_simulation"
description: "Lipase A with dynamically generated SBMA-EGPMA copolymers"
# Enzyme
enzyme:
name: "LipA"
pdb_path: "structures/enzyme.pdb"
# Substrate (optional)
substrate:
name: "ResorufinButyrate"
sdf_path: "structures/substrate.sdf"
charge_method: "nagl"
residue_name: "RBY"
# Dynamic Polymer Generation
polymers:
enabled: true
generation_mode: "dynamic"
type_prefix: "SBMA-EGPMA"
# ATRP reactions (use bundled defaults)
reactions:
initiation: "default"
polymerization: "default"
termination: "default"
# Monomer definitions with SMILES
monomers:
- label: "A"
probability: 0.7
name: "SBMA"
smiles: "[H]C([H])=C(C(=O)OC([H])([H])C([H])([H])[N+](C([H])([H])[H])(C([H])([H])[H])C([H])([H])C([H])([H])C([H])([H])S(=O)(=O)[O-])C([H])([H])[H]"
residue_name: "SBM"
- label: "B"
probability: 0.3
name: "EGPMA"
smiles: "[H]C([H])=C(C(=O)OC([H])([H])C([H])([H])Oc1c([H])c([H])c([H])c([H])c1[H])C([H])([H])[H]"
residue_name: "EGM"
# Chain parameters
length: 5
count: 2
# Charge assignment
charger: "nagl"
max_retries: 10
# Caching
cache_directory: ".polymer_cache"
# Solvent
solvent:
primary:
type: "water"
model: "tip3p"
co_solvents:
- name: "dmso"
volume_fraction: 0.30
residue_name: "DMS"
ions:
neutralize: true
nacl_concentration: 0.0
box:
padding: 1.2
shape: "rhombic_dodecahedron"
target_density: 1.05
tolerance: 2.0
# Thermodynamics
thermodynamics:
temperature: 300.0
pressure: 1.0
# Simulation phases
simulation_phases:
equilibration_stages:
- name: "heating"
duration: 0.2
samples: 20
ensemble: "NVT"
temperature_start: 60.0
temperature_end: 300.0
temperature_increment: 1.0
temperature_interval: 1200.0
position_restraints:
- group: "protein_heavy"
force_constant: 4184.0
- group: "polymer_heavy"
force_constant: 4184.0
- name: "free_equilibration"
duration: 0.8
samples: 80
ensemble: "NPT"
temperature: 300.0
production:
ensemble: "NPT"
duration: 100.0
samples: 2500
time_step: 2.0
thermostat: "LangevinMiddle"
thermostat_timescale: 1.0
barostat: "MC"
barostat_frequency: 25
# Output
output:
projects_directory: "."
scratch_directory: null
job_scripts_subdir: "job_scripts"
slurm_logs_subdir: "slurm_logs"
naming_template: "{enzyme}_{polymer_type}_dynamic_run{replicate}"
save_checkpoint: true
save_state_data: true
trajectory_format: "dcd"
How Dynamic Generation Works
Step 1: Fragment Generation
When you run a simulation with generation_mode: "dynamic", the system:
Loads raw monomer SMILES from your configuration
Runs initiation reactions to activate the monomers (chlorination for ATRP)
Runs polymerization reactions to enumerate all possible fragments
Runs termination reactions on 1-site fragments to restore terminal alkenes
Creates a MonomerGroup with named fragments (e.g.,
SBMA_2-site,SBMA_1-site)
Step 2: Polymer Building
For each polymer chain:
Generate random sequence based on monomer probabilities (e.g., “AABBA”)
Set terminal orientations (head/tail use 1-site fragments)
Build 3D structure using Polymerist’s
build_linear_polymer()Validate no ring-piercing (retry if necessary)
Assign partial charges using the configured charger
Step 3: Caching
Generated fragments and polymers are cached for reuse:
.polymer_cache/
├── SBMA-EGPMA_monomer_group.json # Fragment definitions
├── SBMA-EGPMA_AABBA_5-mer_charged.sdf # Charged polymer structures
└── ...
To regenerate from scratch, delete the cache:
rm -rf .polymer_cache
Supported Chemistries
ATRP (Atom-Transfer Radical Polymerization)
Currently, dynamic generation supports methacrylate-based monomers using ATRP chemistry:
Sulfobetaine methacrylate (SBMA)
Ethylene glycol phenyl ether methacrylate (EGPMA)
Trimethylammonium ethyl methacrylate (TMAEMA)
Sulfopropyl methacrylate (SPMA)
Oligo(ethylene glycol) methacrylate (OEGMA)
Adding New Monomers
To use a new methacrylate monomer:
Obtain the SMILES string (with C=C double bond intact)
Add it to your configuration:
monomers:
- label: "C"
probability: 0.1
name: "MyNewMonomer"
smiles: "[H]C([H])=C(C(=O)O...)C([H])([H])[H]"
Note
For non-methacrylate monomers or different polymerization chemistries (e.g., ring-opening, condensation), you would need to provide custom reaction files. This is an advanced use case.
Troubleshooting
“No 1-site terminal fragment found for monomer”
Cause: The cached MonomerGroup has old fragment names.
Solution: Delete the cache and regenerate:
rm -rf .polymer_cache
“Failed to build polymer after N attempts due to ring-piercing”
Cause: The polymer structure has atoms passing through rings.
Solutions:
Increase
max_retries(default is 10)Try shorter polymer chains
Check that monomer SMILES are correct
“Symbol ‘X’ not present in monomer group mapping”
Cause: Mismatch between sequence labels and MonomerGroup fragments.
Solution: Ensure all monomer labels (A, B, C…) in your config have corresponding fragments generated. Delete the cache and regenerate.
Slow fragment generation
Cause: First run generates all fragments, which can take time.
Solution: This is normal for the first run. Subsequent runs use the cached MonomerGroup and are much faster.
Performance Tips
Start small: Test with short chains (length: 3-5) and few polymers (count: 1-2) first
Use NAGL charger: It’s much faster than AM1BCC for polymer charging
Reuse cache: Keep
.polymer_cachebetween runs for the same monomer setParallelize later: Dynamic generation currently runs sequentially; HPC parallelization is planned
See Also
Polymer Setup Guide - General polymer configuration guide
Configuration Reference - Complete configuration reference
GROMACS Export and Simulation - Running with GROMACS