Configuration Module
Configuration schema for PolyzyMD simulations.
This module defines Pydantic models for all configuration sections, providing validation, type safety, and YAML/JSON serialization support.
- class polyzymd.config.schema.ChargeMethod(value)[source]
-
Supported charge assignment methods for small molecules.
- NAGL = 'nagl'
- ESPALOMA = 'espaloma'
- AM1BCC = 'am1bcc'
- class polyzymd.config.schema.WaterModel(value)[source]
-
Supported water models.
- TIP3P = 'tip3p'
- SPCE = 'spce'
- TIP4P = 'tip4p'
- TIP4PEW = 'tip4pew'
- OPC = 'opc'
- class polyzymd.config.schema.BoxShape(value)[source]
-
Supported simulation box shapes.
- CUBE = 'cube'
- RHOMBIC_DODECAHEDRON = 'rhombic_dodecahedron'
- TRUNCATED_OCTAHEDRON = 'truncated_octahedron'
- class polyzymd.config.schema.Ensemble(value)[source]
-
Thermodynamic ensemble types.
- NVT = 'NVT'
- NPT = 'NPT'
- NVE = 'NVE'
- class polyzymd.config.schema.ThermostatType(value)[source]
-
Supported thermostat types.
- LANGEVIN_MIDDLE = 'LangevinMiddle'
- LANGEVIN = 'Langevin'
- ANDERSEN = 'Andersen'
- NOSE_HOOVER = 'NoseHoover'
- class polyzymd.config.schema.BarostatType(value)[source]
-
Supported barostat types.
- MONTE_CARLO = 'MC'
- MONTE_CARLO_ANISOTROPIC = 'MCA'
- class polyzymd.config.schema.RestraintType(value)[source]
-
Types of restraints that can be applied.
- FLAT_BOTTOM = 'flat_bottom'
- HARMONIC = 'harmonic'
- UPPER_WALL = 'upper_wall'
- LOWER_WALL = 'lower_wall'
- class polyzymd.config.schema.EnzymeConfig(*, name, pdb_path, description=None)[source]
Bases:
BaseModelConfiguration for the enzyme/protein component.
- Variables:
- pdb_path: Path
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.SubstrateConfig(*, name, sdf_path, conformer_index=0, charge_method=ChargeMethod.NAGL, residue_name='LIG')[source]
Bases:
BaseModelConfiguration for the docked substrate/ligand.
- Variables:
name (str) – Identifier for the substrate (e.g., “Resorufin-Butyrate”)
sdf_path (Path) – Path to SDF file with docked conformers
conformer_index (int) – Which conformer to use (0-indexed)
charge_method (ChargeMethod) – Method for assigning partial charges
residue_name (str) – 3-letter residue name for topology (default: “LIG”)
- sdf_path: Path
- charge_method: ChargeMethod
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.PolymerPackingConfig(*, padding=2.0, tolerance=2.0, movebadrandom=False, box_vectors=None)[source]
Bases:
BaseModelSettings for packing polymers around the solute.
Controls the box size and PACKMOL behavior when packing polymers around the protein-ligand complex.
- Variables:
padding (float) – Box padding around the solute in nanometers. Larger values give polymers more room and can speed up PACKMOL convergence.
tolerance (float) – Minimum molecular spacing for PACKMOL in Angstrom.
movebadrandom (bool) – When
True, pass themovebadrandomkeyword to PACKMOL. This places badly-packed molecules at random positions in the box rather than near well-packed neighbours, which improves convergence for dense or heterogeneous systems (many unique chain types). Has no effect when only a single chain type is present. Default isFalse(PACKMOL default behaviour is preserved).box_vectors (list[float] | None) – Optional explicit box dimensions
[Lx, Ly, Lz]in nanometers. When set, overrides the auto-computed bounding box plus padding. The protein is centered at the midpoint of the box. Default isNone(auto-compute from solute bounding box + padding).
Example
>>> PolymerPackingConfig(padding=2.5, movebadrandom=True) >>> PolymerPackingConfig(box_vectors=[8.0, 10.0, 12.0])
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.MonomerSpec(*, label, probability, name=None, smiles=None, residue_name=None)[source]
Bases:
BaseModelSpecification for a single monomer type in a co-polymer.
For dynamic polymer generation, provide the raw (unactivated) monomer SMILES. The system will run initiation reactions to create the active fragments.
- Variables:
label (str) – Single character label for this monomer (e.g., “A”, “B”)
probability (float) – Probability of selecting this monomer (0-1)
name (Optional[str]) – Optional full name (e.g., “SBMA”, “EGPMA”)
smiles (Optional[str]) – Raw monomer SMILES string (required for dynamic generation)
residue_name (Optional[str]) – 3-character PDB residue name (auto-generated if not provided)
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.PolymerGenerationMode(value)[source]
-
Mode for polymer generation.
- CACHED = 'cached'
- DYNAMIC = 'dynamic'
- class polyzymd.config.schema.ReactionConfig(*, initiation, polymerization, termination)[source]
Bases:
BaseModelPaths to reaction templates for ATRP polymer generation.
These .rxn files define the chemical transformations used to create polymer fragments from raw monomer SMILES. For ATRP, this includes: - Initiation: Activates the vinyl group (e.g., chlorination) - Polymerization: Creates chain-extending fragments - Termination: Restores the alkene for chain ends
You can use “default” as a special value to use the bundled ATRP methacrylate reaction templates that ship with PolyzyMD.
Example
- reactions:
initiation: “default” polymerization: “default” termination: “default”
- Variables:
initiation (Path) – Path to the initiation reaction template (.rxn) or “default”
polymerization (Path) – Path to the polymerization reaction template (.rxn) or “default”
termination (Path) – Path to the termination reaction template (.rxn) or “default”
- initiation: Path
- polymerization: Path
- termination: Path
- classmethod resolve_default_paths(v, info)[source]
Resolve ‘default’ to bundled ATRP reaction paths.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.PolymerConfig(*, enabled=True, generation_mode=PolymerGenerationMode.CACHED, type_prefix, monomers, length, count, sdf_directory=None, reactions=None, charger=ChargeMethod.NAGL, max_retries=10, cache_directory=PosixPath('.polymer_cache'), packing=<factory>, random_seed=None)[source]
Bases:
BaseModelConfiguration for polymer components.
Supports two generation modes: - “cached”: Load pre-built polymer SDF files from sdf_directory (legacy) - “dynamic”: Generate polymers on-the-fly using Polymerist from SMILES
For dynamic mode, you must provide: - SMILES for each monomer in monomers[].smiles - Reaction templates in the reactions field
- Variables:
enabled (bool) – Whether to include polymers in the system
generation_mode (PolymerGenerationMode) – “cached” for pre-built SDFs, “dynamic” for on-the-fly generation
type_prefix (str) – Prefix for polymer type in filenames (e.g., “SBMA-EGPMA”)
monomers (List[MonomerSpec]) – List of monomer specifications with probabilities (and SMILES for dynamic)
length (int) – Number of monomer units per polymer chain
count (int) – Number of polymer chains to add
sdf_directory (Optional[Path]) – Path to pre-built polymer SDF files (for cached mode)
reactions (Optional[ReactionConfig]) – Reaction templates for ATRP (required for dynamic mode)
charger (ChargeMethod) – Charge assignment method for generated polymers
max_retries (int) – Maximum retries for polymer generation (ring-piercing failures)
cache_directory (Path) – Directory for caching generated polymers and fragments
packing (PolymerPackingConfig) – Settings for packing polymers around the solute
random_seed (Optional[int]) – Random seed for polymer sequence generation (for reproducibility)
- generation_mode: PolymerGenerationMode
- monomers: List[MonomerSpec]
- reactions: ReactionConfig | None
- charger: ChargeMethod
- cache_directory: Path
- packing: PolymerPackingConfig
- validate_generation_mode_requirements()[source]
Validate that required fields are present for the selected generation mode.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.CoSolventSpec(*, name, smiles=None, volume_fraction=None, concentration=None, density=None, residue_name=None)[source]
Bases:
BaseModelSpecification for a co-solvent component.
You must specify EITHER volume_fraction OR concentration, not both.
For co-solvents in the built-in library (dmso, dmf, urea, ethanol, etc.), you can omit the smiles and density fields - they will be looked up automatically.
- Variables:
name (str) – Identifier for the co-solvent (e.g., “dmso”)
smiles (Optional[str]) – SMILES string (optional if co-solvent is in library)
volume_fraction (Optional[float]) – Volume fraction (0-1), e.g., 0.30 for 30% v/v
concentration (Optional[float]) – Molar concentration (mol/L)
density (Optional[float]) – Density in g/mL (required for volume_fraction with custom molecules)
residue_name (Optional[str]) – 3-letter residue name (default: first 3 chars of name)
- Example (library co-solvent with volume fraction):
>>> CoSolventSpec(name="dmso", volume_fraction=0.30)
- Example (library co-solvent with concentration):
>>> CoSolventSpec(name="urea", concentration=2.0)
- Example (custom co-solvent):
>>> CoSolventSpec( ... name="my_solvent", ... smiles="CCOC(=O)C", ... density=0.902, ... volume_fraction=0.15 ... )
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.PrimarySolventConfig(*, type='water', model=WaterModel.TIP3P)[source]
Bases:
BaseModelConfiguration for the primary solvent (usually water).
- Variables:
type (str) – Solvent type identifier
model (WaterModel) – Water model to use (if type is “water”)
- model: WaterModel
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.IonConfig(*, neutralize=True, nacl_concentration=0.1, kcl_concentration=0.0, mgcl2_concentration=0.0)[source]
Bases:
BaseModelConfiguration for ions in the solvent.
- Variables:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.BoxConfig(*, padding=1.2, shape=BoxShape.RHOMBIC_DODECAHEDRON, target_density=1.0, tolerance=2.0)[source]
Bases:
BaseModelConfiguration for the simulation box.
- Variables:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.SolventConfig(*, primary=<factory>, co_solvents=<factory>, ions=<factory>, box=<factory>)[source]
Bases:
BaseModelComplete solvent configuration.
- Variables:
primary (PrimarySolventConfig) – Primary solvent settings
co_solvents (List[CoSolventSpec]) – List of co-solvent specifications
ions (IonConfig) – Ion configuration
box (BoxConfig) – Box geometry settings
- primary: PrimarySolventConfig
- co_solvents: List[CoSolventSpec]
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.AtomSelectionConfig(*, selection, description=None)[source]
Bases:
BaseModelConfiguration for selecting atoms for restraints.
Uses MDAnalysis-compatible selection syntax for flexibility.
- Variables:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.RestraintConfig(*, type, name, atom1, atom2, distance, force_constant=10000.0, enabled=True)[source]
Bases:
BaseModelConfiguration for a single restraint.
- Variables:
type (RestraintType) – Type of restraint (flat_bottom, harmonic, etc.)
name (str) – Identifier for this restraint
atom1 (AtomSelectionConfig) – First atom selection
atom2 (AtomSelectionConfig) – Second atom selection
distance (float) – Target/threshold distance in Angstroms
force_constant (float) – Force constant in kJ/mol/nm^2
enabled (bool) – Whether this restraint is active
- type: RestraintType
- atom1: AtomSelectionConfig
- atom2: AtomSelectionConfig
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.ThermodynamicsConfig(*, temperature, pressure=1.0)[source]
Bases:
BaseModelThermodynamic conditions for the simulation.
- Variables:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.SimulationPhaseConfig(*, ensemble, duration, samples, time_step=2.0, thermostat=ThermostatType.LANGEVIN_MIDDLE, thermostat_timescale=1.0, barostat=None, barostat_frequency=25, checkpoint_interval=60.0)[source]
Bases:
BaseModelConfiguration for a single simulation phase (equilibration or production).
- Variables:
ensemble (Ensemble) – Thermodynamic ensemble (NVT, NPT)
duration (float) – Simulation duration in nanoseconds
samples (int) – Number of trajectory frames to save
time_step (float) – Integration time step in femtoseconds
thermostat (ThermostatType) – Thermostat type
thermostat_timescale (float) – Thermostat coupling timescale in ps
barostat (Optional[BarostatType]) – Barostat type (for NPT)
barostat_frequency (int) – Barostat update frequency (steps)
checkpoint_interval (float) – Wall-time interval (seconds) between restart checkpoints for preemption resilience
- thermostat: ThermostatType
- barostat: BarostatType | None
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.PositionRestraintConfig(*, group, force_constant=4184.0)[source]
Bases:
BaseModelConfiguration for positional restraints on an atom group.
Position restraints apply a harmonic potential to keep atoms near their initial coordinates. This is commonly used during equilibration to prevent large structural changes while the system relaxes.
- Variables:
- classmethod validate_group_name(v)[source]
Validate that the group name is a recognized predefined group.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.EquilibrationStageConfig(*, name, duration, samples=100, ensemble=Ensemble.NVT, temperature=None, temperature_start=None, temperature_end=None, temperature_increment=1.0, temperature_interval=1200.0, position_restraints=<factory>, time_step=None, thermostat=None, thermostat_timescale=None, barostat=None, barostat_frequency=None)[source]
Bases:
BaseModelConfiguration for a single equilibration stage.
Supports two temperature modes:
Constant temperature: Set
temperaturefieldTemperature ramping (simulated annealing): Set
temperature_startandtemperature_endfields
Position restraints can be applied to hold specific atom groups in place during the stage.
- Variables:
name (str) – Stage identifier (used in output paths)
duration (float) – Stage duration in nanoseconds
samples (int) – Number of trajectory frames to save
ensemble (Ensemble) – Thermodynamic ensemble (NVT or NPT)
temperature (Optional[float]) – Constant temperature in K (mutually exclusive with ramping)
temperature_start (Optional[float]) – Starting temperature for ramping in K
temperature_end (Optional[float]) – Ending temperature for ramping in K
temperature_increment (float) – Temperature increment per update in K
temperature_interval (float) – Time between temperature updates in fs
position_restraints (List[PositionRestraintConfig]) – List of position restraints for this stage
time_step (Optional[float]) – Optional time step override in fs
thermostat (Optional[ThermostatType]) – Optional thermostat type override
thermostat_timescale (Optional[float]) – Optional thermostat timescale override in ps
barostat (Optional[BarostatType]) – Optional barostat type (for NPT ensemble)
barostat_frequency (Optional[int]) – Optional barostat update frequency
- position_restraints: List[PositionRestraintConfig]
- thermostat: ThermostatType | None
- barostat: BarostatType | None
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.SimulationPhasesConfig(*, equilibration_stages=None, production)[source]
Bases:
BaseModelConfiguration for all simulation phases.
- Variables:
equilibration_stages (Optional[List[EquilibrationStageConfig]]) – Multi-stage equilibration protocol
production (SimulationPhaseConfig) – Production phase settings
- model_config = {'extra': 'ignore'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- equilibration_stages: List[EquilibrationStageConfig] | None
- production: SimulationPhaseConfig
- classmethod warn_deprecated_segments(data)[source]
Warn if the deprecated ‘segments’ field is present and remove it.
- polyzymd.config.schema.expand_path(path)[source]
Expand environment variables and user home in a path.
Supports: - $VAR and ${VAR} syntax for environment variables - ~ for user home directory
- class polyzymd.config.schema.OutputConfig(*, projects_directory=PosixPath('.'), scratch_directory=None, naming_template='{enzyme}_{substrate}_{polymer_type}_{duration}ns_{temperature}K_run{replicate}', job_scripts_subdir='job_scripts', slurm_logs_subdir='slurm_logs', save_checkpoint=True, save_state_data=True, trajectory_format='dcd', base_directory=None)[source]
Bases:
BaseModelConfiguration for simulation output.
Supports separate directories for: - scripts/logs (projects_directory): Where job scripts and SLURM logs are written - simulation data (scratch_directory): Where trajectories, checkpoints go
This separation allows running simulations on HPC systems where code lives in long-term storage (projects) but data is written to high-performance scratch storage.
- Variables:
projects_directory (Path) – Directory for scripts, configs, logs (long-term storage)
scratch_directory (Optional[Path]) – Directory for simulation output (high-performance storage)
naming_template (str) – Template for naming working directories
job_scripts_subdir (str) – Subdirectory name for job scripts within projects
slurm_logs_subdir (str) – Subdirectory name for SLURM logs within projects
save_checkpoint (bool) – Whether to save checkpoint files
save_state_data (bool) – Whether to save thermodynamic state data
trajectory_format (str) – Output trajectory format
- Example YAML:
- output:
projects_directory: /projects/user/polyzymd scratch_directory: /scratch/alpine/user/simulations naming_template: “{enzyme}_{substrate}_{temperature}K_run{replicate}”
- projects_directory: Path
- classmethod expand_env_vars_in_paths(v)[source]
Expand environment variables and ~ in path fields.
Supports $USER, ${HOME}, ~/path, etc.
- handle_legacy_base_directory()[source]
Handle legacy base_directory field for backwards compatibility.
- property effective_scratch_directory: Path
Get the effective scratch directory (falls back to projects if not set).
- format_directory_name(enzyme, substrate, polymer_type, temperature, replicate, duration=0.0, **kwargs)[source]
Format the directory name using the template.
- Parameters:
- Returns:
Formatted directory name
- Return type:
- get_job_scripts_directory()[source]
Get the directory for job scripts.
- Returns:
Path to job scripts directory (within projects)
- Return type:
- get_slurm_logs_directory()[source]
Get the directory for SLURM log files.
- Returns:
Path to SLURM logs directory (within projects)
- Return type:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.ForceFieldConfig(*, protein='ff14sb_off_impropers_0.0.4.offxml', small_molecule='openff-2.0.0.offxml')[source]
Bases:
BaseModelConfiguration for force field selection.
- Variables:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.config.schema.SimulationConfig(*, name, description=None, enzyme, substrate=None, polymers=None, solvent=<factory>, restraints=<factory>, thermodynamics, simulation_phases, output=<factory>, force_field=<factory>)[source]
Bases:
BaseModelComplete simulation configuration.
This is the top-level configuration model that contains all settings for a PolyzyMD simulation.
- Variables:
name (str) – Simulation name/identifier
description (Optional[str]) – Optional description
enzyme (EnzymeConfig) – Enzyme configuration
substrate (Optional[SubstrateConfig]) – Substrate/ligand configuration
polymers (Optional[PolymerConfig]) – Polymer configuration (optional)
solvent (SolventConfig) – Solvent and box configuration
restraints (List[RestraintConfig]) – List of restraint configurations
thermodynamics (ThermodynamicsConfig) – Temperature/pressure settings
simulation_phases (SimulationPhasesConfig) – Equilibration and production settings
output (OutputConfig) – Output file settings
force_field (ForceFieldConfig) – Force field selection
Example
>>> config = SimulationConfig.from_yaml("simulation.yaml") >>> print(config.enzyme.name) "LipA"
- enzyme: EnzymeConfig
- substrate: SubstrateConfig | None
- polymers: PolymerConfig | None
- solvent: SolventConfig
- restraints: List[RestraintConfig]
- thermodynamics: ThermodynamicsConfig
- simulation_phases: SimulationPhasesConfig
- output: OutputConfig
- force_field: ForceFieldConfig
- classmethod from_yaml(path)[source]
Load configuration from a YAML file.
- Parameters:
- Returns:
SimulationConfig instance
- Raises:
FileNotFoundError – If file doesn’t exist
ValidationError – If configuration is invalid
- Return type:
- get_working_directory(replicate=1)[source]
Get the working directory path for a given replicate (in scratch).
This returns the path where simulation output (trajectories, checkpoints) will be written, which is in the scratch directory.
- get_projects_directory()[source]
Get the projects directory path.
This is where scripts, configs, and logs are stored.
- Returns:
Path to the projects directory
- Return type:
- discover_replicate_dirs()[source]
Auto-detect all replicate directories on disk.
Builds a glob pattern from the naming template with
replicate="*"and scans the effective scratch directory.
- to_signac_statepoint(replicate=1)[source]
Convert configuration to a Signac-compatible state point dictionary.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Loader Utilities
YAML configuration loader and saver for PolyzyMD.
This module provides functions to load and save SimulationConfig objects from/to YAML files, with support for Path objects and environment variable expansion.
- class polyzymd.config.loader.ConfigLoader(base_path)[source]
Bases:
objectCustom YAML loader with support for includes and references.
- polyzymd.config.loader.load_config(path)[source]
Load a SimulationConfig from a YAML file.
- Parameters:
- Returns:
Validated SimulationConfig instance
- Raises:
FileNotFoundError – If the config file doesn’t exist
yaml.YAMLError – If the YAML is malformed
pydantic.ValidationError – If the configuration is invalid
- Return type:
Example
>>> config = load_config("my_simulation.yaml") >>> print(config.enzyme.name) "LipA"
- polyzymd.config.loader.save_config(config, path, relative_paths=True)[source]
Save a SimulationConfig to a YAML file.
- Parameters:
config (SimulationConfig) – Configuration to save
relative_paths (bool) – Whether to convert paths to relative (default: True)
Example
>>> config = SimulationConfig(...) >>> save_config(config, "output_config.yaml")
- polyzymd.config.loader.load_config_dict(data, base_path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/polyzymd/checkouts/latest/docs/source'))[source]
Create a SimulationConfig from a dictionary.
This is useful for programmatic configuration creation.
- Parameters:
- Returns:
Validated SimulationConfig instance
- Return type:
Example
>>> data = { ... "name": "test_sim", ... "enzyme": {"name": "LipA", "pdb_path": "enzyme.pdb"}, ... ... ... } >>> config = load_config_dict(data)