Contributing Guide
This guide covers how to contribute to PolyzyMD, with a focus on adding new solvents to the library.
Contributing New Solvents
The co-solvent library is designed to be easily extensible. Here’s how to add a new solvent.
Step 1: Add to the Co-solvent Library
Edit src/polyzymd/data/cosolvent_library.py and add your solvent to COSOLVENT_LIBRARY:
COSOLVENT_LIBRARY: Dict[str, CoSolventData] = {
# ... existing solvents ...
"formamide": CoSolventData(
name="formamide",
smiles="C(=O)N",
density=1.133, # g/mL from PubChem
residue_name="FOR", # 3-letter code
pubchem_cid=713, # For reference
),
}
Requirements:
name: Lowercase identifier (used in YAML configs)
smiles: Valid SMILES string (verify with RDKit or OpenEye)
density: Density in g/mL (required for volume fraction calculations)
residue_name: 3-letter code for PDB/topology files
pubchem_cid: PubChem Compound ID for traceability
Step 2: Generate the Pre-computed SDF
Run the generator script to create the SDF file with embedded charges:
pixi shell -e build
cd /path/to/polyzymd
python src/polyzymd/data/solvents/_generator.py
This will:
Load the molecule from SMILES
Generate a 3D conformer
Compute AM1BCC partial charges
Save to
src/polyzymd/data/solvents/{name}.sdf
Step 3: Verify the SDF
Check that the SDF was created correctly:
from openff.toolkit import Molecule
mol = Molecule.from_file("src/polyzymd/data/solvents/formamide.sdf")
print(f"Atoms: {mol.n_atoms}")
print(f"Charges: {mol.partial_charges}")
print(f"Residue: {mol.atoms[0].metadata.get('residue_name')}")
Step 4: Test the New Solvent
Create a test configuration using your new solvent:
solvent:
primary:
type: "water"
model: "tip3p"
co_solvents:
- name: "formamide"
volume_fraction: 0.10
Run the build to verify it works:
polyzymd build test_config.yaml --dry-run
Step 5: Submit a Pull Request
Commit your changes:
src/polyzymd/data/cosolvent_library.py(library entry)src/polyzymd/data/solvents/{name}.sdf(pre-computed charges)
Include in your PR description:
The solvent name and use case
PubChem CID reference for density value
Any special considerations (e.g., hydrogen bonding, coordination)
SDF File Format
The SDF files in src/polyzymd/data/solvents/ follow a specific format:
molecule_name
RDKit 3D
4 3 0 0 0 0 0 0 0 0999 V2000
0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
...
M CHG ...
M END
> <atom.dprop.PartialCharge>
-0.123456 0.234567 ...
> <residue_name>
DMS
$$$$
Key elements:
3D coordinates: Generated conformer
PartialCharge property: AM1BCC charges for each atom
residue_name property: 3-letter residue code for topology
Water Models
Water molecules use hardcoded literature charges (not AM1BCC) for accuracy:
Model |
O Charge |
H Charge |
Source |
|---|---|---|---|
TIP3P |
-0.834 |
+0.417 |
Jorgensen et al., 1983 |
SPC/E |
-0.8476 |
+0.4238 |
Berendsen et al., 1987 |
To add a new water model, edit src/polyzymd/data/solvent_molecules.py:
def _create_tip4pew_water() -> Molecule:
"""Create TIP4P-Ew water with literature charges."""
mol = Molecule.from_smiles("O")
mol.generate_conformers(n_conformers=1)
# TIP4P-Ew charges (note: virtual site not included)
charges = [-0.84844, 0.42422, 0.42422] * unit.elementary_charge
mol.partial_charges = charges
for atom in mol.atoms:
atom.metadata["residue_name"] = "HOH"
return mol
Code Style
Follow PEP 8 and use
rufffor lintingAdd type hints for all function signatures
Include docstrings with Args/Returns sections
Keep lines under 100 characters
Testing
Before submitting:
# Run linting
ruff check src/
# Run type checking (if available)
mypy src/polyzymd/
# Test the build process
polyzymd build examples/enzyme_cosolvent.yaml --dry-run
Questions?
Open an issue on GitHub: https://github.com/joelaforet/polyzymd/issues