Compare Module
Core
Base classes for comparison analysis.
This module provides abstract base classes that consolidate common patterns across all comparator types, following the Template Method design pattern.
Classes
- BaseConditionSummary
Abstract base for condition-level summary statistics.
- BaseComparisonResult
Abstract base for complete comparison results with save/load.
- PairwiseComparison
Shared model for statistical comparison between two conditions.
- ANOVASummary
Shared model for ANOVA results.
- BaseComparator
Abstract base implementing the Template Method pattern for comparisons.
Design Principles
Open-Closed Principle: New comparators extend base classes without modifying them.
Template Method: compare() defines the algorithm skeleton; subclasses fill in specifics.
DRY: Statistical tests, pairwise logic, and serialization are implemented once.
- class polyzymd.compare.core.base.PairwiseComparison(*, condition_a, condition_b, metric='default', t_statistic, p_value, cohens_d, effect_size_interpretation, direction, significant, percent_change)[source]
Bases:
BaseModelStatistical comparison between two conditions.
This is the standard pairwise comparison result used across all comparator types. For comparators that need additional fields (e.g., multiple metrics), subclass this model.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.core.base.ANOVASummary(*, metric='default', f_statistic, p_value, significant)[source]
Bases:
BaseModelOne-way ANOVA result summary.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.core.base.BaseConditionSummary(*, label, config_path, n_replicates, replicate_values)[source]
Bases:
BaseModel,ABCAbstract base class for condition-level summary statistics.
All condition summaries share these common fields. Subclasses add analysis-specific fields (e.g., mean_rmsf, coverage_mean).
- replicate_values
Per-replicate values of the primary metric (for statistical tests).
- abstract property primary_metric_value: float
Return the primary metric value for ranking/comparison.
This is used by BaseComparator for sorting and statistical tests.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.core.base.BaseComparisonResult(*, metric, name, control_label=None, conditions, pairwise_comparisons, anova=None, ranking, equilibration_time, created_at, polyzymd_version='1.2.1')[source]
Bases:
BaseModel,ABC,Generic[TConditionSummary,TPairwiseComparison]Abstract base class for comparison results.
Provides common serialization (save/load) and accessor methods. Subclasses define analysis-specific fields.
- metric
The primary metric being compared (e.g., “rmsf”, “simultaneous_contact_fraction”).
- Type:
- pairwise_comparisons
Statistical comparisons (all vs control, or all pairs).
- Type:
list[TPairwiseComparison]
- anova
ANOVA result if 3+ conditions.
- Type:
ANOVASummary, optional
- created_at
When the analysis was run.
- Type:
datetime
- anova: ANOVASummary | list[ANOVASummary] | None
- created_at: datetime
- save(path)[source]
Save result to JSON file.
- Parameters:
path (Path or str) – Output path.
- Returns:
Path to saved file.
- Return type:
Path
- classmethod load(path)[source]
Load result from JSON file.
- Parameters:
path (Path or str) – Path to JSON file.
- Returns:
Loaded result.
- Return type:
Self
- get_condition(label)[source]
Get a condition by label.
- get_comparison(label)[source]
Get pairwise comparison for a condition vs control.
- Parameters:
label (str) – Treatment condition label.
- Returns:
The comparison, or None if not found.
- Return type:
PairwiseComparison or None
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.core.base.BaseComparisonResult(*, metric, name, control_label=None, conditions, pairwise_comparisons, anova=None, ranking, equilibration_time, created_at, polyzymd_version='1.2.1')[source]
Bases:
BaseModel,ABC,Generic[TConditionSummary,TPairwiseComparison]Abstract base class for comparison results.
Provides common serialization (save/load) and accessor methods. Subclasses define analysis-specific fields.
- metric
The primary metric being compared (e.g., “rmsf”, “simultaneous_contact_fraction”).
- Type:
- pairwise_comparisons
Statistical comparisons (all vs control, or all pairs).
- Type:
list[TPairwiseComparison]
- anova
ANOVA result if 3+ conditions.
- Type:
ANOVASummary, optional
- created_at
When the analysis was run.
- Type:
datetime
- anova: ANOVASummary | list[ANOVASummary] | None
- created_at: datetime
- save(path)[source]
Save result to JSON file.
- Parameters:
path (Path or str) – Output path.
- Returns:
Path to saved file.
- Return type:
Path
- classmethod load(path)[source]
Load result from JSON file.
- Parameters:
path (Path or str) – Path to JSON file.
- Returns:
Loaded result.
- Return type:
Self
- get_condition(label)[source]
Get a condition by label.
- get_comparison(label)[source]
Get pairwise comparison for a condition vs control.
- Parameters:
label (str) – Treatment condition label.
- Returns:
The comparison, or None if not found.
- Return type:
PairwiseComparison or None
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.core.base.BaseComparisonResult(*, metric, name, control_label=None, conditions, pairwise_comparisons, anova=None, ranking, equilibration_time, created_at, polyzymd_version='1.2.1')[source]
Bases:
BaseModel,ABC,Generic[TConditionSummary,TPairwiseComparison]Abstract base class for comparison results.
Provides common serialization (save/load) and accessor methods. Subclasses define analysis-specific fields.
- metric
The primary metric being compared (e.g., “rmsf”, “simultaneous_contact_fraction”).
- Type:
- pairwise_comparisons
Statistical comparisons (all vs control, or all pairs).
- Type:
list[TPairwiseComparison]
- anova
ANOVA result if 3+ conditions.
- Type:
ANOVASummary, optional
- created_at
When the analysis was run.
- Type:
datetime
- anova: ANOVASummary | list[ANOVASummary] | None
- created_at: datetime
- save(path)[source]
Save result to JSON file.
- Parameters:
path (Path or str) – Output path.
- Returns:
Path to saved file.
- Return type:
Path
- classmethod load(path)[source]
Load result from JSON file.
- Parameters:
path (Path or str) – Path to JSON file.
- Returns:
Loaded result.
- Return type:
Self
- get_condition(label)[source]
Get a condition by label.
- get_comparison(label)[source]
Get pairwise comparison for a condition vs control.
- Parameters:
label (str) – Treatment condition label.
- Returns:
The comparison, or None if not found.
- Return type:
PairwiseComparison or None
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.core.base.BaseComparisonResult(*, metric, name, control_label=None, conditions, pairwise_comparisons, anova=None, ranking, equilibration_time, created_at, polyzymd_version='1.2.1')[source]
Bases:
BaseModel,ABC,Generic[TConditionSummary,TPairwiseComparison]Abstract base class for comparison results.
Provides common serialization (save/load) and accessor methods. Subclasses define analysis-specific fields.
- metric
The primary metric being compared (e.g., “rmsf”, “simultaneous_contact_fraction”).
- Type:
- pairwise_comparisons
Statistical comparisons (all vs control, or all pairs).
- Type:
list[TPairwiseComparison]
- anova
ANOVA result if 3+ conditions.
- Type:
ANOVASummary, optional
- created_at
When the analysis was run.
- Type:
datetime
- anova: ANOVASummary | list[ANOVASummary] | None
- created_at: datetime
- save(path)[source]
Save result to JSON file.
- Parameters:
path (Path or str) – Output path.
- Returns:
Path to saved file.
- Return type:
Path
- classmethod load(path)[source]
Load result from JSON file.
- Parameters:
path (Path or str) – Path to JSON file.
- Returns:
Loaded result.
- Return type:
Self
- get_condition(label)[source]
Get a condition by label.
- get_comparison(label)[source]
Get pairwise comparison for a condition vs control.
- Parameters:
label (str) – Treatment condition label.
- Returns:
The comparison, or None if not found.
- Return type:
PairwiseComparison or None
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.core.base.BaseComparisonResult(*, metric, name, control_label=None, conditions, pairwise_comparisons, anova=None, ranking, equilibration_time, created_at, polyzymd_version='1.2.1')[source]
Bases:
BaseModel,ABC,Generic[TConditionSummary,TPairwiseComparison]Abstract base class for comparison results.
Provides common serialization (save/load) and accessor methods. Subclasses define analysis-specific fields.
- metric
The primary metric being compared (e.g., “rmsf”, “simultaneous_contact_fraction”).
- Type:
- pairwise_comparisons
Statistical comparisons (all vs control, or all pairs).
- Type:
list[TPairwiseComparison]
- anova
ANOVA result if 3+ conditions.
- Type:
ANOVASummary, optional
- created_at
When the analysis was run.
- Type:
datetime
- anova: ANOVASummary | list[ANOVASummary] | None
- created_at: datetime
- save(path)[source]
Save result to JSON file.
- Parameters:
path (Path or str) – Output path.
- Returns:
Path to saved file.
- Return type:
Path
- classmethod load(path)[source]
Load result from JSON file.
- Parameters:
path (Path or str) – Path to JSON file.
- Returns:
Loaded result.
- Return type:
Self
- get_condition(label)[source]
Get a condition by label.
- get_comparison(label)[source]
Get pairwise comparison for a condition vs control.
- Parameters:
label (str) – Treatment condition label.
- Returns:
The comparison, or None if not found.
- Return type:
PairwiseComparison or None
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.core.base.BaseComparator(config, analysis_settings, equilibration=None)[source]
Bases:
ABC,Generic[TAnalysisSettings,TConditionData,TConditionSummary,TResult]Abstract base class for all comparators using Template Method pattern.
The compare() method defines the comparison algorithm skeleton: 1. Load/compute analysis for each condition 2. Build condition summaries 3. Compute pairwise statistical comparisons 4. Compute ANOVA (if 3+ conditions) 5. Rank conditions 6. Build and return result
Subclasses implement the abstract methods to customize each step.
- Parameters:
config (ComparisonConfig) – Comparison configuration defining conditions.
analysis_settings (TAnalysisSettings) – Analysis-specific settings.
equilibration (str, optional) – Equilibration time override.
Parameters (Type)
---------------
TAnalysisSettings – Type of analysis settings (e.g., RMSFAnalysisSettings).
TConditionData – Type of raw data loaded for each condition.
TConditionSummary – Type of condition summary (e.g., RMSFConditionSummary).
TResult – Type of comparison result (e.g., RMSFComparisonResult).
- abstractmethod classmethod comparison_type_name()[source]
Return the comparison type identifier (e.g., “rmsf”, “contacts”).
- Returns:
Type identifier used in registry and CLI.
- Return type:
- abstract property metric_type: MetricType
Declare whether this comparator’s metric is mean or variance-based.
This determines how autocorrelation is handled in the underlying analysis:
MEAN_BASED: Use all frames for computation, correct uncertainty using N_eff (effective sample size). Examples: average distance, contact fraction, catalytic triad proximity.
VARIANCE_BASED: Subsample to independent frames separated by 2τ (correlation time) to avoid bias in variance estimates. Examples: RMSF, fluctuation metrics.
Contributors implementing new comparators MUST declare the appropriate metric type to ensure correct statistical treatment per LiveCoMS best practices (Grossfield et al., 2018).
- Returns:
The metric type for this comparator.
- Return type:
MetricType
References
Grossfield et al. (2018) LiveCoMS 1:5067 (Best Practices for Uncertainty)
GitHub: dmzuckerman/Sampling-Uncertainty
- compare(recompute=False)[source]
Run comparison across all conditions (Template Method).
This method defines the algorithm skeleton. Subclasses customize behavior by implementing the abstract hook methods.
- Parameters:
recompute (bool, optional) – If True, force recompute even if cached results exist.
- Returns:
Complete comparison results with statistics and rankings.
- Return type:
TResult
Registry for comparator types.
This module provides extensible infrastructure for registering comparator types following the Open-Closed Principle (OCP). New comparators can be added by registering with the ComparatorRegistry without modifying core code.
Example
Registering a new comparator:
>>> from polyzymd.compare.core.registry import ComparatorRegistry
>>> from polyzymd.compare.core.base import BaseComparator
>>>
>>> @ComparatorRegistry.register("my_metric")
... class MyComparator(BaseComparator):
... @classmethod
... def comparison_type_name(cls) -> str:
... return "my_metric"
... ...
>>>
>>> # Create comparator instance via registry
>>> comparator = ComparatorRegistry.create("my_metric", config, settings)
- class polyzymd.compare.core.registry.ComparatorRegistry[source]
Bases:
objectRegistry for comparator implementations.
Allows new comparators to be registered without modifying core code. Use the register decorator to add new comparator classes.
Examples
>>> @ComparatorRegistry.register("rmsf") ... class RMSFComparator(BaseComparator): ... ... >>> >>> # List available comparators >>> ComparatorRegistry.list_available() ['contacts', 'rmsf', 'triad'] >>> >>> # Create comparator instance >>> comparator = ComparatorRegistry.create("rmsf", config, settings)
- classmethod register(name=None)[source]
Decorator to register a comparator class.
- Parameters:
name (str, optional) – Registry key. If None, uses the class’s comparison_type_name().
- Returns:
Decorator function.
- Return type:
Callable
Examples
>>> @ComparatorRegistry.register("rmsf") ... class RMSFComparator(BaseComparator): ... @classmethod ... def comparison_type_name(cls) -> str: ... return "rmsf"
- classmethod get(name)[source]
Get comparator class by name.
- Parameters:
name (str) – Comparator type identifier.
- Returns:
The registered comparator class.
- Return type:
Type[BaseComparator]
- Raises:
ValueError – If the comparator type is not registered.
- classmethod create(name, config, analysis_settings, equilibration=None, **kwargs)[source]
Factory to create a comparator instance.
- Parameters:
name (str) – Comparator type identifier.
config (ComparisonConfig) – Comparison configuration.
analysis_settings – Analysis-specific settings.
equilibration (str, optional) – Equilibration time override.
**kwargs – Additional comparator-specific arguments.
- Returns:
Configured comparator instance.
- Return type:
Configuration
Configuration schema for comparison projects.
This module defines the YAML schema for comparison.yaml files that specify which simulation conditions to compare.
The schema has two main sections: - analysis_settings: Defines WHAT analyses to run (shared across conditions) - comparison_settings: Defines HOW to compare (statistical parameters)
Both sections use a registry-based approach for extensibility. New analysis types can be added by registering with AnalysisSettingsRegistry and ComparisonSettingsRegistry (see polyzymd.compare.settings).
- class polyzymd.compare.config.ConditionConfig(*, label, config, replicates)[source]
Bases:
BaseModelConfiguration for one condition in a comparison.
- config
Path to the simulation’s config.yaml file
- Type:
Path
- config: Path
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.config.AnalysisSettingsContainer(**data)[source]
Bases:
BaseModelContainer for analysis settings (WHAT to analyze).
Uses dynamic attribute access to support any registered analysis type without hardcoding field names.
- model_config = {'extra': 'allow'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- __init__(**data)[source]
Initialize with dynamic analysis settings.
- Parameters:
**data (Any) – Analysis settings keyed by analysis type name.
- get(analysis_type)[source]
Get settings for a specific analysis type.
- Parameters:
analysis_type (str) – Analysis type identifier (e.g., “rmsf”, “contacts”).
- Returns:
Settings for the analysis type, or None if not configured.
- Return type:
BaseAnalysisSettings or None
- get_enabled_analyses()[source]
Get list of enabled analysis types.
Notes
Uses actual model data from comparison.yaml rather than relying on a registry. This makes comparison.yaml the source of truth for which analyses are enabled.
- class polyzymd.compare.config.ComparisonSettingsContainer(**data)[source]
Bases:
BaseModelContainer for comparison settings (HOW to compare).
Uses dynamic attribute access to support any registered comparison type. Each analysis type in analysis_settings must have a corresponding entry here (can be empty dict) to enable comparison.
- model_config = {'extra': 'allow'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- __init__(**data)[source]
Initialize with dynamic comparison settings.
- Parameters:
**data (Any) – Comparison settings keyed by analysis type name.
- class polyzymd.compare.config.RMSFPlotSettings(*, show_error=True, highlight_residues=<factory>, figsize_profile=(14, 4), figsize_comparison=(8, 6))[source]
Bases:
BasePlotSettingsRMSF-specific plot customization.
- highlight_residues
Residue numbers to highlight with vertical lines (e.g., active site)
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.config.TriadPlotSettings(*, generate_kde_panel=True, generate_bars=True, generate_2d_kde=False, threshold_line_color='red', kde_fill_alpha=0.7, kde_xlim=(0.0, 7.0), figsize_kde_panel=None, figsize_bars=(10, 6))[source]
Bases:
BasePlotSettingsTriad-specific plot customization.
- figsize_kde_panel
Figure size for KDE panel (auto-calculated if None)
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.config.DistancesPlotSettings(*, show_threshold=True, use_kde=True, generate_state_bars=True, figsize=(10, 6))[source]
Bases:
BasePlotSettingsDistance analysis plot customization.
- generate_state_bars
Generate per-pair state bar charts (above/below threshold). Each pair gets its own figure showing the fraction of frames in each state per condition. Default True.
- Type:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.config.ContactsPlotSettings(*, figsize=(10, 8), generate_enrichment_heatmap=True, generate_enrichment_bars=True, figsize_enrichment_heatmap=None, figsize_enrichment_bars=(10, 6), enrichment_colormap='RdBu_r', show_enrichment_error=True, generate_system_coverage_heatmap=True, generate_system_coverage_bars=True, figsize_system_coverage_heatmap=None, figsize_system_coverage_bars=(10, 6), show_system_coverage_error=True, generate_user_partition_bars=True, figsize_user_partition_bars=(10, 6), show_user_partition_error=True, generate_contact_fraction_profile=True, figsize_contact_fraction_profile=(14, 5), show_contact_fraction_profile_error=True, contact_fraction_profile_threshold=None, generate_residence_time_profile=True, figsize_residence_time_profile=(14, 5), show_residence_time_profile_error=True, generate_cf_by_aa_class_bars=True, figsize_cf_by_aa_class_bars=(10, 6), show_cf_by_aa_class_error=True, generate_cf_by_partition_bars=True, figsize_cf_by_partition_bars=(10, 6), show_cf_by_partition_error=True, generate_rt_by_aa_class_bars=True, figsize_rt_by_aa_class_bars=(10, 6), show_rt_by_aa_class_error=True, generate_rt_by_partition_bars=True, figsize_rt_by_partition_bars=(10, 6), show_rt_by_partition_error=True, highlight_residues=<factory>)[source]
Bases:
BasePlotSettingsContacts analysis plot customization.
- generate_enrichment_heatmap
Generate binding preference enrichment heatmap (default True)
- Type:
- figsize_enrichment_heatmap
Figure size for enrichment heatmap (auto-calculated if None)
- generate_system_coverage_heatmap
Generate system coverage enrichment heatmap (default True)
- Type:
- figsize_system_coverage_heatmap
Figure size for system coverage heatmap (auto-calculated if None)
- figsize_user_partition_bars
Figure size for user-defined partition bar charts
- show_user_partition_error
Show error bars on user-defined partition bar charts (default True)
- Type:
- generate_contact_fraction_profile
Generate per-residue contact fraction line plot (default True)
- Type:
- figsize_contact_fraction_profile
Figure size for contact fraction profile plot
- show_contact_fraction_profile_error
Show SEM fill_between bands on contact fraction profile (default True)
- Type:
- contact_fraction_profile_threshold
If set, draw a horizontal threshold line on the contact fraction profile. Residues above this value are considered “high contact”.
- Type:
float or None
- generate_residence_time_profile
Generate per-residue mean residence time line plot (default True)
- Type:
- figsize_residence_time_profile
Figure size for residence time profile plot
- show_residence_time_profile_error
Show SEM fill_between bands on residence time profile (default True)
- Type:
- generate_cf_by_aa_class_bars
Generate contact fraction by AA class grouped bar chart (default True)
- Type:
- figsize_cf_by_aa_class_bars
Figure size for contact fraction by AA class bar chart
- show_cf_by_aa_class_error
Show error bars on contact fraction by AA class bar chart (default True)
- Type:
- generate_cf_by_partition_bars
Generate contact fraction by user-defined partition bar charts (default True)
- Type:
- figsize_cf_by_partition_bars
Figure size for contact fraction by partition bar charts
- show_cf_by_partition_error
Show error bars on contact fraction by partition bar charts (default True)
- Type:
- generate_rt_by_aa_class_bars
Generate residence time by AA class grouped bar chart (default True)
- Type:
- figsize_rt_by_aa_class_bars
Figure size for residence time by AA class bar chart
- show_rt_by_aa_class_error
Show error bars on residence time by AA class bar chart (default True)
- Type:
- generate_rt_by_partition_bars
Generate residence time by user-defined partition bar charts (default True)
- Type:
- figsize_rt_by_partition_bars
Figure size for residence time by partition bar charts
- show_rt_by_partition_error
Show error bars on residence time by partition bar charts (default True)
- Type:
- highlight_residues
Residue IDs to mark with vertical dashed lines on profile plots. Useful for highlighting active-site residues or known anchor points.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.config.BFEPlotSettings(*, generate_heatmap=True, generate_bars=True, figsize_heatmap=None, figsize_bars=(10, 6), colormap='RdBu_r', show_error_bars=True, annotate_heatmap=True)[source]
Bases:
BasePlotSettingsBinding free energy plot customization.
- generate_heatmap
Generate ΔG_sel heatmap (rows = AA groups, columns = conditions). Default True.
- Type:
- generate_bars
Generate ΔG_sel grouped bar chart (one bar per condition per AA group). Default True.
- Type:
- figsize_heatmap
Figure size for ΔG_sel heatmap (auto-calculated if None).
- colormap
Diverging colormap for heatmap (default “RdBu_r”: red = avoidance, blue = preference).
- Type:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.config.AffinityPlotSettings(*, generate_stacked_bars=True, generate_group_bars=True, figsize_stacked=(10, 6), figsize_group_bars=(10, 6), show_error_bars=True)[source]
Bases:
BasePlotSettingsPolymer affinity score plot customization.
- generate_stacked_bars
Generate stacked bar chart of total score by condition, broken down by polymer type. Default True.
- Type:
- generate_group_bars
Generate grouped bar chart showing per-group contributions across conditions. Default True.
- Type:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.config.SSPlotSettings(*, generate_timeline=True, generate_content_bars=True, generate_individual_bars=True, generate_diff_heatmap=True, figsize_timeline=(14, 6), figsize_content_bars=(10, 6), figsize_diff_heatmap=None, diff_colormap='RdBu_r')[source]
Bases:
BasePlotSettingsSecondary structure plot customization.
- generate_content_bars
Generate grouped bar chart of helix/strand/coil fractions. Default True.
- Type:
- generate_individual_bars
Generate one bar chart per SS type (helix, beta-sheet, no-SS). Default True.
- Type:
- generate_diff_heatmap
Generate condition x residue persistence difference heatmap. Default True.
- Type:
- figsize_diff_heatmap
Figure size for difference heatmap (auto-calculated if None).
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.config.PlotTheme(*, title_fontsize=13, suptitle_fontsize=14, label_fontsize=11, tick_fontsize=9, legend_fontsize=9, annotation_fontsize=9, small_fontsize=8, tiny_fontsize=7, bar_alpha=0.85, bar_edgecolor='black', bar_linewidth=0.5, bar_capsize=4, dot_size=18, dot_alpha=0.7, dot_color='black', line_alpha=0.8, fill_alpha=0.25, reference_line_color='black', reference_line_style='--', reference_line_width=1.5, highlight_line_alpha=0.5, hide_top_spine=True, hide_right_spine=True, title_fontweight='bold', legend_loc='center left', legend_bbox=(1.02, 0.5), show_watermark=True)[source]
Bases:
BaseModelCentralized visual defaults for all comparison plots.
Replaces ~219 hardcoded style values (font sizes, alphas, line widths, marker sizes, spine visibility, etc.) across all plotter files with a single configurable Pydantic model.
Three presets are available via class methods:
PlotTheme.publication()— default; print-ready sizes and weights.PlotTheme.presentation()— ~1.3x larger fonts/dots/lines for slides.PlotTheme.minimal()— no dots, no bar edges, thinner lines.
Users can override individual values in
comparison.yaml:plot_settings: style: "publication" theme: title_fontsize: 16 dot_size: 24
- Parameters:
title_fontsize (int) – Font size for axes titles.
suptitle_fontsize (int) – Font size for figure suptitles.
label_fontsize (int) – Font size for axis labels (xlabel/ylabel).
tick_fontsize (int) – Font size for tick labels.
legend_fontsize (int) – Font size for legend entries.
annotation_fontsize (int) – Font size for heatmap cell annotations and inline text.
small_fontsize (int) – Font size for secondary annotations (e.g. SEM ± labels).
tiny_fontsize (int) – Font size for fine-grained annotations (e.g. residue IDs).
bar_alpha (float) – Opacity for bar chart fill.
bar_edgecolor (str) – Edge colour for bar outlines.
bar_linewidth (float) – Edge line width for bars.
bar_capsize (int) – Error bar cap size in points.
dot_size (int) – Marker size for replicate dot overlays (
s=inscatter).dot_alpha (float) – Opacity for replicate dots.
dot_color (str) – Colour for replicate dots.
line_alpha (float) – Opacity for line plots (e.g. RMSF profiles).
fill_alpha (float) – Opacity for fill_between bands (e.g. SEM regions).
reference_line_color (str) – Colour for horizontal/vertical reference lines.
reference_line_style (str) – Linestyle for reference lines (e.g.
"--").reference_line_width (float) – Line width for reference lines.
highlight_line_alpha (float) – Opacity for highlight / vertical reference lines.
hide_top_spine (bool) – Whether to hide the top axis spine.
hide_right_spine (bool) – Whether to hide the right axis spine.
title_fontweight (str) – Font weight for titles (e.g.
"bold","normal").legend_loc (str) – Matplotlib legend location string (e.g.
"center left"). Used withlegend_bboxto place the legend outside the axes.legend_bbox (tuple of float) –
bbox_to_anchorfor legend placement, relative to axes. Default(1.02, 0.5)places it just outside the right edge, vertically centred.show_watermark (bool) – Whether to render a subtle “Made by PolyzyMD” watermark in the bottom-right corner of every saved figure. Default
True.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.config.PlotSettings(*, output_dir=PosixPath('figures'), format='png', dpi=300, style='publication', color_palette='tab10', theme=<factory>, **data)[source]
Bases:
BaseModelGlobal plot settings for comparison.yaml.
Controls plot generation for all analyses. Per-analysis plot settings are discovered via
PlotSettingsRegistry— any key in the YAML that matches a registered analysis type is parsed into the corresponding settings class. Unrecognised keys that are not global fields are logged and skipped.- output_dir
Directory for generated plots (relative to comparison.yaml)
- Type:
Path
- theme
Resolved visual theme. Built from the
stylepreset and any user overrides in thetheme:YAML block.- Type:
Notes
Attribute access for any registered analysis type always succeeds: if the user did not provide that section in YAML, a default-constructed settings instance is returned. This means
self.settings.rmsf.show_erroris always safe, even when the YAML has normsf:block.Examples
In comparison.yaml:
plot_settings: output_dir: "figures/" format: "png" dpi: 300 style: "publication" rmsf: highlight_residues: [77, 133, 156] triad: generate_2d_kde: true
- model_config = {'extra': 'allow'}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- output_dir: Path
- __init__(**data)[source]
Initialize with global fields and registry-discovered per-analysis settings.
Theme resolution: the
stylefield selects a preset (publication, presentation, or minimal) and then any user-suppliedtheme:overrides are merged on top. This allowsstyle: presentationwiththeme: {dot_size: 40}to use the presentation preset but override just the dot size.- Parameters:
**data (Any) – Plot settings from YAML. Keys matching registered analysis types are parsed into their settings classes; global keys are handled by Pydantic; unknown keys are logged and skipped.
- __getattr__(name)[source]
Fall back to default-constructed settings for registered types.
This ensures
self.settings.rmsf.show_errorworks even when the user omitted thermsf:block from their YAML.- Parameters:
name (str) – Attribute name.
- Returns:
Default-constructed settings if name is a registered type.
- Return type:
BasePlotSettings
- Raises:
AttributeError – If name is not a registered plot settings type.
- class polyzymd.compare.config.ComparisonConfig(*, name, description=None, control=None, conditions, defaults=<factory>, analysis_settings=<factory>, comparison_settings=<factory>, plot_settings=<factory>, source_path=None)[source]
Bases:
BaseModelSchema for comparison.yaml configuration files.
A comparison config defines multiple simulation conditions to compare, along with analysis settings and comparison-specific parameters.
The schema follows a three-section pattern: - analysis_settings: WHAT to analyze (shared across conditions) - comparison_settings: HOW to compare (statistical parameters) - plot_settings: HOW to visualize (plot customization)
- conditions
List of conditions to compare
- Type:
- defaults
Default analysis parameters (equilibration_time)
- Type:
AnalysisDefaults
- analysis_settings
Analysis parameters (WHAT to analyze)
- comparison_settings
Comparison parameters (HOW to compare)
- plot_settings
Plot customization (HOW to visualize)
- Type:
Examples
>>> config = ComparisonConfig.from_yaml("comparison.yaml") >>> print(config.name) "Polymer Stabilization Study" >>> for cond in config.conditions: ... print(f"{cond.label}: {cond.config}") >>> print("Enabled analyses:", config.analysis_settings.get_enabled_analyses()) >>> rmsf_settings = config.analysis_settings.get("rmsf") >>> if rmsf_settings: ... print(f"RMSF selection: {rmsf_settings.selection}")
- conditions: list[ConditionConfig]
- defaults: AnalysisDefaults
- analysis_settings: AnalysisSettingsContainer
- comparison_settings: ComparisonSettingsContainer
- plot_settings: PlotSettings
- validate_comparison_coverage()[source]
Validate that comparison_settings covers all analysis_settings.
Each analysis type in analysis_settings must have a corresponding entry in comparison_settings (can be empty {}).
- classmethod from_yaml(path)[source]
Load comparison config from YAML file.
- Parameters:
path (Path or str) – Path to comparison.yaml file
- Returns:
Loaded and validated configuration
- Return type:
- Raises:
FileNotFoundError – If the config file doesn’t exist
ValidationError – If the config is invalid
- to_yaml(path)[source]
Save comparison config to YAML file.
- Parameters:
path (Path or str) – Output path for comparison.yaml
- get_condition(label)[source]
Get a condition by its label.
- generate_analysis_yaml(condition)[source]
Generate analysis.yaml content for a specific condition.
- Parameters:
condition (ConditionConfig) – The condition to generate analysis.yaml for.
- Returns:
YAML content for the analysis.yaml file.
- Return type:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Settings
Analysis and comparison settings for the comparison workflow.
This module defines the concrete settings classes for each analysis type, registered via the AnalysisSettingsRegistry and ComparisonSettingsRegistry.
Analysis Settings (WHAT to analyze): - RMSFAnalysisSettings: RMSF calculation parameters - DistancesAnalysisSettings: Distance pair monitoring parameters - CatalyticTriadAnalysisSettings: Active site distance analysis - ContactsAnalysisSettings: Polymer-protein contact parameters
Comparison Settings (HOW to compare): - RMSFComparisonSettings: (no comparison-specific params) - DistancesComparisonSettings: (no comparison-specific params) - CatalyticTriadComparisonSettings: (no comparison-specific params) - ContactsComparisonSettings: FDR, effect size, top residues
All settings classes are auto-registered on module import.
- class polyzymd.compare.settings.RMSFAnalysisSettings(*, selection='protein and name CA', reference_mode='centroid', reference_frame=None, reference_file=None)[source]
Bases:
BaseAnalysisSettingsRMSF analysis settings.
- classmethod validate_reference_mode(v)[source]
Validate reference mode is one of the allowed values.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.RMSFComparisonSettings[source]
Bases:
BaseComparisonSettingsComparison settings for RMSF analysis.
Currently empty — all RMSF comparison behavior uses defaults from
BaseComparisonSettings. This class exists as an extension point: add fields here when RMSF-specific comparison parameters are needed (e.g., a per-residue significance threshold) without modifying the orchestrator or other comparison types.- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.DistancePairSettings(*, label, selection_a, selection_b, threshold=None, below_label=None, above_label=None)[source]
Bases:
BaseAnalysisSettingsConfiguration for a single distance pair.
- threshold
Per-pair distance threshold (Angstroms). If None, uses the global threshold from DistancesAnalysisSettings.
- Type:
float, optional
- below_label
Display label for the “below threshold” state (e.g.
"Bound","Closed"). WhenNone, defaults to"Below {threshold}Å".- Type:
str, optional
- above_label
Display label for the “above threshold” state (e.g.
"Unbound","Open"). WhenNone, defaults to"Above {threshold}Å".- Type:
str, optional
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.DistancesAnalysisSettings(*, threshold=3.5, pairs=<factory>, use_pbc=True, align_trajectory=True, alignment_selection='protein and name CA', alignment_mode='centroid', alignment_frame=None)[source]
Bases:
BaseAnalysisSettingsDistance analysis settings.
- pairs
List of atom pairs to measure distances between.
- Type:
- align_trajectory
Align trajectory before distance calculation. Default True. When enabled, removes rotational drift and COM motion that can add noise to inter-domain distance measurements.
- Type:
- alignment_selection
MDAnalysis selection for trajectory alignment. Default: “protein and name CA”.
- Type:
- alignment_mode
Reference mode for alignment: “centroid”, “average”, or “frame”. Default: “centroid”.
- Type:
- pairs: list[DistancePairSettings]
- classmethod validate_alignment_mode(v)[source]
Validate alignment mode is one of the allowed values.
- validate_alignment_frame_required()[source]
Ensure alignment_frame is provided when alignment_mode is ‘frame’.
- get_alignment_config()[source]
Build an AlignmentConfig from these settings.
- Returns:
Configuration for trajectory alignment, ready to pass to align_trajectory() or DistanceCalculator.
- Return type:
AlignmentConfig
Notes
Import is done inside the method to avoid circular imports.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.DistancesComparisonSettings[source]
Bases:
BaseComparisonSettingsComparison settings for distance analysis.
Currently empty — all distance comparison behavior uses defaults from
BaseComparisonSettings. This class exists as an extension point: add fields here when distance-specific comparison parameters are needed (e.g., per-pair significance thresholds) without modifying the orchestrator or other comparison types.- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.TriadPairSettings(*, label, selection_a, selection_b)[source]
Bases:
BaseAnalysisSettingsConfiguration for one distance pair in a catalytic triad/active site.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.CatalyticTriadAnalysisSettings(*, name, pairs, threshold=3.5, description=None)[source]
Bases:
BaseAnalysisSettingsCatalytic triad/active site analysis settings.
- pairs
Distance pairs to monitor.
- Type:
- pairs: list[TriadPairSettings]
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.CatalyticTriadComparisonSettings[source]
Bases:
BaseComparisonSettingsComparison settings for catalytic triad analysis.
Currently empty — all triad comparison behavior uses defaults from
BaseComparisonSettings. This class exists as an extension point: add fields here when triad-specific comparison parameters are needed (e.g., functional distance thresholds) without modifying the orchestrator or other comparison types.- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.BindingPreferenceFieldsMixin(*, surface_exposure_threshold=0.2, enzyme_pdb_for_sasa=None, include_default_aa_groups=True, protein_groups=None, protein_partitions=None, polymer_type_selections=None)[source]
Bases:
BaseAnalysisSettingsShared fields for experimental binding-preference-derived analyses.
Both
ContactsAnalysisSettingsandBindingFreeEnergyAnalysisSettingsneed identical fields for surface exposure, protein grouping, and polymer type selection. This mixin provides them once, keeping defaults in sync.- protein_groups
Custom protein groups as {name: [resid1, resid2, …]}.
- protein_partitions
Custom partitions for system coverage comparison.
- polymer_type_selections
Custom polymer type selections as {name: “MDAnalysis selection”}.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.ContactsAnalysisSettings(*, surface_exposure_threshold=0.2, enzyme_pdb_for_sasa=None, include_default_aa_groups=True, protein_groups=None, protein_partitions=None, polymer_type_selections=None, polymer_selection='chainID C', protein_selection='protein', cutoff=4.5, polymer_types=None, grouping='aa_class', compute_residence_times=True, compute_binding_preference=False, enrichment_normalization='residue')[source]
Bases:
BindingPreferenceFieldsMixinPolymer-protein contact analysis settings.
Inherits binding preference fields (surface_exposure_threshold, enzyme_pdb_for_sasa, include_default_aa_groups, protein_groups, protein_partitions, polymer_type_selections) from
BindingPreferenceFieldsMixin.- enrichment_normalization
DEPRECATED (kept for backward compatibility). Enrichment is now always normalized by protein surface availability. This field is ignored.
- Type:
- validate_protein_partitions()[source]
Validate protein_partitions references and mutual exclusivity.
Validates: 1. All groups referenced in partitions exist in protein_groups 2. Groups within each partition don’t overlap (mutually exclusive)
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.ContactsComparisonSettings(*, fdr_alpha=0.05, min_effect_size=0.5, top_residues=10)[source]
Bases:
BaseComparisonSettingsComparison settings for polymer-protein contacts analysis.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.ExposureAnalysisSettings(*, protein_selection='protein', polymer_selection='chainID C', exposure_threshold=0.2, transient_lower=0.2, transient_upper=0.8, min_event_length=1, probe_radius_nm=0.14, n_sphere_points=960, protein_chain='A', polymer_resnames=None)[source]
Bases:
BaseAnalysisSettingsExperimental exposure dynamics settings (dynamic SASA-based chaperone analysis).
- polymer_resnames
Subset of polymer monomer resnames to include. If None, all detected.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.ExposureComparisonSettings[source]
Bases:
BaseComparisonSettingsComparison settings for exposure dynamics analysis.
Currently empty — all exposure comparison behavior uses defaults from
BaseComparisonSettings. This class exists as an extension point: add fields here when exposure-specific comparison parameters are needed (e.g., transient classification thresholds) without modifying the orchestrator or other comparison types.- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.BindingFreeEnergyAnalysisSettings(*, surface_exposure_threshold=0.2, enzyme_pdb_for_sasa=None, include_default_aa_groups=True, protein_groups=None, protein_partitions=None, polymer_type_selections=None, units='kT', compute_binding_preference=True)[source]
Bases:
BindingPreferenceFieldsMixinExperimental settings for binding free energy analysis via Boltzmann inversion.
Computes the selectivity free energy:
ΔG_sel = -k_B·T · ln(contact_share / expected_share)
where: - contact_share = fraction of polymer contacts directed at an AA group - expected_share = fraction of exposed surface belonging to that AA group - T = simulation temperature (from SimulationConfig)
This is a post-processing analysis that consumes binding preference results from the contacts analysis layer (no new per-frame computation is needed).
Inherits binding preference fields (surface_exposure_threshold, enzyme_pdb_for_sasa, include_default_aa_groups, protein_groups, protein_partitions, polymer_type_selections) from
BindingPreferenceFieldsMixin.- units
Energy units for output. One of “kT” (dimensionless, in units of k_bT — the thermal energy), “kcal/mol”, or “kJ/mol”.
- Type:
- compute_binding_preference
Compute binding preference from contacts data when cached results are not found.
- Type:
- k_b()[source]
Return k_B in the selected energy units.
- Returns:
Boltzmann constant in kcal/(mol·K) or kJ/(mol·K). When units=’kT’, returns 0.0 — callers should use kT=1.0 directly instead of k_b() * T.
- Return type:
- to_analysis_yaml_dict()[source]
Convert to analysis.yaml-compatible dictionary.
- Returns:
Dictionary suitable for writing to analysis.yaml.
- Return type:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.BindingFreeEnergyComparisonSettings(*, fdr_alpha=0.05)[source]
Bases:
BaseComparisonSettingsComparison settings for binding free energy analysis.
- fdr_alpha
False discovery rate alpha for Benjamini-Hochberg correction of p-values across (polymer_type, AA_group) pairs.
- Type:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.PolymerAffinityScoreSettings(*, surface_exposure_threshold=0.2, enzyme_pdb_for_sasa=None, include_default_aa_groups=True, protein_groups=None, protein_partitions=None, polymer_type_selections=None, compute_binding_preference=True)[source]
Bases:
BindingPreferenceFieldsMixinExperimental settings for polymer affinity score analysis.
The polymer affinity score is a comparative metric that quantifies total polymer-protein interaction strength:
S = Σ_{p,g} N_{p,g} × ΔG_sel_{p,g} [kT]
- where:
N = mean_contact_fraction × n_exposed_in_group ΔG_sel = -ln(contact_share / expected_share)
This is a post-processing analysis that consumes binding preference results from the contacts analysis layer — no new per-frame computation is needed. All scores are in kT (dimensionless); the temperature factor cancels in the Boltzmann inversion ratio.
Important
This metric assumes thermodynamic independence of contacts. The absolute values are NOT rigorous binding free energies. Only relative differences between polymer compositions are meaningful (comparative ranking).
Inherits binding preference fields (surface_exposure_threshold, enzyme_pdb_for_sasa, include_default_aa_groups, protein_groups, protein_partitions, polymer_type_selections) from
BindingPreferenceFieldsMixin.- compute_binding_preference
Compute binding preference from contacts data when cached results are not found.
- Type:
- to_analysis_yaml_dict()[source]
Convert to analysis.yaml-compatible dictionary.
- Returns:
Dictionary suitable for writing to analysis.yaml.
- Return type:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.PolymerAffinityScoreComparisonSettings(*, fdr_alpha=0.05)[source]
Bases:
BaseComparisonSettingsComparison settings for polymer affinity score analysis.
- fdr_alpha
False discovery rate alpha for Benjamini-Hochberg correction of pairwise p-values across conditions.
- Type:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.SecondaryStructureAnalysisSettings(*, chain_id='A')[source]
Bases:
BaseAnalysisSettingsSecondary structure (DSSP) analysis settings.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.settings.SecondaryStructureComparisonSettings[source]
Bases:
BaseComparisonSettingsComparison settings for secondary structure analysis.
Currently empty — all secondary structure comparison behavior uses defaults from
BaseComparisonSettings. This class exists as an extension point: add fields here when SS-specific comparison parameters are needed without modifying the orchestrator.- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Statistics
Statistical tests for comparing simulation conditions.
This module provides statistical functions for comparing analysis results across multiple conditions, including t-tests, ANOVA, and effect sizes.
All functions use SciPy for statistical calculations.
- class polyzymd.compare.statistics.TTestResult(t_statistic, p_value)[source]
Bases:
objectResult of a two-sample t-test.
- __init__(t_statistic, p_value)
- class polyzymd.compare.statistics.EffectSize(cohens_d, interpretation, direction)[source]
Bases:
objectCohen’s d effect size with interpretation.
- __init__(cohens_d, interpretation, direction)
- class polyzymd.compare.statistics.ANOVAResult(f_statistic, p_value)[source]
Bases:
objectResult of one-way ANOVA.
- __init__(f_statistic, p_value)
- polyzymd.compare.statistics.independent_ttest(group1, group2)[source]
Perform two-sample independent t-test.
Tests the null hypothesis that two independent samples have identical expected values.
- Parameters:
group1 (array_like) – First group of values (e.g., control replicate means)
group2 (array_like) – Second group of values (e.g., treatment replicate means)
- Returns:
Result containing t-statistic and p-value
- Return type:
Examples
>>> control = [0.715, 0.693, 0.696] # No polymer RMSF >>> treatment = [0.517, 0.586] # 100% SBMA RMSF >>> result = independent_ttest(control, treatment) >>> print(f"t = {result.t_statistic:.3f}, p = {result.p_value:.4f}")
- polyzymd.compare.statistics.cohens_d(group1, group2, rmsf_mode=True)[source]
Compute Cohen’s d effect size.
Cohen’s d is the difference between means divided by the pooled standard deviation. A positive d means group1 has higher values.
For RMSF comparisons (rmsf_mode=True), direction is interpreted as: - d > 0 (control > treatment) = “stabilizing” (treatment reduces RMSF) - d < 0 (control < treatment) = “destabilizing” (treatment increases RMSF)
- Parameters:
group1 (array_like) – First group (typically control)
group2 (array_like) – Second group (typically treatment)
rmsf_mode (bool, optional) – If True, interpret direction for RMSF (lower = better). Default is True.
- Returns:
Effect size with interpretation
- Return type:
- polyzymd.compare.statistics.one_way_anova(*groups)[source]
Perform one-way ANOVA across multiple groups.
Tests the null hypothesis that all groups have the same mean.
- Parameters:
*groups (array_like) – Variable number of groups to compare
- Returns:
Result containing F-statistic and p-value
- Return type:
Examples
>>> no_poly = [0.715, 0.693, 0.696] >>> sbma = [0.517, 0.586] >>> egma = [0.558, 0.738, 0.496] >>> result = one_way_anova(no_poly, sbma, egma) >>> print(f"F = {result.f_statistic:.3f}, p = {result.p_value:.4f}")
Comparators
Contacts
Contacts comparator for comparing polymer-protein contacts across conditions.
This module provides the ContactsComparator class that orchestrates contacts analysis and statistical comparison across multiple conditions.
Key features: - Aggregate-level comparisons (coverage, mean contact fraction) - Effect size (Cohen’s d) for practical significance - ANOVA for 3+ conditions - Auto-exclusion of conditions without polymer (e.g., “No Polymer” controls)
The comparator inherits from BaseComparator and implements the Template Method pattern for DRY comparison logic. Since contacts has TWO primary metrics (coverage and mean_contact_fraction), some methods are customized.
Note
Per-residue pairwise comparisons have been removed. Contact data is mechanistic (explains WHY stability changes), not an observable. Per-residue contact-RMSF correlations are computed in polyzymd compare report.
- class polyzymd.compare.comparators.contacts.ContactsComparator(config, analysis_settings, comparison_settings=None, equilibration=None)[source]
Bases:
BaseComparator[ContactsAnalysisSettings,dict[str,Any],ContactsConditionSummary,ContactsComparisonResult]Compare polymer-protein contacts across multiple simulation conditions.
This class loads contacts analysis results for each condition (computing them if necessary), then performs statistical comparisons including: - Aggregate-level comparisons (coverage, mean contact fraction) - ANOVA for 3+ conditions - Effect sizes (Cohen’s d) for practical significance
- Parameters:
config (ComparisonConfig) – Comparison configuration defining conditions.
analysis_settings (ContactsAnalysisSettings) – Settings defining what contacts to analyze (selections, cutoff).
comparison_settings (ContactsComparisonSettings, optional) – Settings for how to compare (FDR alpha, effect sizes). Defaults to ContactsComparisonSettings() if not provided.
equilibration (str, optional) – Equilibration time override (e.g., “10ns”). If None, uses config.defaults.equilibration_time.
Examples
>>> config = ComparisonConfig.from_yaml("comparison.yaml") >>> analysis_settings = config.analysis_settings.get("contacts") >>> comparison_settings = config.comparison_settings.get("contacts") >>> comparator = ContactsComparator(config, analysis_settings, comparison_settings) >>> result = comparator.compare() >>> print(result.ranking_by_coverage) ["100% SBMA", "50/50 Mix", "100% EGMA"]
Notes
Higher contact fraction is considered “better” (more polymer-protein interaction)
Conditions without polymer atoms are automatically excluded
This is a MEAN_BASED metric (contact fractions are averages)
- property metric_type: MetricType
Contact fraction is a mean-based metric.
Contact fraction is the average fraction of frames where a residue is in contact with the polymer. This is an average over frames, so the mean converges regardless of autocorrelation. However, we need to correct uncertainty using N_eff (effective sample size).
- Returns:
MetricType.MEAN_BASED
- Return type:
MetricType
- compare(recompute=False)[source]
Run comparison across all conditions.
Overrides base to handle contacts-specific logic: - Dual metrics (coverage and mean_contact_fraction) - Auto-exclusion of no-polymer conditions - Custom result building
- Parameters:
recompute (bool, optional) – If True, force recompute even if cached results exist.
- Returns:
Complete comparison results with statistics and rankings.
- Return type:
ContactsComparisonResult
RMSF
RMSF comparator for comparing flexibility across conditions.
This module provides the RMSFComparator class that orchestrates RMSF analysis and statistical comparison across multiple conditions.
The comparator inherits from BaseComparator and implements the Template Method pattern for DRY comparison logic.
- class polyzymd.compare.comparators.rmsf.RMSFComparator(config, analysis_settings, equilibration=None, selection_override=None, reference_mode_override=None, reference_frame_override=None, reference_file_override=None)[source]
Bases:
BaseComparator[RMSFAnalysisSettings,dict[str,Any],RMSFConditionSummary,RMSFComparisonResult]Compare RMSF across multiple simulation conditions.
This class loads RMSF results for each condition (computing them if necessary), then performs statistical comparisons including t-tests, ANOVA, and effect size calculations.
- Parameters:
config (ComparisonConfig) – Comparison configuration defining conditions.
analysis_settings (RMSFAnalysisSettings) – RMSF analysis settings (from config.analysis_settings.get(“rmsf”)).
equilibration (str, optional) – Equilibration time override (e.g., “10ns”). If None, uses config.defaults.equilibration_time.
selection_override (str, optional) – Override for atom selection (requires –override flag on CLI).
reference_mode_override (str, optional) – Override for reference mode (requires –override flag on CLI).
reference_frame_override (int, optional) – Override for reference frame (requires –override flag on CLI).
reference_file_override (str, optional) – Override for external reference PDB file path (requires –override flag on CLI). Used when reference_mode is “external”.
Examples
>>> config = ComparisonConfig.from_yaml("comparison.yaml") >>> rmsf_settings = config.analysis_settings.get("rmsf") >>> comparator = RMSFComparator(config, rmsf_settings, equilibration="10ns") >>> result = comparator.compare() >>> print(result.ranking) ["100% SBMA", "100% EGMA", "No Polymer", "50/50 Mix"]
- __init__(config, analysis_settings, equilibration=None, selection_override=None, reference_mode_override=None, reference_frame_override=None, reference_file_override=None)[source]
- property metric_type: MetricType
RMSF is a variance-based metric.
RMSF measures root-mean-square fluctuations, which are inherently variance-based. Correlated frames lead to biased variance estimates, so independent subsampling (2τ separation) is required for accurate uncertainty quantification.
- Returns:
MetricType.VARIANCE_BASED
- Return type:
MetricType
Triad
Catalytic triad comparator for comparing active site geometry across conditions.
This module provides the TriadComparator class that orchestrates catalytic triad analysis and statistical comparison across multiple conditions.
The key metric is “simultaneous contact fraction” - the percentage of frames where ALL pairs in the triad are below the contact threshold simultaneously. Higher values indicate better triad integrity and potentially better catalytic competence.
The comparator inherits from BaseComparator and implements the Template Method pattern for DRY comparison logic.
- class polyzymd.compare.comparators.triad.TriadComparator(config, analysis_settings, equilibration=None)[source]
Bases:
BaseComparator[CatalyticTriadAnalysisSettings,dict[str,Any],TriadConditionSummary,TriadComparisonResult]Compare catalytic triad geometry across multiple simulation conditions.
This class loads triad analysis results for each condition (computing them if necessary), then performs statistical comparisons including t-tests, ANOVA, and effect size calculations on the simultaneous contact fraction.
- Parameters:
config (ComparisonConfig) – Comparison configuration defining conditions.
analysis_settings (CatalyticTriadAnalysisSettings) – Catalytic triad analysis settings (from config.analysis_settings.get(“catalytic_triad”)).
equilibration (str, optional) – Equilibration time override (e.g., “10ns”). If None, uses config.defaults.equilibration_time.
Examples
>>> config = ComparisonConfig.from_yaml("comparison.yaml") >>> triad_settings = config.analysis_settings.get("catalytic_triad") >>> comparator = TriadComparator(config, triad_settings, equilibration="10ns") >>> result = comparator.compare() >>> print(result.ranking) ["100% SBMA", "100% EGMA", "No Polymer", "50/50 Mix"]
Notes
Higher simultaneous contact fraction is better (triad is more intact).
- property metric_type: MetricType
Catalytic triad contact fraction is a mean-based metric.
The simultaneous contact fraction is an average over frames (fraction of frames where all pairs are in contact). The mean converges regardless of autocorrelation, but we need to correct the uncertainty using N_eff (effective sample size = N/g where g is the statistical inefficiency).
- Returns:
MetricType.MEAN_BASED
- Return type:
MetricType
Distances
Distances comparator for comparing distance metrics across conditions.
This module provides the DistancesComparator class that orchestrates distance analysis and statistical comparison across multiple conditions.
The primary ranking metric is mean distance (lower = closer interactions). Secondary metric is fraction below threshold (if threshold specified).
The comparator inherits from BaseComparator and implements the Template Method pattern for DRY comparison logic.
- class polyzymd.compare.comparators.distances.DistancesComparator(config, analysis_settings, equilibration=None)[source]
Bases:
BaseComparator[DistancesAnalysisSettings,dict[str,Any],DistanceConditionSummary,DistanceComparisonResult]Compare distance metrics across multiple simulation conditions.
This class loads distance analysis results for each condition (computing them if necessary), then performs statistical comparisons including t-tests, ANOVA, and effect size calculations on both mean distance and fraction below threshold.
Each distance pair is compared independently - there is no cross-pair averaging since different pairs measure fundamentally different physical quantities (e.g., H-bond distances vs lid-opening distances).
- Parameters:
config (ComparisonConfig) – Comparison configuration defining conditions.
analysis_settings (DistancesAnalysisSettings) – Distance analysis settings (from config.analysis_settings.get(“distances”)).
equilibration (str, optional) – Equilibration time override (e.g., “10ns”). If None, uses config.defaults.equilibration_time.
Examples
>>> config = ComparisonConfig.from_yaml("comparison.yaml") >>> dist_settings = config.analysis_settings.get("distances") >>> comparator = DistancesComparator(config, dist_settings, equilibration="10ns") >>> result = comparator.compare() >>> print(result.ranking_by_pair["Catalytic H-bond"]) # Per-pair ranking ["100% SBMA", "No Polymer", "50/50 Mix", "100% EGMA"]
Notes
Lower mean distance is better (closer interactions). Higher fraction below threshold is better (more time in contact).
- property metric_type: MetricType
Distance analysis is a mean-based metric.
The mean distance is an average over frames. The mean converges regardless of autocorrelation, but we need to correct the uncertainty using N_eff (effective sample size = N/g where g is the statistical inefficiency).
- Returns:
MetricType.MEAN_BASED
- Return type:
MetricType
- compare(recompute=False)[source]
Run the comparison across all conditions.
Each distance pair is compared independently - rankings and statistics are computed per-pair since averaging unrelated distances (e.g., H-bond + lid-opening) is not semantically meaningful.
- Parameters:
recompute (bool) – Force recompute even if cached results exist.
- Returns:
Complete comparison result with per-pair rankings.
- Return type:
DistanceComparisonResult
Exposure Dynamics
Exposure dynamics comparator for chaperone-like polymer-protein interaction analysis.
This module provides ExposureDynamicsComparator, which orchestrates: 1. SASA computation (MDTraj shrake_rupley, protein-only) 2. Exposure dynamics analysis (classify residues, detect chaperone events) 3. Chaperone enrichment (dual residue/atom normalization) 4. Statistical comparison of chaperone fraction across conditions
Design follows the ContactsComparator pattern: - compare() is fully overridden (custom multi-metric flow) - _load_or_compute() handles caching at replicate level - Condition summaries aggregate per-replicate ExposureDynamicsResults
Registration: @ComparatorRegistry.register("exposure")
- class polyzymd.compare.comparators.exposure.ExposureDynamicsComparator(config, analysis_settings, comparison_settings=None, equilibration=None)[source]
Bases:
BaseComparator[ExposureAnalysisSettings,dict[str,Any],ExposureConditionSummary,ExposureComparisonResult]Compare chaperone-like polymer activity across simulation conditions.
Combines per-frame SASA data with polymer-protein contact data to:
Classify each protein residue as stably exposed, stably buried, or transiently exposed.
Detect “chaperone events” (buried → exposed → polymer contact → re-buried) and unassisted refolding events.
Compute dynamic chaperone enrichment per (polymer_type, aa_group) pair with dual residue/atom normalization.
Statistically compare chaperone_fraction across conditions.
- Parameters:
config (ComparisonConfig) – Comparison configuration defining conditions.
analysis_settings (ExposureAnalysisSettings) – Settings defining SASA and exposure parameters.
comparison_settings (ExposureComparisonSettings, optional) – Settings for statistical comparison. Defaults to ExposureComparisonSettings() if not provided.
equilibration (str, optional) – Equilibration time override. If None, uses config.defaults.equilibration_time.
Notes
This is a MEAN_BASED metric (chaperone fraction is an average over frames).
Conditions without polymer (no chaperone events possible) are excluded.
Contacts must be pre-computed (contacts_rep{n}.json must exist).
SASA is computed on demand and cached under analysis_dir/sasa/.
- property metric_type: MetricType
Chaperone fraction is a mean-based metric.
Chaperone fraction is the fraction of exposed windows that coincide with polymer contact — an average over discrete events. The mean converges regardless of autocorrelation; uncertainty is corrected using N_eff.
- Returns:
MetricType.MEAN_BASED
- Return type:
MetricType
Binding Free Energy
Binding free energy comparator via Boltzmann inversion of binding preference.
This module implements BindingFreeEnergyComparator, which converts the existing binding preference (enrichment) data into a selectivity free energy ΔG_sel in real units (kT, kcal/mol, or kJ/mol).
Physics
In the NPT ensemble the correct thermodynamic potential is the Gibbs free energy G. The polymer distributes its contacts across protein surface groups. Both the observed contact distribution (contact_share) and the null reference distribution (expected_share, proportional to each group’s solvent-exposed surface area) are proper probability distributions that sum to 1 over the partition. Boltzmann inversion of their ratio gives the selectivity free energy:
ΔG_sel(j) = -k_B·T · ln(contact_share_j / expected_share_j)
Because both distributions are normalized over the same partition, there is no arbitrary constant — ΔG_sel(j) is fully determined by the data.
Because contact_share / expected_share = enrichment + 1, per replicate:
ΔG_sel,rep = -k_B·T · ln(enrichment_rep + 1)
This is the exact Boltzmann-inverted version of the dimensionless enrichment score.
- Sign convention:
ΔG_sel < 0 → preferential contact (observed > surface-availability reference) ΔG_sel > 0 → contact avoidance (observed < surface-availability reference) ΔG_sel = 0 → contacts match the surface-availability reference exactly
Differences between groups (ΔG_sel(i) - ΔG_sel(j)) give the relative selectivity. Differences between conditions (ΔG_sel,B(j) - ΔG_sel,A(j)) give a true ΔΔG.
Temperature handling
ΔG_sel computed at temperature T is not comparable to ΔG_sel at T’ (in physical units). Pairwise statistics are suppressed between conditions at different simulation temperatures.
Design
Consumes cached binding preference files produced by ContactsComparator / binding_preference.py. When cached data is missing, computes binding preference on-demand from per-replicate contacts_rep{N}.json files (following the same load-or-compute contract as every other comparator).
Inherits BaseComparator but overrides compare() (like ContactsComparator) because the result type (BindingFreeEnergyResult) does not conform to BaseComparisonResult.
- class polyzymd.compare.comparators.binding_free_energy.BindingFreeEnergyComparator(config, analysis_settings, comparison_settings=None, equilibration=None)[source]
Bases:
BaseComparator[BindingFreeEnergyAnalysisSettings,dict[str,Any],FreeEnergyConditionSummary,BindingFreeEnergyResult]Compare selectivity free energy (ΔG_sel) across simulation conditions.
Consumes cached binding preference results (produced by the contacts analysis layer) and converts them to selectivity free energies via Boltzmann inversion:
ΔG_sel = -k_B·T · ln(contact_share / expected_share)
Statistical comparisons are only computed between conditions that share the same simulation temperature. Cross-temperature pairs are flagged and their statistics suppressed.
- Parameters:
config (ComparisonConfig) – Comparison configuration.
analysis_settings (BindingFreeEnergyAnalysisSettings) – Units, surface-exposure threshold, custom partitions.
comparison_settings (BindingFreeEnergyComparisonSettings, optional) – FDR alpha. Defaults to BindingFreeEnergyComparisonSettings().
equilibration (str, optional) – Equilibration time override.
Examples
>>> config = ComparisonConfig.from_yaml("comparison.yaml") >>> settings = BindingFreeEnergyAnalysisSettings(units="kcal/mol") >>> comparator = BindingFreeEnergyComparator(config, settings) >>> result = comparator.compare() >>> print(result.units) kcal/mol
Notes
This is a MEAN_BASED metric (contact fractions are averages over frames, not fluctuation-based quantities).
- property metric_type: MetricType
Contact share is a mean-based metric.
- Returns:
MetricType.MEAN_BASED
- Return type:
MetricType
- compare(recompute=False)[source]
Run binding free energy comparison across all conditions.
- Parameters:
recompute (bool, optional) – Ignored (binding free energy is always recomputed from cached binding preference data; it is fast and stateless).
- Returns:
Complete ΔG_sel comparison result.
- Return type:
Polymer Affinity Score
Polymer affinity score comparator.
This module implements PolymerAffinityScoreComparator, which quantifies the total strength of polymer-protein interactions by summing per-contact free energy contributions weighted by the number of simultaneous contacts.
Physics
For each (polymer_type, protein_group) pair:
S_{p,g} = N_{p,g} × ΔG_sel(p,g)
- where:
N_{p,g} = mean_contact_fraction × n_exposed_in_group ΔG_sel(p,g) = -ln(contact_share / expected_share) [kT]
Because contact_share / expected_share = enrichment + 1:
ΔG_sel,rep = -ln(enrichment_rep + 1)
The total affinity score for a polymer type is:
S_p = Σ_g S_{p,g}
The total affinity score for a condition is:
S = Σ_p S_p
Independence assumption
This formulation assumes contacts are thermodynamically independent — each contact contributes the same free energy regardless of what other contacts exist simultaneously. The absolute values are NOT rigorous binding free energies. Only the relative differences between polymer compositions are meaningful as a comparative scoring metric.
Sign convention
S < 0 → net favorable polymer-protein interaction S > 0 → net unfavorable (avoidance dominates) S = 0 → contacts match the surface-availability reference
Temperature handling
All scores are in kT (dimensionless); the temperature factor cancels in the Boltzmann inversion ratio. Pairwise statistics are suppressed between conditions at different simulation temperatures because N changes.
Design
Consumes cached binding preference files produced by the contacts analysis layer. When cached data is missing, computes binding preference on-demand from per-replicate
contacts_rep{N}.jsonfiles.Inherits
BaseComparatorbut overridescompare()(like the BFE comparator) because the result type does not conform toBaseComparisonResult.Uses
AggregatedBindingPreferenceEntryobjects (frombp_result.entries) for the group-level data that includesmean_contact_fraction.
- class polyzymd.compare.comparators.polymer_affinity.PolymerAffinityScoreComparator(config, analysis_settings, comparison_settings=None, equilibration=None)[source]
Bases:
BaseComparator[PolymerAffinityScoreSettings,dict[str,Any],AffinityScoreConditionSummary,PolymerAffinityScoreResult]Compare polymer affinity scores across simulation conditions.
Computes a composite interaction score for each (polymer_type, protein_group) pair by multiplying the mean number of simultaneous contacts by the per-contact selectivity free energy:
S = N × ΔG_sel [kT]
The total score is summed across all polymer types and protein groups. More negative = stronger net polymer-protein affinity.
Statistical comparisons use per-replicate total scores and are only computed between conditions at the same simulation temperature.
- Parameters:
config (ComparisonConfig) – Comparison configuration.
analysis_settings (PolymerAffinityScoreSettings) – Surface-exposure threshold, protein groups, etc.
comparison_settings (PolymerAffinityScoreComparisonSettings, optional) – FDR alpha. Defaults to
PolymerAffinityScoreComparisonSettings().equilibration (str, optional) – Equilibration time override.
Notes
This is a MEAN_BASED metric (contact fractions are averages over frames, not fluctuation-based quantities).
- property metric_type: MetricType
Contact fractions and shares are mean-based metrics.
- Returns:
MetricType.MEAN_BASED
- Return type:
MetricType
- compare(recompute=False)[source]
Run polymer affinity score comparison across all conditions.
- Parameters:
recompute (bool, optional) – Ignored (affinity scores are always recomputed from cached binding preference data; the computation is fast and stateless).
- Returns:
Complete polymer affinity score comparison result.
- Return type:
Results
Common result modules live under polyzymd.compare.results.
Stable result families include:
polyzymd.compare.results.rmsfpolyzymd.compare.results.triadpolyzymd.compare.results.contactspolyzymd.compare.results.distancespolyzymd.compare.results.secondary_structure
Result models for binding free energy comparison analysis.
Physics background
In the NPT ensemble (constant pressure, as used in all polyzymd simulations) the correct thermodynamic potential is the Gibbs free energy G.
The quantity computed here is a selectivity free energy (ΔG_sel) that measures how much more (or less) favorable it is for a polymer to contact a given group of protein residues compared to what would be expected if the polymer contacted each exposed surface residue in proportion to that residue group’s share of the total solvent-exposed protein surface.
Concretely: if aromatic residues make up 10% of the solvent-exposed surface but receive 20% of the polymer’s contacts, the polymer preferentially contacts aromatic residues. The reference (expected) distribution is simply proportional to surface availability — not any property of the polymer itself.
ΔG_sel(j) = -k_B·T · ln(contact_share_j / expected_share_j)
- where:
- contact_share_j = (contact frames involving residues in group j) /
(total contact frames across all protein residues) — the observed fraction of polymer contacts directed at group j
- expected_share_j = (number of solvent-exposed residues in group j) /
(total number of solvent-exposed protein residues) — the fraction of the protein surface belonging to group j; this is the reference assuming contacts are distributed purely by surface area
k_B = Boltzmann constant (0.0019872041 kcal mol⁻¹ K⁻¹) T = simulation temperature in Kelvin
Because both distributions are normalized over the same partition (they sum to 1 over all groups), there is no arbitrary additive constant — ΔG_sel is fully determined by the data.
When units=’kT’ (default), the formula simplifies to:
ΔG_sel(j) / k_BT = -ln(contact_share_j / expected_share_j)
yielding a dimensionless value directly comparable to the thermal energy scale. A value of -1.0 means the binding preference is exactly 1 k_bT favorable relative to the surface-availability reference.
Note: contact_share / expected_share = enrichment_ratio = enrichment + 1 (where enrichment is the existing dimensionless enrichment score from binding preference analysis). So ΔG_sel = -kT·ln(enrichment + 1), and the two representations are mathematically equivalent; ΔG_sel simply puts the enrichment score on a physically meaningful energy scale.
- Sign convention:
ΔG_sel < 0 → preferential contact (observed > surface-availability reference) ΔG_sel > 0 → contact avoidance (observed < surface-availability reference) ΔG_sel = 0 → contacts match the surface-availability reference exactly
Differences between conditions (ΔG_sel,B(j) − ΔG_sel,A(j)) give a true ΔΔG, stored in FreeEnergyPairwiseEntry.delta_delta_G.
Uncertainty propagation
When multiple independent replicates are available, two uncertainty estimates are reported:
Between-replicate SEM on ΔG_sel (primary, used for pairwise statistics): ΔG_sel is computed independently for each replicate, and the SEM is taken directly across those values. This is the most statistically sound approach for independent replicates and is the quantity used in t-tests.
Delta-method propagation (analytical approximation, stored for reference): For the mean contact_share and its SEM, uncertainty is propagated through the logarithm using first-order error propagation (Taylor 1997, ch. 3; Bevington & Robinson 2003, ch. 3):
σ(ΔG_sel) ≈ k_B·T · √[(σ_cs / cs)² + (σ_es / es)²] (or simply √[…] when units=’kT’)
where σ_cs = SEM of contact_share across replicates, and σ_es ≈ 0 because expected_share is computed from a single static PDB structure (no replicate variance). This simplifies to σ(ΔG_sel) ≈ k_B·T · (σ_cs / cs) (or σ_cs / cs when units=’kT’).
References: - Taylor, J. R. (1997). An Introduction to Error Analysis, 2nd ed.
University Science Books. (Ch. 3: Error propagation for functions of one or more variables)
Bevington, P. R. & Robinson, D. K. (2003). Data Reduction and Error Analysis for the Physical Sciences, 3rd ed. McGraw-Hill. (Ch. 3)
Wikipedia: Delta method, https://en.wikipedia.org/wiki/Delta_method
- Temperature handling:
When units=’kT’, ΔG_sel = -ln(ratio) is temperature-independent (the same ratio at any temperature gives the same dimensionless value). However, the underlying contact probabilities ARE temperature-dependent, so cross- temperature comparisons still require caution. When units=’kcal/mol’ or ‘kJ/mol’, ΔG_sel computed at temperature T is NOT directly comparable to ΔG_sel at temperature T’. Pairwise statistical comparisons are only computed between conditions sharing the same simulation temperature.
- class polyzymd.compare.results.binding_free_energy.FreeEnergyEntry(*, polymer_type, protein_group, partition_name='aa_class', contact_share, expected_share, enrichment_ratio, delta_G=None, delta_G_uncertainty=None, delta_G_per_replicate=<factory>, units='kT', temperature_K, n_replicates=0, n_exposed_in_group=0)[source]
Bases:
BaseModelFree energy analysis for one (polymer_type, protein_group) pair in one condition.
Stores both the ΔG_sel value and the raw probability quantities used to compute it, enabling reproducibility and downstream verification.
Observed fraction of polymer contacts directed at this group. This is P_obs in ΔG_sel = -kT·ln(P_obs / P_ref).
- Type:
Surface-availability-weighted reference fraction. This is P_ref in ΔG_sel = -kT·ln(P_obs / P_ref).
- Type:
- enrichment_ratio
contact_share / expected_share (= enrichment + 1). Stored for traceability; ΔG_sel = -kT·ln(enrichment_ratio).
- Type:
- delta_G
ΔG_sel in the configured units. None when contact_share = 0 or expected_share = 0 (log undefined or reference missing).
- Type:
float | None
- delta_G_uncertainty
σ(ΔG_sel) from delta-method error propagation. None if delta_G is None or if SEM data is unavailable (single replicate).
- Type:
float | None
- delta_G_per_replicate
Per-replicate ΔG_sel values used for cross-condition statistics.
- n_exposed_in_group
Number of surface-exposed residues in this group (used for expected_share).
- Type:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.results.binding_free_energy.FreeEnergyConditionSummary(*, label, config_path, temperature_K, n_replicates, units='kT', entries=<factory>, polymer_types=<factory>, protein_groups=<factory>)[source]
Bases:
BaseModelFree energy summary for one simulation condition.
Aggregates FreeEnergyEntry objects across all (polymer_type, protein_group) pairs for a single condition, together with condition metadata.
- entries
All (polymer_type, protein_group) ΔG_sel entries.
- Type:
- entries: list[FreeEnergyEntry]
- property primary_metric_value: float
Mean ΔG_sel across all valid entries (for BaseConditionSummary compatibility).
- get_entry(polymer_type, protein_group, partition_name=None)[source]
Get the FreeEnergyEntry for a (polymer_type, protein_group) pair.
- Parameters:
polymer_type (str) – Polymer type.
protein_group (str) – AA group label.
partition_name (str or None, optional) – If given, further restrict to entries belonging to this partition. Necessary when the same
protein_grouplabel appears in multiple partitions (e.g., “rest_of_protein” in several user-defined partitions).
- Return type:
FreeEnergyEntry or None
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.results.binding_free_energy.FreeEnergyPairwiseEntry(*, polymer_type, protein_group, condition_a, condition_b, temperature_a_K, temperature_b_K, cross_temperature=False, delta_G_a=None, delta_G_b=None, delta_delta_G=None, t_statistic=None, p_value=None)[source]
Bases:
BaseModelPairwise comparison between two conditions for one (polymer, group) pair.
Each condition has a per-group selectivity free energy ΔG_sel. The difference ΔΔG = ΔG_sel,B − ΔG_sel,A is a true double-delta quantity.
Statistics are only computed when both conditions share the same simulation temperature. If temperatures differ, all stat fields are None and the
cross_temperatureflag is set to True.- delta_delta_G
ΔΔG = ΔG_sel,B − ΔG_sel,A. Positive → B has less favorable selectivity.
- Type:
float | None
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.results.binding_free_energy.BindingFreeEnergyResult(*, name, units='kT', formula='ΔG_sel = -ln(contact_share / expected_share) [units: k_bT]', mixed_temperatures=False, temperature_groups=<factory>, conditions=<factory>, pairwise_comparisons=<factory>, polymer_types=<factory>, protein_groups=<factory>, surface_exposure_threshold=None, equilibration_time='', created_at=<factory>, polyzymd_version=<factory>)[source]
Bases:
BaseModelComplete binding free energy comparison result.
This is the main output from BindingFreeEnergyComparator.compare().
Physics summary
Formula: ΔG_sel = -k_B·T · ln(contact_share / expected_share)
Uncertainty: σ(ΔG_sel) = k_B·T · √[(σ_cs/cs)² + (σ_es/es)²]
Temperature note: pairwise statistics are suppressed between conditions at different temperatures. The
mixed_temperaturesflag indicates this occurred. Each condition’s temperature is stored in its summary.- temperature_groups
Mapping of temperature (K) to condition labels at that temperature.
- conditions
Summary for each condition.
- Type:
- pairwise_comparisons
All pairwise comparisons (cross-T pairs have stats suppressed).
- Type:
- surface_exposure_threshold
SASA threshold used (from binding preference settings).
- Type:
float | None
- created_at
When the analysis was run.
- Type:
datetime
- conditions: list[FreeEnergyConditionSummary]
- pairwise_comparisons: list[FreeEnergyPairwiseEntry]
- created_at: datetime
- save(path)[source]
Save result to JSON file.
- Parameters:
path (Path or str) – Output path.
- Returns:
Path to the saved file.
- Return type:
Path
- classmethod load(path)[source]
Load result from JSON file.
- Parameters:
path (Path or str) – Path to JSON file.
- Returns:
Loaded result.
- Return type:
- get_condition(label)[source]
Get a condition summary by label.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Exposure dynamics condition summary and comparison result models.
These classes inherit from the base classes in compare/core/ and add exposure-dynamics-specific fields for chaperone event analysis.
- class polyzymd.compare.results.exposure.ExposureConditionSummary(*, label, config_path, n_replicates, replicate_values, mean_transient_fraction, sem_transient_fraction, mean_chaperone_fraction, sem_chaperone_fraction, mean_n_transient, mean_total_chaperone_events=0.0, mean_total_unassisted_events=0.0, enrichment_by_polymer_type=<factory>, polymer_types=<factory>, aa_groups=<factory>)[source]
Bases:
BaseConditionSummarySummary statistics for one condition in an exposure dynamics comparison.
- replicate_values
Per-replicate mean chaperone fraction across transient residues.
- mean_transient_fraction
Mean fraction of protein residues that are transiently exposed, averaged across replicates.
- Type:
- mean_chaperone_fraction
Mean chaperone fraction (chaperone events / total exposed windows) across transient residues and replicates.
- Type:
- enrichment_by_polymer_type
Nested dict: polymer_type → aa_group → mean enrichment_residue.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.results.exposure.ExposureComparisonResult(*, metric='chaperone_fraction', name, control_label=None, conditions=<factory>, pairwise_comparisons=<factory>, anova=None, ranking, equilibration_time='0ns', created_at=<factory>, polyzymd_version='1.2.1', ranking_by_transient_fraction=<factory>, excluded_conditions=<factory>)[source]
Bases:
BaseComparisonResult[ExposureConditionSummary, PairwiseComparison]Complete exposure dynamics comparison result.
This is the main output from ExposureDynamicsComparator.compare(). Contains per-condition summaries of transient exposure and chaperone event statistics, plus pairwise statistical comparisons.
- conditions
Summary for each condition.
- Type:
- pairwise_comparisons
Pairwise t-tests on chaperone_fraction.
- Type:
- anova
One-way ANOVA across all conditions.
- Type:
ANOVASummary, optional
- ranking_by_transient_fraction
Condition labels sorted by transient_fraction (highest first).
- created_at
When the analysis was run.
- Type:
datetime
- conditions: list[ExposureConditionSummary]
- pairwise_comparisons: list[PairwiseComparison]
- created_at: datetime
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Result models for polymer affinity score comparison analysis.
The polymer affinity score is a comparative metric that quantifies the total strength of polymer-protein interactions by summing per-contact free energy contributions weighted by the number of simultaneous contacts.
Physics
For each (polymer_type, protein_group) pair, the affinity score is:
S_{p,g} = N_{p,g} × ΔG_sel(p,g)
- where:
- N_{p,g} = mean number of simultaneous contacts per frame
= mean_contact_fraction × n_exposed_in_group
ΔG_sel(p,g) = -ln(contact_share / expected_share) [in units of k_bT]
The total affinity score for a polymer type is:
S_p = Σ_g S_{p,g}
The total affinity score for a condition is:
S = Σ_p S_p
Independence assumption
This formulation assumes contacts are thermodynamically independent — each contact contributes the same free energy regardless of what other contacts exist simultaneously. This is the standard polyvalent binding approximation (Mammen et al., Angew. Chem. Int. Ed. 1998, 37, 2754).
The absolute values are NOT rigorous thermodynamic binding free energies. However, the relative differences between polymer compositions are meaningful as a comparative scoring function, analogous to scoring functions in molecular docking or MM/PBSA decomposition.
Sign convention
S < 0 → net favorable polymer-protein interaction S > 0 → net unfavorable (avoidance dominates) S = 0 → contacts match the surface-availability reference
Interpretation
More negative total score → stronger net polymer-protein affinity. When combined with structural stability metrics (RMSF, triad contacts), the affinity score helps rank polymer compositions by total interaction strength.
Uncertainty propagation
Per-replicate scores are computed independently:
S_rep = N_rep × ΔG_sel,rep
where N_rep = contact_fraction_rep × n_exposed_in_group, and ΔG_sel,rep = -ln(enrichment_rep + 1). The mean and SEM are taken across replicates. This approach naturally captures the covariance between N and ΔG_sel.
When per-replicate data is unavailable, analytical error propagation is used:
σ(S) = √[(N·σ_ΔG_sel)² + (ΔG_sel·σ_N)²]
- class polyzymd.compare.results.polymer_affinity.AffinityScoreEntry(*, polymer_type, protein_group, partition_name='aa_class', n_contacts, delta_G_per_contact=None, affinity_score=None, affinity_score_uncertainty=None, affinity_score_per_replicate=<factory>, mean_contact_fraction=0.0, n_exposed_in_group=0, contact_share=0.0, expected_share=0.0, temperature_K=0.0, n_replicates=0)[source]
Bases:
BaseModelAffinity score for one (polymer_type, protein_group) pair in one condition.
Stores both the composite score and its constituent quantities for reproducibility and downstream verification.
- n_contacts
Mean number of simultaneous contacts per frame. Computed as mean_contact_fraction * n_exposed_in_group.
- Type:
- delta_G_per_contact
Per-contact selectivity free energy in kT. Computed as -ln(contact_share / expected_share).
- Type:
float | None
- affinity_score
Composite score: n_contacts * delta_G_per_contact (kT). More negative = stronger favorable interaction.
- Type:
float | None
- affinity_score_uncertainty
Uncertainty on affinity_score. From replicate SEM when available, otherwise from analytical error propagation.
- Type:
float | None
- affinity_score_per_replicate
Per-replicate affinity scores for statistical testing.
- mean_contact_fraction
Mean per-residue contact fraction in this group (from binding preference).
- Type:
Observed fraction of polymer contacts directed at this group.
- Type:
Surface-availability reference fraction.
- Type:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.results.polymer_affinity.PolymerTypeScore(*, polymer_type, total_score, total_score_uncertainty=None, total_score_per_replicate=<factory>, total_n_contacts=0.0, group_contributions=<factory>)[source]
Bases:
BaseModelAggregated affinity score for one polymer type across all protein groups.
- The score is the sum of per-group affinity scores:
S_p = Σ_g (N_g × ΔG_sel(g))
- group_contributions
Breakdown by protein group (for detail reporting).
- Type:
- group_contributions: list[AffinityScoreEntry]
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.results.polymer_affinity.AffinityScoreConditionSummary(*, label, config_path, temperature_K, n_replicates=0, total_score=0.0, total_score_uncertainty=None, total_score_per_replicate=<factory>, total_n_contacts=0.0, polymer_type_scores=<factory>, entries=<factory>, polymer_types=<factory>, protein_groups=<factory>)[source]
Bases:
BaseModelAffinity score summary for one simulation condition.
Aggregates scores at three levels: per (polymer_type, protein_group), per polymer_type, and total condition score.
- total_score_per_replicate
Per-replicate grand total scores for pairwise statistics.
- polymer_type_scores
Per-polymer-type score breakdown.
- Type:
- entries
All (polymer_type, protein_group) entries.
- Type:
- polymer_type_scores: list[PolymerTypeScore]
- entries: list[AffinityScoreEntry]
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.results.polymer_affinity.AffinityScorePairwiseEntry(*, condition_a, condition_b, temperature_a_K, temperature_b_K, cross_temperature=False, score_a=0.0, score_b=0.0, delta_score=None, t_statistic=None, p_value=None)[source]
Bases:
BaseModelPairwise affinity score comparison between two conditions.
Compares total affinity scores. Statistics are suppressed for cross-temperature pairs.
- delta_score
Difference: score_B - score_A (kT). Negative = B has stronger affinity than A.
- Type:
float | None
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class polyzymd.compare.results.polymer_affinity.PolymerAffinityScoreResult(*, name, methodology='Polymer Affinity Score: S = Σ (N_contacts × ΔG_sel_per_contact) [kT]. N_contacts = mean_contact_fraction × n_exposed_in_group. ΔG_sel_per_contact = -ln(contact_share / expected_share). More negative = stronger net polymer-protein affinity. Assumes contact independence; interpret as comparative scoring metric.', mixed_temperatures=False, temperature_groups=<factory>, conditions=<factory>, pairwise_comparisons=<factory>, polymer_types=<factory>, protein_groups=<factory>, surface_exposure_threshold=None, equilibration_time='', created_at=<factory>, polyzymd_version=<factory>)[source]
Bases:
BaseModelComplete polymer affinity score comparison result.
This is the main output from PolymerAffinityScoreComparator.compare().
The polymer affinity score quantifies total polymer-protein interaction strength as a comparative metric. It is computed by summing per-contact selectivity free energies weighted by the number of simultaneous contacts:
S = Σ_{p,g} N_{p,g} × ΔG_sel(p,g)
where the sum runs over all (polymer_type, protein_group) pairs.
Important
This quantity assumes contact independence and should be interpreted as a relative affinity score, not a rigorous thermodynamic binding free energy. See the module docstring for details.
- temperature_groups
Mapping of temperature (K, as str) to condition labels.
- conditions
Summary for each condition.
- pairwise_comparisons
All pairwise comparisons.
- Type:
- created_at
When the analysis was run.
- Type:
datetime
- conditions: list[AffinityScoreConditionSummary]
- pairwise_comparisons: list[AffinityScorePairwiseEntry]
- created_at: datetime
- save(path)[source]
Save result to JSON file.
- Parameters:
path (Path or str) – Output path.
- Returns:
Path to the saved file.
- Return type:
Path
- classmethod load(path)[source]
Load result from JSON file.
- Parameters:
path (Path or str) – Path to JSON file.
- Returns:
Loaded result.
- Return type:
- get_condition(label)[source]
Look up a condition summary by label.
- Parameters:
label (str) – Condition display name.
- Return type:
AffinityScoreConditionSummary or None
- get_ranking()[source]
Return conditions ranked by total affinity score (most negative first).
- Returns:
Conditions sorted by total_score ascending.
- Return type:
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Formatters
Output formatters for binding free energy comparison results.
Provides console table, Markdown, and JSON output for BindingFreeEnergyResult.
- polyzymd.compare.binding_free_energy_formatters.format_bfe_console_table(result)[source]
Format a BindingFreeEnergyResult as a console-friendly ASCII table.
- Parameters:
result (BindingFreeEnergyResult) – Comparison result to format.
- Returns:
ASCII table string.
- Return type:
- polyzymd.compare.binding_free_energy_formatters.format_bfe_markdown(result)[source]
Format a BindingFreeEnergyResult as Markdown.
- Parameters:
result (BindingFreeEnergyResult) – Comparison result to format.
- Returns:
Markdown-formatted string.
- Return type:
- polyzymd.compare.binding_free_energy_formatters.format_bfe_json(result)[source]
Format a BindingFreeEnergyResult as JSON.
- Parameters:
result (BindingFreeEnergyResult) – Comparison result to format.
- Returns:
JSON string.
- Return type:
- polyzymd.compare.binding_free_energy_formatters.format_bfe_result(result, format='table')[source]
Format a BindingFreeEnergyResult in the requested format.
- Parameters:
result (BindingFreeEnergyResult) – Comparison result to format.
format (str) – Output format: “table” (default), “markdown”, or “json”.
- Returns:
Formatted string.
- Return type:
- Raises:
ValueError – If format is not recognized.
Output formatters for polymer affinity score comparison results.
Provides console table, Markdown, and JSON output for PolymerAffinityScoreResult.
- polyzymd.compare.polymer_affinity_formatters.format_affinity_console_table(result)[source]
Format a PolymerAffinityScoreResult as a console-friendly ASCII table.
- Parameters:
result (PolymerAffinityScoreResult) – Comparison result to format.
- Returns:
ASCII table string.
- Return type:
- polyzymd.compare.polymer_affinity_formatters.format_affinity_markdown(result)[source]
Format a PolymerAffinityScoreResult as Markdown.
- Parameters:
result (PolymerAffinityScoreResult) – Comparison result to format.
- Returns:
Markdown-formatted string.
- Return type:
- polyzymd.compare.polymer_affinity_formatters.format_affinity_json(result)[source]
Format a PolymerAffinityScoreResult as JSON.
- Parameters:
result (PolymerAffinityScoreResult) – Comparison result to format.
- Returns:
JSON string.
- Return type:
- polyzymd.compare.polymer_affinity_formatters.format_affinity_result(result, format='table')[source]
Format a PolymerAffinityScoreResult in the requested format.
- Parameters:
result (PolymerAffinityScoreResult) – Comparison result to format.
format (str) – Output format: “table” (default), “markdown”, or “json”.
- Returns:
Formatted string.
- Return type:
- Raises:
ValueError – If format is not recognized.
Plotters
Binding free energy plotters for comparison workflow.
This module provides registered plotters for ΔG_sel (selectivity free energy) analysis: - BFEHeatmapPlotter: ΔG_sel heatmap with rows = AA groups, columns = conditions - BFEBarPlotter: Grouped bar chart of ΔG_sel by AA residue class
Both plotters load a BindingFreeEnergyResult JSON saved by the
polyzymd compare binding-free-energy command (in results/ adjacent to
comparison.yaml) rather than per-condition analysis directories.
Partition-aware plotting
Each FreeEnergyEntry carries a partition_name field (e.g., “aa_class”,
“lid_helices”, “whole_lid_domain”) that identifies which residue grouping
scheme produced that entry. Different partitions use different denominators
(each partition’s total exposed surface area), so mixing groups from different
partitions on the same figure is scientifically misleading.
Both plotters therefore produce one figure per (partition, polymer_type) combination. When only a single partition is present (the common case for datasets that only use default AA-class grouping), filenames and titles omit the partition name to preserve backward compatibility.
Physics interpretation
- ΔG_sel < 0 → preferential contact (polymer contacts this group more than
expected from surface availability alone)
ΔG_sel > 0 → contact avoidance (polymer contacts this group less than expected) ΔG_sel = 0 → contacts match surface-availability reference exactly
Diverging colormap (RdBu_r by default) is centered at 0.0: - Blue (negative) → preference - White (zero) → neutral - Red (positive) → avoidance
Units are whatever was specified in analysis_settings.binding_free_energy.units (kT by default — dimensionless, in units of k_bT).
- class polyzymd.compare.plotters.binding_free_energy.BFEHeatmapPlotter(settings)[source]
Bases:
BasePlotterGenerate ΔG_sel heatmap comparing binding free energy across conditions.
Creates one figure per (partition, polymer_type) combination: - Rows: protein groups belonging to that partition - Columns: Conditions (e.g., 0% SBMA, 25% SBMA, …) - Color: ΔG_sel value with diverging colormap centered at 0
When only a single partition exists (e.g., just “aa_class”), filenames and titles match the previous single-partition behavior for backward compatibility.
Loads
BindingFreeEnergyResultfromresults/adjacent tocomparison.yaml(accepts bothbinding_free_energy_comparison_*.jsonandbfe_comparison_*.jsonnaming conventions).Sign convention
Blue (negative ΔG_sel) = preferential contact Red (positive ΔG_sel) = contact avoidance
- classmethod plot_type()[source]
Return the unique identifier for this plotter.
- Returns:
Plot type identifier (e.g., “triad_kde_panel”, “rmsf_comparison”)
- Return type:
- can_plot(comparison_config, analysis_type)[source]
Return True for ‘binding_free_energy’ when heatmap is enabled.
- class polyzymd.compare.plotters.binding_free_energy.BFEBarPlotter(settings)[source]
Bases:
BasePlotterGenerate ΔG_sel grouped bar charts comparing binding free energy across conditions.
Creates one figure per (partition, polymer_type) combination with: - Groups on x-axis: protein groups from that partition - Bars within each group: one per condition - Error bars: between-replicate SEM on ΔG_sel (delta-method fallback) - Reference line at ΔG_sel = 0
When only a single partition exists, filenames and titles match the previous single-partition behavior for backward compatibility.
Loads
BindingFreeEnergyResultfromresults/adjacent tocomparison.yaml(accepts bothbinding_free_energy_comparison_*.jsonandbfe_comparison_*.jsonnaming conventions).- classmethod plot_type()[source]
Return the unique identifier for this plotter.
- Returns:
Plot type identifier (e.g., “triad_kde_panel”, “rmsf_comparison”)
- Return type:
- can_plot(comparison_config, analysis_type)[source]
Return True for ‘binding_free_energy’ when bar charts are enabled.
Exposure dynamics plotters for comparison workflow.
Provides two registered plotters:
ExposureChaperoneFractionPlotter("exposure_chaperone_fraction") Bar chart comparing mean chaperone fraction across conditions.ExposureEnrichmentHeatmapPlotter("exposure_enrichment_heatmap") Heatmap of residue-based chaperone enrichment per (polymer_type, aa_group).
Both plotters follow the established BasePlotter pattern: load data from
data[label]["analysis_dir"] paths rather than expecting data to be
passed via kwargs.
- class polyzymd.compare.plotters.exposure.ExposureChaperoneFractionPlotter(settings)[source]
Bases:
BasePlotterBar chart comparing chaperone fraction across conditions.
Shows mean chaperone fraction (with SEM error bars) per condition, ordered by the ranking from ExposureDynamicsComparator.compare().
Compatible with analysis_type=”exposure”.
- classmethod plot_type()[source]
Return the unique identifier for this plotter.
- Returns:
Plot type identifier (e.g., “triad_kde_panel”, “rmsf_comparison”)
- Return type:
- can_plot(comparison_config, analysis_type)[source]
Check if this plotter can handle the given analysis type.
- Parameters:
comparison_config (ComparisonConfig) – Full comparison configuration
analysis_type (str) – Analysis type to check (e.g., “rmsf”, “triad”, “distances”)
- Returns:
True if this plotter can generate plots for the analysis type
- Return type:
- class polyzymd.compare.plotters.exposure.ExposureEnrichmentHeatmapPlotter(settings)[source]
Bases:
BasePlotterHeatmap of chaperone enrichment per (polymer_type, aa_group).
One subplot per condition; rows = polymer types, columns = AA groups. Color encodes residue-based enrichment (warm = enriched, cool = depleted).
Compatible with analysis_type=”exposure”.
- classmethod plot_type()[source]
Return the unique identifier for this plotter.
- Returns:
Plot type identifier (e.g., “triad_kde_panel”, “rmsf_comparison”)
- Return type:
- can_plot(comparison_config, analysis_type)[source]
Check if this plotter can handle the given analysis type.
- Parameters:
comparison_config (ComparisonConfig) – Full comparison configuration
analysis_type (str) – Analysis type to check (e.g., “rmsf”, “triad”, “distances”)
- Returns:
True if this plotter can generate plots for the analysis type
- Return type:
Polymer affinity score plotters for comparison workflow.
This module provides registered plotters for the polymer affinity score:
AffinityStackedBarPlotter: Total affinity score per condition, with stacked segments showing each polymer type’s contribution.
AffinityGroupBarPlotter: Per-group breakdown comparing conditions, one figure per polymer type.
Both plotters load a PolymerAffinityScoreResult JSON saved by the
polyzymd compare polymer-affinity command (in results/ adjacent to
comparison.yaml).
Physics interpretation
Score < 0 → net favorable polymer-protein affinity Score > 0 → net unfavorable (avoidance dominates) Score = 0 → contacts match the surface-availability reference
Units are always kT (dimensionless, in units of k_bT).
Sign convention
More negative = stronger polymer-protein interaction. Diverging colormap is not used here (unlike BFE heatmaps) because the primary display is bar charts where sign is visually obvious.
- class polyzymd.compare.plotters.polymer_affinity.AffinityStackedBarPlotter(settings)[source]
Bases:
BasePlotterStacked bar chart of total affinity score per condition.
Each bar represents one condition’s total affinity score, with segments colored by polymer type contribution. This gives a quick overview of which polymer types contribute most to the total interaction strength.
Loads
PolymerAffinityScoreResultfromresults/adjacent tocomparison.yaml.- classmethod plot_type()[source]
Return the unique identifier for this plotter.
- Returns:
Plot type identifier (e.g., “triad_kde_panel”, “rmsf_comparison”)
- Return type:
- can_plot(comparison_config, analysis_type)[source]
Check if this plotter can handle the given analysis type.
- Parameters:
comparison_config (ComparisonConfig) – Full comparison configuration
analysis_type (str) – Analysis type to check (e.g., “rmsf”, “triad”, “distances”)
- Returns:
True if this plotter can generate plots for the analysis type
- Return type:
- class polyzymd.compare.plotters.polymer_affinity.AffinityGroupBarPlotter(settings)[source]
Bases:
BasePlotterGrouped bar chart of per-group affinity score contributions.
Creates one figure per polymer type with: - Groups on x-axis: protein groups (AA classes) - Bars within each group: one per condition - Error bars: SEM on per-group affinity score - Reference line at score = 0
Loads
PolymerAffinityScoreResultfromresults/.- classmethod plot_type()[source]
Return the unique identifier for this plotter.
- Returns:
Plot type identifier (e.g., “triad_kde_panel”, “rmsf_comparison”)
- Return type:
- can_plot(comparison_config, analysis_type)[source]
Check if this plotter can handle the given analysis type.
- Parameters:
comparison_config (ComparisonConfig) – Full comparison configuration
analysis_type (str) – Analysis type to check (e.g., “rmsf”, “triad”, “distances”)
- Returns:
True if this plotter can generate plots for the analysis type
- Return type:
CLI
CLI commands for the compare module.
This module provides the polyzymd compare command group with subcommands for initializing comparison projects and running comparisons.