Analysis Base Classes

API reference for polyzymd.analyses.base, including the plugin base class, context objects, and shared comparison/result models. Contributor plugins should import Analysis, MetricValue, and lifecycle contexts from this module rather than private implementation modules.

Public facade for the PolyzyMD analysis plugin system.

Every analysis in PolyzyMD inherits from Analysis. The framework discovers subclasses automatically and owns replicate iteration, caching, dependency ordering, comparison, plotting, and CLI wiring.

This module remains the stable public import surface. Implementation details live in private framework modules so plugins and tests can keep importing all public symbols from polyzymd.analyses.base.

class polyzymd.analyses.base.ANOVAResult(*, metric='default', f_statistic, p_value, significant, testable=True, note=None)

Bases: BaseModel

One-way ANOVA result for one metric.

model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

metric: str
f_statistic: float
p_value: float
significant: bool
testable: bool
note: str | None
class polyzymd.analyses.base.AggregateContext(condition, replicates, output_dir, equilibration, settings, result_path=None, recompute=False)

Bases: object

Context passed to condition-level aggregation.

__init__(condition, replicates, output_dir, equilibration, settings, result_path=None, recompute=False)
recompute: bool = False
result_path: Path | None = None
condition: Condition
replicates: tuple[int, ...]
output_dir: Path
equilibration: str
settings: BaseModel
exception polyzymd.analyses.base.AggregateValidationError[source]

Bases: ValueError

Raised when an aggregated result is stale for the active context.

class polyzymd.analyses.base.Analysis[source]

Bases: ABC

Base class for all PolyzyMD analyses.

Subclasses represent a complete analysis lifecycle: MDAnalysis-backed per-replicate computation, aggregation across replicates, cross-condition comparison, plotting, and CLI formatting.

name: ClassVar[str]
Settings: ClassVar[type]
PlotSettingsModel: ClassVar[type[BasePlotSettings] | None] = None
AggregatedResultClass: ClassVar[type | None] = None
ReplicateResultClass: ClassVar[type | None] = None
execution_cost_hint: ClassVar[str] = 'medium'
dependencies: ClassVar[tuple[str, ...]] = ()
min_replicates: ClassVar[int] = 2
has_compute_stage: ClassVar[bool] = True
has_aggregate_stage: ClassVar[bool] = True
slurm_resource_hint: ClassVar[SlurmResourceHint | None] = None
settings_path_fields: ClassVar[tuple[str, ...]] = ()
aggregate(ctx, results)[source]

Aggregate results across replicates for one condition.

Parameters:
  • ctx (AggregateContext) – Framework-provided aggregation context.

  • results (Sequence[Any]) – Per-replicate results.

Returns:

Aggregated result, or None when aggregation is disabled.

Return type:

Any

build_mda_metric_policy(ctx)[source]

Build the metric policy used by default MDA artifact aggregation.

Parameters:

ctx (AggregateContext) – Framework-provided aggregation context.

Returns:

Custom replicate metric policy, or None to use the explicit scalar metric policy.

Return type:

ReplicateMetricPolicy or None

build_mda_jobs(ctx)[source]

Build MDAnalysis-compatible jobs for one replicate.

Parameters:

ctx (MDAReplicateJobContext) – Framework-provided MDAnalysis job context with a loaded universe, frame selection, universe policy, and artifact store.

Returns:

Jobs to execute for the replicate. None is valid only for non-compute plugins and is rejected for compute-stage plugins.

Return type:

sequence of MDAAnalysisJob or None

build_mda_collector(ctx)[source]

Build the artifact collector for completed MDAnalysis jobs.

Parameters:

ctx (MDACollectorContext) – Framework-provided collector context for one replicate.

Returns:

Collector that maps completed job results to a replicate artifact.

Return type:

MDAArtifactCollector

get_trajectory_window(ctx, replicate, loader, universe)[source]

Resolve the frame window for a replicate analysis.

Parameters:
  • ctx (ReplicateContext) – Framework-provided replicate context.

  • replicate (int) – Replicate number.

  • loader (Any) – Trajectory loader used for the replicate.

  • universe (Any) – Loaded trajectory universe.

Returns:

Resolved trajectory window object.

Return type:

Any

filter_conditions(conditions, settings=None)[source]

Filter conditions before comparison.

Parameters:
  • conditions (list[Condition]) – All conditions from the comparison config.

  • settings (BaseModel or None) – Resolved plugin settings.

Returns:

Conditions to include in analysis.

Return type:

list[Condition]

compare(ctx)[source]

Compare results across conditions.

Parameters:

ctx (ComparisonContext) – Framework-provided comparison context.

Returns:

Comparison result, or None if comparison is not supported.

Return type:

BaseModel | None

extract_metrics(summary)[source]

Extract scalar metrics from an aggregated result for comparison.

Parameters:

summary (Any) – Aggregated result.

Returns:

Mapping from metric name to metric value.

Return type:

dict[str, MetricValue]

aggregate_settings_fingerprint(settings)[source]

Return the settings fingerprint expected on aggregate results.

Parameters:

settings (BaseModel or None) – Active analysis settings.

Returns:

Fingerprint used to validate aggregate caches, or None to skip settings identity checks.

Return type:

str or None

validate_aggregated_result(result, *, condition, settings, equilibration, source=None, expected_replicates=None, allow_replicate_subset=False)[source]

Validate an aggregate result against the active framework context.

Parameters:
  • result (Any) – Loaded or newly computed aggregate result.

  • condition (Condition or None) – Condition providing configuration context.

  • settings (BaseModel or None) – Active analysis settings.

  • equilibration (str) – Requested equilibration window.

  • source (str or Path or None, optional) – Cache path or description used in diagnostics.

  • expected_replicates (sequence of int or None, optional) – Replicate IDs expected in the aggregate.

  • allow_replicate_subset (bool, optional) – Whether a successful subset of requested replicates is acceptable.

Returns:

Validated aggregate result, potentially coerced through the plugin’s AggregatedResultClass.

Return type:

Any

plot(ctx)[source]

Generate comparison figures.

Parameters:

ctx (PlotContext) – Framework-provided plot context.

Returns:

Paths to generated figure files.

Return type:

list[Path]

format(result, output_format='text')[source]

Format a comparison result for CLI display.

Parameters:
  • result (Any) – Comparison result to format.

  • output_format (str, optional) – Output format.

Returns:

Formatted string ready for CLI display.

Return type:

str

static replicate_result_path(output_dir)[source]

Return the canonical per-replicate cache path.

static aggregate_result_path(output_dir)[source]

Return the canonical aggregated cache path.

comparison_result_path(results_dir)[source]

Return the canonical comparison cache path.

figures_output_dir(figures_root)[source]

Return the analysis-specific figure directory.

save_result(result, path)[source]

Save a result object to disk using a common contract.

resolve_output_dir(analysis_root, condition_label)[source]

Build the analysis output directory for a condition.

classmethod __init_subclass__(**kwargs)[source]

Validate that subclasses satisfy the analysis contract.

__repr__()[source]

Return a concise representation for debugging.

class polyzymd.analyses.base.BaseComparisonResult(*, metric, name, control_label=None, conditions, pairwise_comparisons, anova=None, ranking, equilibration_time, created_at, polyzymd_version)

Bases: BaseModel, ABC, Generic[TConditionSummary, TPairwiseResult]

Abstract base class for custom plugin comparison results.

comparison_type: ClassVar[str] = 'base'
get_comparison(label)[source]

Get a pairwise comparison by condition pair.

Parameters:

label (tuple[str, str]) – Explicit (condition_a, condition_b) pair.

Returns:

The matching comparison, or None if not found.

Return type:

TPairwiseResult or None

get_condition(label)[source]

Get a condition by label.

Parameters:

label (str) – Condition label.

Returns:

The matching condition summary.

Return type:

TConditionSummary

Raises:

KeyError – If the condition is not found.

classmethod load(path)[source]

Load the result from a JSON file.

Parameters:

path (Path or str) – Path to a JSON file.

Returns:

Loaded result.

Return type:

Self

model_config = {'ser_json_inf_nan': 'strings'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

save(path)[source]

Save the result to a JSON file.

Parameters:

path (Path or str) – Output path.

Returns:

Path to the saved file.

Return type:

Path

metric: str
name: str
control_label: str | None
conditions: list[TConditionSummary]
pairwise_comparisons: list[TPairwiseResult]
anova: ANOVAResult | list[ANOVAResult] | None
ranking: list[str]
equilibration_time: str
created_at: datetime
polyzymd_version: str
class polyzymd.analyses.base.BaseConditionSummary(*, label, config_path, n_replicates, replicate_values)

Bases: BaseModel, ABC

Abstract base class for condition-level custom comparison summaries.

model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

abstract property primary_metric_sem: float

Return the SEM of the primary metric.

abstract property primary_metric_value: float

Return the primary metric value for ranking and comparison.

label: str
config_path: str
n_replicates: int
replicate_values: list[float]
class polyzymd.analyses.base.BasePlotSettings

Bases: BaseModel

Base class for per-analysis plot settings.

model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class polyzymd.analyses.base.ComparisonContext(name, conditions, excluded_conditions, control_label, analysis_dirs, results_dir, equilibration, settings, fdr_alpha=0.05, ttest_method='student', posthoc_method='ttest_bh', result_path=None, failed_conditions=<factory>, aggregated_results=<factory>, recompute=False)

Bases: object

Context passed to cross-condition comparison.

__init__(name, conditions, excluded_conditions, control_label, analysis_dirs, results_dir, equilibration, settings, fdr_alpha=0.05, ttest_method='student', posthoc_method='ttest_bh', result_path=None, failed_conditions=<factory>, aggregated_results=<factory>, recompute=False)
property effective_control: str | None

Return the control label when it is still included.

fdr_alpha: float = 0.05
posthoc_method: str = 'ttest_bh'
recompute: bool = False
result_path: Path | None = None
ttest_method: str = 'student'
name: str
conditions: list[Condition]
excluded_conditions: list[Condition]
control_label: str | None
analysis_dirs: dict[str, Path]
results_dir: Path
equilibration: str
settings: BaseModel
failed_conditions: list[Condition]
aggregated_results: dict[str, Any]
class polyzymd.analyses.base.ComparisonResult(*, analysis_type, name, control_label=None, fdr_alpha=None, ttest_method='student', posthoc_method='ttest_bh', conditions=<factory>, pairwise_comparisons=<factory>, anova=None, ranking=<factory>, rankings_by_metric=None, equilibration_time='0ns', created_at='', polyzymd_version='')

Bases: BaseModel

Serializable result of a default scalar cross-condition comparison.

classmethod load(path)[source]

Load the result from a JSON file.

Parameters:

path (Path or str) – Path to a JSON file.

Returns:

Loaded result.

Return type:

Self

model_config = {'ser_json_inf_nan': 'strings'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

save(path)[source]

Save the result to a JSON file.

Parameters:

path (Path or str) – Output path.

Returns:

Path to the saved file.

Return type:

Path

analysis_type: str
name: str
control_label: str | None
fdr_alpha: float | None
ttest_method: str
posthoc_method: str
conditions: list[ConditionSummary]
pairwise_comparisons: list[PairwiseResult]
anova: list[ANOVAResult] | None
ranking: list[str]
rankings_by_metric: dict[str, list[str]] | None
equilibration_time: str
created_at: str
polyzymd_version: str
class polyzymd.analyses.base.Condition(label, config_path, replicates, sim_config)

Bases: object

A single simulation condition within a comparison.

__init__(label, config_path, replicates, sim_config)
classmethod from_condition_config(cond)[source]

Create a condition from a comparison configuration entry.

Parameters:

cond (ConditionConfig) – Comparison condition entry.

Returns:

Condition with the simulation configuration loaded.

Return type:

Condition

label: str
config_path: Path
replicates: tuple[int, ...]
sim_config: SimulationConfig
class polyzymd.analyses.base.ConditionSummary(*, label, n_replicates=0, **extra_data)

Bases: BaseModel

Summary statistics for one condition in a scalar comparison.

model_config = {'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

label: str
n_replicates: int
class polyzymd.analyses.base.MetricValue(name, mean, sem, replicate_values, higher_is_better=True, direction_labels=('decreased', 'unchanged', 'increased'))

Bases: object

A scalar metric extracted from one aggregated condition result.

__init__(name, mean, sem, replicate_values, higher_is_better=True, direction_labels=('decreased', 'unchanged', 'increased'))
direction_labels: tuple[str, str, str] = ('decreased', 'unchanged', 'increased')
higher_is_better: bool | None = True
name: str
mean: float
sem: float
replicate_values: list[float]
class polyzymd.analyses.base.PairwiseResult(*, condition_a, condition_b, metric='default', t_statistic, p_value, p_value_adjusted, posthoc_method='ttest_bh', cohens_d, effect_size_interpretation, direction, significant, percent_change, testable=True, note=None)

Bases: BaseModel

Statistical comparison between two conditions for one metric.

model_config = {'ser_json_inf_nan': 'strings'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

condition_a: str
condition_b: str
metric: str
t_statistic: float
p_value: float
p_value_adjusted: float | None
posthoc_method: str
cohens_d: float
effect_size_interpretation: str
direction: str
significant: bool
percent_change: float
testable: bool
note: str | None
class polyzymd.analyses.base.PlotContext(conditions, analysis_dirs, results_dir, output_dir, settings, plot_settings=<factory>, comparison_path=None, control_label=None, equilibration='0ns', recompute=False)

Bases: object

Context passed to comparison plotting.

__init__(conditions, analysis_dirs, results_dir, output_dir, settings, plot_settings=<factory>, comparison_path=None, control_label=None, equilibration='0ns', recompute=False)
__post_init__()[source]

Ensure plot settings are materialized for plugins.

comparison_path: Path | None = None
control_label: str | None = None
equilibration: str = '0ns'
recompute: bool = False
conditions: list[Condition]
analysis_dirs: dict[str, Path]
results_dir: Path
output_dir: Path
settings: BaseModel
plot_settings: PlotSettings
exception polyzymd.analyses.base.PluginContractError[source]

Bases: AnalysisError

Raised when a plugin violates the Analysis contract.

class polyzymd.analyses.base.ReplicateContext(condition, replicate, sim_config, output_dir, equilibration, recompute, settings, result_path=None, backend_policy=<factory>)

Bases: object

Context passed to per-replicate analysis execution.

__init__(condition, replicate, sim_config, output_dir, equilibration, recompute, settings, result_path=None, backend_policy=<factory>)
result_path: Path | None = None
condition: Condition
replicate: int
sim_config: SimulationConfig
output_dir: Path
equilibration: str
recompute: bool
settings: BaseModel
backend_policy: MDABackendPolicy
class polyzymd.analyses.base.SlurmResourceHint(*, mem=None, time=None, cpus_per_task=None)

Bases: BaseModel

Per-plugin SLURM resource hints for HPC submission.

model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

mem: str | None
time: str | None
cpus_per_task: int | None