Aggregation and comparison

The default aggregation policy reads one finite scalar per metric from payload["metrics"] or payload["replicate_metrics"] in each ReplicateArtifact. It computes condition-level mean, std, sem, n, and replicate values without loading trajectories.

The MDA comparison helper consumes ConditionArtifact objects, validates metric keys, replicate identity, settings fingerprints, and aggregate-statistic consistency, then delegates scalar statistics to the shared comparison engine.

Condition aggregation for MDAnalysis replicate artifacts.

exception polyzymd.analyses.mda.aggregation.MDAAggregationError[source]

Bases: MDAnalysisExtensionError

Error raised when MDAnalysis replicate artifacts cannot be aggregated.

class polyzymd.analyses.mda.aggregation.AggregatedMetric(*, name, values=<factory>, mean, sem, std, n)[source]

Bases: BaseModel

Summary statistics for one metric across biological replicates.

name: str
values: list[float]
mean: float
sem: float
std: float
n: int
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class polyzymd.analyses.mda.aggregation.ReplicateMetricPolicy(*args, **kwargs)[source]

Bases: Protocol

Protocol for reducing one replicate artifact to scalar metrics.

extract_metrics(artifact)[source]

Extract one scalar value per metric from a replicate artifact.

Parameters:

artifact (ReplicateArtifact) – Artifact produced for one replicate.

Returns:

Metric names mapped to one replicate-level scalar each.

Return type:

Mapping[str, float]

__init__(*args, **kwargs)
class polyzymd.analyses.mda.aggregation.ExplicitReplicateMetricPolicy[source]

Bases: object

Extract explicitly declared replicate-level scalar metrics.

The default policy deliberately reads only payload["metrics"] or payload["replicate_metrics"]. It does not reduce arrays, events, job tables, or frame-level values because those reductions are analysis-specific scientific choices.

extract_metrics(artifact)[source]

Return validated scalar metrics from a replicate artifact.

Parameters:

artifact (ReplicateArtifact) – Artifact produced for one replicate.

Returns:

Validated replicate-level scalar metrics.

Return type:

Mapping[str, float]

class polyzymd.analyses.mda.aggregation.MDAAggregationContext(analysis_name, condition_label, expected_replicates, settings_fingerprint=None, min_replicates=1, allow_partial=False, require_compatible_frame_selection=True, expected_frame_selection=None, validate_sidecars=True, artifact_stores=<factory>, source_replicates=(), skipped_replicates=())[source]

Bases: object

Identity and provenance expected during condition aggregation.

analysis_name: str
condition_label: str
expected_replicates: tuple[int, ...]
settings_fingerprint: str | None = None
min_replicates: int = 1
allow_partial: bool = False
require_compatible_frame_selection: bool = True
expected_frame_selection: Mapping[str, Any] | None = None
validate_sidecars: bool = True
artifact_stores: Mapping[int, ArtifactStore]
source_replicates: Sequence[Mapping[str, Any]] = ()
skipped_replicates: Sequence[Mapping[str, Any]] = ()
__post_init__()[source]

Normalize replicate identity and validate minimum count.

__init__(analysis_name, condition_label, expected_replicates, settings_fingerprint=None, min_replicates=1, allow_partial=False, require_compatible_frame_selection=True, expected_frame_selection=None, validate_sidecars=True, artifact_stores=<factory>, source_replicates=(), skipped_replicates=())
polyzymd.analyses.mda.aggregation.aggregate_replicate_artifacts(artifacts, ctx, policy=None)[source]

Aggregate replicate artifacts into a condition artifact.

Parameters:
  • artifacts (sequence of ReplicateArtifact) – Replicate artifacts to aggregate.

  • ctx (MDAAggregationContext) – Expected condition identity and provenance.

  • policy (ReplicateMetricPolicy or None, optional) – Metric extraction policy, by default ExplicitReplicateMetricPolicy.

Returns:

Aggregated condition artifact containing replicate-level statistics.

Return type:

ConditionArtifact

polyzymd.analyses.mda.aggregation.aggregate_replicate_artifacts_from_disk(analysis_dir, ctx, policy=None, *, artifact_path='result.json')[source]

Load replicate artifacts from disk and aggregate them.

Parameters:
  • analysis_dir (Path) – Condition analysis directory containing run_N subdirectories.

  • ctx (MDAAggregationContext) – Expected condition identity and aggregation policy controls.

  • policy (ReplicateMetricPolicy or None, optional) – Optional custom metric extraction policy.

  • artifact_path (str or Path, optional) – Store-relative replicate artifact filename, by default "result.json".

Returns:

Aggregated condition artifact.

Return type:

ConditionArtifact

Comparison engine for MDAnalysis condition artifacts.

exception polyzymd.analyses.mda.comparison.MDAComparisonError[source]

Bases: MDAnalysisExtensionError

Error raised when condition artifacts cannot be compared.

class polyzymd.analyses.mda.comparison.MDAComparisonContext(analysis_name, project_name, expected_condition_labels=None, expected_replicates_by_condition=None, control_label=None, effective_control=None, equilibration='0ns', settings_fingerprint=None, min_replicates=1, fdr_alpha=0.05, ttest_method='student', posthoc_method='ttest_bh')[source]

Bases: object

Identity and statistical controls for condition-artifact comparison.

analysis_name: str
project_name: str
expected_condition_labels: Sequence[str] | None = None
expected_replicates_by_condition: Mapping[str, Sequence[int]] | None = None
control_label: str | None = None
effective_control: str | None = None
equilibration: str = '0ns'
settings_fingerprint: str | None = None
min_replicates: int = 1
fdr_alpha: float = 0.05
ttest_method: str = 'student'
posthoc_method: str = 'ttest_bh'
__post_init__()[source]

Normalize expected identity inputs and reject ambiguous values.

__init__(analysis_name, project_name, expected_condition_labels=None, expected_replicates_by_condition=None, control_label=None, effective_control=None, equilibration='0ns', settings_fingerprint=None, min_replicates=1, fdr_alpha=0.05, ttest_method='student', posthoc_method='ttest_bh')
polyzymd.analyses.mda.comparison.compare_condition_artifacts(artifacts, ctx)[source]

Compare aggregate condition artifacts with replicate-level statistics.

Parameters:
  • artifacts (sequence of ConditionArtifact) – Condition artifacts produced by MDAnalysis extension-layer aggregation.

  • ctx (MDAComparisonContext) – Comparison identity, expected condition labels, and statistical controls.

Returns:

Stable comparison artifact containing scalar statistics and provenance.

Return type:

ComparisonArtifact