Universe provenance and shared MDAnalysis primitives

UniverseProvider wraps the existing trajectory loader and records topology and trajectory identity for provenance.

AnalysisBaseLike and MDARunKwargs describe the lightweight protocol used by jobs and adapters. The pair-distance helper is a public but specialized primitive for distance-family analyses that need the same MDAnalysis-native pair-distance behavior as built-in plugins; most contributors only need the job, frame-selection, artifact, and store objects above.

Universe loading and provenance helpers for the MDAnalysis extension layer.

class polyzymd.analyses.mda.universe.FileIdentity(path, format, size_bytes, mtime_ns)[source]

Bases: object

Filesystem identity for an input topology or trajectory file.

path: Path
format: str | None
size_bytes: int
mtime_ns: int
classmethod from_path(path, file_format=None)[source]

Create file identity metadata from a filesystem path.

Parameters:
  • path (Path) – File path to identify.

  • file_format (str or None, optional) – Format reported by the trajectory layout. When omitted, the file suffix is used without the leading dot.

Returns:

Path, format, size, and modification-time metadata.

Return type:

FileIdentity

as_dict()[source]

Serialize the identity to JSON-compatible primitive values.

Returns:

Dictionary representation with the path converted to a string.

Return type:

dict[str, Any]

__init__(path, format, size_bytes, mtime_ns)
class polyzymd.analyses.mda.universe.UniverseProvenance(replicate, working_directory, topology, trajectories, n_segments, loader_class, config_engine, engine_override=None, warnings=<factory>)[source]

Bases: object

Provenance for one replicate universe loaded from PolyzyMD outputs.

replicate: int
working_directory: Path
topology: FileIdentity
trajectories: tuple[FileIdentity, ...]
n_segments: int
loader_class: str
config_engine: str | None
engine_override: str | None = None
warnings: tuple[str, ...]
as_dict()[source]

Serialize provenance to JSON-compatible primitive values.

Returns:

Dictionary representation suitable for manifests and tests.

Return type:

dict[str, Any]

__init__(replicate, working_directory, topology, trajectories, n_segments, loader_class, config_engine, engine_override=None, warnings=<factory>)
class polyzymd.analyses.mda.universe.UniverseProvider(config, engine_override=None, loader=None, loader_factory=None)[source]

Bases: object

Config-aware provider for MDAnalysis universes and input provenance.

config: SimulationConfig
engine_override: str | None = None
loader: _TrajectoryLoaderLike | None = None
loader_factory: LoaderFactory | None = None
__post_init__()[source]

Validate loader injection settings after dataclass construction.

classmethod from_config(config, **kwargs)[source]

Create a universe provider from a simulation configuration.

Parameters:
  • config (SimulationConfig) – PolyzyMD simulation configuration.

  • **kwargs (Any) – Optional provider settings such as engine_override, loader, or loader_factory.

Returns:

Provider that lazily instantiates the trajectory loader.

Return type:

UniverseProvider

load_universe(replicate, *, cache=True)[source]

Load an MDAnalysis universe for a replicate through the existing loader.

Parameters:
  • replicate (int) – Replicate index to load.

  • cache (bool, optional) – Whether the underlying loader may reuse its universe cache, by default True.

Returns:

Loaded MDAnalysis universe from the underlying trajectory loader.

Return type:

Universe

provenance_for(replicate, *, refresh=False)[source]

Return provenance for a replicate, computing it when needed.

Parameters:
  • replicate (int) – Replicate index to inspect.

  • refresh (bool, optional) – Recompute provenance even when cached, by default False.

Returns:

Input file identity and loader metadata for the replicate.

Return type:

UniverseProvenance

get_provenance(replicate)[source]

Return cached provenance without triggering trajectory discovery.

Parameters:

replicate (int) – Replicate index whose cached provenance should be returned.

Returns:

Cached provenance when available, otherwise None.

Return type:

UniverseProvenance or None

__init__(config, engine_override=None, loader=None, loader_factory=None)

Pair-distance AnalysisBase primitives for MDAnalysis integrations.

class polyzymd.analyses.mda.pair_distance.PairDistanceSpec(label, selection_a, selection_b, atoms_a, atoms_b, mode_a, mode_b, threshold=None)[source]

Bases: object

Resolved atom-group inputs for one pair-distance measurement.

Parameters:
  • label (str) – Human-readable pair label.

  • selection_a (str) – Original selection string for the first atom group or point.

  • selection_b (str) – Original selection string for the second atom group or point.

  • atoms_a (Any) – First MDAnalysis atom group.

  • atoms_b (Any) – Second MDAnalysis atom group.

  • mode_a (Any) – Position-reduction mode understood by shared selection helpers.

  • mode_b (Any) – Position-reduction mode understood by shared selection helpers.

  • threshold (float or None, optional) – Optional distance threshold in Å for downstream state summaries.

label: str
selection_a: str
selection_b: str
atoms_a: Any
atoms_b: Any
mode_a: Any
mode_b: Any
threshold: float | None = None
__init__(label, selection_a, selection_b, atoms_a, atoms_b, mode_a, mode_b, threshold=None)
polyzymd.analyses.mda.pair_distance.build_pair_distance_analysis(*, universe, pairs, use_pbc)[source]

Build a lazy custom AnalysisBase for pair-distance matrices.

Parameters:
  • universe (Any) – MDAnalysis universe for one trajectory.

  • pairs (sequence of PairDistanceSpec) – Resolved pair specifications.

  • use_pbc (bool) – Whether to request minimum-image distances from MDAnalysis.

Returns:

AnalysisBase instance whose results.distance_matrix has shape (n_pairs, n_frames) and whose results also include frame, time, and warning metadata.

Return type:

Any

polyzymd.analyses.mda.pair_distance.pair_distance_version()[source]

Return the pair-distance primitive schema version.

Returns:

Version string for provenance records.

Return type:

str

Import-light primitives for the MDAnalysis extension layer.

class polyzymd.analyses.mda.base.MDARunKwargs[source]

Bases: TypedDict

Keyword arguments accepted by MDAnalysis.analysis.base.AnalysisBase.run.

The value types stay intentionally broad where MDAnalysis accepts multiple backend or frame-selector forms. This keeps the public extension-layer type import-light and independent of optional MDAnalysis runtime objects.

start: int | None
stop: int | None
step: int | None
frames: Any
verbose: bool | None
progressbar_kwargs: dict[str, Any] | None
backend: Any
n_workers: int | None
n_parts: int | None
unsupported_backend: bool | None
class polyzymd.analyses.mda.base.AnalysisBaseLike(*args, **kwargs)[source]

Bases: Protocol

Structural protocol for MDAnalysis AnalysisBase-style objects.

results: Any
run(**kwargs)[source]

Run the analysis and return the analysis object.

Parameters:

**kwargs (Any) – Keyword arguments forwarded to the wrapped analysis object’s run() method.

Returns:

The analysis object after execution.

Return type:

AnalysisBaseLike

__init__(*args, **kwargs)
exception polyzymd.analyses.mda.base.MDAnalysisExtensionError[source]

Bases: RuntimeError

Base runtime error for MDAnalysis extension-layer failures.