SASA Plugin Reference

This page is lookup documentation for the sasa analysis plugin: settings, selection behavior, output paths, artifact fields, comparison outputs, and plot files.

For a guided workflow, see Tutorial: Measure Polymer Shielding with SASA. For practical recipes and commands, see SASA Analysis: Quick Start.

Plugin key

Top-level comparison YAML key: plugins.sasa.

plugins:
  sasa:
    runs:
      - label: "protein_total"
        target_selection: "protein"

Settings

`plugins.sasa`

Field	Type	Default	Constraints	Description
`runs`	list	required	at least one entry; labels must be unique	Named SASA computations to run.
`probe_radius_nm`	float	`0.14`	`> 0`	Shrake-Rupley probe radius in nanometers.
`n_sphere_points`	int	`960`	`>= 100`	Number of test points on each atom sphere. Higher is more accurate and slower.
`chunk_size`	int	`100`	`>= 1`	Frames processed per chunk for memory-managed computation.

`runs` entries

Field	Type	Default	Constraints	Description
`label`	string	required	non-empty; must not contain `/` or `\`	Human-readable run label used in summaries and plot filenames.
`target_selection`	string	required	non-empty	MDAnalysis selection for atoms whose SASA is reported.
`context_selection`	string or null	`target_selection`	blank values become omitted	MDAnalysis selection for atoms included as surface blockers during SASA computation.
`stride`	int	`1`	`>= 1`	Analyze every Nth selected frame.

Target and context behavior

SASA runs separate what is reported from what can block the surface:

Selection	Behavior
`target_selection`	Defines the atoms/residues whose SASA values are summarized.
`context_selection`	Defines the atoms present in the Shrake-Rupley surface calculation.

If context_selection is omitted, PolyzyMD sets it equal to target_selection. This is useful for self-SASA measurements such as whole protein SASA.

Examples:

Goal	`target_selection`	`context_selection`
Whole-protein self-SASA	`protein`	`protein` or omitted
Protein SASA with polymer shielding	`protein`	`protein or chainid C`
Active-site SASA	`protein and (resid 77 or resid 156 or resid 262)`	`protein`
Active-site SASA with polymer shielding	`protein and (resid 77 or resid 156 or resid 262)`	`protein or chainid C`
Monomer-specific shielding	`protein`	`protein or resname SBMA`

The project chain convention is A = protein, B = substrate, C = polymer, and D+ = solvent/ions/other. Use lowercase chainid in MDAnalysis selections.

Canonical output paths

SASA writes canonical artifact outputs for compute and aggregate stages, plus a comparison output and plots.

Level	Path	Contents
Per replicate	`analysis/<condition>/sasa/run_<replicate>/result.json`	`ReplicateArtifact` envelope with per-run payload summaries and sidecar references.
Per replicate sidecars	`analysis/<condition>/sasa/run_<replicate>/sidecars/*.npz`	Large arrays such as per-frame total SASA and per-residue SASA.
Per condition	`analysis/<condition>/sasa/aggregated/result.json`	`ConditionArtifact` envelope with aggregate per-run summaries across replicates.
Per condition sidecars	`analysis/<condition>/sasa/aggregated/sidecars/*.npz`	Aggregate arrays and supporting data, when written.
Cross condition	`comparison/sasa/result.json`	Comparison output with condition summaries, pairwise tests, ANOVA-by-run, rankings, and metadata.
Plots	`figures/sasa/` by default, or the configured plot output directory	SASA comparison, normalized-control, time-series, and profile plots.

Artifact envelope fields

Replicate and condition JSON files are artifact envelopes. The stable public concepts are the artifact envelope and the canonical paths, not private helper classes.

Field	Meaning
`analysis_name`	Analysis plugin name, usually `sasa`.
`condition_label`	Comparison condition label.
`replicate`	Replicate number for replicate artifacts; absent or not meaningful for condition artifacts.
`payload`	JSON-compatible SASA summaries, metrics, run labels, and relative sidecar paths.
`metadata`	Settings fingerprints, software versions, equilibration labels, units, and related run metadata.
`provenance`	Input trajectory/topology identity and workflow details.
`sidecars`	Validated references to large sidecar files, including relative paths and integrity metadata.

Common replicate payload keys include:

Key	Meaning
`run_results`	List of per-run summaries for the replicate.
`n_runs`	Number of configured SASA runs.
`n_frames_total`	Total frames available after the workflow frame selection.
`n_frames_used`	Frames actually analyzed after per-run stride.
`metrics` / `replicate_metrics`	Scalar metrics extracted from run summaries.
`metric_metadata`	Units and labels for scalar metrics.

Common per-run payload fields include:

Key	Meaning
`label`	SASA run label.
`target_selection`	Selection whose SASA is reported.
`context_selection`	Selection used as the blocking context.
`mean_sasa`	Mean SASA for the run in A^2.
`sem_sasa`	Standard error estimate for the run in A^2.
`sidecar_path`	Relative path to the NPZ sidecar for arrays.
`probe_radius_nm`	Probe radius used for the calculation.
`n_sphere_points`	Sphere point count used for the calculation.

Loading artifacts with `ArtifactStore`

Use the public MDAnalysis artifact API to inspect canonical artifacts:

from pathlib import Path

from polyzymd.analyses.mda import ArtifactStore

replicate_store = ArtifactStore(Path("analysis/With Polymer/sasa/run_1"))
replicate = replicate_store.read_replicate_result()
print(replicate.payload["run_results"])

condition_store = ArtifactStore(Path("analysis/With Polymer/sasa/aggregated"))
condition = condition_store.read_condition_result()
print(condition.payload)

Sidecar NPZ files are referenced from the artifact sidecars list and from payload fields such as sidecar_path. Treat sidecars as large validated data files linked by the artifact, not as independently discovered cache files.

Comparison output

comparison/sasa/result.json contains cross-condition statistics organized by configured run label.

Field	Description
`metric`	Comparison metric name, currently `mean_sasa`.
`name`	Comparison study name.
`n_runs`	Number of configured SASA runs.
`run_labels`	Ordered list of run labels.
`control_label`	Configured control condition, when present.
`conditions`	Per-condition summaries with per-run means and SEMs.
`pairwise_comparisons`	Per-run pairwise statistics between conditions.
`anova_by_run`	Per-run ANOVA results when testable.
`ranking_by_run`	Condition ranking for each run.
`equilibration_time`	Equilibration cutoff used for the comparison.

Pairwise comparison entries include:

Field	Description
`run_label`	SASA run being compared.
`condition_a`, `condition_b`	Conditions in the comparison.
`p_value`, `p_value_adjusted`	Raw and adjusted p-values when available.
`cohens_d`	Effect size when available.
`direction`	`shielding`, `exposure`, or `unchanged` based on SASA change.
`significant`	Whether the comparison passed the configured significance rule.
`percent_change`	Percent change from condition A to condition B.
`testable`	Whether the comparison had enough data for a statistical test.
`note`	Explanation for non-testable or special cases.

Normalized-control formula

Normalized comparison plots use the configured control condition as the denominator:

percent change = (condition_mean - control_mean) / control_mean * 100

For a shielding run such as protein_with_polymer, negative values indicate lower SASA than the control and are consistent with polymer shielding. Positive values indicate increased exposure relative to the control.

Plot outputs

For each configured run label, SASA may generate these plots under the configured plot output directory, usually figures/sasa/:

Plot output	Description
`sasa_comparison_<run>.png`	Mean SASA bar chart with SEM and replicate points.
`sasa_normalized_comparison_<run>.png`	Percent change relative to the configured control.
`sasa_timeseries_<run>.png`	Per-frame SASA traces summarized across conditions.
`sasa_profile_<run>.png`	Per-residue mean SASA profile across conditions.

Time-axis plots assume uniformly saved frames. PolyzyMD maps frame index to time as frame_index * dt; variable-timestep concatenated trajectories are not supported.

Units

Quantity	Unit
`probe_radius_nm`	nm
SASA values in outputs	A^2
Time sidecar arrays	ns