Post-Hoc Testing Reference
Overview
PolyzyMD performs post-hoc pairwise comparisons automatically during
polyzymd compare run. Two methods are available: BH-corrected t-tests
(default) and Tukey’s HSD. Both methods compute Cohen’s d effect sizes and
percent-change for every pair.
Available Methods
Method |
|
When to use |
Assumptions |
|---|---|---|---|
BH-corrected t-tests |
|
Specific pairs of interest or heterogeneous sample sizes. Default. |
Independence between pairs; equal variance (Student) or relaxed (Welch). |
Tukey’s HSD |
|
All conditions are equally important; balanced design preferred. |
Equal variance; equal (or similar) sample sizes across conditions. |
Configuration
Set post-hoc options in the defaults: block of comparison.yaml:
defaults:
posthoc_method: "ttest_bh" # or "tukey_hsd"
ttest_method: "student" # or "welch" (only used when posthoc_method is ttest_bh)
fdr_alpha: 0.05 # significance threshold (used by both ttest_bh and tukey_hsd)
Field |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Selects which post-hoc procedure to use for pairwise comparisons. |
|
|
|
Controls the variance assumption for the two-sample t-test. Only used when |
|
float (0, 1] |
|
Significance threshold for pairwise comparisons and ANOVA. Used as the BH false-discovery-rate threshold when |
Method Details
BH-Corrected t-Tests (ttest_bh)
Runs independent two-sample t-tests for each pair of conditions.
ttest_method: "student"assumes equal variances (scipy.stats.ttest_indwithequal_var=True).ttest_method: "welch"relaxes that assumption (equal_var=False).Raw p-values from all pairs across all metrics are collected into a single family, then adjusted via the Benjamini-Hochberg step-up procedure.
A pair is significant when
p_adj <= fdr_alpha.Cohen’s d is computed for effect size, with interpretation labels:
"negligible"(|d| < 0.2),"small"(0.2 <= |d| < 0.5),"medium"(0.5 <= |d| < 0.8),"large"(|d| >= 0.8).
Note
When a control label is set in comparison.yaml, only control-vs-treatment pairs are tested. Otherwise, all unique pairs are tested.
Tukey’s HSD (tukey_hsd)
Tests all pairs simultaneously using
scipy.stats.tukey_hsd.Controls the family-wise error rate (FWER) rather than FDR.
A pair is significant when the Tukey-adjusted
p_value <= fdr_alpha.p_value_adjustedmirrorsp_valuefor Tukey results (Tukey p-values are already family-wise corrected).Best with balanced designs (equal replicates per condition).
Cohen’s d is still computed for each pair.
The
t_statisticfield is set toNaNfor Tukey results since the test does not produce a t-statistic.
ANOVA
A one-way ANOVA (scipy.stats.f_oneway) is always run alongside post-hoc tests when there are 3 or more conditions. It tests whether at least one condition’s mean differs from the others.
Field |
Type |
Description |
|---|---|---|
|
float |
F-statistic from the one-way ANOVA. |
|
float |
P-value for the omnibus test. |
|
bool |
Whether |
ANOVA does not determine which pairs differ – that is the role of post-hoc tests. ANOVA is skipped when fewer than 3 conditions are present, and returns NaN statistics if any group has fewer than 2 observations.
Output Fields
These fields appear in comparison JSON files and are used by the CLI formatter.
Pairwise result fields
Field |
Type |
Description |
|---|---|---|
|
str |
Label of first condition (typically control). |
|
str |
Label of second condition (typically treatment). |
|
str |
Name of the metric being compared. |
|
float |
T-test statistic. |
|
float |
Raw p-value from the pairwise test. |
|
float or null |
Adjusted p-value (BH-adjusted for |
|
str |
|
|
float |
Effect size (positive = |
|
str |
|
|
str |
Interpreted direction of change (e.g. |
|
bool |
Whether |
|
float |
Percent change from |
Comparison-level fields
Field |
Type |
Description |
|---|---|---|
|
float |
The alpha threshold used for significance (BH FDR for |
|
str |
|
|
str |
|
CLI Significance Markers
The CLI formatter annotates pairwise rows with significance markers:
Marker |
Meaning |
|---|---|
|
|
|
|
|
|
Note
The contacts formatter uses multi-level markers (**, ***). The default
scalar formatter uses a single * for any significant result.
Some plugins also use:
Marker |
Meaning |
|---|---|
|
Cohen’s d meets the |
Edge Cases
Scenario |
Behavior |
|---|---|
Fewer than 2 replicates in a group |
t-test returns |
Equal values across all replicates in both groups |
p-value = 1.0, Cohen’s d = 0.0 ( |
Single condition |
No pairwise tests are generated; ANOVA is skipped. |
Two conditions |
Pairwise tests run normally; ANOVA is skipped (requires >= 3 conditions). |
Tukey HSD with fewer than 2 groups or fewer than 2 observations per group |
Returns empty results (no pairs generated). |
Zero control mean |
Percent change returns |
See Also
comparison.yaml Schema Reference – full
comparison.yamlschema referenceComparison and Plotting Reference – comparison CLI commands and plugin summary
Statistics Best Practices for MD Analysis – autocorrelation, FDR concepts, and interpretation guidance