Statistical Power

Statistical power is the probability that your study will detect an effect when there really is an effect to detect. Formally: power = 1 − β, where β is the probability of a Type 2 error.

If your study is underpowered, you may very well miss real effects — and worse, you may ship the conclusion that there's no effect when there is one. An underpowered study isn't just inconclusive; it's actively misleading.

ℹ

The Four Interlocking Components

Power analysis sits on four numbers that are all interconnected. Specify any three, and the fourth is determined:

Effect size — the magnitude of the difference you're trying to detect.
Sample size — the number of observations.
Significance level (α) — the probability of a Type 1 error (usually 0.05).
Power (1 − β) — the probability of avoiding a Type 2 error (usually 0.80).

The most common scenario: specify effect size, α, and power → solve for minimum sample size.

Power Analysis Explorer

The indigo curve is the null distribution (H₀); the green curve is the alternative (H₁). The dashed line is the critical value. Adjust the sliders to see how power changes.

H₀ (null)H₁ (alternative)Power (1 − β)β (Type II error)α (Type I error)Critical value

Effect size (Cohen's d)0.50

0.101.00

Sample size (n)40

5300

Significance level (α)

Power (1 − β)

93.5%

Adequate — meets the conventional 80% threshold

0.50

Effect size (d)

Sample size

93.5%

Power

6.5%

β (miss rate)

Adjust effect size, sample size, and α to see how the null (H₀) and alternative (H₁) distributions shift relative to the critical value. The green shaded region is power; the red region is β.

Power gets higher when:

Sample size increases. More data = more power. This is usually your main lever.
Effect size is larger. Big effects are easier to detect.

Power gets lower when:

The significance threshold becomes stricter. Moving from α = 0.05 to α = 0.01 makes it harder to clear the bar.
Variability in the data increases. Noisy data masks signal.

Checkpoint

A study has 80% power at α = 0.05 to detect an effect of size d = 0.5. A researcher decides to use a stricter significance threshold of α = 0.01. All else equal, what happens to power?

←PreviousThe Practical QuestionStatistical Significance and Power Analysis Next→Determining Effect SizeStatistical Significance and Power Analysis