Power Analysis in Python

statsmodels has built-in power analysis tools for most of the statistical tests we'll cover. The two you'll use most often:

  • statsmodels.stats.power.TTestPower — one-sample or paired t-tests.
  • statsmodels.stats.power.TTestIndPower — independent-sample t-tests.

A typical workflow:

  1. Run or find a pilot study with two groups.
  2. Compute the means and pooled standard deviation.
  3. Compute Cohen's d: (mean₁ − mean₂) / pooled_std.
  4. Pass that effect size, along with α = 0.05 and power = 0.80, to the appropriate power function.
  5. Solve for required sample size per group.

Example: Comparing Two Conversion Rates

from statsmodels.stats.power import TTestIndPower

# Pilot study results:
# Control: mean = 0.042, std = 0.201
# Treatment: mean = 0.051, std = 0.220
# Pooled std ≈ 0.211

effect_size = (0.051 - 0.042) / 0.211  # Cohen's d ≈ 0.043

analysis = TTestIndPower()
n = analysis.solve_power(
    effect_size=effect_size,
    alpha=0.05,
    power=0.80,
    alternative='two-sided'
)
print(f"Required sample size per group: {n:.0f}")

The output is the minimum sample size per group.

When the Number You Get Is Impossible

Power analysis is a respected methodology. It is also not magic. You will sometimes get back numbers like "you need 50,000 subjects" when you have budget for 200.

Your honest options:

  • Report results as preliminary / underpowered (explicitly labeled).
  • Shrink the scope: design to detect only a larger effect.
  • Negotiate for more time or budget by showing the power analysis.
  • Decide not to run the study at all — an underpowered study can be worse than no study because it produces a misleading null result.

What you should not do is run the underpowered study and report the result as definitive.