Independent Samples and Welch's t-Test

Student's Independent Samples t-Test

Question: Are the means of two independent groups significantly different?

Use it when: Two separate groups, no overlap between subjects, means compared. Common in A/B tests.

Assumptions: Independence, normality within each group, equal variances (homogeneity).

In Python: scipy.stats.ttest_ind(group1, group2)

Welch's t-Test (Unequal Variances)

Question: Same as Student's — but used when the equal-variance assumption may not hold.

Use it when: Two independent groups, but variance in the groups differs or you're unsure. Welch's computes an adjusted degrees of freedom (Welch-Satterthwaite equation) that accounts for unequal variances.

Recommendation: Default to Welch's. The cost of using it when variances are actually equal is small. The cost of using Student's when they're not can be substantial — your p-values will be wrong.

In Python: scipy.stats.ttest_ind(group1, group2, equal_var=False)

Independent Samples t-Test (Welch's)

Choose a scenario

Group means & spread (±1.96 SD)

Dots = means. Horizontal bars = ±1.96 SD range. Dashed line = mean difference.

Unequal variances detected — SD ratio ≈ 3.0×. Welch's t-test (used here) adjusts for this automatically via Welch-Satterthwaite df.

New customers

Mean (x̄₁)65.0

SD (s₁)5.0

Size (n₁)100

Returning customers

Mean (x̄₂)67.0

SD (s₂)15.0

Size (n₂)100

Welch's t-distribution (df = 120.7)

Shaded tails = p-value region. Orange marker = observed t-statistic.

Results

Mean diff (x̄₁ − x̄₂)

-2.00

Standard error

1.5811

t-statistic

-1.265

p-value (two-tailed)

0.2083

Interpretation (α = 0.05)

p = 0.2083 ≥ 0.05 — fail to reject H₀. No significant difference between New customers (mean = 65) and Returning customers (mean = 67) was detected.

How it's computed

SE = √(s₁²/n₁ + s₂²/n₂) = √(5²/100 + 15²/100) = 1.5811

t = (x̄₁ − x̄₂) / SE = (65 − 67) / 1.5811 = -1.265

Welch df = (s₁²/n₁ + s₂²/n₂)² / [(s₁²/n₁)²/(n₁−1) + (s₂²/n₂)²/(n₂−1)] = 120.7

p = 0.2083 → not significant at α = 0.05

Group A meanGroup B meant-statisticp-value region

Adjust group means, standard deviations, and sample sizes to see how Welch's t-statistic and p-value respond. Notice how the unequal-variance warning appears when SD ratios diverge.

◆

Worked Example: New vs. Returning Customers

Comparing average purchase amounts: new customers (mean = $65, SD =$ 5, n = 100) vs. returning customers (mean = $67, SD =$ 15, n = 100).

Notice the standard deviations: $5 vs.$ 15. That's a 3x difference in spread — Levene's test would likely flag this, and Welch's is clearly the right choice here.

Running the independent t-test, you might find the means are not statistically significantly different (because the high variance in returning customers creates a wide confidence interval around their mean). The lesson: visually similar sample means don't necessarily mean a significant difference when variance is high.

Checkpoint

You're comparing average response time between two groups in an A/B test. Levene's test returns p = 0.01, indicating significantly unequal variances. Which test should you use?

←PreviousThe One-Sample t-TestParametric Tests Next→The Paired Samples t-TestParametric Tests