The Logic: Partitioning Variability
The key insight behind ANOVA is that total variability in your data can be split into two parts:
- Variability between groups — how much do the group means differ from the overall mean?
- Variability within groups — how much do individual observations vary from their own group mean?
If between-group variability is large compared to within-group variability, that's evidence the groups really differ. If between-group variability is small relative to within-group, you can't distinguish group differences from random noise.
Three Sums of Squares
Sum of Squares Total (SST): Total variability around the grand mean.
Sum of Squares Between (SSB): Variability of group means around the grand mean.
Sum of Squares Within (SSW): Variability within each group around its own mean.
These three quantities satisfy: SST = SSB + SSW.
The F-Statistic
The F-statistic compares between-group and within-group variability, normalized by degrees of freedom:
A large F means between-group variability dominates within-group variability — evidence against the null. If p ≤ α, reject the null: at least one group mean differs.
In Python: scipy.stats.f_oneway(group1, group2, group3, ...)
Vertical colored bars = group means. Dashed line = grand mean. Brackets show between-group distance.
| Source | SS | df | MS |
|---|---|---|---|
| Between groups (SSB) | 2400.0 | 2 | 1200.00 |
| Within groups (SSW) | 354.5 | 33 | 10.74 |
| Total (SST) | 2754.5 | 35 | — |
Adjust the means and spreads of three groups. See SSB, SSW, and the F-statistic update in real time. Observe how group separation relative to within-group spread determines F.
You run a one-way ANOVA comparing three ML models on the same metric and get F = 0.8, p = 0.46. What can you conclude?