Wilcoxon Signed-Rank Test

Question: Is there a significant difference in the medians of two related groups?

Use it when: Paired data (like the paired t-test), but normality is violated.

How it works:

  1. Compute the difference between each paired observation.
  2. Rank the absolute differences (ignoring sign).
  3. Reattach the original signs to the ranks.
  4. The test statistic W is the smaller of the sum of positive ranks vs. negative ranks.

In Python: scipy.stats.wilcoxon(before, after)

Wilcoxon Signed-Rank — Step-by-Step
Dataset
Step 1 — Compute differences

For each pair, compute After − Before. Positive differences mean the score went up; negative means it went down. Pairs where the difference is exactly 0 are excluded from the test.

PairBefore (hrs)After (hrs)Difference
16.17.4+1.3
25.56.8+1.3
37.27.5+0.3
46.87.9+1.1
55.06.1+1.1
66.56.4-0.1
77.08.2+1.2
85.87.0+1.2
Step 1 of 4

Walk through the four steps of the Wilcoxon signed-rank test on real paired data. Switch datasets to see how the ranks and test statistic change.

Checkpoint

You measure user engagement scores before and after a UI redesign for 50 users. The scores are skewed with some large outliers (a few power users with very high engagement). Which test is most appropriate?