The Chi-Square Test

The tests so far compare means or distributions of continuous data. The chi-square test is fundamentally different — it's for categorical data. Are two categorical variables associated? Does a categorical variable's distribution match what you'd expect?

Chi-Square Test of Independence

Question: Are two categorical variables associated?

Use it when: You have two categorical variables and want to know if they're related — or independent.

Process:

Build a contingency table of observed frequencies (rows × columns = one cell per category combination).
Compute expected frequencies under independence: (row total × column total) / grand total.
Chi-square statistic: $\chi^2 = \sum \frac{(O - E)^2}{E}$

Examples: Is device type associated with conversion? Is gender associated with product preference? Is neighborhood associated with churn?

In Python: scipy.stats.chi2_contingency(contingency_table)

Chi-Square Test of Independence — Step-by-Step

Dataset

Step 1 — Build the observed contingency table

Start with a contingency table — count how many observations fall into each combination of categories. Row totals and column totals let you compute what you'd expect if the variables were truly independent.

Device ↓ / Converted →	Yes	No	Total
Mobile	45	155	200
Desktop	98	102	200
Tablet	22	78	100
Total	165	335	500

Step 1 of 4

Walk through the four steps of the chi-square test of independence. Switch datasets to see how expected counts, cell contributions, and the final p-value change.

Chi-Square Goodness-of-Fit

Question: Does the distribution of a single categorical variable match an expected distribution?

Examples: Are customer arrivals uniformly distributed across days of the week? Are dice rolls actually uniform? Does the demographic distribution of your users match the national distribution?

In Python: scipy.stats.chisquare(observed, expected)

Chi-Square Goodness-of-Fit — Step-by-Step

Dataset

Step 1 — Compare observed counts to expected counts

Compare the observed frequencies to the expected frequencies under your hypothesis. If the distribution matches, bars should be roughly equal in height.

Mon

Tue

Wed

Thu

Fri

Observed Expected (Uniform (60/day))

Category	Observed (O)	Expected (E)	O − E
Mon	42	60.0	-18.0
Tue	38	60.0	-22.0
Wed	65	60.0	+5.0
Thu	55	60.0	-5.0
Fri	100	60.0	+40.0

Total observations: 300

Step 1 of 3

Walk through the goodness-of-fit test step by step. See how observed counts compare to a reference distribution and which categories drive the test statistic.

⚠

Assumptions That Get Missed

Independence of observations.
Sufficient sample size: Each cell of the contingency table should have at least 5 expected occurrences. Fewer than 20% of cells with expected frequencies below 5.
If cells have very low expected counts, use Fisher's exact test instead — it doesn't rely on the chi-square approximation and works well with small samples.
Mutually exclusive categories: Each data point must fit into exactly one category.

Checkpoint

You want to test whether users from different countries (US, UK, Germany, France) have different rates of opting into push notifications (Yes/No). You build a 4×2 contingency table. One cell has an expected count of 3. What should you do?

←PreviousMann-Whitney U TestNonparametric Tests Next→The Test Decision TreeNonparametric Tests