Type 1 and Type 2 Errors

Any time you run a hypothesis test, you can be wrong in two distinct ways. Understanding these errors — and which one you care about more — is one of the most important practical skills in this unit.

A Type 1 error is a false positive: you rejected a true null hypothesis. You concluded there was an effect when there wasn't one.

A Type 2 error is a false negative: you failed to reject a false null hypothesis. There really was an effect, and you missed it.

Difference between T1 error and T2 error — Type 1 vs. Type 2 Error

✦

A Mnemonic That Sticks

Imagine you're trying to remember if it's someone's birthday.

Type 1 error: You say "happy birthday" — and it's not their birthday. (False positive.)
Type 2 error: You say nothing — and it is their birthday. (False negative.)

Which is worse depends entirely on who the person is. For a colleague you barely know: Type 1 (saying happy birthday incorrectly) is mildly awkward. For your partner: Type 2 (forgetting) could be catastrophic!

Let's anchor this with a more permanent example: you build a machine learning model that detects cancer.

A Type 1 error means your model tells a patient they have cancer when they don't. They may undergo unnecessary biopsies, treatments, and serious psychological distress.
A Type 2 error means your model fails to detect cancer that's actually there. The patient doesn't receive treatment, the disease progresses.

If you have to favor one, you'd rather have the false alarm. Missing a real cancer is far worse than triggering a follow-up test. This shapes everything: the decision threshold, the loss function, which metric you optimize for.

⚠

This Trade-off Is Everywhere in ML

Fraud detection skews the same direction: a missed fraud (Type 2) is usually worse than a false flag (Type 1) that a human can review. Medical screening tests are deliberately tuned toward Type 1 errors. Content moderation may trade off differently depending on the platform's values.

Your choice of decision threshold and your primary evaluation metric both encode an implicit answer to the Type 1 / Type 2 trade-off. Make that choice deliberately — don't let it happen by default!

Checkpoint

You're building a model to detect critical equipment failures in a factory. A missed failure (no alert when failure is imminent) could cause a catastrophic accident. A false alarm (alert when no failure is coming) causes a brief, costly shutdown. Which error type should you minimize, and what does that imply about your threshold?

←PreviousOverview of Hypothesis TestingThe Logic of Hypothesis Testing Next→P-Values, CarefullyThe Logic of Hypothesis Testing