Bayes' Theorem
Bayes' Theorem describes how to update the probability of a hypothesis based on new evidence:
- P(A | B) — Posterior: Probability of hypothesis A given evidence B. Your updated belief after seeing data.
- P(B | A) — Likelihood: Probability of observing evidence B given that A is true. How well does the hypothesis explain the data?
- P(A) — Prior: Probability of A before seeing any evidence. Your initial belief.
- P(B) — Marginal likelihood (evidence): Probability of observing B under any hypothesis. Acts as a normalization constant.
In plain language: posterior ∝ likelihood × prior. Your updated belief is your initial belief, scaled by how well the hypothesis explains what you observed.
Each dot = 1 person. Indigo = true positive, amber = false positive.
If you test positive, the probability you actually have the condition.
With only 1% prevalence, most positive tests are false positives — even a very accurate test has PPV of 16%. Prevalence (the prior) dominates the posterior.
Adjust prior probability, test sensitivity, and specificity. See the posterior probability update. Visualize how prevalence (prior) dramatically affects the positive predictive value.
Intuitive Example: Medical Testing
A disease affects 1% of the population. A test is 99% accurate (P(positive | disease) = 0.99, P(negative | no disease) = 0.99). You test positive. What's the probability you have the disease?
Intuition says ~99%. Bayes says something different:
- P(disease) = 0.01 (prior)
- P(positive | disease) = 0.99 (likelihood)
- P(positive) = 0.99 × 0.01 + 0.01 × 0.99 = 0.0198 (marginal likelihood)
- P(disease | positive) = (0.99 × 0.01) / 0.0198 ≈ 0.50
A 99%-accurate test on a 1%-prevalence disease yields only ~50% probability of disease given a positive result. The low prior (rare disease) pulls against the high likelihood. This is why medical screening relies on prevalence data, not just test accuracy.
In the Bayesian framework, what is the role of the prior — and what happens to its influence as data accumulates?