The Shape of Uncertainty
Every measurement, every metric, every prediction you'll ever work with has uncertainty baked in. Probability distributions are how we describe that uncertainty mathematically — what values are likely, what values are possible, and what values would be genuinely surprising.
If you understand the right distribution for your problem, you have a model of the world. If you assume the wrong distribution, your statistical conclusions can be quietly, confidently wrong.
Before we get into specific distributions, a few terms you'll see everywhere:
- Probability density function (PDF): For continuous distributions, describes the relative likelihood of a random variable taking a given value. You integrate over a range to get a probability — the PDF value itself isn't a probability.
- Probability mass function (PMF): The discrete counterpart. Gives the exact probability of a specific value. A Poisson variable equaling exactly 3 is a PMF value.
- Cumulative distribution function (CDF): The probability that a random variable takes a value ≤ x. Ranges from 0 to 1 across the full range of the variable.
Distributions also have two key parameters:
- Mean (expected value): The average value of the random variable, weighted by probabilities.
- Variance: How spread out the distribution is around its mean. The square root of variance is the standard deviation.
Different distributions have different relationships between their parameters and their shape. Understanding those relationships is what lets you identify the right distribution for a given situation.
Bell-shaped, symmetric. Described by mean μ and std dev σ. Shifting μ moves the curve; increasing σ flattens and widens it.
Select a distribution and adjust its parameters to see how the PDF/PMF and CDF change — and how mean, variance, and mode shift together.
Each curve is computed analytically — no sampling involved. For the Normal distribution:
For Poisson, each bar is the PMF at integer :
For Exponential, for . For Uniform, between and , zero elsewhere.
The CDF is the running integral of the PDF (or running sum of the PMF) from left to right — which is why it always starts near 0 and climbs to 1. The mean line marks , computed from the closed-form mean for each distribution.