Distributions Across the ML Stack

Rather than memorizing every formula, you should focus on recognizing which distribution describes your situation. The skill that pays off: seeing a problem and knowing what shape the uncertainty should take.

◆

Where Each Distribution Lives in Practice

Bernoulli / Binomial: Underlie logistic regression and any binary classification problem. Click-through rates, conversion rates, fraud detection, disease diagnosis.
Poisson: Count-based features and outcomes. Call volumes, request counts, defect counts, rare event modeling. Also the foundation for Poisson regression.
Normal: The assumption behind most parametric statistical tests. Residuals of linear regression (when assumptions hold). Initialization weights in neural networks.
Exponential: Survival analysis, churn prediction, reliability modeling, time-to-failure. The foundation for Cox proportional hazards models.
Uniform: Random weight initialization, A/B test group assignment, Monte Carlo simulation, dropout masks.
t-distribution: Hypothesis tests about means when population variance is unknown. Confidence intervals on regression coefficients.

A useful heuristic for choosing:

Is the outcome binary? → Bernoulli (single event) or Binomial (count of successes).
Is the outcome a count per time period? → Poisson.
Is it time to an event? → Exponential (or Weibull for more flexibility).
Is it a continuous measurement with no known structure? → Normal (especially for residuals and errors).
Do you genuinely have no reason to prefer any value? → Uniform.

💭Reflection

Identify the most appropriate distribution for the outcome variable and explain why: predicting the number of support tickets a customer will submit next month

💭Reflection

Identify the most appropriate distribution for the outcome variable and explain why: predicting whether an email is spam

💭Reflection

Identify the most appropriate distribution for the outcome variable and explain why: predicting how long a customer will remain subscribed before canceling.

←PreviousThe Student's t-DistributionProbability Distributions Next→Overview of Hypothesis TestingThe Logic of Hypothesis Testing