Distributions Across the ML Stack

Rather than memorizing every formula, you should focus on recognizing which distribution describes your situation. The skill that pays off: seeing a problem and knowing what shape the uncertainty should take.

Where Each Distribution Lives in Practice

  • Bernoulli / Binomial: Underlie logistic regression and any binary classification problem. Click-through rates, conversion rates, fraud detection, disease diagnosis.
  • Poisson: Count-based features and outcomes. Call volumes, request counts, defect counts, rare event modeling. Also the foundation for Poisson regression.
  • Normal: The assumption behind most parametric statistical tests. Residuals of linear regression (when assumptions hold). Initialization weights in neural networks.
  • Exponential: Survival analysis, churn prediction, reliability modeling, time-to-failure. The foundation for Cox proportional hazards models.
  • Uniform: Random weight initialization, A/B test group assignment, Monte Carlo simulation, dropout masks.
  • t-distribution: Hypothesis tests about means when population variance is unknown. Confidence intervals on regression coefficients.

A useful heuristic for choosing:

  1. Is the outcome binary? → Bernoulli (single event) or Binomial (count of successes).
  2. Is the outcome a count per time period? → Poisson.
  3. Is it time to an event? → Exponential (or Weibull for more flexibility).
  4. Is it a continuous measurement with no known structure? → Normal (especially for residuals and errors).
  5. Do you genuinely have no reason to prefer any value? → Uniform.
💭Reflection

Identify the most appropriate distribution for the outcome variable and explain why: predicting the number of support tickets a customer will submit next month

💭Reflection

Identify the most appropriate distribution for the outcome variable and explain why: predicting whether an email is spam

💭Reflection

Identify the most appropriate distribution for the outcome variable and explain why: predicting how long a customer will remain subscribed before canceling.