What Statistics Buys You

Whether you are a data scientist, a research engineer, an ML Engineer, an AI engineer, or some title that has yet to be invited, your job will surely have some science in it. And science means hypotheses, experiments, evidence, and the discipline of saying "I'm not sure" when you're not sure. Statistics is the language we use to do that work honestly.

Here's something worth sitting with: across careers in this field, statistics is almost always the differentiator. Not the most fashionable framework, not the latest model architecture — statistics. The colleagues who could think clearly about a hypothesis, design an experiment, and tell you whether a result actually meant anything were the ones whose work held up. They were often the only ones in the room who could do it.

Three Places Statistics Shows Up in Your Work

  • Data analysis: Before you train a single model, you're computing means, standard deviations, distributions, and outliers. Statistical descriptions are the language of exploratory data analysis.
  • Hypothesis testing: Every claim — "this model is better," "this feature matters," "Group A behaves differently than Group B" — is a statistical claim. There are rigorous ways to make them and sloppy ways. This unit is about the rigorous way.
  • Model evaluation: "The number went up" is not the same as "the number went up meaningfully." Statistics is what makes that distinction concrete.

This unit is partly remedial — most of you have seen some of this before — and partly transformative. We're going to take statistics seriously as a practical tool for the work you'll actually do: shipping models, running experiments, justifying decisions, and answering the question that gets asked in every meeting you'll ever attend, which is "how do we know?"

A few notes on how this unit is built:

  • It's practical first. Deep-theory statistics courses exist and are worth taking. This isn't one of them. We stay close to the questions you'll be asked at work.
  • Real-world consequences anchor everything. Almost every chapter ends with concrete applications. If you can't picture using the concept, you don't really have it yet.
  • Code is part of the curriculum. Python — particularly scipy.stats, statsmodels, and imbalanced-learn — handles most of the heavy computation. We'll point you to the libraries where they matter.
Checkpoint

A junior data scientist says: 'The accuracy of our new model is 94% — that's higher than the old model's 93%, so we should ship it.' What's the most important thing missing from this argument?