Unit 2

Statistics

This unit builds a practitioner's statistical toolkit — from descriptive foundations through hypothesis testing, power analysis, sampling, the full test toolkit, regression, Bayesian inference, and rigorous model evaluation.

Chapter 1

Practical Statistics

This chapter establishes why statistical thinking matters, introduces the core vocabulary of populations and samples, and frames the unit's central question: how much data do you actually need?

Chapter 2

Probability Distributions

Every outcome you model has a shape. Probability distributions describe that shape mathematically — what values are likely, what values are rare, and what would be genuinely surprising. This chapter builds the vocabulary you need to choose the right distribution for any data science problem.

Chapter 3

The Logic of Hypothesis Testing

Every model comparison, A/B test, and feature evaluation is hypothesis testing. This chapter establishes the formal procedure — hypotheses, test statistics, p-values, error types — and the discipline required to do it honestly: pre-registering your analysis, avoiding p-hacking, and correcting for multiple comparisons.

Chapter 4

Statistical Significance and Power Analysis

Power analysis is the principled answer to 'how much data do I need?' This chapter covers the four interlocking components of a power analysis, how to estimate effect size, and how to run a power analysis in Python — so you can give a defensible, quantitative answer to the sample size question.

Chapter 5

Sampling

Every claim a data scientist makes is a claim about a population, made from a sample. This chapter covers the major sampling methods and connects them to the ML workflows where sampling decisions have direct consequences: train/test splits, cross-validation, and handling class imbalance.

Chapter 6

Class Balancing

A model that predicts 'pizza' every time can be 75% accurate and completely useless. This chapter covers why class imbalance breaks naive evaluation, and the practical toolkit — SMOTE, Tomek links, class weights — for building models that actually learn the minority class.

Chapter 7

Parametric Tests

Parametric tests are powerful — but only when their assumptions hold. This chapter covers the three assumptions you must check every time, then walks through the t-test family: one-sample, independent (Student's and Welch's), and paired — with worked examples from real ML scenarios.

Chapter 8

Nonparametric Tests

When the assumptions of parametric tests break down — and they often do in real data — nonparametric tests provide valid inference. This chapter covers the Wilcoxon signed-rank test, the Mann-Whitney U test, and the chi-square test, along with a decision tree for choosing the right test for any two-group comparison.

Chapter 9

ANOVA

Comparing three or more groups with multiple t-tests inflates your false positive rate. ANOVA tests all groups simultaneously by partitioning variability into between-group and within-group components. This chapter covers one-way ANOVA, the F-statistic, and post hoc tests for identifying which groups differ.

Chapter 10

Regression

Regression models the relationship between variables for prediction, for understanding effects, and for hypothesis testing on coefficients. This chapter builds linear regression from the ground up using house price prediction — starting with the assumptions that make OLS valid, then working through residuals, coefficients, and multiple predictors.

Chapter 11

Bayesian Statistics

Bayesian statistics asks a different question than frequentist methods: given the data, how should I update my beliefs? This chapter introduces Bayes' Theorem, the prior-likelihood-posterior framework, and the practical ML contexts where Bayesian thinking provides a real advantage.

Chapter 12

Model Evaluation

A model with good metrics is not the same as a trustworthy model. This chapter closes the unit with the statistical toolkit for robust model evaluation: goodness-of-fit measures, residual analysis, confidence intervals, Simpson's Paradox, and a checklist that synthesizes every concept from the unit into a practical evaluation workflow.