Chapter 12

Model Evaluation

A model with good metrics is not the same as a trustworthy model. This chapter closes the unit with the statistical toolkit for robust model evaluation: goodness-of-fit measures, residual analysis, confidence intervals, Simpson's Paradox, and a checklist that synthesizes every concept from the unit into a practical evaluation workflow.

1. How Good Is My Model, Actually?→
2. Goodness-of-Fit: R², AIC, and BIC→
3. Residual Analysis and Confidence Intervals→
4. Simpson's Paradox→
5. The Evaluation Checklist→