Workbook

Data Science

Learn fundamental concepts in data science including data storytelling, statistics, and data/ML engineering.


Units

  1. 1

    Data Storytelling

    From raw information to actionable insight: how data is represented, sourced, explored, visualized, and prepared for modeling — and the ethical responsibilities that come with it.

  2. 2

    Statistics

    This unit builds a practitioner's statistical toolkit — from descriptive foundations through hypothesis testing, power analysis, sampling, the full test toolkit, regression, Bayesian inference, and rigorous model evaluation.

  3. 3

    Data/ML Engineering

    Where does your data actually live, how does it get to you, and how does the model you build ever escape your laptop? This unit covers data storage, data pipelines, and ML pipelines — the infrastructure layer that separates a simple class project from a production system.

About

This workbook was created and is maintained by Dr. Brinnae Bent at Duke University for AIPI 510: Data Sourcing for Analytics.

This book is a compilation of best practices, lessons learned, and concepts that are directly applicable when designing and deploying machine learning systems for production.