From Notebook to the World

Most ML projects that fail don't fail because the model was bad. They fail because nobody built the operations layer, or because the model made it to production and quietly broke — and nobody was watching. A model without an operations layer is a project. A model with one is a product.

The MLOps Lifecycle

Phase 1

ML Design

Requirements, use case prioritization, data acquisition, problem framing.

This is where the question gets defined. A well-scoped problem statement with clear success criteria is the single biggest predictor of whether a model ever ships.

Key question

What are we building and why?

Phase 2

Model Development

Data prep, feature engineering, training, experimentation, evaluation.

The part most modelers spend their time on. Features are engineered, experiments are tracked, and models are evaluated until one is good enough to ship.

Key question

Does the model actually work?

Phase 3

Operations

Deployment, CI/CD, monitoring, triggered retraining.

The model runs forever; the training run happened once. This phase is where most production engineering time goes — and where models quietly break without anyone noticing.

Key question

Is it still working in prod?

In industry, Phase 3 consumes more engineering time than Phases 1 and 2 combined — because a production system runs forever, and the model was trained once.

Checkpoint

Six months after deploying a fraud detection model, the ops team reports that the false positive rate has tripled — the model is incorrectly flagging three times as many legitimate transactions as fraudulent. The model hasn't been retrained. What is the most likely explanation?

←PreviousDesigning a PipelineBuilding Data Pipelines Next→Docker and Deployment OptionsBuilding ML Pipelines