Cross-Validation
Cross-validation is a more data-efficient approach to robust evaluation — especially important with small datasets where you can't afford to hold out 20% just for validation.
K-fold cross-validation splits your data into k equally-sized folds. For each fold, use that fold as the test set and the remaining k−1 folds for training. Train and evaluate k separate times, then aggregate the metrics. Common values: k = 5 or 10.
Stratified k-fold applies the same idea but ensures each fold's class distribution matches the overall dataset — critical for imbalanced classification where a random split might leave one fold with no minority-class examples at all.
Leave-one-out (LOOCV) sets k equal to the number of samples — each "fold" is a single observation. Extremely conservative and robust, but also extremely computationally expensive. Use when your dataset is small and you can afford the compute.
Iteration 1: Fold 1 is held out for testing. The model trains on the other 4 folds. After all 5 iterations, every sample has been in the test set exactly once.
Click any iteration row or use prev/next to step through each fold assignment. Switch between K-Fold, Stratified K-Fold, and LOOCV.
Cross-Validation Doesn't Replace a Held-Out Test Set
Even with cross-validation, hold out an external test set if you can. Cross-validation gives you a robust estimate of generalization during model development. A final held-out test set gives you an honest, untouched evaluation at the end. Use both when possible.
When to Use Which
- Large dataset, plentiful compute → simple train/validation/test split is most common in industry.
- Small dataset → k-fold or LOOCV extracts the most signal from limited data.
- Imbalanced classes → stratified k-fold is essential to ensure minority classes appear in every fold.
You're classifying medical records where only 3% of cases are positive (the condition you're trying to detect). You use standard k-fold cross-validation with k=10. What's the risk?