Objective
This lesson trains your "skeptic mode". You should be able to look at any result and ask whether it will survive real traffic.
Core ideas
- Overfitting: model memorizes noise, fails on new data.
- Underfitting: model too simple, misses signal.
- Train split: fit parameters.
- Validation split: tune choices.
- Test split: final unbiased estimate.
Worked example (online store)
Imagine delivery predictions with features including "actual delivered timestamp" by mistake. The model appears brilliant because it indirectly sees the answer. In production, that column does not exist, so performance collapses.
This is leakage: hidden answer clues during training.
Spot-the-leak mindset
Ask for each feature:
- Is it available at prediction time?
- Is it derived from the target?
- Could it include future information?
If yes to any, investigate.
Practical loop
- Build baseline.
- Evaluate on validation.
- Lock choices.
- Report once on test.
Never tune after seeing test metrics unless you intentionally reset the experimental protocol.
Three takeaways
- High scores are meaningless without split discipline.
- Leakage is common and expensive.
- Honest evaluation beats impressive dashboards.