EP02 · 8 min

Workflow: data → X/y → model → training vs inference → evaluation

Learn the end-to-end machine learning workflow and where mistakes usually happen.

Simple definition
A model workflow turns raw data into predictions through training and evaluation.
Precise definition
The ML lifecycle transforms collected observations into feature-target pairs, fits parameters on training data, serves inference, and validates with task-specific metrics.

Objective

You will trace one complete loop from raw data to model decision. The goal is to know exactly where to debug when performance drops.

Workflow overview

  1. Collect data.
  2. Clean and structure data.
  3. Split into features (X) and target (y).
  4. Train a model on historical examples.
  5. Run inference on new, unseen examples.
  6. Evaluate with metrics aligned to business risk.

Worked example (online store)

For spam detection:

  • Data: historical support messages.
  • X: message text length, sender domain, presence of suspicious phrases.
  • y: spam/not spam label.
  • Training: learn weights from historical labeled messages.
  • Inference: classify a new incoming message in real time.
  • Evaluation: precision/recall tradeoff because both false positives and false negatives hurt.

For delivery time:

  • X: distance, courier load, weather conditions.
  • y: actual delivery hours.

Different target, same workflow.

Why teams fail here

Many teams jump from "we have data" to "let's deploy". They skip labeling quality checks, split strategy, and metric choice. That creates dashboards with nice numbers and poor user outcomes.

Quick check guidance

In the quiz, ask: "Am I in training mode or inference mode?" If you cannot answer that quickly, your architecture is probably tangled.

Three takeaways

  • Treat data flow as a product surface, not hidden plumbing.
  • Inference should mirror real-world conditions, not training shortcuts.
  • Evaluation must connect to user impact and business constraints.

Visual Stage

Interactive walkthrough

Visual walkthrough: ML pipeline

Tap each stage in order.

Step Insight

Gather representative historical examples aligned to your deployment scenario.

Common traps
  • Mixing training and production data paths.
  • Confusing model training with runtime inference.
  • Shipping without defining evaluation criteria up front.
Three takeaways
  • Clear separation of data prep, training, inference, and evaluation prevents hidden bugs.
  • X contains inputs, y contains targets.
  • Evaluation is not optional; it is the decision gate for deployment.
Next lesson