DSPython Logo DSPython

Supervised Learning - Introduction

Understand fundamentals of supervised machine learning and algorithm categories.

ML Basics Beginner 40 min

Topic 1: What is Supervised Learning?

Supervised Learning is a Machine Learning technique where the model learns patterns from labeled data. Labeled data means the input (X) and the correct output (y) are provided during training.

Goal: Learn mapping function f(X) → y

In simple words → whenever you provide input and the correct answer during training, the algorithm studies them and learns how to predict output for future unseen input.


from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
print(X.shape, y.shape)

🧠 Real-life examples

  • Email → Spam or Not Spam
  • House type → Price prediction
  • Loan → Approve or Reject
  • Photo → Cat or Dog

📌 Training vs Prediction

Training:
   Input (X) + Output (y) → Model learns

Prediction:
   Input (X_new) → Model predicts y
        

🧠 Key mindset

You show the machine enough examples, it learns by itself.

Topic 2: Types of Supervised Learning

Supervised learning is divided into two categories:

  1. Regression → Predict continuous values
  2. Classification → Predict classes (categories)

1️⃣ Regression

Regression models predict numeric/continuous output. Example → Predicting:

  • House price
  • Temperature
  • Salary

Example: Price = f(size, location, bedrooms, ...)


# Example: Regression
y = [100, 110, 200, 130, 120]   # continuous output

📈 Regression visualization

X ------------------------------→ 
         *
      *     *
   *           *
y ------------------------------
        

2️⃣ Classification

Classification models predict classes or categories. Example → Predicting:

  • Spam / Not Spam
  • Male / Female
  • Pass / Fail

Example: y ∈ {0,1} or {Cat, Dog}


# Example: Classification
y = [0, 1, 1, 0, 1, 2]   # class labels

📊 Classification visualization

         ● ● ●          (Class A)
                ■ ■ ■   (Class B)
        

Topic 3: Basic Flow of Supervised ML

🔁 Steps:

  1. Collect Data
  2. Split into Train/Test
  3. Train the Model
  4. Predict
  5. Evaluate Performance

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

model = LinearRegression()
model.fit(X_train, y_train)

pred = model.predict(X_test)
print(pred[:5])

⚙️ Train-test split

We cannot train using all data — we need some to test the model.

Train : Learn Pattern
Test  : Check Accuracy
        

Topic 4: Evaluation Metrics

✅ Regression Metrics

  • MAE (Mean Absolute Error)
  • MSE (Mean Squared Error)
  • RMSE
  • R² score

✅ Classification Metrics

  • Accuracy
  • Precision
  • Recall
  • F1 Score
  • Confusion Matrix

Topic 5: When to use supervised learning?

Use supervised learning when:

  • You have labeled data
  • You know expected output
  • You want prediction or classification