DSPython - Supervised Learning Intro

Topic 1: What is Supervised Learning?

Supervised Learning is a Machine Learning technique where the model learns patterns from labeled data. Labeled data means the input (X) and the correct output (y) are provided during training.

Goal: Learn mapping function f(X) → y

In simple words → whenever you provide input and the correct answer during training, the algorithm studies them and learns how to predict output for future unseen input.


from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
print(X.shape, y.shape)

🧠 Real-life examples

Email → Spam or Not Spam
House type → Price prediction
Loan → Approve or Reject
Photo → Cat or Dog

📌 Training vs Prediction

Training:
   Input (X) + Output (y) → Model learns

Prediction:
   Input (X_new) → Model predicts y

🧠 Key mindset

You show the machine enough examples, it learns by itself.

Topic 2: Types of Supervised Learning

Supervised learning is divided into two categories:

Regression → Predict continuous values
Classification → Predict classes (categories)

1️⃣ Regression

Regression models predict numeric/continuous output. Example → Predicting:

House price
Temperature
Salary

Example: Price = f(size, location, bedrooms, ...)


# Example: Regression
y = [100, 110, 200, 130, 120]   # continuous output

📈 Regression visualization

X ------------------------------→ 
         *
      *     *
   *           *
y ------------------------------

2️⃣ Classification

Classification models predict classes or categories. Example → Predicting:

Spam / Not Spam
Male / Female
Pass / Fail

Example: y ∈ {0,1} or {Cat, Dog}


# Example: Classification
y = [0, 1, 1, 0, 1, 2]   # class labels

📊 Classification visualization

         ● ● ●          (Class A)
                ■ ■ ■   (Class B)

Topic 3: Basic Flow of Supervised ML

🔁 Steps:

Collect Data
Split into Train/Test
Train the Model
Predict
Evaluate Performance


from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

model = LinearRegression()
model.fit(X_train, y_train)

pred = model.predict(X_test)
print(pred[:5])

⚙️ Train-test split

We cannot train using all data — we need some to test the model.

Train : Learn Pattern
Test  : Check Accuracy

Topic 4: Evaluation Metrics

✅ Regression Metrics

MAE (Mean Absolute Error)
MSE (Mean Squared Error)
RMSE
R² score

✅ Classification Metrics

Accuracy
Precision
Recall
F1 Score
Confusion Matrix

Topic 5: When to use supervised learning?

Use supervised learning when:

You have labeled data
You know expected output
You want prediction or classification

Supervised Learning - Introduction