Supervised Learning - Introduction
Understand fundamentals of supervised machine learning and algorithm categories.
Topic 1: What is Supervised Learning?
Supervised Learning is a Machine Learning technique where the model learns patterns from labeled data. Labeled data means the input (X) and the correct output (y) are provided during training.
Goal: Learn mapping function f(X) → y
In simple words → whenever you provide input and the correct answer during training, the algorithm studies them and learns how to predict output for future unseen input.
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
print(X.shape, y.shape)
🧠 Real-life examples
- Email → Spam or Not Spam
- House type → Price prediction
- Loan → Approve or Reject
- Photo → Cat or Dog
📌 Training vs Prediction
Training:
Input (X) + Output (y) → Model learns
Prediction:
Input (X_new) → Model predicts y
🧠 Key mindset
You show the machine enough examples, it learns by itself.
Topic 2: Types of Supervised Learning
Supervised learning is divided into two categories:
- Regression → Predict continuous values
- Classification → Predict classes (categories)
1️⃣ Regression
Regression models predict numeric/continuous output. Example → Predicting:
- House price
- Temperature
- Salary
Example: Price = f(size, location, bedrooms, ...)
# Example: Regression
y = [100, 110, 200, 130, 120] # continuous output
📈 Regression visualization
X ------------------------------→
*
* *
* *
y ------------------------------
2️⃣ Classification
Classification models predict classes or categories. Example → Predicting:
- Spam / Not Spam
- Male / Female
- Pass / Fail
Example: y ∈ {0,1} or {Cat, Dog}
# Example: Classification
y = [0, 1, 1, 0, 1, 2] # class labels
📊 Classification visualization
● ● ● (Class A)
■ ■ ■ (Class B)
Topic 3: Basic Flow of Supervised ML
🔁 Steps:
- Collect Data
- Split into Train/Test
- Train the Model
- Predict
- Evaluate Performance
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = LinearRegression()
model.fit(X_train, y_train)
pred = model.predict(X_test)
print(pred[:5])
⚙️ Train-test split
We cannot train using all data — we need some to test the model.
Train : Learn Pattern
Test : Check Accuracy
Topic 4: Evaluation Metrics
✅ Regression Metrics
- MAE (Mean Absolute Error)
- MSE (Mean Squared Error)
- RMSE
- R² score
✅ Classification Metrics
- Accuracy
- Precision
- Recall
- F1 Score
- Confusion Matrix
Topic 5: When to use supervised learning?
Use supervised learning when:
- You have labeled data
- You know expected output
- You want prediction or classification