DSPython Logo DSPython
Statistics for Data Science

Linear Regression

The first step towards Machine Learning – predicting numbers using data πŸ“ˆ

1️⃣ What is Linear Regression?

Linear Regression is a technique used to predict a number using past data.

πŸ‘‰ It finds a straight-line relationship between:

  • Input (X) – what we know
  • Output (Y) – what we want to predict

Example: If we know hours studied, can we predict marks? ➑️ Yes, using Linear Regression.

2️⃣ Real-World Examples (Easy to Remember)

🏠

House Price

X = Size (sq.ft)
Y = Price
Bigger house β†’ higher price

πŸš•

Cab Fare

X = Distance
Y = Fare
More distance β†’ more cost

πŸ“š

Exam Marks

X = Study Hours
Y = Marks
More study β†’ better marks

3️⃣ How does Linear Regression work? (Intuition)

Imagine you plotted many points on a graph πŸ“ (each point is one student / house / trip).

Now your task is to draw one straight line such that:

  • The line is close to most points
  • The total mistake is minimum

That line is called the Best Fit Line.

4️⃣ Visual Explanation

πŸ”΅ Blue line = Prediction
πŸ”΄ Red lines = Errors (difference between actual & predicted)

X (Input Feature) Y (Target Value) Best Fit Line Error (Residual)

5️⃣ Formula (Don’t panic πŸ˜„)

Once intuition is clear, math becomes easy.

Linear Equation

y = m x + c

(Also written as y = Ξ²β‚€ + β₁x)

  • m (Slope): How much Y changes when X increases by 1
  • c (Intercept): Starting value when X = 0

Interview Checkpoint 🎯

What is Best Fit Line?

It is the line that gives the least total error between predicted and actual values.

What is R-Squared?

It tells how good the model is.
0 β†’ very bad
1 β†’ perfect prediction

Can Linear Regression handle curves?

No ❌ It only works for straight-line relationships.

πŸ€–
DSPython AI Assistant βœ–
πŸ‘‹ Hi! I’m your AI assistant. Paste your code here, I will find bugs for you.