Probability Basics
Master the mathematics of uncertainty - from basic concepts to conditional probability and real-world data science applications.
🎯 What is Probability?
Definition
Probability is the branch of mathematics that measures the likelihood or chance of events occurring.
It quantifies uncertainty using numbers between 0 (impossible) and 1 (certain).
Why It Matters
- Make predictions under uncertainty
- Quantify risk and confidence
- Build statistical models
- Make data-driven decisions
📊 Probability Scale: 0 to 1
Impossible
Equally Likely
Certain
Every machine learning model output (classification probability, confidence scores) is fundamentally based on probability theory.
🔑 Basic Probability Concepts
Experiment
Definition: Any process that generates outcomes
- Tossing a coin
- Rolling a die
- Drawing a card
Outcome
Definition: A single result of an experiment
- Coin: Heads
- Die: 4
- Card: Ace of Spades
Sample Space (S)
Definition: Set of all possible outcomes
- Coin: S = {H, T}
- Die: S = {1,2,3,4,5,6}
- Cards: S = {52 cards}
Event (A, B, C...)
Definition: Subset of sample space
- Die: A = "Even number" = {2,4,6}
- Cards: B = "Heart" = {13 hearts}
- Coin: C = "Heads" = {H}
🎲 Dice Rolling Example
Sample Space (S)
Event A: "Even number"
Event B: "Number > 3"
📐 Core Probability Formulas
Classical Probability
Where:
favorable = Number of outcomes in A
total = Total outcomes in sample space
favorable = {2,4,6} = 3, total = 6
Complement Rule
Where:
A′ = "Not A" or complement of A
P(A) + P(A′) = 1
P(no rain) = 1 - 0.3 = 0.7
Independent Events
Where:
A ∩ B = "A and B" (intersection)
Events don't affect each other
P(H and H) = 0.5 × 0.5 = 0.25
Conditional Probability
Where:
P(A|B) = "Probability of A given B"
B has already occurred
P(Ace|Heart) = (1/52) / (13/52) = 1/13
Dependent Events
Where:
Events affect each other
P(B|A) ≠ P(B)
P(2 Aces) = (4/52) × (3/51)
Addition Rule
Where:
A ∪ B = "A or B" (union)
Prevents double-counting intersection
P(A∪B) = 3/6 + 3/6 - 2/6 = 4/6
⚖️ Independent vs Dependent Events
📋 Key Differences
| Aspect | Independent Events | Dependent Events |
|---|---|---|
| Definition | Events don't affect each other | Events affect each other |
| Conditional Probability | P(A|B) = P(A) | P(A|B) ≠ P(A) |
| Multiplication Rule | P(A∩B) = P(A) × P(B) | P(A∩B) = P(A) × P(B|A) |
| Examples | Coin tosses, Dice rolls | Cards without replacement |
🎲 Independent Events Example
- A = First coin is Heads
- B = Second coin is Heads
🃏 Dependent Events Example
- A = First card is Ace
- B = Second card is Ace
🧪 Test for Independence
If P(A|B) = P(A)
or P(B|A) = P(B)
or P(A∩B) = P(A)×P(B)
P(A) = 0.4, P(B) = 0.5
P(A∩B) = 0.2
0.4 × 0.5 = 0.2 ✓
∴ Events are independent
🎯 Conditional Probability
📖 Understanding P(A|B)
"Given that event B has already occurred, what's the probability of A?"
🏥 Medical Testing Example
- Disease prevalence = 1% (P(D) = 0.01)
- Test accuracy = 95% (P(+|D) = 0.95, P(-|healthy) = 0.95)
Answer: Using Bayes' theorem ≈ 16% (Counterintuitive!)
📊 Venn Diagram Interpretation
Focus only on the B circle
What fraction of B is also in A?
P(A|B) = Area(A∩B) / Area(B)
= Size(intersection) / Size(B)
🧠 Data Science Applications
Model Confidence
- Model predicts P(spam|email) = 0.87
- Decision threshold at 0.5
- Classify as spam if > 0.5
A/B Testing Logic
- Calculate p-value from test
- P(data|null hypothesis)
- If p < 0.05, reject null
Risk Estimation
- Credit scoring: P(default|features)
- Fraud detection: P(fraud|transaction)
- Insurance pricing
🏢 Case Study: Recommendation Systems
🎯 Problem
Predict which products a user will like based on past behavior
📐 Probability Approach
Calculate P(purchase|user features, product features)
🧠 ML Models
Logistic regression, Neural networks output probabilities
P(user watches show|viewing history) = 0.78 → Highly recommend
🎯 Interview Questions Preview
✅ Chapter Summary & Cheatsheet
Basic Concepts
Experiment → Outcome → Sample Space → Event
Independence
P(A|B) = P(A) or P(A∩B) = P(A)×P(B)
Conditional
P(A|B) = P(A∩B)/P(B)