DSPython Logo DSPython

Probability Basics

Master the mathematics of uncertainty - from basic concepts to conditional probability and real-world data science applications.

Uncertainty Mathematics Intermediate Level 45 min

🎯 What is Probability?

Definition

Probability is the branch of mathematics that measures the likelihood or chance of events occurring.

It quantifies uncertainty using numbers between 0 (impossible) and 1 (certain).

Why It Matters

  • Make predictions under uncertainty
  • Quantify risk and confidence
  • Build statistical models
  • Make data-driven decisions

📊 Probability Scale: 0 to 1

0
Impossible
0.5
Equally Likely
1
Certain
Sun rises in west
Coin toss
Rolling 3 on dice
Sun rises tomorrow
Data Science Insight:
Every machine learning model output (classification probability, confidence scores) is fundamentally based on probability theory.

🔑 Basic Probability Concepts

1

Experiment

Definition: Any process that generates outcomes

Examples:
  • Tossing a coin
  • Rolling a die
  • Drawing a card
2

Outcome

Definition: A single result of an experiment

Examples:
  • Coin: Heads
  • Die: 4
  • Card: Ace of Spades
3

Sample Space (S)

Definition: Set of all possible outcomes

Examples:
  • Coin: S = {H, T}
  • Die: S = {1,2,3,4,5,6}
  • Cards: S = {52 cards}
4

Event (A, B, C...)

Definition: Subset of sample space

Examples:
  • Die: A = "Even number" = {2,4,6}
  • Cards: B = "Heart" = {13 hearts}
  • Coin: C = "Heads" = {H}

🎲 Dice Rolling Example

Sample Space (S)

1
2
3
4
5
6
All possible outcomes

Event A: "Even number"

2
4
6
Subset of sample space

Event B: "Number > 3"

4
5
6
Another subset

📐 Core Probability Formulas

1

Classical Probability

P(A) = favorable / total

Where:
favorable = Number of outcomes in A
total = Total outcomes in sample space

Example: P(even on die) = 3/6 = 0.5
favorable = {2,4,6} = 3, total = 6
2

Complement Rule

P(A′) = 1 − P(A)

Where:
A′ = "Not A" or complement of A
P(A) + P(A′) = 1

Example: P(rain) = 0.3
P(no rain) = 1 - 0.3 = 0.7
3

Independent Events

P(A ∩ B) = P(A) × P(B)

Where:
A ∩ B = "A and B" (intersection)
Events don't affect each other

Example: Two coin tosses
P(H and H) = 0.5 × 0.5 = 0.25
4

Conditional Probability

P(A|B) = P(A ∩ B) / P(B)

Where:
P(A|B) = "Probability of A given B"
B has already occurred

Example: Draw card from deck
P(Ace|Heart) = (1/52) / (13/52) = 1/13
5

Dependent Events

P(A ∩ B) = P(A) × P(B|A)

Where:
Events affect each other
P(B|A) ≠ P(B)

Example: Draw cards without replacement
P(2 Aces) = (4/52) × (3/51)
6

Addition Rule

P(A ∪ B) = P(A) + P(B) - P(A ∩ B)

Where:
A ∪ B = "A or B" (union)
Prevents double-counting intersection

Example: Die: A=even, B=≥4
P(A∪B) = 3/6 + 3/6 - 2/6 = 4/6

⚖️ Independent vs Dependent Events

📋 Key Differences

Aspect Independent Events Dependent Events
Definition Events don't affect each other Events affect each other
Conditional Probability P(A|B) = P(A) P(A|B) ≠ P(A)
Multiplication Rule P(A∩B) = P(A) × P(B) P(A∩B) = P(A) × P(B|A)
Examples Coin tosses, Dice rolls Cards without replacement

🎲 Independent Events Example

Scenario: Tossing two fair coins
Events:
  • A = First coin is Heads
  • B = Second coin is Heads
P(A) = 0.5
P(B) = 0.5
P(A ∩ B) = 0.5 × 0.5 = 0.25
Why Independent: First toss doesn't affect second toss

🃏 Dependent Events Example

Scenario: Drawing two cards from deck without replacement
Events:
  • A = First card is Ace
  • B = Second card is Ace
P(A) = 4/52
P(B|A) = 3/51
P(A ∩ B) = (4/52) × (3/51) = 12/2652
Why Dependent: First draw changes deck composition

🧪 Test for Independence

Check Condition:
If P(A|B) = P(A)
or P(B|A) = P(B)
or P(A∩B) = P(A)×P(B)
Example Check:
P(A) = 0.4, P(B) = 0.5
P(A∩B) = 0.2
0.4 × 0.5 = 0.2 ✓
∴ Events are independent

🎯 Conditional Probability

📖 Understanding P(A|B)

P(A|B)
"Probability of A given B"
Interpretation:
"Given that event B has already occurred, what's the probability of A?"
P(A|B) = P(A ∩ B) / P(B)
Where P(B) > 0 (B must be possible)

🏥 Medical Testing Example

Scenario: Disease testing where:
  • Disease prevalence = 1% (P(D) = 0.01)
  • Test accuracy = 95% (P(+|D) = 0.95, P(-|healthy) = 0.95)
P(D)
Prior probability
0.01
P(+|D)
Sensitivity
0.95
P(-|H)
Specificity
0.95
Question: P(D|+) = ? (Probability of disease given positive test)
Answer: Using Bayes' theorem ≈ 16% (Counterintuitive!)

📊 Venn Diagram Interpretation

A
B
A∩B
P(A|B) in Venn Diagram:
Focus only on the B circle
What fraction of B is also in A?
Calculation:
P(A|B) = Area(A∩B) / Area(B)
= Size(intersection) / Size(B)

🧠 Data Science Applications

🤖

Model Confidence

Classification Models: Output probability scores
Example: Email spam detection
  • Model predicts P(spam|email) = 0.87
  • Decision threshold at 0.5
  • Classify as spam if > 0.5
Why Important: Quantifies prediction certainty
📊

A/B Testing Logic

Statistical Significance: Based on probability theory
Process:
  • Calculate p-value from test
  • P(data|null hypothesis)
  • If p < 0.05, reject null
Example: Website conversion rate test
⚠️

Risk Estimation

Financial/Cybersecurity: Quantifying risks
Applications:
  • Credit scoring: P(default|features)
  • Fraud detection: P(fraud|transaction)
  • Insurance pricing
Key Formula: Expected Value = Σ(P × Loss)

🏢 Case Study: Recommendation Systems

🎯 Problem

Predict which products a user will like based on past behavior

📐 Probability Approach

Calculate P(purchase|user features, product features)

🧠 ML Models

Logistic regression, Neural networks output probabilities

Real Example: Netflix recommendation
P(user watches show|viewing history) = 0.78 → Highly recommend

🎯 Interview Questions Preview

Q: If P(A) = 0.3, P(B) = 0.4, and P(A∩B) = 0.12, are A and B independent?
A: Yes, because P(A)×P(B) = 0.3×0.4 = 0.12 = P(A∩B)
Q: In a medical test with 95% accuracy, why might P(disease|positive) be low?
A: Due to low disease prevalence (Base rate fallacy)

✅ Chapter Summary & Cheatsheet

🎯

Basic Concepts

Experiment → Outcome → Sample Space → Event

0 ≤ P(A) ≤ 1
⚖️

Independence

P(A|B) = P(A) or P(A∩B) = P(A)×P(B)

P(A∩B) = P(A)×P(B)
🎯

Conditional

P(A|B) = P(A∩B)/P(B)

P(A|B) = P(A∩B)/P(B)

⚡ Probability Rules Cheatsheet

P(A) = favorable/total P(A') = 1 - P(A) P(A∩B) = P(A)×P(B) P(A|B) = P(A∩B)/P(B) P(A∪B) = P(A)+P(B)-P(A∩B)

📐 Must-Know Formulas

Classical
P(A) = favorable/total
Complement
P(A') = 1 - P(A)
Independent
P(A∩B) = P(A)×P(B)
Conditional
P(A|B) = P(A∩B)/P(B)
🤖
DSPython AI Assistant
👋 Hi! I’m your AI assistant. Paste your code here, I will find bugs for you.