DSPython Logo DSPython
Statistics for Data Science

Hypothesis Testing

Make decisions using data — with confidence, not guesses 🧠📊

1️⃣ What is Hypothesis Testing?

Hypothesis Testing is a statistical method to decide whether a claim (about a population) is supported by sample data.

We never "prove" with 100% certainty — we measure how strong the evidence is and then decide.

  • 🔹 State two opposite statements: Null (H₀) and Alternative (H₁).
  • 🔹 Compute a test statistic from sample data.
  • 🔹 Use p-value or critical values to decide: reject or fail to reject H₀.

2️⃣ Null & Alternative Hypothesis (Quick)

One-sample Mean

Is the average equal to X?
H₀: μ = 70
H₁: μ ≠ 70

Two-sample Difference

Are two groups different?
H₀: μ₁ = μ₂
H₁: μ₁ ≠ μ₂

Proportion Test

Is success rate > 50%?
H₀: p = 0.5
H₁: p > 0.5

3️⃣ One-tailed vs Two-tailed (Visual)

Which side(s) of the distribution represent extreme values?

Two-Tailed Test (≠)

Accept H₀ Reject Reject

Rejection region on both sides

One-Tailed Test (Right >)

Accept H₀ Reject Region

Rejection region on one side only

4️⃣ P-value — Intuition

P-value = "If H₀ is true, how likely is the observed result (or more extreme)?"

Rule of Thumb:

  • 📉 Small p-value (< 0.05) → Strong evidence against H₀ (Reject).
  • 📈 Large p-value (> 0.05) → Weak evidence against H₀ (Fail to Reject).

Visualizing P-value

p-value area Observed Result (x̄)
🍕

5️⃣ Case Study: The "30-Min Delivery" Claim

Manual Calculation vs Python Code

The Problem

A pizza shop claims average delivery time is 30 mins. You think it's longer. You collect 10 random delivery times:
[32, 35, 29, 34, 33, 36, 30, 32, 31, 33]

Mean (x̄) = 32.5 StdDev (s) = 2.27 n = 10

Maths Breakdown 🧮

Step 1: Standard Error

SE = s / √n = 2.27 / 3.16 = 0.718

Step 2: t-statistic

t = (x̄ - μ) / SE
t = (32.5 - 30) / 0.718
t = 2.5 / 0.718 = 3.48

Step 3: Decision

For df=9, critical t is 1.83.
Since 3.48 > 1.83, we Reject H₀.

Python Solution scipy.stats
import scipy.stats as stats

# 1. The Data
data = [32, 35, 29, 34, 33, 36, 30, 32, 31, 33]

# 2. Perform t-test
# 'greater' because we test if time > 30
t_stat, p_val = stats.ttest_1samp(
    data, 
    popmean=30, 
    alternative='greater'
)

print(f"T-statistic: {t_stat:.2f}")
print(f"P-value: {p_val:.4f}")
Output
> T-statistic: 3.48
> P-value: 0.0034

Conclusion: p-value (0.0034) is less than 0.05. The data proves the shop takes longer than 30 mins!

6️⃣ Type I & Type II Errors (Visual)

This graph shows the trade-off between the two error types.

Critical Value H₀ (Null) H₁ (Alternative) α (Type I) β (Type II)
Type I (False Positive) Rejecting H₀ when it's actually True.
Type II (False Negative) Failing to reject H₀ when it's actually False.

7️⃣ Decision Flowchart

1. Define H₀ & H₁ 2. Set α (0.05) 3. Find P-value P < α ? YES: Reject H₀ NO: Fail to Reject

Interview Checkpoint 🎯 (5)

1. What is a p-value in simple terms?
It is the probability of finding the observed results (or something more extreme) if the Null Hypothesis were true. A tiny p-value means the Null Hypothesis is likely wrong.
2. When do you use a one-tailed test?
Use a one-tailed test when you are specifically interested if a metric is greater than OR less than a value (directional). Use two-tailed if you just want to know if it is different.
3. What is the difference between Type I and Type II errors?
Type I (α): False Positive (You rejected H₀, but H₀ was true).
Type II (β): False Negative (You failed to reject H₀, but H₀ was false).
4. If p = 0.03 and alpha = 0.05, what is the conclusion?
Since p (0.03) < alpha (0.05), we Reject the Null Hypothesis. The result is statistically significant.
🤖
DSPython AI Assistant
👋 Hi! I’m your AI assistant. Paste your code here, I will find bugs for you.