Why does ddof=1 reduce bias?

Statistics for ML

Easy📐 Math for MLW2 D4

Statistics for ML

Master ML-relevant statistics: estimators, confidence intervals, hypothesis tests, p-values, multiple comparisons, and common gotchas.

Students can compute and interpret confidence intervals and tests, understand failure modes (p-hacking, confounding), and connect stats to ML evaluation.

Progress — 0/6 tasks

1Tasks

2Estimators (Mean/Variance)

3Confidence Interval for Mean (Normal approx)

4Hypothesis Testing (Two-sample test intuition)

5Multiple Comparisons (Gotcha)

Interview Angles

• Why does CI have ~95% coverage by design?
• Slicing dashboards can create accidental p-hacking.

FAANG Gotchas

• CI is for the mean, not individual outcomes.
• p-value is not P(H0 true).

Asked At

GoogleGitHub

Python 3 — Notebook

0/6 solvedSubstack Notes

Dataset & Setup

Statistics for ML — FAANG-Level Lab

Goal: Confidence intervals, hypothesis testing, and interpretation for ML engineering.

Outcome: You can quantify uncertainty and avoid p-value traps.

Loading editor...

Solution

Estimators (Mean/Variance)

Unbiased sample variance

Section 1 — Estimators (Mean/Variance)

Task 1.1: Unbiased sample variance

Implement sample mean and unbiased sample variance (ddof=1) without calling np.var(..., ddof=1).

●mean = sum(x)/n
●unbiased var = sum((x-mean)^2)/(n-1)

Explain: Why divide by (n-1) instead of n?

Explain: Why divide by (n-1) instead of n?

Loading editor...

Solution

Confidence Interval for Mean (Normal approx)

95% CI for mean

Section 2 — Confidence Interval for Mean (Normal approx)

Task 2.1: 95% CI for mean

Compute a 95% CI for mean using normal approximation: CI = mean ± z * s/sqrt(n), where z≈1.96.

●Use unbiased sample std

FAANG gotcha: CI is about the mean, not individual outcomes.

Loading editor...

Solution

Coverage simulation

Task 2.2: Coverage simulation

Simulate repeated sampling from Normal(mu=0, sigma=1). Estimate how often 95% CI contains true mean.

●Run many trials
●Count coverage

Explain: Why isn't coverage exactly 0.95 in finite simulation?

Explain: Why isn't coverage exactly 0.95 in finite simulation?

Loading editor...

Solution

Hypothesis Testing (Two-sample test intuition)

Permutation test for A/B (no scipy)

Section 3 — Hypothesis Testing (Two-sample test intuition)

Task 3.1: Permutation test for A/B (no scipy)

Given samples A and B, test whether mean(B) - mean(A) is significant via permutation.

●Combine samples
●Shuffle and split
●Compute diff distribution
●p-value = fraction of diffs >= observed (two-sided if needed)

FAANG gotcha: p-value is not P(H0 true).

Loading editor...

Solution

Multiple Comparisons (Gotcha)

Bonferroni correction

Section 4 — Multiple Comparisons (Gotcha)

Task 4.1: Bonferroni correction

If you run m tests at alpha=0.05, Bonferroni uses alpha/m per test.

Compute adjusted alpha for m=20 and explain why this matters in feature slicing / metric dashboards.

Loading editor...

Solution