S30 Logo
S30 AI Labwww.thes30.com
Back
#7

Statistics for ML

EasyšŸ“ Math for MLW2 D4

Statistics for ML

Master ML-relevant statistics: estimators, confidence intervals, hypothesis tests, p-values, multiple comparisons, and common gotchas.

Students can compute and interpret confidence intervals and tests, understand failure modes (p-hacking, confounding), and connect stats to ML evaluation.

Progress — 0/6 tasks

1Tasks
2Estimators (Mean/Variance)
3Confidence Interval for Mean (Normal approx)
4Hypothesis Testing (Two-sample test intuition)
5Multiple Comparisons (Gotcha)

Interview Angles

  • • Why does CI have ~95% coverage by design?
  • • Slicing dashboards can create accidental p-hacking.

FAANG Gotchas

  • • CI is for the mean, not individual outcomes.
  • • p-value is not P(H0 true).

Asked At

GoogleGitHub
Python 3 — Notebook
0/6 solvedSubstack Notes
1
Dataset & Setup

Statistics for ML — FAANG-Level Lab

Goal: Confidence intervals, hypothesis testing, and interpretation for ML engineering.

Outcome: You can quantify uncertainty and avoid p-value traps.

Loading editor...
Solution
1

Estimators (Mean/Variance)

2
Unbiased sample variance
2

Section 1 — Estimators (Mean/Variance)

Task 1.1: Unbiased sample variance

Implement sample mean and unbiased sample variance (ddof=1) without calling np.var(..., ddof=1).

  • ā—mean = sum(x)/n
  • ā—unbiased var = sum((x-mean)^2)/(n-1)

Explain: Why divide by (n-1) instead of n?

Explain: Why divide by (n-1) instead of n?
Loading editor...
Solution
2

Confidence Interval for Mean (Normal approx)

3
95% CI for mean
1

Section 2 — Confidence Interval for Mean (Normal approx)

Task 2.1: 95% CI for mean

Compute a 95% CI for mean using normal approximation: CI = mean ± z * s/sqrt(n), where zā‰ˆ1.96.

  • ā—Use unbiased sample std

FAANG gotcha: CI is about the mean, not individual outcomes.

Loading editor...
Solution
4
Coverage simulation
1

Task 2.2: Coverage simulation

Simulate repeated sampling from Normal(mu=0, sigma=1). Estimate how often 95% CI contains true mean.

  • ā—Run many trials
  • ā—Count coverage

Explain: Why isn't coverage exactly 0.95 in finite simulation?

Explain: Why isn't coverage exactly 0.95 in finite simulation?
Loading editor...
Solution
3

Hypothesis Testing (Two-sample test intuition)

5
Permutation test for A/B (no scipy)
1

Section 3 — Hypothesis Testing (Two-sample test intuition)

Task 3.1: Permutation test for A/B (no scipy)

Given samples A and B, test whether mean(B) - mean(A) is significant via permutation.

  • ā—Combine samples
  • ā—Shuffle and split
  • ā—Compute diff distribution
  • ā—p-value = fraction of diffs >= observed (two-sided if needed)

FAANG gotcha: p-value is not P(H0 true).

Loading editor...
Solution
4

Multiple Comparisons (Gotcha)

6
Bonferroni correction
1

Section 4 — Multiple Comparisons (Gotcha)

Task 4.1: Bonferroni correction

If you run m tests at alpha=0.05, Bonferroni uses alpha/m per test.

Compute adjusted alpha for m=20 and explain why this matters in feature slicing / metric dashboards.

Loading editor...
Solution

Need help? Share feedback