Data Science Interview | Data For Dummies

Free sample — easy bait

1

What is the bias–variance tradeoff? Normal

Bias is error from a model being too simple (underfitting); it doesn’t capture the true pattern. Variance is error from the model being too sensitive to the training set (overfitting). High bias → underfitting; high variance → overfitting. The goal is to find the sweet spot (e.g. via cross-validation, regularization, or tuning model complexity).

2

Precision vs recall — when do you optimize for which? Normal

Precision: Of all predicted positives, how many are correct? (Minimize false positives.) Recall: Of all actual positives, how many did we find? (Minimize false negatives.) Use precision when false positives are costly (e.g. spam marking good email). Use recall when missing a positive is costly (e.g. disease screening). F1 balances both.

Unlock all Data Science interview questions

40+ questions across ML fundamentals, stats, coding, and case studies — with full answers.

Upgrade to Pro

ML fundamentals

3

Supervised vs unsupervised vs semi-supervised learning? Normal

Unlock with Pro for the answer.

4

Write a simple train/test split and fit a logistic regression in Python. Code

Unlock with Pro for the code.

5

How would you handle severely imbalanced classes in a binary classifier? Logic

Unlock with Pro for the answer.

Regression & regularization

6

L1 vs L2 regularization — difference and when to use each? Normal

Unlock with Pro for the answer.

7

Implement k-fold cross-validation from scratch (pseudo-code or Python). Code

Unlock with Pro for the code.

8

How does gradient descent work? Batch vs mini-batch vs stochastic? Logic

Unlock with Pro for the answer.

Classification & trees

9

How does a decision tree choose splits? (e.g. Gini, entropy) Normal

Unlock with Pro for the answer.

10

Write code to train a random forest and get feature importances. Code

Unlock with Pro for the code.

11

When would you use XGBoost vs logistic regression? Logic

Unlock with Pro for the answer.

Statistics & experimentation

12

What is p-value and Type I vs Type II error? Normal

Unlock with Pro for the answer.

13

Design an A/B test: sample size, metrics, and how you’d analyze results. Code

Unlock with Pro for the answer.

14

How would you approach a “predict churn” project end-to-end? Logic

Unlock with Pro for the answer.