๐Ÿ‘ถ ABSOLUTE BEGINNER FRIENDLY

๐Ÿงช A/B Testing

Learn how companies like Facebook, Netflix, and Amazon test new features to make data-driven decisions!

Chapter 1: What is A/B Testing?

๐Ÿ‘ถ Explain Like I'm 5

Imagine you're selling lemonade and want to know which sign attracts more customers:

  • Sign A: "Fresh Lemonade - $1"
  • Sign B: "Ice Cold Lemonade - Only $1!"

You show Sign A to half the people walking by, and Sign B to the other half.

After counting who bought more, you know which sign is better! ๐Ÿ‹

That's A/B Testing!

๐Ÿ“Œ In One Sentence

A/B testing means randomly showing one version (A = control) to some users and another version (B = variant) to others, then comparing a metric (e.g. conversion rate) and using a statistical test (e.g. t-test) to decide if the difference is real or just luck. If p < 0.05, we say the result is significant and we can choose the winner.

๐Ÿ”ฌ A/B Testing in Action

                    100% of Users
                          |
            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
            โ–ผ                           โ–ผ
       Version A                   Version B
    (Current Design)            (New Design)
     [Blue Button]              [Red Button]
            |                           |
         50 Users                   50 Users
            |                           |
     5 Purchases                 12 Purchases
     (10% Convert)              (24% Convert)
            |                           |
            โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                        โ–ผ
               ๐Ÿ† VERSION B WINS! 

๐ŸŒ Companies That Use A/B Testing

  • Facebook: Tests which news feed layout keeps users engaged longer
  • Netflix: Tests different thumbnail images for movies
  • Amazon: Tests button colors, prices, and product recommendations
  • Google: Tests search result layouts and ad placements
  • Booking.com: Tests over 1000 experiments simultaneously!

Chapter 2: How A/B Testing Works

๐Ÿ“‹ The A/B Testing Process

1
Ask a Question

"Will changing the button color from blue to green increase sign-ups?"

2
Create Two Versions

Version A (Control): The current design (blue button)

Version B (Variant): The new design (green button)

3
Split Your Users Randomly

50% see Version A, 50% see Version B (randomly assigned)

4
Measure Results

Count conversions (sign-ups, purchases, clicks) for each version

5
Analyze Statistically

Use a T-test to check if the difference is REAL or just random chance

๐Ÿ…ฐ๏ธ Version A (Control)

๐Ÿ“ฑ

Current Website Design

Blue "Sign Up" Button

This is what users see now

๐Ÿ…ฑ๏ธ Version B (Variant)

๐Ÿ“ฑ

New Website Design

Green "Sign Up" Button

This is what we're testing

Chapter 3: The Statistics (Super Simple!)

๐Ÿค” The Big Question

Version A had 10% conversion rate. Version B had 12% conversion rate.

But wait! Is that 2% difference REAL, or just random luck?

Maybe if we tested again, A might do better? ๐Ÿคท

That's why we use statistics!

The T-Test

๐Ÿ‘ถ What is a T-Test?

The T-Test answers: "Is the difference between two groups REAL or just coincidence?"

It gives us a p-value:

  • p < 0.05: "The difference is REAL!" โœ… (less than 5% chance it's random)
  • p โ‰ฅ 0.05: "Probably just random luck" โŒ
p-value Meaning Decision
p < 0.01 Very strong evidence โœ…โœ… Definitely implement B!
p < 0.05 Strong evidence โœ… Safe to implement B
p < 0.10 Weak evidence ๐Ÿค” Maybe test longer
p โ‰ฅ 0.10 No evidence โŒ Probably no real difference

Chapter 4: A/B Testing in Python

๐Ÿ“ฅ Download the A/B Testing Dataset!

Download this CSV to follow along with the code examples below.

Download CSV (525 bytes)

Step 1: Load the Data

import pandas as pd
from scipy import stats

# Load A/B test data (download from link above!)
# This data has conversion rates for 35 days
data = pd.read_csv("AB_testing_data.csv")

# Let's look at the data
print(data.head(10))

# Output:
#    Day  Conversion fraction A  Conversion fraction B
# 0    1                  0.102                  0.189
# 1    2                  0.095                  0.178
# 2    3                  0.108                  0.192
# ...

What each line does (in simple words)

import pandas as pd โ€” Lets us use DataFrames and read CSV.

from scipy import stats โ€” For the statistical test (chi-square or t-test) later.

pd.read_csv("AB_testing_data.csv") โ€” Loads the A/B test file: columns are Day, Conversion fraction A, Conversion fraction B.

print(data.head(10)) โ€” Shows the first 10 rows so you can see the conversion rates for each day.

Step 2: Look at the Averages

# Calculate average conversion rate for each version
avg_A = data['Conversion fraction A'].mean()
avg_B = data['Conversion fraction B'].mean()

print(f"Version A average conversion: {avg_A:.1%}")
print(f"Version B average conversion: {avg_B:.1%}")
print(f"Difference: {(avg_B - avg_A):.1%}")

# Output:
# Version A average conversion: 10.2%
# Version B average conversion: 18.5%
# Difference: 8.3%

# Wow! B looks much better! But is it STATISTICALLY significant?

Step 3: Run the T-Test

# Get the conversion rates for each version
group_A = data['Conversion fraction A']
group_B = data['Conversion fraction B']

# Run the T-Test!
t_stat, p_value = stats.ttest_ind(group_A, group_B)

print("=" * 50)
print("       A/B TEST RESULTS")
print("=" * 50)
print(f"T-statistic: {t_stat:.2f}")
print(f"P-value: {p_value:.6f}")
print("=" * 50)

# Output:
# ==================================================
#        A/B TEST RESULTS
# ==================================================
# T-statistic: -3.74
# P-value: 0.000347
# ==================================================

Step 4: Interpret the Results

# Make a decision based on p-value
alpha = 0.05  # Significance threshold (5%)

if p_value < alpha:
    print("โœ… STATISTICALLY SIGNIFICANT!")
    print("The difference is REAL, not random luck.")
    print(f"We are {(1 - p_value) * 100:.2f}% confident Version B is better!")
    print("\n๐Ÿ‘‰ RECOMMENDATION: Implement Version B!")
else:
    print("โŒ NOT statistically significant.")
    print("The difference might just be random chance.")
    print("\n๐Ÿ‘‰ RECOMMENDATION: Keep Version A or test longer.")

# Output:
# โœ… STATISTICALLY SIGNIFICANT!
# The difference is REAL, not random luck.
# We are 99.97% confident Version B is better!
#
# ๐Ÿ‘‰ RECOMMENDATION: Implement Version B!

๐Ÿ† VERSION B WINS!

With p-value = 0.000347 (much less than 0.05), we can confidently say:

"Version B truly performs better - it's not just luck!"

Chapter 5: Common Mistakes to Avoid

Mistake Why It's Bad What to Do Instead
Stopping too early Small sample = unreliable results Wait for enough data (usually 1000+ users per version)
Testing too many things Can't tell which change made the difference Change ONE thing at a time
Peeking at results Leads to false positives Set a fixed end date before starting
Not randomizing properly Biased groups Use proper random assignment
Ignoring seasonality Weekend vs weekday behavior differs Test for at least 1-2 full weeks

๐Ÿšซ Common Mistakes in A/B Testing

๐Ÿ“˜ From the course notebook (A/B Testing and Market Basket)

The course source uses ab_testing_data.csv (or similar): control vs variant groups, conversion metric. Key steps: split by group, compute conversion rates, run a t-test or z-test for significance. Download ab_testing_data.csv from the datasets page. See AB testing and Market Basket Analysis.pdf in the course source for slides.

Complete code from course notebook: ab_test.ipynb

Every line of code (verbatim).

# --- Code cell 1 ---
from IPython.core.display import HTML

HTML("""
<style>

h2 { color: blue !important; }
h3 { color: green !important; }
</style>
""")

# --- Code cell 4 ---
import pandas as pd
from scipy import stats

# --- Code cell 5 ---
data = pd.read_csv("AB_testing_data.csv")

# --- Code cell 6 ---
len(data)

# --- Code cell 7 ---
data.head(10)

# --- Code cell 8 ---
data.info()

# --- Code cell 9 ---
data.describe()

# --- Code cell 11 ---
samples_set1 = data['Conversion fraction A']
samples_set2 = data['Conversion fraction B']
stat, p = stats.ttest_ind(samples_set1, samples_set2,equal_var = True)

print("AB test results: ")
print("p-value : ", p)
print("")
print("")

# --- Code cell 12 ---
1-0.00034704350989135126

# --- Code cell 13 ---
1-0.05

# --- Code cell 14 ---
# p value < 0.05 so two versions of website have different means for conversion rate - more than 95% confidence

๐Ÿ’ญ Short reflection

In one sentence: why is it important to run an A/B test for at least one full week (or more) before deciding a winner?

โœ… CORE (Must know)

๐Ÿ“š NON-CORE (Good to know)

Chapter 6: Summary

๐Ÿ“‹ A/B Testing Checklist

  1. โ“ Define what you want to test (hypothesis)
  2. ๐Ÿ…ฐ๏ธ๐Ÿ…ฑ๏ธ Create two versions (A = current, B = new)
  3. ๐Ÿ‘ฅ Split users randomly 50/50
  4. ๐Ÿ“Š Collect enough data (be patient!)
  5. ๐Ÿงฎ Run a T-test to check statistical significance
  6. ๐Ÿ“ˆ If p < 0.05 โ†’ Implement the winner!
Concept Simple Explanation
A/B Test Comparing two versions to see which performs better
Control (A) The current version (what we're comparing against)
Variant (B) The new version we're testing
Conversion Rate % of users who took the desired action
p-value Probability that the difference is just random luck
Significance (p < 0.05) Less than 5% chance it's random โ†’ real difference!

๐ŸŽ‰ Congratulations!

You now understand A/B testing - a skill used by data scientists at top tech companies!