GitHub - welpo/stop-early-stopping: Why you shouldn't peek at significance levels to decide when to stop an experiment

Why you shouldn't peek at significance in A/B tests

What is Early Stopping?

Early stopping occurs when you repeatedly check your A/B test results and stop the experiment as soon as you see a "statistically significant" result. This is a form of p-hacking that dramatically increases false positive rates.

For example, with a significance level of 0.05 (5%), you expect a 5% false positive rate when checking just once at the end. But if you check results daily during a 30-day test, you could see false positive rates as high as 30%, meaning many of your "winning" tests are actually detecting nothing at all!

Features

Interactive simulation: See how early stopping affects false positive rates in real-time
Adjustable parameters: Test different experiment durations, checking frequencies, and significance levels
Visual comparison: Directly compare the false positive rates of proper testing vs early stopping

How it works

This tool simulates A/B tests where both variants have identical performance (a true "null hypothesis" scenario):

Each simulation generates two identical variants with:
- 250 visitors per day per variant
- 10% conversion rate for both variants
- No actual difference between variants
The tool runs 625 simulated tests with two different stopping criteria:
- Left grid: Tests that only check for significance once, after the full duration
- Right grid: Tests that check at your specified frequency and stop when significance is found
Results are color-coded:
- 🟢: tests correctly showing no significant difference
- 🟥: tests incorrectly showing a significant difference (false positives)

Statistical significance is determined using a two-proportion Z-test, mathematically equivalent to a chi-squared test for this 2×2 case.

Contributing

Please do! I'd appreciate bug reports, improvements (however minor), suggestions…

The simulator uses vanilla JavaScript, HTML, and CSS. To run locally:

Clone the repository: git clone https://github.com/welpo/stop-early-stopping.git
Navigate to the project directory: cd stop-early-stopping
Start a local server: python3 -m http.server
Visit http://localhost:8000 in your browser

The important files are:

index.html: Basic structure
styles.css: Styles
app.js: Main UI logic
simulationWorker.js: Web worker that runs simulations in background

Need help?

Something not working? Have an idea? Let me know!

Questions or ideas → Start a discussion
Found a bug? → Report it here
Feature request? → Let me know

License

This simulator is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.githooks		.githooks
.github		.github
app		app
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
COPYING		COPYING
README.md		README.md
robot.png		robot.png
screenshot.png		screenshot.png
sumi.toml		sumi.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

Why you shouldn't peek at significance in A/B tests

What is Early Stopping?

Features

How it works

Contributing

Need help?

License

About

Uh oh!

Sponsor this project

Uh oh!

Languages

Uh oh!

License

welpo/stop-early-stopping

Folders and files

Latest commit

History

Repository files navigation

Why you shouldn't peek at significance in A/B tests

What is Early Stopping?

Features

How it works

Contributing

Need help?

License

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Sponsor this project

Uh oh!

Languages