diff --git a/.gitignore b/.gitignore index fd1a6a2..e40cd46 100644 --- a/.gitignore +++ b/.gitignore @@ -170,3 +170,4 @@ todos.txt # experiments/ +cluster-experiments.code-workspace diff --git a/README.md b/README.md index f770fc7..aad07e4 100644 --- a/README.md +++ b/README.md @@ -11,89 +11,40 @@ https://codecov.io/gh/david26694/cluster-experiments/branch/main/graph/badge.svg ![License](https://img.shields.io/github/license/david26694/cluster-experiments) [![Pypi version](https://img.shields.io/pypi/pyversions/cluster-experiments.svg)](https://pypi.python.org/pypi/cluster-experiments) -A Python library for end-to-end A/B testing workflows, featuring: -- Experiment analysis and scorecards -- Power analysis (simulation-based and normal approximation) -- Variance reduction techniques (CUPED, CUPAC) -- Support for complex experimental designs (cluster randomization, switchback experiments) - -## Key Features - -### 1. Power Analysis -- **Simulation-based**: Run Monte Carlo simulations to estimate power -- **Normal approximation**: Fast power estimation using CLT -- **Minimum Detectable Effect**: Calculate required effect sizes -- **Multiple designs**: Support for: - - Simple randomization - - Variance reduction techniques in power analysis - - Cluster randomization - - Switchback experiments -- **Dict config**: Easy to configure power analysis with a dictionary - -### 2. Experiment Analysis -- **Analysis Plans**: Define structured analysis plans -- **Metrics**: - - Simple metrics - - Ratio metrics -- **Dimensions**: Slice results by dimensions -- **Statistical Methods**: - - GEE - - Mixed Linear Models - - Clustered / regular OLS - - T-tests - - Synthetic Control -- **Dict config**: Easy to define analysis plans with a dictionary - -### 3. Variance Reduction -- **CUPED** (Controlled-experiment Using Pre-Experiment Data): - - Use historical outcome data to reduce variance, choose any granularity - - Support for several covariates -- **CUPAC** (Control Using Predictors as Covariates): - - Use any scikit-learn compatible estimator to predict the outcome with pre-experiment data - -## Quick Start - -### Power Analysis Example +**`cluster experiments`** is a comprehensive Python library for end-to-end A/B testing workflows. -```python -import numpy as np -import pandas as pd -from cluster_experiments import PowerAnalysis, NormalPowerAnalysis +--- -# Create sample data -N = 1_000 -df = pd.DataFrame({ - "target": np.random.normal(0, 1, size=N), - "date": pd.to_datetime( - np.random.randint( - pd.Timestamp("2024-01-01").value, - pd.Timestamp("2024-01-31").value, - size=N, - ) - ), -}) +## πŸš€ Key Features -# Simulation-based power analysis with CUPED -config = { - "analysis": "ols", - "perturbator": "constant", - "splitter": "non_clustered", - "n_simulations": 50, -} -pw = PowerAnalysis.from_dict(config) -power = pw.power_analysis(df, average_effect=0.1) - -# Normal approximation (faster) -npw = NormalPowerAnalysis.from_dict({ - "analysis": "ols", - "splitter": "non_clustered", - "n_simulations": 5, - "time_col": "date", -}) -power_normal = npw.power_analysis(df, average_effect=0.1) -power_line_normal = npw.power_line(df, average_effects=[0.1, 0.2, 0.3]) +### πŸ“Œ Experiment Design & Planning +- **Power analysis** and **Minimal Detectable Effect (MDE)** estimation + - **Normal Approximation (CLT-based)**: Fast, analytical formulas assuming approximate normality + - Best for large sample sizes and standard A/B tests + - **Monte Carlo Simulation**: Empirically estimate power or MDE by simulating many experiments + - Ideal for complex or non-standard designs (e.g., clustering, non-normal outcomes) +- Supports complex **experimental designs**, including: + - 🏒 **Cluster randomization** + - πŸ”„ **Switchback experiments** + - πŸ“Š **Observational studies**, including **synthetic control** +### πŸ§ͺ Statistical Methods for Analysis +- πŸ“Œ **Ordinary Least Squares (OLS)** and **Clustered OLS**, with support for covariates +- 🎯 **Variance Reduction Techniques**: **CUPED** and **CUPAC** + +### πŸ“ˆ Scalable Experiment Analysis with Scorecards +- Generate **Scorecards** to summarize experiment results, allowing analysis for multiple metrics +- Include **confidence intervals, relative and absolute effect sizes, p-values**, + +`cluster experiments` empowers analysts and data scientists with **scalable, reproducible, and statistically robust** A/B testing workflows. + +πŸ”— **Get Started:** [Documentation Link] + +πŸ“¦ **Installation:** +```sh +pip install cluster-experiments +======= # MDE calculation mde = npw.mde(df, power=0.8) @@ -106,7 +57,6 @@ mde_timeline = npw.mde_time_line( print(power, power_line_normal, power_normal, mde, mde_timeline) ``` - ### Experiment Analysis Example ```python diff --git a/docs/quickstart.md b/docs/quickstart.md new file mode 100644 index 0000000..2b10e45 --- /dev/null +++ b/docs/quickstart.md @@ -0,0 +1,134 @@ +# Quickstart + +## Installation + +You can install **Cluster Experiments** via pip: + +```bash +pip install cluster-experiments +``` + +!!! info "Python Version Support" + **Cluster Experiments** requires **Python 3.9 or higher**. Make sure your environment meets this requirement before proceeding with the installation. + +--- + +## Usage + +Designing and analyzing experiments can feel overwhelming at times. After formulating a testable hypothesis, +you're faced with a series of routine tasks. From collecting and transforming raw data to measuring the statistical significance of your experiment results and constructing confidence intervals, +it can quickly become a repetitive and error-prone process. +*Cluster Experiments* is here to change that. Built on top of well-known packages like `pandas`, `numpy`, `scipy` and `statsmodels`, it automates the core steps of an experiment, streamlining your workflow, saving you time and effort, while maintaining statistical rigor. +## Key Features +- **Modular Design**: Each componentβ€”`Splitter`, `Perturbator`, and `Analysis`β€”is independent, reusable, and can be combined in any way you need. +- **Flexibility**: Whether you're conducting a simple A/B test or a complex clustered experiment, Cluster Experiments adapts to your needs. +- **Statistical Rigor**: Built-in support for advanced statistical methods ensures that your experiments maintain high standards, including clustered standard errors and variance reduction techniques like CUPED and CUPAC. + +The core functionality of *Cluster Experiments* revolves around several intuitive, self-contained classes and methods: + +- **Splitter**: Define how your control and treatment groups are split. +- **Perturbator**: Specify the type of effect you want to test. +- **Analysis**: Perform statistical inference to measure the impact of your experiment. + + +--- + +### `Splitter`: Defining Control and Treatment Groups + +The `Splitter` classes are responsible for dividing your data into control and treatment groups. The way you split your data depends on the **metric** (e.g., simple, ratio) you want to observe and the unit of observation (e.g., users, sessions, time periods). + +#### Features: + +- **Randomized Splits**: Simple random assignment of units to control and treatment groups. +- **Stratified Splits**: Ensure balanced representation of key segments (e.g., geographic regions, user cohorts). +- **Time-Based Splits**: Useful for switchback experiments or time-series data. + +```python +from cluster_experiments import RandomSplitter + +splitter = RandomSplitter( + cluster_cols=["cluster_id"], # Split by clusters + treatment_col="treatment", # Name of the treatment column +) +``` + +--- + +### `Perturbator`: Simulating the Treatment Effect + +The `Perturbator` classes define the type of effect you want to test. It simulates the treatment effect on your data, allowing you to evaluate the impact of your experiment. + +#### Features: + +- **Absolute Effects**: Add a fixed uplift to the treatment group. +- **Relative Effects**: Apply a percentage-based uplift to the treatment group. +- **Custom Effects**: Define your own effect size or distribution. + +```python +from cluster_experiments import ConstantPerturbator + +perturbator = ConstantPerturbator( + average_effect=5.0 # Simulate a nominal 5% uplift +) +``` + +--- + +### `Analysis`: Measuring the Impact + +Once your data is split and the treatment effect is applied, the `Analysis` component helps you measure the statistical significance of the experiment results. It provides tools for calculating effects, confidence intervals, and p-values. + +You can use it for both **experiment design** (pre-experiment phase) and **analysis** (post-experiment phase). + +#### Features: + +- **Statistical Tests**: Perform t-tests, OLS regression, and other hypothesis tests. +- **Effect Size**: Calculate both absolute and relative effects. +- **Confidence Intervals**: Construct confidence intervals for your results. + +Example: + +```python +from cluster_experiments import TTestClusteredAnalysis + +analysis = TTestClusteredAnalysis( + cluster_cols=["cluster_id"], # Cluster-level analysis + treatment_col="treatment", # Name of the treatment column + target_col="outcome" # Metric to analyze +) +``` + +--- + +### Putting It All Together for Experiment Design + +You can combine all classes as inputs in the `PowerAnalysis` class, where you can analyze different experiment settings, power lines, and Minimal Detectable Effects (MDEs). + +```python +from cluster_experiments import PowerAnalysis +from cluster_experiments import RandomSplitter, ConstantPerturbator, TTestClusteredAnalysis + +# Define the components +splitter = RandomSplitter(cluster_cols=["cluster_id"], treatment_col="treatment") +perturbator = ConstantPerturbator(average_effect=0.1) +analysis = TTestClusteredAnalysis(cluster_cols=["cluster_id"], treatment_col="treatment", target_col="outcome") + +# Create the experiment +experiment = PowerAnalysis( + perturbator=perturbator, + splitter=splitter, + analysis=analysis, + target_col="outcome", + treatment_col="treatment" +) + +# Run the experiment +results = experiment.power_analysis() +``` + +--- + +## Next Steps + +- Explore the **Core Documentation** for detailed explanations of each component. +- Check out the **Usage Examples** for practical applications of the package. diff --git a/docs/stylesheets/overrides.css b/docs/stylesheets/overrides.css new file mode 100644 index 0000000..bf676b7 --- /dev/null +++ b/docs/stylesheets/overrides.css @@ -0,0 +1,13 @@ +/* Custom admonition styling */ +.md-typeset .admonition { + border-radius: 8px; + border-left: 4px solid var(--md-primary-fg-color); + box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1); +} + +/* Code block styling */ +.md-typeset pre { + border-radius: 8px; + background-color: var(--md-code-bg-color); + box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1); +} diff --git a/docs/stylesheets/style.css b/docs/stylesheets/style.css new file mode 100644 index 0000000..c22863b --- /dev/null +++ b/docs/stylesheets/style.css @@ -0,0 +1,9 @@ +/* Apply text justification to all paragraphs in the documentation */ +.md-content p { + text-align: justify; +} + +/* Optionally, justify lists or other specific elements */ +.md-content ul, .md-content ol { + text-align: justify; +} diff --git a/mkdocs.yml b/mkdocs.yml index 6ded671..db41ed4 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,62 +1,90 @@ site_name: Cluster Experiments Docs -extra_css: [style.css] repo_url: https://github.com/david26694/cluster-experiments site_url: https://david26694.github.io/cluster-experiments/ site_description: Functions to design and run clustered experiments site_author: David Masip use_directory_urls: false edit_uri: blob/main/docs/ +docs_dir: docs +site_dir: site + nav: - - Home: - - Index: index.md - - Cupac example: cupac_example.ipynb - - Custom classes: create_custom_classes.ipynb - - Switchback: - - Stratified switchback: switchback.ipynb - - Switchback calendar visualization: plot_calendars.ipynb - - Visualization - 4-hour switches: plot_calendars_hours.ipynb - - Multiple treatments: multivariate.ipynb - - AA test clustered: aa_test.ipynb - - Paired T test: paired_ttest.ipynb - - Different hypotheses tests: analysis_with_different_hypotheses.ipynb - - Washover: washover_example.ipynb - - Normal Power: - - Compare with simulation: normal_power.ipynb - - Time-lines: normal_power_lines.ipynb - - Synthetic control: synthetic_control.ipynb - - Experiment analysis workflow: experiment_analysis.ipynb - - Delta Method Analysis: delta_method.ipynb - - API: - - Experiment analysis methods: api/experiment_analysis.md - - Perturbators: api/perturbator.md - - Splitter: api/random_splitter.md - - Pre experiment outcome model: api/cupac_model.md - - Power config: api/power_config.md - - Power analysis: api/power_analysis.md - - Washover: api/washover.md - - Metric: api/metric.md - - Variant: api/variant.md - - Dimension: api/dimension.md - - Hypothesis Test: api/hypothesis_test.md - - Analysis Plan: api/analysis_plan.md + - Home: ../README.md + - Quickstart: quickstart.md + - Core Documentation: + - API: + - Experiment analysis methods: api/experiment_analysis.md + - Perturbators: api/perturbator.md + - Splitter: api/random_splitter.md + - Pre experiment outcome model: api/cupac_model.md + - Power config: api/power_config.md + - Power analysis: api/power_analysis.md + - Washover: api/washover.md + - Metric: api/metric.md + - Variant: api/variant.md + - Dimension: api/dimension.md + - Hypothesis Test: api/hypothesis_test.md + - Analysis Plan: api/analysis_plan.md + - Usage Examples: + - Variance Reduction: + - CUPAC: cupac_example.ipynb + - Switchback: + - Stratified switchback: switchback.ipynb + - Switchback calendar visualization: plot_calendars.ipynb + - Visualization - 4-hour switches: plot_calendars_hours.ipynb + - Multiple treatments: multivariate.ipynb + - AA test clustered: aa_test.ipynb + - Paired T test: paired_ttest.ipynb + - Different hypotheses tests: analysis_with_different_hypotheses.ipynb + - Washover: washover_example.ipynb + - Normal Power: + - Compare with simulation: normal_power.ipynb + - Time-lines: normal_power_lines.ipynb + - Synthetic control: synthetic_control.ipynb + - Delta Method Analysis: delta_method.ipynb + - Experiment analysis workflow: experiment_analysis.ipynb + - Contribute: + - Contributing Guidelines: development/contributing.md + - Code Structure: development/code_structure.md + - Testing: development/testing.md + - Building Documentation: development/building_docs.md + +extra: + social: + - icon: fontawesome/brands/github + link: https://github.com/david26694/cluster-experiments + plugins: - mkdocstrings: watch: - cluster_experiments - mkdocs-jupyter - search + +extra_css: + - stylesheets/overrides.css + - stylesheets/style.css + copyright: Copyright © 2022 Maintained by David Masip. + theme: name: material font: text: Ubuntu code: Ubuntu Mono - feature: - tabs: true + features: + - content.tabs + - content.code.annotate + - navigation.instant + - navigation.tracking + - navigation.sections + - navigation.top palette: primary: indigo accent: blue + markdown_extensions: + - admonition - codehilite - pymdownx.inlinehilite - pymdownx.superfences