From 04e31df21d5f1b8dc73cb6df17b1710030c507e4 Mon Sep 17 00:00:00 2001 From: "Benjamin T. Vincent" Date: Thu, 17 Jul 2025 21:27:02 +0100 Subject: [PATCH] flesh out more detail about causal impact in the interrupted time series setting --- docs/source/notebooks/its_pymc.ipynb | 60 +++++++++++++++++++++------- 1 file changed, 45 insertions(+), 15 deletions(-) diff --git a/docs/source/notebooks/its_pymc.ipynb b/docs/source/notebooks/its_pymc.ipynb index bc9d8dc3..cbd05801 100644 --- a/docs/source/notebooks/its_pymc.ipynb +++ b/docs/source/notebooks/its_pymc.ipynb @@ -7,7 +7,45 @@ "source": [ "# Example Interrupted Time Series (ITS) with `pymc` models\n", "\n", - "This notebook shows an example of using interrupted time series, where we do not have untreated control units of a similar nature to the treated unit and we just have a single time series of observations and the predictor variables are simply time and month." + "Interrupted Time Series (ITS) analysis is a powerful approach for estimating the causal impact of an intervention or treatment when you have a single time series of observations. The key idea is to compare what actually happened after the intervention to what would have happened in the absence of the intervention (the \"counterfactual\"). To do this, we train a statistical model on the pre-intervention data (when no treatment has occurred) and then use this model to forecast the expected outcomes into the post-intervention period. The difference between the observed outcomes and these model-based counterfactual predictions provides an estimate of the causal effect of the intervention, along with a measure of uncertainty if using a Bayesian approach.\n", + "\n", + "This notebook shows an example of using interrupted time series, where we do not have untreated control units of a similar nature to the treated unit and we just have a single time series of observations and the predictor variables are simply time and month. So the only real way to estimate the counterfactual is by training a model on the pre-intervention data and then using this model to forecast the expected outcomes into the post-intervention period." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## What do we mean by \"causal impact\" in Interrupted Time Series?\n", + "\n", + "In the context of interrupted time series (ITS) analysis, especially when using Bayesian models, the term **causal impact** refers to the estimated effect of an intervention or event on an outcome of interest.\n", + "\n", + "### Instantaneous and Cumulative Bayesian Causal Effects\n", + "\n", + "- The **Instantaneous Bayesian Causal Effect** at each time point is the difference between the observed outcome and the model's posterior predictive distribution for the counterfactual (i.e., what would have happened without the intervention). This is not just a single number, but a full probability distribution that reflects our uncertainty.\n", + "- The **Cumulative Bayesian Causal Impact** is the sum of these instantaneous effects over the post-intervention period, again represented as a distribution.\n", + "\n", + "### Mathematical expression\n", + "Let $y_t$ be the observed outcome at time $t$ (after the intervention), and $\\tilde{y}_t$ be the model's counterfactual prediction for the same time point. Then:\n", + "- **Instantaneous effect:** $\\Delta_t = y_t - \\tilde{y}_t$\n", + "- **Cumulative effect (up to time $T$):** $C_T = \\sum_{t=1}^T \\Delta_t$\n", + "\n", + "In Bayesian analysis, both $\\tilde{y}_t$ and $\\Delta_t$ are distributions, not just point estimates.\n", + "\n", + "### Why does this matter for decision making?\n", + "These metrics allow you to answer questions like:\n", + "- \"How much did the intervention change the outcome, compared to what would have happened otherwise?\"\n", + "- \"What is the total effect of the intervention over time, and how certain are we about it?\"\n", + "\n", + "By providing both instantaneous and cumulative effects, along with their uncertainty, you can make more informed business decisions and better understand the impact of your interventions." + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Interrupted Time Series (ITS) Example" ] }, { @@ -34,14 +72,6 @@ "seed = 42" ] }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Interrupted Time Series (ITS) Example" - ] - }, { "attachments": {}, "cell_type": "markdown", @@ -170,7 +200,11 @@ { "cell_type": "code", "execution_count": 4, - "metadata": {}, + "metadata": { + "tags": [ + "hide-output" + ] + }, "outputs": [ { "name": "stderr", @@ -304,11 +338,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "As well as the model coefficients, we might be interested in the average causal impact and average cumulative causal impact.\n", - "\n", - ":::{note}\n", - "Better output for the summary statistics are in progress!\n", - ":::" + "As well as the model coefficients, we might be interested in the average causal impact and average cumulative causal impact." ] }, {