Skip to content

Incorrect Code in Chapter 20 (and theoretical nitpicking) #402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
aliquod opened this issue Aug 2, 2024 · 1 comment
Open

Incorrect Code in Chapter 20 (and theoretical nitpicking) #402

aliquod opened this issue Aug 2, 2024 · 1 comment
Assignees

Comments

@aliquod
Copy link

aliquod commented Aug 2, 2024

First of all, thank you for making this very accessible book!

In the section about continuous treatment in chapter 20, you defined

$$Y^*_i := (Y_i- \bar{Y})\dfrac{(T_i - M(T_i))}{(T_i - M(T_i))^2}$$

to be the pseudo-outcome1 and then you threw away the denominator since you are interested in comparing treatment effects, not their absolute values. But doing so does not preserve order2. Instead why don't we just simplify it to be

$$Y^*_i = \dfrac{Y_i- \bar{Y}}{T_i - M(T_i)}?$$

Now onto the actual issue: the code block that came after

$$Y^*_i = (Y_i- \bar{Y})(T_i - M(T_i))$$

is

y_star_cont = (train["price"] - train["price"].mean()
               *train["sales"] - train["sales"].mean())

but this is missing some parentheses, so it actually computes

$$Y^*_i \overset{???}{=} Y_i- (\bar{Y} \times T_i) - M(T_i).$$

Footnotes

  1. The denominator I assume is an estimate of the conditional variance Var(T|X), but for most regression methods this residual is an underestimate.

  2. In the end we will average those values up to estimate the CATE. But unlike the randomized treatment case where every term is scaled by σ² and can be un-scaled without changing order, here each term has a different factor.

@diepala
Copy link

diepala commented May 2, 2025

Adding to this also. In the proof, we see that subtracting $\bar{Y}$ from $Y^*_i$ is not really necessary, but I think doing this reduces variance. I believe variance could be further reduced by substituting $\bar{Y}$ with a model for estimating $Y$ as a function of the confounders, $f(x) \approx \mathcal{E}[Y | X = x]$, then

$Y^*_i := (Y_i - f(x))\frac{T_i - M(T_i)}{(T_i - M(T_i))^2}$

The proof that this estimates the pseudo CATE would still be correct, but variance would be lower due to the (generally) smaller first term.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants