Skip to content

Method in diagnostics tutorial over-rejects #1501

Open
@mollyow

Description

@mollyow

This issue is about a method proposed in the tutorial, Evaluating a causal forest fit.

One heuristic method to detect heterogeneity described in the tutorial over-rejects under the null, I believe due to the winner's curse. (The same model is used to determine subgroups and implement estimation). Cross-fitting resolves this problem, and I have proposed a small modification to the tutorial that provides this suggestion in a pull-request (#1502 ).

Description of the bug

tau.hat <- predict(cf)$predictions
high.effect <- tau.hat > median(tau.hat)
ate.high <- average_treatment_effect(cf, subset = high.effect)
ate.low <- average_treatment_effect(cf, subset = !high.effect)
ate.high[["estimate"]] - ate.low[["estimate"]] +
  c(-1, 1) * qnorm(0.975) * sqrt(ate.high[["std.err"]]^2 + ate.low[["std.err"]]^2)
#> [1] 0.6591796 1.0443646

(this is just the method as referenced in the tutorial, the bug is the average rejection rate)

Steps to reproduce
Even when the sharp null is true (i.e., treatment is not associated with outcomes), this method rejects at higher than nominal rates.

I used the code from the tutorial, and created a gist showing over-rejection here:

https://gist.github.com/mollyow/c1690ac8fd4a8d333d61cdefeeef82a9

A longer write-up is available here: https://alexandercoppock.com/testing_with_grf.pdf

GRF version
2.4.0 (but it's just about the tutorial, not the underlying code).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions