Method in diagnostics tutorial over-rejects

This issue is about a method proposed in the tutorial, [Evaluating a causal forest fit](https://grf-labs.github.io/grf/articles/diagnostics.html). 

One heuristic method to detect heterogeneity described in the tutorial over-rejects under the null, I believe due to the winner's curse. (The same model is used to determine subgroups and implement estimation). Cross-fitting resolves this problem, and I have proposed a small modification to the tutorial that provides this suggestion in a pull-request (#1502 ). 

**Description of the bug**
```
tau.hat <- predict(cf)$predictions
high.effect <- tau.hat > median(tau.hat)
ate.high <- average_treatment_effect(cf, subset = high.effect)
ate.low <- average_treatment_effect(cf, subset = !high.effect)
ate.high[["estimate"]] - ate.low[["estimate"]] +
  c(-1, 1) * qnorm(0.975) * sqrt(ate.high[["std.err"]]^2 + ate.low[["std.err"]]^2)
#> [1] 0.6591796 1.0443646
```

(this is just the method as referenced in the tutorial, the bug is the average rejection rate)


**Steps to reproduce**
Even when the sharp null is true (i.e., treatment is not associated with outcomes), this method rejects at higher than nominal rates. 

I used the code from the tutorial, and created a gist showing over-rejection here:

https://gist.github.com/mollyow/c1690ac8fd4a8d333d61cdefeeef82a9

A longer write-up is available here: https://alexandercoppock.com/testing_with_grf.pdf

**GRF version**
2.4.0 (but it's just about the tutorial, not the underlying code). 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Method in diagnostics tutorial over-rejects #1501

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Method in diagnostics tutorial over-rejects #1501

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions