Skip to content

Allowing inclusion of multiple 'sites' as treatment group for synthetic control #165

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kashramli opened this issue Feb 8, 2023 · 6 comments · Fixed by #338
Closed

Allowing inclusion of multiple 'sites' as treatment group for synthetic control #165

kashramli opened this issue Feb 8, 2023 · 6 comments · Fixed by #338
Labels
enhancement New feature or request geo project Related to geo-testing

Comments

@kashramli
Copy link

I'm not sure if this is already possible and I haven't worked it out yet, but it would be great to have a way to easily include multiple 'sites' to make up the treatment group. In the example of geolift experiments, sometimes you would want to include multiple cities as the treatment group.

This is something easily done in Meta's GeoLift, so it would be great to see it here as well.
And if it is already possible....it would be great to have instructions included in the documentation

Thanks

@drbenvincent
Copy link
Collaborator

Hi @kashramli, thanks for this suggestion. I think this is something we could do relatively easily. As far as I understand, the way how this is done is to create a new aggregate unit which is simply the sum of the individual treated units.

So one solution would be to do that as a manual pre-processing step. Though I can imagine that might result in an aggregate treated unit with values much higher than the individual untreated units. That may cause problems with interpolation (i.e. the synthetic control weights summing to 1).

I had a quick look on the GeoLift website but didn't have luck finding anything specific on this issue of multiple treated units. Would you be able to point me in the right direction so I can look at the dataset and approach they take?

@drbenvincent drbenvincent added the enhancement New feature or request label Feb 9, 2023
@kashramli
Copy link
Author

Hi @drbenvincent,

I can't find any explanation for how Meta's Geolift does it specifically, but I have found some papers that have introduced the concept of having multiple treatments. I'm not quite confident enough to evaluate and determine which approach (if any) is best.

Extending synthetic control method for multiple treated units: an application to environmental intervention
https://www.tandfonline.com/doi/full/10.1080/1331677X.2020.1782764

Examination of the synthetic control method for evaluating health policies with multiple treated units(pg. 1519)
https://onlinelibrary.wiley.com/doi/pdf/10.1002/hec.3258

The inclusive synthetic control method.
Note: This seems to be more about including treated units to improve the control rather than having multiple treatment units for experimentation.
https://tinyurl.com/2p9827v9

Inference for Synthetic Control Methods with Multiple Treated Units
https://arxiv.org/pdf/1912.00568.pdf

I'll keep looking and will report back if I find something more concrete.

@kashramli
Copy link
Author

kashramli commented Feb 22, 2023

Hello again.....so I just asked them directly on their FB group, and they were super helpful in talking through their solutions. Here is the post: https://www.facebook.com/groups/fbgeolift/posts/1578515999315617/?comment_id=1581575829009634&notif_id=1676376929860659&notif_t=group_comment_mention

@drbenvincent
Copy link
Collaborator

Thanks @kashramli. Sorry for the delayed reply. I think their answer (of using the mean of the treated units) perhaps answers most (or all?) questions. Do you feel that your problem can be solved yourself by manually calculating a new mean-of-treated-units? It seems like that would be sufficient for simple synthetic cases, but perhaps there's more to it?

Or do you think that it would be highly useful for CausalPy to include some specific functionality to do that?

@pamant22
Copy link

I'm also trying to mimic the Meta GeoLift approach (I'd much prefer a Bayesian answer with interpretable confidence intervals!). If you like I can try to code some functionality for detecting when an average of multiple units is needed.

But I'm getting a different issue when using the mean of the treated units and I'm not sure what the problem is; the error I get is
The chain reached the maximum tree depth. Increase max_treedepth, increase target_accept or reparameterize
I'll try the suggestion in the error but is there something obvious I'm missing? I tried with a single test unit and it ran ok.

@drbenvincent
Copy link
Collaborator

drbenvincent commented May 8, 2024

Quick update @kashramli / @pamant22. I know it's been a while, but we're finally getting around to focusing on geo testing. I have a work in progress PR which will demonstrate how to analyse data with multiple treatment methods. It will demonstrate 2 approaches. The first is a pooled approach which simply aggregates the treatment geo's and then proceeds to use the current functionality (i.e. the SyntheticControl class). The other approach is to model each treated geo independently. See the PR for more info.

I've set it so that closing #338 will also close this issue. But I realise that this topic is quite rich. If there are more specific aspects of multi-cell geo lift testing that you'd like to be worked on a bit, then feel free to create a new issue - perhaps after #338 is done.

@drbenvincent drbenvincent added geo project Related to geo-testing and removed feature request labels Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request geo project Related to geo-testing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants