Repository structure:
sarbecovirus_phylogeography/
├── data
└── analysis
├── recombination_analysis
├── SARS-CoV-1-like_viruses
│ ├── Clock_calibration
│ ├── Bayesian_divergence_time_estimation_and_recCA_inference
│ │ ├── Primary_analysis
│ │ └── Sensitivity_analysis_extra_genomes
│ ├── tip-dated_phylogeography
│ │ ├── Primary_analysis
│ │ └── Sensitivity_analysis_withTwoGhostLineages
│ └── PoW-transformed_phylogeography
│ └── Primary_analysis
├── SARS-CoV-2-like_viruses
│ ├── Clock_calibration
│ ├── Bayesian_divergence_time_estimation_and_recCA_inference
│ │ ├── Primary_analysis
│ │ └── Sensitivity_analysis_early2020
│ ├── tip-dated_phylogeography
│ │ ├── Primary_analysis
│ │ └── Sensitivity_analysis_withTwoGhostLineages
│ └── PoW-transformed_phylogeography
│ └── Primary_analysis
└── post_hoc_analyses
The input data that we are able to share publicly is in the data
directory.
Contains XMLs and resulting MCC trees for each of the analyses. The Clock_calibration
analyses also includes subsampled tree files that were used for the empirical tree distribution in the clock calibration inference across each non-recombinant region. The post_hoc_analyses
subdirectory includes jupyter notebooks to generate most of the main text figures, the script for applying the PoW transformation to the output trees generated by the XMLs in the PoW-transformed_phylogeography
analyses, and the scripts for generating the lineage dispersal rates and phylogeography figures. Refer to the methods in the manuscript for more details.