Skip to content

7.0.0 - Remove Tree Nine (but for good reason!!)

Latest
Compare
Choose a tag to compare
@aofarrel aofarrel released this 01 Feb 01:01
· 12 commits to main since this release

This project has gone through several iterations, and ultimately has two different flavors:

  1. What is most practical for running on SRA data
  2. What is most practical for running on health department data

Flavor 1, myco_sra, is designed to run multiple samples at once in a single instance of the workflow. When run in this way, it makes perfect sense for variant calling to feed directly into phylogenetic analysis -- all the samples are already there, so you can put everything on the tree at once.

Flavor 2, myco_raw, is also capable of running in that way, but when run on what it was actually designed to run upon (CDPH's data tables on the cloud platform Terra), we cannot combine variant calling and phylogenetics in the same workflow. Writing results back to original data tables requires each row of the data table (ie, each sample) to run its own instance of the workflow, which prevents samples from having "knowledge" of each other. Variant calling does not require "knowledge" of any other other samples, but by definition phylogenetics does.

With that in mind:

  • myco is now exclusively an SRA download/FASTQ cleaner/TBProfiler/decontamination/variant-calling/QC check pipeline
  • Tree Nine is, as it always was, exclusively focused on phylogenetics
  • TB-D is the name for the "overall system", which will include a separate WDL file linking myco directly to Tree Nine for those who desire that functionality

For this to actually make sense, it's necessary to remove the remnants of Tree Nine from the myco pipeline -- especially as Tree Nine has changed more and more to fall into line with CDPH's specific needs.

Full Changelog: 6.4.1...7.0.0