-
Notifications
You must be signed in to change notification settings - Fork 149
Description
The need
Just like we can bin items into collections, there are a few good reasons to implement a similar feature for the layers. The interactive interface already allow the user to select layers by either right-clicking or selecting a branch in a layer's dendrogram and they will be selected in the main panel. Useful to color groups of genome or metagenome in a pan or classic interface.
Enrichment analysis: one could select group of genome or metagenomes to later perform any type of functional enrichments by selecting two bins in a layers's collection.
Refined search options: in a pangenome, one could constrain the search-and-filter of GCs for a particular group of genome and get their unique GCs, their SCGs, filter for homogeneity, etc. This has been a recurrent request: automatically get the accessory GCs of user-defined group of genomes. The same idea applies for the classic interface where search for functions.
One could then split pan or profile based on these layer's bin: to focus on a set of metagenomes, or set of genomes. On could always reduced the height of these layers to zero, but a proper splitting would imply redoing the clustering of items, and layers.
Finally, the output of anvi-summary could include summary numbers for a group of layers: total GCs for a group of genomes, etc.
The solution
We need either a new artifact, similar to collections, or make collections (already a generic term) more versatile.
I know that the groups-txt does something seemingly similar, but in nature it accept any 'items/layer/source' to be matched to group's name (a very versatile artifact that has its use).
That artifact, which I will called layer's collection for now, would be stored in the pan or profile.db as it refers to the layers (side note: item's collections should be in contigs.db - or pan.db, which is already the case). I would be happy to move and/or create that new artifact in a similar fashion as the recent samples-txt class.
In the interface, in the main panel, we have three drop-down menus for items, layers, and legend. In the bin panel, we could have an 'items' and 'layers' drop down as well. When right-clicking anywhere in the main matrix of the interface the user would be able to either add items to bin (existing behavior) or add layers to bin.
The search panel could use a small drop-down or pop-up that allow the user to select a bin of layers (and while I am thinking of this, it could also and/or select for a bin of items) and then the search would be constrained to these layers/items, both for expression search, function, and also the GC filter.
Beneficiaries
As I mentioned, this feature would help users to quickly extract the accessory GC of a genome's group in a pangenome. It would also improve searches in the interfaces, split databases in a new fashion.