Skip to content

[FEATURE REQUEST] Binning layers #2484

@FlorianTrigodet

Description

@FlorianTrigodet

The need

Just like we can bin items into collections, there are a few good reasons to implement a similar feature for the layers. The interactive interface already allow the user to select layers by either right-clicking or selecting a branch in a layer's dendrogram and they will be selected in the main panel. Useful to color groups of genome or metagenome in a pan or classic interface.

Enrichment analysis: one could select group of genome or metagenomes to later perform any type of functional enrichments by selecting two bins in a layers's collection.

Refined search options: in a pangenome, one could constrain the search-and-filter of GCs for a particular group of genome and get their unique GCs, their SCGs, filter for homogeneity, etc. This has been a recurrent request: automatically get the accessory GCs of user-defined group of genomes. The same idea applies for the classic interface where search for functions.

One could then split pan or profile based on these layer's bin: to focus on a set of metagenomes, or set of genomes. On could always reduced the height of these layers to zero, but a proper splitting would imply redoing the clustering of items, and layers.

Finally, the output of anvi-summary could include summary numbers for a group of layers: total GCs for a group of genomes, etc.

The solution

We need either a new artifact, similar to collections, or make collections (already a generic term) more versatile.
I know that the groups-txt does something seemingly similar, but in nature it accept any 'items/layer/source' to be matched to group's name (a very versatile artifact that has its use).

That artifact, which I will called layer's collection for now, would be stored in the pan or profile.db as it refers to the layers (side note: item's collections should be in contigs.db - or pan.db, which is already the case). I would be happy to move and/or create that new artifact in a similar fashion as the recent samples-txt class.

In the interface, in the main panel, we have three drop-down menus for items, layers, and legend. In the bin panel, we could have an 'items' and 'layers' drop down as well. When right-clicking anywhere in the main matrix of the interface the user would be able to either add items to bin (existing behavior) or add layers to bin.

The search panel could use a small drop-down or pop-up that allow the user to select a bin of layers (and while I am thinking of this, it could also and/or select for a bin of items) and then the search would be constrained to these layers/items, both for expression search, function, and also the GC filter.

Beneficiaries

As I mentioned, this feature would help users to quickly extract the accessory GC of a genome's group in a pangenome. It would also improve searches in the interfaces, split databases in a new fashion.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions