You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm developing a k-mer counting Python package for internal usage and I'm using needletail as a backend. While developing it, I noticed that Kmers and CanonicalKmers are inconsistent regarding non-ATCG characters. While Kmers count them, they are skipped by CanonicalKmers (understandably so).
Because of that, my function only uses CanonicalKmers even when counting non-canonical k-mers (I just reverse complement the sequence if canonical boolean is true), which causes additional computational burden.
I don't know if this decision was made by design, but maybe Kmers should include an argument that allows the user to choose whether non-ATCG characters should be ignored.