Skip to content

kmer-model format #29

@jsthv

Description

@jsthv

I am trying to simulate signal data by creating a kmer-model for a modified base. My data comes from R10.4.1 flow cell and thus needs a 9-mer kmer model. I have a sequence with a site-specific modification that causes a large decrease in the current. All I want to do is try to simulate the signal by only varying the parameters for the 9-mers that overlap the modification. It would have been very straightforward if I could just take the ont R10.4.1 400 bps 2 column file and simply add 9 lines to the file at the end for only the 9 modified 9mers in my sequence. It seems that I cannot do that or that I am not doing it correctly. The manual says that I have to use a file formatted in the same was as for f5c (r9.4_450bps.nucleotide.6mer.template.model). When I look at that file, it is a 6mer kmer library with a 6 column structure. The file for r9.4_450bps.cpg.6mer.template.model has a 5 column structure. There are no examples of 9mer libraries. Is the problem that I would need a 5^9 line file for A, C, G, T, X which would be unwieldly? Is that why I have to use a 6mer library, which translates to 5^6 lines? Strangely, the errors I get seem to indicate that it could hand 9mer libraries but I don't know how to get it to do so.

For example I get the error:
[read_model::INFO] k-mer size in file /home/jst/models/9mer_XX_CPD_f5c.txt is 9
[read_model::ERROR] File /home/jst/models/9mer_XX_CPD_f5c.txt has too many entries. Expected 262144 kmers in the model, but file had more than that At src/model.c:114

This is what I had in the header:

#model_name R10.4.1_400bps_9mer_custom
#kit R10.4.1_400bps
#strand template
#k 9
#alphabet dnaX

If it did recognize it as a 9mer library with a modified base X, then it should have anticipated 5^9 9-mers, not 4^9 which is 262144.

Is there anyway to simply use the two column ONT 9mer library with a few added modified 9-mers? My case is a bit non standard because my modifications run in tandem, i.e., XX.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions