Read mapping is very slow on diploid human genome assembly

I tried to use VerityMap to validate a diploid human genome assembly using HiFi reads, but on my data it was too slow to be practical. I let it run for >3 weeks one 16 threads, and it only mapped up to about 4x. Is this speed expected? Are there any tweaks I can make to increase it?

The command I ran was

```
python3 main.py --reads reads.fastq.gz -o verity_map_output -t 16 -d hifi-diploid \
    assembly.haplotype1.fasta assembly.haplotype2.fasta
```

Another question/request: I understand from the paper that VerityMap also includes analysis modules to detect the location of misassemblies. As far as I can see, these can only be accessed after read mapping concludes (I believe the relevant code is [here](https://github.com/ablab/VerityMap/blob/master/veritymap/py_src/mapper.py#L150-L154)). Is this correct? It would be useful if the interface allowed a more modular option that could be run independently of mapping, especially since it seems like I will need to troubleshoot the mapping stage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Read mapping is very slow on diploid human genome assembly #28

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Read mapping is very slow on diploid human genome assembly #28

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions