Dorado polish low GPU utilisation for bam regions of variable coverage

### New issue checks

- [x] I have read the [Dorado Documentation](https://dorado-docs.readthedocs.io/en/latest/).
- [x] I did not find an [existing issue](https://github.com/nanoporetech/dorado/issues?q=is%3Aissue).

### Dorado version

```true
1.1.1+e72f1492
```

### Dorado subcommand

Polish

### The issue

Dorado polish can sometimes show inconsistent GPU utilisation when processing BAM files containing many short regions (e.g. in my case, >260,000 regions of ~1.5 kB for amplicon polishing) with inconsistent region coverage, or BAMs with inconsistent region lengths. I have mostly seen this on our cluster with a very fast graphics card but sometimes low I/O speeds. In this case, the infer threads seem to be faster than the encoder threads, with the trace log showing a queue size of 0: `[consumer x] Popped data: item.samples.size() = 1, queue size: 0`. Increasing the number of threads for encoding doesn't seem to help either.

I've noticed that the CPU usage rises after the debug message `[debug] Starting to encode regions for X windows using X threads` and then gradually drops until the next batch of windows is encoded. This indicates that the computational load is not evenly distributed among the encoders and that all encoders wait for an entire batch of windows to finish before pushing it into the infer queue and moving on to the next one. This cannot be fixed by providing more threads for encoding, as they all wait for the slowest thread to complete.

### System specifications

Operating System: Ubuntu 22.04.4 LTS
CPU: AMD EPYC 7742 64-Core Processor (64 cores / 64 threads)
RAM: 120 GB
GPU: NVIDIA A100-SXM4-40GB / NVIDIA H100-80GB
Disk: Sometimes very slow 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dorado polish low GPU utilisation for bam regions of variable coverage #1503

New issue checks

Dorado version

Dorado subcommand

The issue

System specifications

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dorado polish low GPU utilisation for bam regions of variable coverage #1503

Description

New issue checks

Dorado version

Dorado subcommand

The issue

System specifications

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions