Issue with SUP basecalling using RTX3080 10GB

### New issue checks

- [x] I have read the [Dorado Documentation](https://dorado-docs.readthedocs.io/en/latest/).
- [x] I did not find an [existing issue](https://github.com/nanoporetech/dorado/issues?q=is%3Aissue).

### Dorado version

```true
1.1.1
```

### Dorado subcommand

Basecaller

### The issue

dorado basecaller sup  input.pod5  > basecalled.bam

Output:
[2025-09-04 15:12:33.631] [info] > Creating basecall pipeline
[2025-09-04 15:12:34.581] [info] Using CUDA devices:
[2025-09-04 15:12:34.581] [info] cuda:0 - NVIDIA GeForce RTX 3080
[2025-09-04 15:12:35.444] [info] Calculating optimized batch size for GPU "NVIDIA GeForce RTX 3080" and model dna_r10.4.1_e8.2_400bps_sup@v5.2.0. Full benchmarking will run for this device, which may take some time.
[2025-09-04 15:12:36.267] [info] cuda:0 using chunk size 12288, batch size 96
[2025-09-04 15:12:36.475] [info] cuda:0 using chunk size 6144, batch size 192
[2025-09-04 15:14:27.768] [warning] Caught Torch error 'CUDA error: unspecified launch failure
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
', clearing CUDA cache and retrying.
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: unspecified launch failure
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Exception raised from c10_cuda_check_implementation at /builds/machine-learning/torch-builds/pytorch/c10/cuda/CUDAException.cpp:43 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0xb0 (0x7f220d205390 in /mnt/SSD1/dorado-1.1.1-linux-x64/bin/../lib/libdorado_torch_lib.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xfa (0x7f2203da5896 in /mnt/SSD1/dorado-1.1.1-linux-x64/bin/../lib/libdorado_torch_lib.so)
frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x3cc (0x7f220d1a639c in /mnt/SSD1/dorado-1.1.1-linux-x64/bin/../lib/libdorado_torch_lib.so)
frame #3: <unknown function> + 0xb245e1d (0x7f220d181e1d in /mnt/SSD1/dorado-1.1.1-linux-x64/bin/../lib/libdorado_torch_lib.so)
frame #4: <unknown function> + 0xb24677e (0x7f220d18277e in /mnt/SSD1/dorado-1.1.1-linux-x64/bin/../lib/libdorado_torch_lib.so)
frame #5: <unknown function> + 0xb259f98 (0x7f220d195f98 in /mnt/SSD1/dorado-1.1.1-linux-x64/bin/../lib/libdorado_torch_lib.so)
frame #6: /mnt/SSD1/dorado-1.1.1-linux-x64/bin/dorado() [0x54d7d4]
frame #7: <unknown function> + 0xc2b23 (0x7f2201bf7b23 in /mnt/SSD1/dorado-1.1.1-linux-x64/bin/../lib/libstdc++.so.6)
frame #8: <unknown function> + 0x8609 (0x7f2201ed2609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #9: clone + 0x43 (0x7f22018f6353 in /lib/x86_64-linux-gnu/libc.so.6)


### System specifications

Operating system:Ubuntu 20.04.6 LTS
CPU: AMD Ryzen 9 3950X 16-Core Processor
GPU: RTX 3080 10 gb
SSD

Hi,

I am facing an issue when try to use sup to basecall my reads, after starting the command for a few minutes the GPU would crash and I would need to reboot the system to to use the GPU.
I have also tried to lower the batch size to -b 12  and the same error occurs.
When I tried using HAC for basecalling the same pod5 files, there was no issue with the same system. 

Thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue with SUP basecalling using RTX3080 10GB #1489

New issue checks

Dorado version

Dorado subcommand

The issue

System specifications

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue with SUP basecalling using RTX3080 10GB #1489

Description

New issue checks

Dorado version

Dorado subcommand

The issue

System specifications

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions