-
Notifications
You must be signed in to change notification settings - Fork 96
Description
New issue checks
- I have read the Dorado Documentation.
- I did not find an existing issue.
Dorado version
1.1.1
Dorado subcommand
Basecaller
The issue
Have experienced a few crashes now so here is the report. The basecaller starts up nicely and runs for quite a while before crashing. Nothing else is running on the computer.
[2025-09-26 13:50:19.756] [info] Running: "basecaller" "--recursive" "--device" "cuda:all" "sup" "./PBE71308/" "--modified-bases" "4mC_5mC" "6mA" "--resume-from" "PBE71308.dorado1.1.1.bm5.2.0_sup.sim.mod4mC_5mC_6mA.bam"
[2025-09-26 13:50:20.260] [info] - downloading dna_r10.4.1_e8.2_400bps_sup@v5.2.0 with httplib
[2025-09-26 13:50:22.560] [info] - downloading dna_r10.4.1_e8.2_400bps_sup@v5.2.0_4mC_5mC@v1 with httplib
[2025-09-26 13:50:23.348] [info] - downloading dna_r10.4.1_e8.2_400bps_sup@v5.2.0_6mA@v1 with httplib
[2025-09-26 13:50:24.185] [info] > Creating basecall pipeline
[2025-09-26 13:50:25.687] [info] Using CUDA devices:
[2025-09-26 13:50:25.687] [info] cuda:0 - NVIDIA GeForce RTX 5090
[2025-09-26 13:50:25.687] [info] cuda:1 - NVIDIA GeForce RTX 5090
[2025-09-26 13:50:27.018] [info] Calculating optimized batch size for GPU "NVIDIA GeForce RTX 5090" and model dna_r10.4.1_e8.2_400bps_sup@v5.2.0. Full benchmarking will run for this device, which may take some time.
[2025-09-26 13:50:27.104] [info] Calculating optimized batch size for GPU "NVIDIA GeForce RTX 5090" and model dna_r10.4.1_e8.2_400bps_sup@v5.2.0. Full benchmarking will run for this device, which may take some time.
[2025-09-26 13:50:28.452] [info] cuda:0 using chunk size 12288, batch size 224
[2025-09-26 13:50:28.452] [info] cuda:1 using chunk size 12288, batch size 288
[2025-09-26 13:50:28.603] [info] cuda:0 using chunk size 6144, batch size 448
[2025-09-26 13:50:28.638] [info] cuda:1 using chunk size 6144, batch size 288
[2025-09-26 13:50:28.848] [info] > Inspecting resume file...
[2025-09-26 13:50:28.854] [info] Resuming from file PBE71308.dorado1.1.1.bm5.2.0_sup.sim.mod4mC_5mC_6mA.bam...
[2025-09-26 14:48:59.831] [info] > 12231628 original read ids found in resume file.
Koi RMSNorm residual: failed to set smem size 184324h:16m:12s] Basecalling
[2025-09-27 00:51:55.231] [error] Koi tiled path failed 7
terminate called after throwing an instance of 'std::runtime_error'
what(): Koi convolution (host_window_ntwc_f16) failed with in size 16
System specifications
ubuntu 24,
2x RTX5090, AMD Ryzen Threadripper 7960X 24-Cores, 192 GB RAM, SSD NVME