-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Hello,
This bug only happens in ~1-10 out of ~5000 themisto build runs. I am running themisto on ~4500 genomes, calling themisto build on each of these genomes separately, using HPC to schedule them in parallel. Sometimes I see ~5 runs hang indefinitely, most recently I saw 1 run hang indefinitely. If I re-run a failed run, themisto finishes normally. So this bug does not seem to be caused by the specific data, and it occurs in ~ 0.02% of themisto build runs.
Pure speculation on my part, but perhaps caused by some kind of rare race condition?
Here is how I am calling themisto on HPC:
sbatch -p scavenger --mem=2G --cpus-per-task=4 --wrap="themisto build -k 31 -i ../results/themisto_replicon_references/GCF_017165095.1_ASM1716509v1_genomic/GCF_017165095.1_ASM1716509v1_genomic.txt --index-prefix ../results/themisto_replicon_indices/GCF_017165095.1_ASM1716509v1_genomic --temp-dir ../results/themisto_replicon_indices/temp --mem-gigas 2 --n-threads 4 --file-colors"
9 hours later, this is what the log file looked like:
failed-slurm-11363085.out.txt
When I rerun this command, the run finishes in 8 seconds, here is the log file:
rerun-slurm-11372781.out.txt