Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
88 commits
Select commit Hold shift + click to select a range
bb40365
acuda stubs
vellamike Jun 7, 2018
26f042d
Sending event means to device
vellamike Jun 7, 2018
381c3c5
estimating emission probabilities
vellamike Jun 11, 2018
8b74a57
estimating emission probabilities
vellamike Jun 11, 2018
9add07a
kermel Executing to completion but incomplete -WIP
vellamike Jun 11, 2018
2fb2b0b
Sending correct sequences to GPU
vellamike Jun 12, 2018
b614313
Correct grid size
vellamike Jun 12, 2018
aa43bc9
Match state almost working
vellamike Jun 15, 2018
a4dbf43
All states now being updated, but no terminal kmer or scaling
vellamike Jun 15, 2018
46e6ead
diagnosing issue
vellamike Jun 25, 2018
e148c87
Dynamic Programming Table the same for GPU and CPU except end
vellamike Jun 27, 2018
6dddd48
first two base scores correct, bug for other ones
vellamike Jul 2, 2018
03ffaba
GPU and CPU versions now giving same results
vellamike Jul 3, 2018
ac82456
Removed print statements
vellamike Jul 3, 2018
10db85a
Fixed bug with overly-large host allocations
vellamike Jul 4, 2018
f5c0b4a
removed some print statements
vellamike Jul 4, 2018
5a203a4
removed some print statements
vellamike Jul 4, 2018
458a84c
Sharing a lot more memory
vellamike Jul 4, 2018
eae79cb
Kernel now fast but some numerical errors remain
vellamike Jul 5, 2018
348fcf0
Fixed bug which was causing incorrect forward strand results
vellamike Jul 6, 2018
0719a9b
tidyup
vellamike Jul 6, 2018
712e068
some performance improvments
vellamike Jul 6, 2018
d6be1c6
Fix error and tidy up
vellamike Jul 9, 2018
ad39f6a
Merge branch 'master' into benchmark
vellamike Jul 10, 2018
ca3af6e
tidy up
vellamike Jul 10, 2018
27fe627
small performance improvments
vellamike Jul 10, 2018
0e7fdcb
tidyup
vellamike Jul 10, 2018
a0cce8f
Update README.md
vellamike Jul 10, 2018
677c94b
Update README.md
vellamike Jul 10, 2018
213b8eb
Update README.md
vellamike Jul 10, 2018
33d3b56
tidup
vellamike Jul 10, 2018
9dc268b
Merge branch 'benchmark' of https://github.com/nanoporetech/nanopolis…
vellamike Jul 10, 2018
dbd7906
typo fix
vellamike Jul 10, 2018
29bf060
Storing kmer ranks in one buffer
vellamike Jul 11, 2018
39fec2b
fixed a makefile error
vellamike Jul 11, 2018
c6414cc
Some simple CUDA API error reporting
vellamike Jul 12, 2018
188de17
One buffer for pore model
vellamike Jul 12, 2018
a2e8c2f
One buffer for pore model
vellamike Jul 12, 2018
9b8f029
Keeping pore model in registers
vellamike Jul 12, 2018
20eca32
Removed print statement
vellamike Jul 12, 2018
ca1796f
Async kernel invocations for improved occupancy
vellamike Jul 13, 2018
58bf7b7
Adding restrict flag to nvcc
vellamike Jul 16, 2018
8b020be
transferring means data and pre/post-flanks
vellamike Jul 16, 2018
05a8896
WIP - modifications to kernel for performance improvments
vellamike Jul 19, 2018
8acc6d2
Both Kernels giving similar but not identical results
vellamike Jul 20, 2018
409fb3a
Split work into smaller threadBlocks
vellamike Jul 20, 2018
8136fa6
New Kernel working in multi-base mode. Code needs big refactor and te…
vellamike Jul 21, 2018
9d43239
Increased buffer sizes
vellamike Jul 23, 2018
d7f2e31
Fixed issue with bases at end not being corrected
vellamike Jul 23, 2018
18effc5
16 workers - better on V100 for now
vellamike Jul 23, 2018
5b09cbc
Refactor of nanopolish_call_variants.cpp
vellamike Jul 24, 2018
68fb38e
Fewer and bigger streams
vellamike Jul 24, 2018
bb69f2e
fixing a memory leak
vellamike Jul 25, 2018
2b14a68
40x coverage
vellamike Jul 25, 2018
f4d53cc
added max coverage
vellamike Jul 26, 2018
cf5be6a
Finding good max coverage to use
vellamike Jul 26, 2018
6e22a85
Performance tuning for V100
vellamike Jul 26, 2018
e2a3525
set sleep to 100us
vellamike Jul 27, 2018
1adf4b8
Merged upstream master
vellamike Aug 2, 2018
2856bbb
Adding files for VCF handling which for some reason are absent
vellamike Aug 15, 2018
489c42a
resolved merge conflict
vellamike Sep 20, 2018
ca5f7a2
tidying makefile
vellamike Sep 20, 2018
5ecf066
tidying
vellamike Sep 20, 2018
075cee3
GPU acceleration of nanopolish consensus
vellamike Sep 20, 2018
9794750
removed spurious comment
vellamike Sep 20, 2018
3fc628e
setting indentation to 4 to match rest of nanopolish
vellamike Sep 20, 2018
b2fb309
removed some outdated comments
vellamike Sep 20, 2018
186ac5d
removed old debug code
vellamike Sep 20, 2018
e823003
removed deprecated code
vellamike Sep 20, 2018
551cd23
removed old debug code
vellamike Sep 20, 2018
5d67b61
revert typo
vellamike Sep 20, 2018
27f4d5c
Made indentation consistent
vellamike Sep 20, 2018
585302a
fixed indentation
vellamike Sep 20, 2018
0cec8f9
Merge branch 'master' into candidate-scoring-gpu
vellamike Sep 20, 2018
f3bf3e1
changes to the makefile to get it compiled
hasindu2008 Sep 25, 2019
e484b29
cleaned up the make file and added cuda support as an option with min…
hasindu2008 Sep 27, 2019
c0fe717
Merge remote-tracking branch 'upstream/master' into gpu-varcall
hasindu2008 Sep 27, 2019
f19f9b8
cleaned up to be consistent with the original code
hasindu2008 Sep 27, 2019
5658597
restructured to minimise changes to the original source code
hasindu2008 Sep 28, 2019
8338b92
make the --gpu more clear
hasindu2008 Sep 28, 2019
3c8677b
Merge remote-tracking branch 'upstream/master' into gpu-varcall-update
hasindu2008 Sep 28, 2019
c05733b
set to cuda static runtime library
hasindu2008 Sep 28, 2019
8964db0
removed .gitignore in test/
hasindu2008 Sep 28, 2019
896b806
add cida object file to make file clean option
hasindu2008 Oct 3, 2019
91deb0b
Merge remote-tracking branch 'upstream/master' into gpu-varcall-update
hasindu2008 Mar 12, 2020
09df08d
implementation of the methylation aware polishing option for the GPU
hasindu2008 Mar 13, 2020
2e8b6b9
Merge remote-tracking branch 'upstream/master' into gpu-varcall-update
hasindu2008 Mar 22, 2020
b13ab2d
Fixed edge case causing segfault when no reads are present in a scoreSet
vellamike May 1, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,10 @@ EXE_SRC = src/main/nanopolish.cpp src/test/nanopolish_test.cpp
CPP_OBJ = $(CPP_SRC:.cpp=.o)
C_OBJ = $(C_SRC:.c=.o)

ifdef cuda
include cuda.mk
endif

# Generate dependencies
.PHONY: depend
depend: .depend
Expand Down Expand Up @@ -172,4 +176,5 @@ test: $(TEST_PROGRAM)
.PHONY: clean
clean:
rm -f $(PROGRAM) $(TEST_PROGRAM) $(CPP_OBJ) $(C_OBJ) \
src/cuda_kernels/gpu_aligner.o \
src/main/nanopolish.o src/test/nanopolish_test.o
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,14 @@ Then you can run nanopolish from the image:
docker run -v /path/to/local/data/data/:/data/ -it :image_id ./nanopolish eventalign -r /data/reads.fa -b /data/alignments.sorted.bam -g /data/ref.fa
```

## GPU acceleration

The nanopolish consensus improvement algorithm can be performed faster using CUDA-enabled GPU acceleration. This is an experimental feature, to try this feature run with the `--gpu=1` flag e.g:
```
nanopolish variants --consensus polished_gpu.fa -w "tig00000001:200000-230000" -r reads.fasta -b reads.sorted.bam -g draft.fa --threads=8 --gpu=1
```
Note that this feature requires nanopolish to be compiled with `make cuda=1`. You should have the [CUDA toolkit installed and configured](https://docs.nvidia.com/cuda/cuda-quick-start-guide/). If your CUDA installation is not in the default location, you can provide the path to make as `make cuda=1 NVCC=/path/to/nvidia_c_compiler CUDA_LIB=/path/to/cuda/lib CUDA_INCLUDE=/path/to/cuda/include`.

## Credits and Thanks

The fast table-driven logsum implementation was provided by Sean Eddy as public domain code. This code was originally part of [hmmer3](http://hmmer.janelia.org/). Nanopolish also includes code from Oxford Nanopore's [scrappie](https://github.com/nanoporetech/scrappie) basecaller. This code is licensed under the MPL.
27 changes: 27 additions & 0 deletions cuda.mk
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#Make file options for CUDA support

NVCC ?= nvcc
CUDA_ROOT = /usr/local/cuda
CUDA_LIB ?= $(CUDA_ROOT)/lib64
CUDA_INCLUDE ?= $(CUDA_ROOT)/include
CURTFLAGS = -L$(CUDA_LIB) -lcudart_static -lrt
NVCCFLAGS ?= -g -lineinfo -std=c++11 -I. -I$(CUDA_INCLUDE) -O3 -use_fast_math --default-stream per-thread -restrict

CPPFLAGS += -I$(CUDA_INCLUDE)
CPPFLAGS += -DHAVE_CUDA=1

# Sub directories containing CUDA source code
SUBDIRS += src/cuda_kernels
# Find the source files by searching subdirectories
CU_SRC := $(foreach dir, $(SUBDIRS), $(wildcard $(dir)/*.cu))
# Automatically generated object names
CU_OBJ = $(CU_SRC:.cu=.o)
CPP_OBJ += $(CU_OBJ)
LDFLAGS += $(CURTFLAGS)

.SUFFIXES: .cu

# Compile objects
.cu.o:
$(NVCC) -o $@ -c $(NVCCFLAGS) $(CPPFLAGS) $<

Loading