make LLAMA_CUBLAS=1 - nvcc fatal : Value 'native' is not defined for option 'gpu-architecture' #2142
Replies: 3 comments 5 replies
-
I have had the same error, so I followed cmake instructions instead and it build fine.
|
Beta Was this translation helpful? Give feedback.
-
I found a solution to this error so I wanted to share, I'm not sure this is the right place. In the Makefile, change the line: to read: There might be some warnings about depreciation but it compiled for me. For reference, I found the solution here: ggml-org/whisper.cpp#876 ++++++++++++ EDIT: After a little more tinkering, I've realized the 'all-major' is a bit hacky and can be narrowed down for the specific Nvidia card. First, find the card in question on this page: Once you've found the card, use the corresponding 'compute capability' with 'sm_' for your makefile (ignore the decimal). For example, I have a GeForce 2070 max q. On the reference page, the GeForce 2070 compute capability is 7.5 so, in my makefile, I changed the line to: and it compiles! 'compute_75' may also work, but I didn't try it (you can get a list of available options by using nvcc --list-gpu-arch). Of course, 'all-major' should work just fine, I'm just going down the Makefile rabbit hole and thought I'd share in case it helps someone, somewhere, at some point. |
Beta Was this translation helpful? Give feedback.
-
The -arch=native option was introduced with this nvcc version: https://docs.nvidia.com/cuda/archive/11.6.0/cuda-toolkit-release-notes/index.html#cuda-compiler-new-features If someone apt installs the cuda-toolkit for Ubuntu 22.04 LTS, he gets a version that is just a little bit to old from the default repository. So one has to either get a newer cuda-toolkit/nvcc or change the Makefile as proposed before. Maybe there should be a hint somewhere in the installation guide, because this happens for the LTS version of Ubuntu, which is probably used quite often. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I was wondering if anyone else has come across this issue before when trying to compile llama with cublas.
Im getting an error " Value 'native' is not defined for option 'gpu-architecture' "
Im trying to compile llama on Ubuntu 22.04, and I have installed 5x Nvidia P40 (24gb) and 2x Nvidia P100's (one with 12gb, and one 16gb)
I tried to follow what was being said here, but it appears theyre trying to figure it out for windows and never came to a definite conclusion.
#1070
Here is my log.
root@devops:/ai/llm/llama.cpp.gpu# make clean
I llama.cpp build info:
I UNAME_S: Linux
I UNAME_P: x86_64
I UNAME_M: x86_64
I CFLAGS: -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS
I LDFLAGS:
I CC: cc (Ubuntu 11.3.0-1ubuntu1
22.04.1) 11.3.022.04.1) 11.3.0I CXX: g++ (Ubuntu 11.3.0-1ubuntu1
rm -vf *.o *.so main quantize quantize-stats perplexity embedding benchmark-matmult save-load-state server simple vdot train-text-from-scratch embd-input-test build-info.h
removed 'common.o'
removed 'ggml.o'
removed 'k_quants.o'
removed 'llama.o'
removed 'build-info.h'
root@devops:/ai/llm/llama.cpp.gpu# make LLAMA_CUBLAS=1
I llama.cpp build info:
I UNAME_S: Linux
I UNAME_P: x86_64
I UNAME_M: x86_64
I CFLAGS: -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include
I LDFLAGS: -lcublas -lculibos -lcudart -lcublasLt -lpthread -ldl -lrt -L/usr/local/cuda/lib64 -L/opt/cuda/lib64 -L/targets/x86_64-linux/lib
I CC: cc (Ubuntu 11.3.0-1ubuntu1
22.04.1) 11.3.022.04.1) 11.3.0I CXX: g++ (Ubuntu 11.3.0-1ubuntu1
cc -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -c ggml.c -o ggml.o
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -c llama.cpp -o llama.o
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -c examples/common.cpp -o common.o
cc -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -c -o k_quants.o k_quants.c
nvcc --forward-unknown-to-host-compiler -arch=native -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -Wno-pedantic -c ggml-cuda.cu -o ggml-cuda.o
nvcc fatal : Value 'native' is not defined for option 'gpu-architecture'
make: *** [Makefile:191: ggml-cuda.o] Error 1
Beta Was this translation helpful? Give feedback.
All reactions