-
Notifications
You must be signed in to change notification settings - Fork 845
Open
Labels
zluda_trace logszluda_trace log files for a particular applicationzluda_trace log files for a particular application
Description
zluda_trace logs (tarball/zip file)
Description
Whisper.cpp when compiled with cuda support doesn't work with zluda. It says the CUDA driver version is insufficient for CUDA runtime version but it works without zluda on an nvidia card
whisper_init_from_file_with_params_no_state: loading model from '/home/asdf/Downloads/whisper.cpp-master/models/ggml-base.bin'
whisper_init_with_params_no_state: use gpu = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw = 0
ggml_cuda_init: failed to initialize CUDA: CUDA driver version is insufficient for CUDA runtime version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = NVIDIA GeForce RTX 3060 Laptop GPU (NVIDIA) | uma: 0 | fp16: 1 | bf16: 1 | warp size: 32 | shared memory: 49152 | int dot: 1 | matrix cores: NV_coopmat2
whisper_init_with_params_no_state: devices = 2
whisper_init_with_params_no_state: backends = 3
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 512
whisper_model_load: n_text_head = 8
whisper_model_load: n_text_layer = 6
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 2 (base)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs = 99
whisper_model_load: Vulkan0 total size = 147.37 MB
whisper_model_load: model size = 147.37 MB
whisper_backend_init_gpu: using Vulkan0 backend
whisper_init_state: kv self size = 6.29 MB
whisper_init_state: kv cross size = 18.87 MB
whisper_init_state: kv pad size = 3.15 MB
whisper_init_state: compute buffer (conv) = 17.24 MB
whisper_init_state: compute buffer (encode) = 85.88 MB
whisper_init_state: compute buffer (cross) = 4.66 MB
whisper_init_state: compute buffer (decode) = 97.29 MB
Steps to reproduce
from whisper.cpp:
First clone the repository:
git clone https://github.com/ggml-org/whisper.cpp.gitNavigate into the directory:
cd whisper.cpp
Then, download one of the Whisper models converted in ggml format. For example:
sh ./models/download-ggml-model.sh base.enNow build the whisper-cli example and transcribe an audio file like this:
# build the project
cmake --fresh -B build -D WHISPER_FFMPEG=yes -DGGML_CUDA=1 -DGGML_VULKAN=1
cmake --build build -j --config Release
# transcribe an audio file
./build/bin/whisper-cli -f samples/jfk.wav
ZLUDA version
5
Operating System
endeavour os rolling
GPU
AMD Radeon RX Vega 7
Metadata
Metadata
Assignees
Labels
zluda_trace logszluda_trace log files for a particular applicationzluda_trace log files for a particular application