Replies: 1 comment
-
I wish to build the same and now found your post. I just got it running today natively on RPi 5 with Deb 12 6.6-y-gpu with recompiled and patched kernel. PCIe Oculink to an AMD GPU. It runs great. Now I would like to containerise the build for docker and eventually K8s. https://www.jeffgeerling.com/blog/2024/llms-accelerated-egpu-on-raspberry-pi-5 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi
My arm64 (raspberry pi 5)+ vulkan + amdgpu (Rx 6700xt) work fine in physical machine with raspberry pi OS (6.6.y).. the problem is the docker container based on that. I create docker container to run llama.cpp : debian bookworm (arm64)+ vulkan + amdgpu (Rx 6700xt)... with map /dev/dri/ , but it failed with following output , what's the "Bus error (core dumped)" mean?
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 6700 XT (RADV NAVI22) (radv) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 65536 | matrix cores: none
build: 4984 (5d01670) with cc (Debian 12.2.0-14) 12.2.0 for aarch64-linux-gnu
main: llama backend init
main: load the model and apply lora adapter, if any
llama_model_load_from_file_impl: using device Vulkan0 (AMD Radeon RX 6700 XT (RADV NAVI22)) - 12032 MiB free
llama_model_loader: loaded meta data with 35 key-value pairs and 255 tensors from models/Llama-3.2-3B-Instruct-Q4_K_M.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
...........
load_tensors: loading model tensors, this can take a while... (mmap = true)
make_cpu_buft_list: disabling extra buffer types (i.e. repacking) since a GPU device is available
load_tensors: offloading 28 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 29/29 layers to GPU
load_tensors: Vulkan0 model buffer size = 1918.35 MiB
load_tensors: CPU_Mapped model buffer size = 308.23 MiB
...........................................................................
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 4096
llama_context: n_ctx_per_seq = 4096
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = 0
llama_context: freq_base = 500000.0
llama_context: freq_scale = 1
llama_context: n_ctx_per_seq (4096) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
Bus error (core dumped)
vulkaninfo show:
VULKANINFO
Vulkan Instance Version: 1.3.239
Instance Extensions: count = 20
VK_EXT_acquire_drm_display : extension revision 1
VK_EXT_acquire_xlib_display : extension revision 1
VK_EXT_debug_report : extension revision 10
VK_EXT_debug_utils : extension revision 2
VK_EXT_direct_mode_display : extension revision 1
VK_EXT_display_surface_counter : extension revision 1
VK_KHR_device_group_creation : extension revision 1
VK_KHR_display : extension revision 23
VK_KHR_external_fence_capabilities : extension revision 1
VK_KHR_external_memory_capabilities : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_display_properties2 : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2 : extension revision 1
VK_KHR_portability_enumeration : extension revision 1
VK_KHR_surface : extension revision 25
VK_KHR_surface_protected_capabilities : extension revision 1
VK_KHR_wayland_surface : extension revision 6
VK_KHR_xcb_surface : extension revision 6
VK_KHR_xlib_surface : extension revision 6
Instance Layers: count = 3
VK_LAYER_KHRONOS_validation Khronos Validation Layer 1.3.239 version 1
VK_LAYER_MESA_device_select Linux device selection layer 1.3.211 version 1
VK_LAYER_MESA_overlay Mesa Overlay layer 1.3.211 version 1
Devices:
GPU0:
apiVersion = 1.3.230
driverVersion = 22.3.6
vendorID = 0x1002
deviceID = 0x73df
deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
deviceName = AMD Radeon RX 6700 XT (RADV NAVI22)
driverID = DRIVER_ID_MESA_RADV
driverName = radv
driverInfo = Mesa 22.3.6
conformanceVersion = 1.3.0.0
deviceUUID = 00000000-0300-0000-0000-000000000000
driverUUID = 414d442d-4d45-5341-2d44-525600000000
the gpu usage monitor show a peak

Beta Was this translation helpful? Give feedback.
All reactions