failed to allocate compute buffers #9808
-
I'm on an M1 air mac, trying to run llama 3.2 3b instruct, obtained from this link but I get an allocation error.
I tried running earlier a Q8 quant of the model but i got the same error, so i thought a smaller quant would be better but i cannot run it either. What should i do? Could somebody explain the error? Thanks in advance |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Same problem here... Mac M3 w/ 16GB unified memory trying to run ggml_metal_init: recommendedMaxWorkingSetSize = 11453.25 MB
llama_kv_cache_init: Metal KV buffer size = 4096.00 MiB
llama_new_context_with_model: KV self size = 4096.00 MiB, K (f16): 2048.00 MiB, V (f16): 2048.00 MiB
llama_new_context_with_model: CPU output buffer size = 0.49 MiB
ggml_gallocr_reserve_n: failed to allocate Metal buffer of size 8875151360
ggml_backend_metal_buffer_type_alloc_buffer: error: failed to allocate buffer, size = 8464.02 MiB
llama_new_context_with_model: failed to allocate compute buffers |
Beta Was this translation helpful? Give feedback.
-
Update - fixed by adding |
Beta Was this translation helpful? Give feedback.
Update - fixed by adding
-ngl 0
flag.