Skip to content

failed to allocate buffer for deepseekv2 model Segmentation fault (core dumped) #8520

Answered by compilade
Ramzee-S asked this question in Q&A
Discussion options

You must be logged in to vote

I tried to load a large model (deepseekv2) on a large computer with 512GB ddr5 memory.

llama_new_context_with_model: n_ctx = 163840
ggml_backend_cpu_buffer_type_alloc_buffer: failed to allocate buffer of size 805306368032

It tried to allocate 805306368032 bytes (750 GiB).

But maybe the buffer is so big it does not even try.

Yes, this likely what happened.

Is there any setting i can try to change?

Try using a smaller context size, with the -c flag.

For example (extending the command you've used):

$ ./llama-server -m ./models/deepseek.gguf --port 8080 -c 32768

This should make the huge buffer from before smaller at around 150 GiB (since 32768 is 5 times smaller than 163840, and 150 …

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@Ramzee-S
Comment options

Answer selected by Ramzee-S
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants