Is it possible to calculate CPU buffer size before loading model? #5432

segmond · 2024-02-09T20:02:36Z

segmond
Feb 9, 2024

I have 3 GPUs, two 24gb and 1 12gb for a total of 60gb of VRAM. I'm trying goliath with a size of 66gb, so I figure, say 58gb for VRAM, 10-12 at most for CPU. Running with a context size of 512. How come CPU buffer size is still 67gb after almost filling up the VRAMs?

66G Feb 9 07:58 goliath-120b.Q4_K_M.gguf

llm_load_tensors: offloading 110 repeating layers to GPU
llm_load_tensors: offloaded 110/138 layers to GPU
llm_load_tensors: CPU buffer size = 67364.36 MiB
llm_load_tensors: CUDA0 buffer size = 10584.06 MiB
llm_load_tensors: CUDA1 buffer size = 20978.38 MiB
llm_load_tensors: CUDA2 buffer size = 21809.56 MiB

slaren · 2024-02-09T20:21:02Z

slaren
Feb 9, 2024
Maintainer

This can happen when using mmap. The CPU buffer size in this case represents the size of the memory mapped file, it is not really a separately allocated buffer. Under Linux the offloaded portions of the model will be unmapped after loading, but that cannot be done on Windows. Using -no-mmap should show you the real CPU buffer size. It can also happen on Linux if the tensors in the model file are ordered in an unusual way, which will cause a larger range of the model file to be kept mapped than strictly necessary. In any case, the OS should evict the memory mapped file from memory if the memory is needed for something else, so I wouldn't expect this to cause issues.

1 reply

segmond Feb 10, 2024
Author

Makes sense, I'm on Linux. Thanks for the info.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is it possible to calculate CPU buffer size before loading model? #5432

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Is it possible to calculate CPU buffer size before loading model? #5432

Uh oh!

segmond Feb 9, 2024

Replies: 1 comment · 1 reply

Uh oh!

slaren Feb 9, 2024 Maintainer

Uh oh!

segmond Feb 10, 2024 Author

segmond
Feb 9, 2024

Replies: 1 comment 1 reply

slaren
Feb 9, 2024
Maintainer

segmond Feb 10, 2024
Author