Skip to content

GGUF vs bin? #6068

Closed Answered by Jeximo
VadimBoev asked this question in Q&A
Mar 14, 2024 · 2 comments
Discussion options

You must be logged in to vote

I am interested in maximum performance. Which is better than GGUF or bin

Same performance.

Also, I saw a topic in which ggerganov advised to disable map and compile llama.cpp after such changes.
Does it really work?

--no-mmap in the command line may help performance, depending on your system RAM --no-mmap

And how can I set parameters for high performance?

Add --threads N, -t N, --threads N

If you build including GPU, then also add -ngl N -ngl N, --n-gpu-layers N

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by VadimBoev
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants