when not to use LLAMAFILE CPP flag #10338

bertsons · 2024-11-16T16:20:05Z

bertsons
Nov 16, 2024

By default, the GGML_USE_LLAMAFILE CPP flag is passed to the compiler in the Makefile and CMake files. Justine Tunney remarks "the speedup works best for prompts having fewer than 1,000 tokens" (https://justine.lol/matmul/).

Is there a prompt size from which performance starts to get worse with GGML_USE_LLAMAFILE flag than without?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

when not to use LLAMAFILE CPP flag #10338

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

when not to use LLAMAFILE CPP flag #10338

Uh oh!

bertsons Nov 16, 2024

Replies: 0 comments

bertsons
Nov 16, 2024