GGML_LLAMAFILE build flag #10326

bertsons · 2024-11-15T22:28:58Z

bertsons
Nov 15, 2024

Just wondering what GGML_LLAMAFILE CMake build option actually does? I presume it doesn't output a Llamafile of some sort?

Answered by danbev

Nov 16, 2024

This is a build flag that is related to the cpu backend. It enables the usage of is a small optimized BLAS (Basic Linear Algebra Subprograms) implementation originally written by Justine Tunney for llamafile. It was contributed in #6414.

There is a nice blog post about this here: https://justine.lol/matmul/

We can see that this is used in the CPU backend in src/ggml-cpu/CMakeLists.txt:

if (GGML_LLAMAFILE)                                                                 
    message(STATUS "Using llamafile")                                           
                                                                                
    add_compile_definitions(GGML_USE_LLAMAFILE)              …

View full answer

danbev · 2024-11-16T05:22:05Z

danbev
Nov 16, 2024
Collaborator

This is a build flag that is related to the cpu backend. It enables the usage of is a small optimized BLAS (Basic Linear Algebra Subprograms) implementation originally written by Justine Tunney for llamafile. It was contributed in #6414.

There is a nice blog post about this here: https://justine.lol/matmul/

We can see that this is used in the CPU backend in src/ggml-cpu/CMakeLists.txt:

if (GGML_LLAMAFILE)                                                                 
    message(STATUS "Using llamafile")                                           
                                                                                
    add_compile_definitions(GGML_USE_LLAMAFILE)                                 
                                                                                
    target_sources(ggml-cpu PRIVATE                                             
                    llamafile/sgemm.cpp                                         
                    llamafile/sgemm.h)                                          
endif()

The header includes one function which is a single precision general matrix multiplication (sgemm):

bool llamafile_sgemm(int64_t, int64_t, int64_t, const void *, int64_t,          
                     const void *, int64_t, void *, int64_t, int, int,          
                     int, int, int);

And is later used in ggml_compute_forward_mul_mat if the GGML_LLAMAFILE flag is enabled.

0 replies

bertsons · 2024-11-16T18:17:24Z

bertsons
Nov 16, 2024
Author

Thanks for the info. I think calling this flag LLAMAFILE is quite confusing to newcomers to llama.cpp. Surely TINYBLAS would be more suitable & direct?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GGML_LLAMAFILE build flag #10326

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

GGML_LLAMAFILE build flag #10326

Uh oh!

bertsons Nov 15, 2024

Replies: 2 comments

Uh oh!

danbev Nov 16, 2024 Collaborator

Uh oh!

bertsons Nov 16, 2024 Author

bertsons
Nov 15, 2024

danbev
Nov 16, 2024
Collaborator

bertsons
Nov 16, 2024
Author