-
Just wondering what GGML_LLAMAFILE CMake build option actually does? I presume it doesn't output a Llamafile of some sort? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
This is a build flag that is related to the cpu backend. It enables the usage of is a small optimized BLAS (Basic Linear Algebra Subprograms) implementation originally written by Justine Tunney for llamafile. It was contributed in #6414. There is a nice blog post about this here: https://justine.lol/matmul/ We can see that this is used in the CPU backend in if (GGML_LLAMAFILE)
message(STATUS "Using llamafile")
add_compile_definitions(GGML_USE_LLAMAFILE)
target_sources(ggml-cpu PRIVATE
llamafile/sgemm.cpp
llamafile/sgemm.h)
endif() The header includes one function which is a single precision general matrix multiplication (sgemm): bool llamafile_sgemm(int64_t, int64_t, int64_t, const void *, int64_t,
const void *, int64_t, void *, int64_t, int, int,
int, int, int); And is later used in |
Beta Was this translation helpful? Give feedback.
-
Thanks for the info. I think calling this flag LLAMAFILE is quite confusing to newcomers to llama.cpp. Surely TINYBLAS would be more suitable & direct? |
Beta Was this translation helpful? Give feedback.
This is a build flag that is related to the cpu backend. It enables the usage of is a small optimized BLAS (Basic Linear Algebra Subprograms) implementation originally written by Justine Tunney for llamafile. It was contributed in #6414.
There is a nice blog post about this here: https://justine.lol/matmul/
We can see that this is used in the CPU backend in
src/ggml-cpu/CMakeLists.txt
: