Skip to content

Releases: stevenkuang-tencent/llama.cpp

b5929

18 Jul 06:32
8f974bc
Compare
Choose a tag to compare
graph : refactor context to not pass gf explicitly (#14629)

ggml-ci

b5896

14 Jul 15:18
55c509d
Compare
Choose a tag to compare
ggml : refactor llamafile_sgemm PPC code (#14673)

Remove un-necessary templates from class definition and packing functions
Reduce deeply nested conditionals, if-else switching in mnapck function
Replace repetitive code with inline functions in Packing functions

2 ~ 7% improvement in Q8 Model
15 ~ 50% improvement in Q4 Model

Signed-off-by: Shalini Salomi Bodapati <Shalini.Salomi.Bodapati@ibm.com>