Releases · stevenkuang-tencent/llama.cpp

18 Jul 06:32

8f974bc

b5929 Latest

Latest

graph : refactor context to not pass gf explicitly (#14629)

ggml-ci

Assets 15

cudart-llama-bin-win-cuda-12.4-x64.zip

sha256:8c79a9b226de4b3cacfd1f83d24f962d0773be79f1e7b75c6af4ded7e32ae1d6
373 MB 2025-07-18T06:32:02Z
llama-b5929-bin-macos-arm64.zip

sha256:f5944238902cab86640e6d8b8581407d62ebf934f1fa55267bda8c5b58ca7fdf
10.6 MB 2025-07-18T06:32:13Z
llama-b5929-bin-macos-x64.zip

sha256:efad7b8894c71c722551ba0290fa03a43768cdd37ed2271f5d62ecb6c4f49a05
27.2 MB 2025-07-18T06:32:14Z
llama-b5929-bin-ubuntu-vulkan-x64.zip

sha256:39b455109681212ca1b20e01007d91214367772bf280a11b1f555daf22668a6f
20.9 MB 2025-07-18T06:32:15Z
llama-b5929-bin-ubuntu-x64.zip

sha256:6e57b79a86c1cd54f856925e74f2a9d0f5b28336fbbc9b06e9f854a19a57c2b0
12.5 MB 2025-07-18T06:32:16Z
llama-b5929-bin-win-cpu-arm64.zip

sha256:a5b71101092e3077ed60576d5a24a11465f770e0dd711ada13f1fc41a9e3829e
10.9 MB 2025-07-18T06:32:17Z
llama-b5929-bin-win-cpu-x64.zip

sha256:92416523cbaf92ce213e511f5c6d110771810d162834dfc8f40f590eceb2b505
13.7 MB 2025-07-18T06:32:18Z
llama-b5929-bin-win-cuda-12.4-x64.zip

sha256:616153f65cc2f0528c12fc4d0c49f2ecb694e38aedcd3e344ffb8eb17512db16
129 MB 2025-07-18T06:32:19Z
llama-b5929-bin-win-hip-radeon-x64.zip

sha256:8215bbcf2b30aed1e49a7579d24d78130659495658bec3c4a66383acf93c22ab
299 MB 2025-07-18T06:32:24Z
llama-b5929-bin-win-opencl-adreno-arm64.zip

sha256:5fe749ed76a03bc072df563ca53a69b2a5aff83a0cf2ccdc0fe40de1fe413821
11.2 MB 2025-07-18T06:32:31Z
Source code (zip)

2025-07-18T05:29:28Z
Source code (tar.gz)

2025-07-18T05:29:28Z

14 Jul 15:18

github-actions

b5896

55c509d

b5896

ggml : refactor llamafile_sgemm PPC code (#14673)

Remove un-necessary templates from class definition and packing functions
Reduce deeply nested conditionals, if-else switching in mnapck function
Replace repetitive code with inline functions in Packing functions

2 ~ 7% improvement in Q8 Model
15 ~ 50% improvement in Q4 Model

Signed-off-by: Shalini Salomi Bodapati <Shalini.Salomi.Bodapati@ibm.com>

Assets 15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Releases: stevenkuang-tencent/llama.cpp

b5929

Uh oh!

b5896

Uh oh!