Releases · bartowski1182/llama.cpp · GitHub

20 May 00:19

b2943 Latest

Latest

Merge branch 'ggerganov:master' into master

Assets 21

cudart-llama-bin-win-cu11.7.1-x64.zip

293 MB 2024-05-20T00:19:03Z
cudart-llama-bin-win-cu12.2.0-x64.zip

413 MB 2024-05-20T00:19:08Z
llama-b2943-bin-macos-arm64.zip

41.4 MB 2024-05-20T00:19:16Z
llama-b2943-bin-macos-x64.zip

38 MB 2024-05-20T00:19:17Z
llama-b2943-bin-ubuntu-x64.zip

45.9 MB 2024-05-20T00:19:18Z
llama-b2943-bin-win-avx-x64.zip

6.61 MB 2024-05-20T00:19:19Z
llama-b2943-bin-win-avx2-x64.zip

6.59 MB 2024-05-20T00:19:20Z
llama-b2943-bin-win-avx512-x64.zip

6.61 MB 2024-05-20T00:19:21Z
llama-b2943-bin-win-clblast-x64.zip

7.79 MB 2024-05-20T00:19:22Z
llama-b2943-bin-win-cuda-cu11.7.1-x64.zip

64.9 MB 2024-05-20T00:19:22Z
Source code (zip)

2024-05-19T23:39:26Z
Source code (tar.gz)

2024-05-19T23:39:26Z

19 May 17:52

b2940

Merge branch 'ggerganov:master' into master

Assets 21

19 May 17:31

b2937

Add Smaug tokenizer support

Assets 21

19 May 17:28

b2936

ggml: implement quantized KV cache for FA (#7372)

Assets 21