Releases · furyhawk/llama.cpp

01 Jul 04:06

0a5a3b5

b5787 Latest

Latest

Add Conv2d for CPU (#14388)

* Conv2D: Add CPU version

* Half decent

* Tiled approach for F32

* remove file

* Fix tests

* Support F16 operations

* add assert about size

* Review: further formatting fixes, add assert and use CPU version of fp32->fp16

Assets 15

cudart-llama-bin-win-cuda-12.4-x64.zip

sha256:8c79a9b226de4b3cacfd1f83d24f962d0773be79f1e7b75c6af4ded7e32ae1d6

373 MB 2025-07-01T04:06:36Z
llama-b5787-bin-macos-arm64.zip

sha256:47b63fa1cf9ea2cb1d380c36904707a65b7b765bcc5b69c8335678c5ec930e38

10.5 MB 2025-07-01T04:06:45Z
llama-b5787-bin-macos-x64.zip

sha256:af0524275a5970d719a1d3f0fb78017a4ef096330b68ede3e6cedbb895c20083

26.3 MB 2025-07-01T04:06:45Z
llama-b5787-bin-ubuntu-vulkan-x64.zip

sha256:785bc7f422ad90e83c084b0bbb58b974c4155d539cb83da56670c6acd36fad70

20 MB 2025-07-01T04:06:47Z
llama-b5787-bin-ubuntu-x64.zip

sha256:5b5005a59dfb53576fc4f50c885b22019f80347fa27fa3f5b36351f54656458e

12.4 MB 2025-07-01T04:06:48Z
llama-b5787-bin-win-cpu-arm64.zip

sha256:284c01cb16bfc379fb3d689065cbe2f053b81b1fbc8e972795f2cf061fa4041b

10.8 MB 2025-07-01T04:06:48Z
llama-b5787-bin-win-cpu-x64.zip

sha256:4ce045613fd0595b11f088c423b9033f8d0c6145a8a2ee11c7f18b920d6e8b83

13.6 MB 2025-07-01T04:06:49Z
llama-b5787-bin-win-cuda-12.4-x64.zip

sha256:0e3452fe5c2144378cbbd250d61fb5035a33d46e9365f3564e1cd50415e964be

128 MB 2025-07-01T04:06:50Z
llama-b5787-bin-win-hip-radeon-x64.zip

sha256:57ace60b8c3792fb9544445437f9496f8313a455d630537345f285885ff7381f

298 MB 2025-07-01T04:06:54Z
llama-b5787-bin-win-opencl-adreno-arm64.zip

sha256:f60b82687e3d176c9061207b52279f6f57f4c3d152780aa891f640770c644f98

11.1 MB 2025-07-01T04:07:00Z
Source code (zip)

2025-06-30T15:57:04Z
Source code (tar.gz)

2025-06-30T15:57:04Z

06 Jun 02:48

github-actions

b5599

1caae7f

b5599

gguf-py : add add_classifier_output_labels method to writer (#14031)

* add add_classifier_output_labels

* use add_classifier_output_labels

Assets 15

02 Jun 00:58

github-actions

b5572

7675c55

b5572

gguf: fix failure on version == 0 (#13956)

Assets 18

01 Jun 13:25

github-actions

b5568

f3a4b16

b5568

sync : ggml

ggml-ci

Assets 18

25 May 14:34

github-actions

b2998

9588f19

b2998

train : change default FA argument (#7528)

Assets 21

22 May 06:07

github-actions

b2961

201cc11

b2961

llama : add phi3 128K model support (#7225)

* add phi3 128k support in convert-hf-to-gguf

* add phi3 128k support in cuda

* address build warnings on llama.cpp

* adjust index value in cuda long rope freq factors

* add long rope support in ggml cpu backend

* make freq factors only depend on ctx size

* remove unused rope scaling type 'su' frin gguf converter

* fix flint warnings on convert-hf-to-gguf.py

* set to the short freq factor when context size is small than trained context size

* add one line of comments

* metal : support rope freq_factors

* ggml : update ggml_rope_ext API to support freq. factors

* backends : add dev messages to support rope freq. factors

* minor : style

* tests : update to use new rope API

* backends : fix pragma semicolons

* minor : cleanup

* llama : move rope factors from KV header to tensors

* llama : remove tmp assert

* cuda : fix compile warning

* convert : read/write n_head_kv

* llama : fix uninitialized tensors

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Assets 21

18 May 05:26

github-actions

b2918

0583484

b2918

ggml : fix quants nans when all the group weights are very close to z…

Assets 21

09 May 03:40

github-actions

b2824

4426e29

b2824

cmake : fix typo (#7151)

Assets 19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: furyhawk/llama.cpp

b5787

Uh oh!

b5599

Uh oh!

b5572

Uh oh!

b5568

Uh oh!

b2998

Uh oh!

b2961

Uh oh!

b2918

Uh oh!

b2824

Uh oh!