Releases · tinglou/llama.cpp

26 Feb 03:37

d7cfe1f

b4779

docs: add docs/function-calling.md to lighten server/README.md's plig…

Assets 25

25 Feb 12:34

github-actions

b4776

c132239

b4776

add OP sigmoid (#12056)

Co-authored-by: Judd <foldl@boxvest.com>

Assets 25

25 Feb 01:56

github-actions

b4769

34a846b

b4769

opencl: fix for small models (#11950)

* opencl: fix small shape gemv, remove unused extensions

* opencl: fix `transpose_16`, `dump_tensor`, enforce subgroup size

* opencl: fix for token length < 4

* opencl: use wave size of 64 for all Adreno GPUs

---------

Co-authored-by: Shawn Gu <quic_shawngu@quicinc.com>
Co-authored-by: Skyler Szot <quic_sszot@quicinc.com>

Assets 25

24 Feb 03:32

github-actions

b4764

7ad0779

b4764

run: allow to customize prompt by env var LLAMA_PROMPT_PREFIX (#12041)

Signed-off-by: Florent Benoit <fbenoit@redhat.com>

Assets 25

22 Feb 11:32

github-actions

b4735

02203f7

b4735

Apply suggestions from code review

Assets 23

21 Feb 08:14

github-actions

b4734

23ba56e

b4734

Merge branch 'master' of github.com:tinglou/llama.cpp

Assets 23

17 Feb 09:38

github-actions

b4732

2eea03d

b4732

vulkan: implement several ops relevant for ggml_opt (#11769)

* vulkan: support memset_tensor

* vulkan: support GGML_OP_SUM

* vulkan: implement GGML_OP_ARGMAX

* vulkan: implement GGML_OP_SUB

* vulkan: implement GGML_OP_COUNT_EQUAL

* vulkan: implement GGML_OP_OPT_STEP_ADAMW

* vulkan: fix check_results RWKV_WKV6 crash and memory leaks

* vulkan: implement GGML_OP_REPEAT_BACK

* tests: remove invalid test-backend-ops REPEAT_BACK tests

* vulkan: fix COUNT_EQUAL memset using a fillBuffer command

Assets 24

13 Feb 08:16

github-actions

b4705

27e8a23

b4705

sampling: add Top-nσ sampler (#11223)

* initial sampling changes:

* completed top nsigma sampler implementation

* apply parameter to only llama-cli

* updated readme

* added tests and fixed nsigma impl

* cleaned up pr

* format

* format

* format

* removed commented tests

* cleanup pr and remove explicit floats

* added top-k sampler to improve performance

* changed sigma to float

* fixed string format to float

* Update src/llama-sampling.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update common/sampling.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update src/llama-sampling.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update src/llama-sampling.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update src/llama-sampling.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update src/llama-sampling.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* added llama_sampler_init

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Assets 23

10 Feb 03:32

github-actions

b4677

19d3c82

b4677

There's a better way of clearing lines (#11756)

Use the ANSI escape code for clearing a line.

Signed-off-by: Eric Curtin <ecurtin@redhat.com>

Assets 23

24 Jan 03:46

github-actions

b4539

564804b

b4539

tests: fix some mul_mat test gaps (#11375)

Now that we have batched mat-vec mul Vulkan shaders for up to n==8,
these tests weren't actually exercising the mat-mat mul path. Test
n==9 as well. Also, change to use all_types.

Assets 23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: tinglou/llama.cpp

b4779

Uh oh!

b4776

Uh oh!

b4769

Uh oh!

b4764

Uh oh!

b4735

Uh oh!

b4734

Uh oh!

b4732

Uh oh!

b4705

Uh oh!

b4677

Uh oh!

b4539

Uh oh!