Skip to content

Commit 3fe8e9a

Browse files
committed
Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main
2 parents 9dc5e20 + 1547202 commit 3fe8e9a

File tree

6 files changed

+11
-4
lines changed

6 files changed

+11
-4
lines changed

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
## [0.2.73]
11+
12+
- feat: Update llama.cpp to ggerganov/llama.cpp@25c6e82e7a1ad25a42b0894e87d9b5c557409516
13+
- fix: Clear kv cache at beginning of image chat formats to avoid bug when image is evaluated first by @abetlen in ac55d0a175115d1e719672ce1cb1bec776c738b1
14+
1015
## [0.2.72]
1116

1217
- fix(security): Remote Code Execution by Server-Side Template Injection in Model Metadata by @retr0reg in b454f40a9a1787b2b5659cd2cb00819d983185df

CMakeLists.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,8 +51,9 @@ if (LLAMA_BUILD)
5151
)
5252

5353
if (LLAVA_BUILD)
54-
if (LLAMA_CUBLAS)
54+
if (LLAMA_CUBLAS OR LLAMA_CUDA)
5555
add_compile_definitions(GGML_USE_CUBLAS)
56+
add_compile_definitions(GGML_USE_CUDA)
5657
endif()
5758

5859
if (LLAMA_METAL)

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ build.debug:
1616
CMAKE_ARGS="-DCMAKE_BUILD_TYPE=Debug" python3 -m pip install --verbose --config-settings=cmake.verbose=true --config-settings=logging.level=INFO --config-settings=install.strip=false --editable .
1717

1818
build.cuda:
19-
CMAKE_ARGS="-DLLAMA_CUBLAS=on" python3 -m pip install --verbose -e .
19+
CMAKE_ARGS="-DLLAMA_CUDA=on" python3 -m pip install --verbose -e .
2020

2121
build.opencl:
2222
CMAKE_ARGS="-DLLAMA_CLBLAST=on" python3 -m pip install --verbose -e .

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -550,7 +550,7 @@ llm = Llama.from_pretrained(
550550
n_ctx=2048, # n_ctx should be increased to accommodate the image embedding
551551
)
552552

553-
respoonse = llm.create_chat_completion(
553+
response = llm.create_chat_completion(
554554
messages = [
555555
{
556556
"role": "user",

llama_cpp/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
from .llama_cpp import *
22
from .llama import *
33

4-
__version__ = "0.2.72"
4+
__version__ = "0.2.73"

llama_cpp/llama_chat_format.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2637,6 +2637,7 @@ def embed_image_bytes(image_bytes: bytes):
26372637

26382638
# Evaluate prompt
26392639
llama.reset()
2640+
llama._ctx.kv_cache_clear()
26402641
for type_, value in split_text:
26412642
if type_ == "text":
26422643
tokens = llama.tokenize(value.encode("utf8"), add_bos=False, special=True)

0 commit comments

Comments
 (0)