-
Notifications
You must be signed in to change notification settings - Fork 12.4k
Open
Labels
bugSomething isn't workingSomething isn't workingcritical severityUsed to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)
Description
What happened?
It seems that loading llava models crashes entirely. I can reproduce that 100% hit with moondream models.
this issue has been discussed already in #9066 (comment) and in #9294 (comment), this ticket is just a tracker to discuss about the issue
Name and Version
Commit still working here: 815b1fb
Commit which is not working: e6b7801 (which includes #9082 ), also daa9623 is not working (which is older)
What operating system are you seeing the problem on?
Linux
Relevant log output
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stderr /home/mudler/_git/LocalAI/backend/cpp/llama-avx2/llama.cpp/ggml/src/ggml.c:13835: GGML_ASSERT(i01 >= 0 &
& i01 < ne01) failed
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout [Thread debugging using libthread_db enabled]
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout Using host libthread_db library "/lib64/libthread_db.so.1".
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout 0x00007f989b8e94a3 in ?? () from /lib64/libgomp.so.1
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout #0 0x00007f989b8e94a3 in ?? () from /lib64/libgomp.so.1
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout #1 0x00000000008222e5 in ggml_graph_compute_thread.isra ()
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout #2 0x00007f989b8dcd16 in GOMP_parallel () from /lib64/libgomp.so.1
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout #3 0x0000000000825a2a in ggml_graph_compute ()
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout #4 0x0000000000834010 in ggml_backend_cpu_graph_compute ()
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout #5 0x000000000083784c in ggml_backend_graph_compute ()
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout #6 0x0000000000652b63 in clip_image_batch_encode.constprop ()
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout #7 0x0000000000653553 in clip_image_encode ()
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout #8 0x0000000000657ac8 in llava_image_embed_make_with_clip_img ()
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout #9 0x00000000004e2c09 in llama_server_context::update_slots() [clone .isra.0] ()
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout #10 0x00000000004d7629 in llama_server_queue::start_loop() ()
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout #11 0x000000000048b040 in main ()
10:25PM DBG GRPC(moondream2-text-model-f16.gguf-127.0.0.1:42747): stdout [Inferior 1 (process 13029) detached]
Note
- Flagged as critical as it completely crashes
llama.cpp
llama.cpp
is being used as a library ( chore(deps): update llama.cpp mudler/LocalAI#3497 )- Applying the suggestion described in Bug: MiniCPM-V-2.6 commit d565bb2fd5a2a58b9924a7a34e77a87c78c52137 causing crash in moondream #9066 (comment) seems to workaround the issue for me
diff --git a/examples/llava/clip.cpp b/examples/llava/clip.cpp
index 342042ff..224db9b5 100644
--- a/examples/llava/clip.cpp
+++ b/examples/llava/clip.cpp
@@ -2419,7 +2419,7 @@ bool clip_image_batch_encode(clip_ctx * ctx, const int n_threads, const clip_ima
struct ggml_tensor * patches = ggml_graph_get_tensor(gf, "patches");
int* patches_data = (int*)malloc(ggml_nbytes(patches));
for (int i = 0; i < num_patches; i++) {
- patches_data[i] = i + 1;
+ patches_data[i] = i;
}
ggml_backend_tensor_set(patches, patches_data, 0, ggml_nbytes(patches));
free(patches_data);
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingcritical severityUsed to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)