Skip to content

Commit 5c149d2

Browse files
committed
fix: Fix indexing into k_l for recurrent cache with filter
Branch: HybridCache Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
1 parent a886cc1 commit 5c149d2

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

src/llama-kv-cache.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2039,8 +2039,8 @@ llama_kv_cache_recurrent::llama_kv_cache_recurrent(
20392039
ggml_tensor * v = ggml_new_tensor_1d(ctx, type_v, n_embd_v_gqa*kv_size);
20402040
ggml_format_name(k, "cache_k_l%d", i);
20412041
ggml_format_name(v, "cache_v_l%d", i);
2042-
k_l.push_back(k);
2043-
v_l.push_back(v);
2042+
k_l[i] = k;
2043+
v_l[i] = v;
20442044
}
20452045

20462046
// allocate tensors and initialize the buffers to avoid NaNs in the padding

0 commit comments

Comments
 (0)