Skip to content

Commit 73f8984

Browse files
committed
fix: Fix indexing into k_l for recurrent cache with filter
Branch: HybridCache Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
1 parent ebd34d0 commit 73f8984

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

src/llama-kv-cache.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2231,8 +2231,8 @@ llama_kv_cache_recurrent::llama_kv_cache_recurrent(
22312231
ggml_tensor * v = ggml_new_tensor_1d(ctx, type_v, n_embd_v_gqa*kv_size);
22322232
ggml_format_name(k, "cache_k_l%d", i);
22332233
ggml_format_name(v, "cache_v_l%d", i);
2234-
k_l.push_back(k);
2235-
v_l.push_back(v);
2234+
k_l[i] = k;
2235+
v_l[i] = v;
22362236
}
22372237

22382238
// allocate tensors and initialize the buffers to avoid NaNs in the padding

0 commit comments

Comments
 (0)