Skip to content

Commit 71ce48e

Browse files
committed
fix: Fix shift logic to defer to unified cache
Branch: HybridRecurrentCache Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
1 parent d224255 commit 71ce48e

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

src/llama-kv-cache-hybrid-recurrent.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -150,8 +150,8 @@ void llama_kv_cache_hybrid_recurrent::defrag_sched(float thold) {
150150
}
151151

152152
bool llama_kv_cache_hybrid_recurrent::get_can_shift() const {
153-
// TODO: Should this return true if the attention cache can shift?
154-
return false;
153+
// Shifting is trivially supported for recurrent
154+
return kv_attn->get_can_shift();
155155
}
156156

157157
void llama_kv_cache_hybrid_recurrent::state_write(llama_io_write_i & io, llama_seq_id seq_id) const {

0 commit comments

Comments
 (0)