We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 676b2db commit b81e2e4Copy full SHA for b81e2e4
src/llama-kv-cache.cpp
@@ -34,8 +34,6 @@ llama_kv_cache_unified::llama_kv_cache_unified(
34
35
const bool is_mla = (hparams.n_embd_head_k_mla != 0 && hparams.n_embd_head_v_mla != 0);
36
37
- is_mla_with_fa = model.arch != LLM_ARCH_DEEPSEEK2 || v_trans
38
-
39
has_shift = false;
40
can_shift = !is_mla || v_trans; // TODO: allow context shifting for MLA with flash attention
41
0 commit comments