Skip to content

Commit 230314b

Browse files
committed
feat: Move layer_filter_cb up to llama_kv_cache
This will be needed by other cache types as well, so centralizing the definition will make it more reusable. Branch: HybridCache Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
1 parent 7a1a22c commit 230314b

File tree

2 files changed

+6
-3
lines changed

2 files changed

+6
-3
lines changed

src/llama-kv-cache-unified.h

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,6 @@ class llama_kv_cache_unified : public llama_kv_cache {
2121
public:
2222
static uint32_t get_padding(const llama_cparams & cparams);
2323

24-
// this callback is used to filter out layers that should not be included in the cache
25-
using layer_filter_cb = std::function<bool(int32_t il)>;
26-
2724
llama_kv_cache_unified(
2825
const llama_model & model,
2926
layer_filter_cb && filter,

src/llama-kv-cache.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,13 @@
44
#include "llama-io.h"
55
#include "llama-memory.h"
66

7+
#include <functional>
8+
79
struct llama_kv_cache : public llama_memory_i {
10+
11+
// this callback is used to filter out layers that should not be included in the cache
12+
using layer_filter_cb = std::function<bool(int32_t il)>;
13+
814
virtual ~llama_kv_cache() = default;
915

1016
// split the input batch into a set of ubatches and verify that they can fit into the cache

0 commit comments

Comments
 (0)