-
Hello, I'm using node-llama-cpp to develop an application, and I'd like to process each input prompt independently, without any influence from previous inputs. In the C++ version of llama.cpp, I was able to achieve this by calling llama_kv_cache_clear(ctx); to clear the KV cache, effectively resetting the context. Is there a way to perform a similar operation in node-llama-cpp without recreating the LlamaContext each time? Currently, I'm creating a new LlamaContext for each input to ensure that previous contexts don't interfere, but I believe this might not be the most efficient approach in terms of performance and resource management. My application requires that all responses are idempotent, and the context from previous inputs should not affect new ones. I would appreciate any guidance on whether there's an existing method to clear the KV cache in LlamaContext, or if there are plans to support this functionality in the future. Thank you for your help! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
I have upgraded to version v3.0.1 and discovered the Sequence feature. This solves my problem, so this issue can be closed. Thank you. |
Beta Was this translation helpful? Give feedback.
I have upgraded to version v3.0.1 and discovered the Sequence feature. This solves my problem, so this issue can be closed. Thank you.