Context management- implementing --keep. #92

mseddon · 2023-11-09T13:39:32Z

mseddon
Nov 9, 2023

In llama.cpp, the main command supports saving the prompt via the --keep flag, which is handy for long running chat sessions. Is there a way to simulate this with the current API? After a few exchanges, it appears the context fills up and the whole system goes bananas.

What I really need is access to llama_kv_cache_seq_shift and llama_kv_cache_seq_rm

giladgd · 2023-11-12T21:10:55Z

giladgd
Nov 12, 2023
Maintainer

@mseddon I'm currently working on #85, which will include better support for infinite test generation and automatic context swapping.
I'll ensure the swap size is also configurable (by default, it'll be automatic and dynamic depending on the prompt).

I aim to make the API as high-level as possible with good defaults, so while you can still configure more advanced parameters, using this library won't require too much technical understanding of the underlying code.
In other words, I'd like to make this library smart enough so you wouldn't need or even want to directly use llama_kv_cache_seq_shift or llama_kv_cache_seq_rm

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Context management- implementing --keep. #92

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Context management- implementing --keep. #92

Uh oh!

Uh oh!

mseddon Nov 9, 2023

Replies: 1 comment

Uh oh!

giladgd Nov 12, 2023 Maintainer

mseddon
Nov 9, 2023

giladgd
Nov 12, 2023
Maintainer