How to handle rewinding with the grammar sampler? #3211

KerfuffleV2 · 2023-09-16T08:22:25Z

KerfuffleV2
Sep 16, 2023
Collaborator

Sorry to bug you with the mention (also anyone else who knows the answer is welcome to reply as well). I've been adding backtracking support in my seqrep project (over here: #2593) and if there's a reasonable way to accomplish it, I'd like to be able to support grammar sampling also.

By rewind, I mean undo some number of tokens and restart generation from an earlier point. The naive way to handle it would probably be to reset the grammar sampler state somehow and then feed it tokens starting from the very beginning up to the rewind part. This is likely to be pretty slow though. Another possible approach would be to save the grammar state at each step and just reload it when rewinding back to that point. That also might be kind of slow and memory-intensive.

Any ideas for a better approach?

Answered by ejones

Sep 18, 2023

Hey, yeah, #2593 looks really neat! Both of those approaches seem valid. I suppose you could also blend the two and checkpoint the grammar state every N tokens.

For the copying, I don't know if you saw there's now a llama_grammar_copy API that would support that. Currently it copies the rules and has to relocate pointers, but we've discussed that we can avoid this with a shared reference. Once that's done, llama_grammar_copy will just be copying the stacks, which is done multiple times on each sample anyway. So I believe it should not really be too bad in terms of run time; memory, not sure.

View full answer

ejones · 2023-09-18T00:12:27Z

ejones
Sep 18, 2023
Collaborator

Hey, yeah, #2593 looks really neat! Both of those approaches seem valid. I suppose you could also blend the two and checkpoint the grammar state every N tokens.

For the copying, I don't know if you saw there's now a llama_grammar_copy API that would support that. Currently it copies the rules and has to relocate pointers, but we've discussed that we can avoid this with a shared reference. Once that's done, llama_grammar_copy will just be copying the stacks, which is done multiple times on each sample anyway. So I believe it should not really be too bad in terms of run time; memory, not sure.

1 reply

KerfuffleV2 Sep 19, 2023
Collaborator Author

Thanks for the reply. I was hoping there might be a more elegant way to just move the grammar position back, but I guess not!

I went with just copying the grammar: c27b8ce - hopefully that will be okay for now. The worst case is ~n_ctx copies of the grammar.

The seqrep stuff is already so complex stuff like the blended approach you suggested is probably hard to justify.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to handle rewinding with the grammar sampler? #3211

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to handle rewinding with the grammar sampler? #3211

Uh oh!

KerfuffleV2 Sep 16, 2023 Collaborator

Replies: 1 comment · 1 reply

Uh oh!

ejones Sep 18, 2023 Collaborator

Uh oh!

KerfuffleV2 Sep 19, 2023 Collaborator Author

KerfuffleV2
Sep 16, 2023
Collaborator

Replies: 1 comment 1 reply

ejones
Sep 18, 2023
Collaborator

KerfuffleV2 Sep 19, 2023
Collaborator Author