Skip to content

How to handle rewinding with the grammar sampler? #3211

Answered by ejones
KerfuffleV2 asked this question in Q&A
Discussion options

You must be logged in to vote

Hey, yeah, #2593 looks really neat! Both of those approaches seem valid. I suppose you could also blend the two and checkpoint the grammar state every N tokens.

For the copying, I don't know if you saw there's now a llama_grammar_copy API that would support that. Currently it copies the rules and has to relocate pointers, but we've discussed that we can avoid this with a shared reference. Once that's done, llama_grammar_copy will just be copying the stacks, which is done multiple times on each sample anyway. So I believe it should not really be too bad in terms of run time; memory, not sure.

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@KerfuffleV2
Comment options

KerfuffleV2 Sep 19, 2023
Collaborator Author

Answer selected by KerfuffleV2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants