rerun model question #9923

Unanswered

nigelzzzzzzz asked this question in Q&A

nigelzzzzzzz
Oct 17, 2024

my scenario is that i can create a pipeline. e.g.,
input prompt -> model -> decode -> output_str ----tokenizer it to input prompt --> prompt -> second mode -> decode -> final output_str.
In tokenizer it to input prompt stage, do i need to do anything, e.g kvcache clean or sampler_free?

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment