Load model tokenizer and use it inside OWUI pipeline to clip conversation history #446

bartmch · 2025-02-26T17:52:13Z

bartmch
Feb 26, 2025

I would like to load the tokenizer automatically from the model that was chosen by the user in the UI. My goal is to use this tokenizer to count the number of tokens and clip the conversation history such that the LLM prompt stays within the LLM's context window. I do not want to use any generic tiktoken library which #tokens might be off. I am using vLLM as the backend inference engine.
Any alternative (easier) ideas are more than welcome!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Load model tokenizer and use it inside OWUI pipeline to clip conversation history #446

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Load model tokenizer and use it inside OWUI pipeline to clip conversation history #446

Uh oh!

bartmch Feb 26, 2025

Replies: 0 comments

bartmch
Feb 26, 2025