[Agent] Does Langchain re-tokenize the entire prompt for every action? #9075

Jchang4 · 2023-08-10T20:35:50Z

Jchang4
Aug 10, 2023

Using local models, my agent is very slow when its processing the previous action + observation.

I've ran the same model with the same parameters using llama.cpp (the ./main script) using my agent's prompt minus the Final Answer: to see how long prompt processing and generation should take at the last step. It was very fast, within a few seconds.

This led me to suspect Agents aren't caching the prompt tokens to avoid re-tokenizing; which leads to the lag in between text generations. Is this true?

And if so, are there any downsides to saving the prompt tokens in the agent and appending the agent/tool outputs to it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Agent] Does Langchain re-tokenize the entire prompt for every action? #9075

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[Agent] Does Langchain re-tokenize the entire prompt for every action? #9075

Uh oh!

Uh oh!

Jchang4 Aug 10, 2023

Replies: 0 comments

Jchang4
Aug 10, 2023