Prompt Cache #745

apaz-cli · 2023-04-03T18:53:33Z

apaz-cli
Apr 3, 2023

When you start the program, it has to load the model, then tokenize the prompt and run it through the model. The time to load the model has been mostly eliminated. Tokenizing is also very fast. It would be nice if it were possible to cache the result of running the prompt through the model and load it on startup as well.

I write long prompts, because it gives better results, but it can take a few minutes for the whole thing to process before the window is responsive. With this change, it should be nearly instant.

calvintwr · 2024-01-07T06:23:59Z

calvintwr
Jan 7, 2024

Would you consider running the interference as part of the bootstrapping process of your app? That should cache the prompts.

0 replies

apaz-cli · 2024-01-07T10:25:16Z

apaz-cli
Jan 7, 2024
Author

@calvintwr This post was made back in April when prompt caching was not supported, but it was implemented a short time after.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prompt Cache #745

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Prompt Cache #745

Uh oh!

Uh oh!

apaz-cli Apr 3, 2023

Replies: 2 comments

Uh oh!

calvintwr Jan 7, 2024

Uh oh!

apaz-cli Jan 7, 2024 Author

apaz-cli
Apr 3, 2023

calvintwr
Jan 7, 2024

apaz-cli
Jan 7, 2024
Author