Caching #4341

Michael-F-Ellis · 2023-12-05T23:08:01Z

Michael-F-Ellis
Dec 5, 2023

I feel like I'm being a pest, but that's not my intention. I just want to know whether the new OAI-like endpoint v1/chat/completions supports or will support caching prior text. I can't get it to work. I've submitted two issues, #4329 and #4287, that have so far not garnered a single reply. I've also tried the unofficial support forum and stackoverflow. I've read the docs and attempted to figure it out from the server.cpp code.

I understand that llama.cpp is a (marvelous) volunteer effort and no one has any obligation to provide support, but I'd greatly appreciate at least a quick "No, it's not supported" or "Yes, but you're doing it wrong" so I won't keep spinning my wheels.

Thanks,
Mike

Answered by Michael-F-Ellis

Dec 6, 2023

@ggerganov has fixed it in branch gg/server-oai-cache-prompt. Works very well now. See #4329. Makes it feasible to work on large-ish docs and chats interactively with 7B models running on my Mac mini.

View full answer

KerfuffleV2 · 2023-12-06T04:19:07Z

KerfuffleV2
Dec 6, 2023
Collaborator

It looks like it always just turns prompt caching on. https://github.com/ggerganov/llama.cpp/blob/5f6e0c0dff1e7a89331e6b25eca9a9fd71324069/examples/server/api_like_OAI.py#L80-L84

Whether it actually works or not, I don't know. You can possibly make your own modifications there to the stuff it's sending to the server example.

1 reply

Michael-F-Ellis Dec 6, 2023
Author

Thanks, I was actually trying to use the new endpoint that obviates the need for api_like_OAI.py

Michael-F-Ellis · 2023-12-06T18:14:22Z

Michael-F-Ellis
Dec 6, 2023
Author

@ggerganov has fixed it in branch gg/server-oai-cache-prompt. Works very well now. See #4329. Makes it feasible to work on large-ish docs and chats interactively with 7B models running on my Mac mini.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Caching #4341

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Caching #4341

Uh oh!

Michael-F-Ellis Dec 5, 2023

Replies: 2 comments · 1 reply

Uh oh!

KerfuffleV2 Dec 6, 2023 Collaborator

Uh oh!

Michael-F-Ellis Dec 6, 2023 Author

Uh oh!

Michael-F-Ellis Dec 6, 2023 Author

Michael-F-Ellis
Dec 5, 2023

Replies: 2 comments 1 reply

KerfuffleV2
Dec 6, 2023
Collaborator

Michael-F-Ellis Dec 6, 2023
Author

Michael-F-Ellis
Dec 6, 2023
Author