Important concept: Are llama_context and llama_batch shared by multiple users? #8175

jygmysoul · 2024-06-27T18:35:19Z

jygmysoul
Jun 27, 2024

Important concept: Are llama_context and llama_batch shared by multiple users?
In the server scenario, when providing reasoning for multiple clients, is a llama_context and llama_batch created for each user?
Or are multiple clients sharing one llama_context and llama_batch? I saw why the server.cpp example shared one. This is very critical.
I came here to ask the official to verify:

struct server_context {
llama_model * model = nullptr;
llama_context * ctx = nullptr;
gpt_params params;
llama_batch batch;
.......
}

I understand that there should be one llama_context and llama_batch for each client. This logic is very clear.
Am I wrong? Please give an accurate answer from the official staff. Thank you. It is very critical.

ggerganov · 2024-07-07T15:21:14Z

ggerganov
Jul 7, 2024
Maintainer

The design is to use one llama_context + llama_batch to handle multiple clients. See the parallel example to understand how this works

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Important concept: Are llama_context and llama_batch shared by multiple users? #8175

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Important concept: Are llama_context and llama_batch shared by multiple users? #8175

Uh oh!

jygmysoul Jun 27, 2024

Replies: 1 comment

Uh oh!

ggerganov Jul 7, 2024 Maintainer

jygmysoul
Jun 27, 2024

ggerganov
Jul 7, 2024
Maintainer