Replies: 1 comment
-
The design is to use one |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Important concept: Are llama_context and llama_batch shared by multiple users?
In the server scenario, when providing reasoning for multiple clients, is a llama_context and llama_batch created for each user?
Or are multiple clients sharing one llama_context and llama_batch? I saw why the server.cpp example shared one. This is very critical.
I came here to ask the official to verify:
struct server_context {
llama_model * model = nullptr;
llama_context * ctx = nullptr;
gpt_params params;
llama_batch batch;
.......
}
I understand that there should be one llama_context and llama_batch for each client. This logic is very clear.
Am I wrong? Please give an accurate answer from the official staff. Thank you. It is very critical.
Beta Was this translation helpful? Give feedback.
All reactions