Replies: 1 comment
-
When you start the server, you should see lines like this:
Something slightly higher than sum of these numbers should be a good starting point for the model/context size/context data type combination you use. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I need to measure how much RAM is necessary to run the llama-server so I run this script to print the VSZ and RSS
I tried sending requests continuously to the server and monitor these metrics and I see they increase gradually and not reduce
Is this normal? Is there any way to determine the necessary RAM or allocate exactly the number of RAM for llama-server?
Thank you so much
Beta Was this translation helpful? Give feedback.
All reactions