Help! Llama.cpp server Stream Freeze current request and continue after processing the new request. #9367
-
I'm new to Llama.cpp server I used Streamlit to make my GUI but it happened like this :
Streamlit.mp4 |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 7 replies
-
I also see server freezes after https://github.com/ggerganov/llama.cpp/releases/tag/b3655 and https://github.com/ggerganov/llama.cpp/releases/tag/b3678 patches modified the server scheduling. Should probably be converted to a debug issue. |
Beta Was this translation helpful? Give feedback.
You can also try to lower the batch, try
-b 32
. Be careful that lower batch have big impact on performance.Also it seems like you're running on CPU, so the default batch size 2048 is significantly long.