Apparent neverending response to simple prompt #12919
Unanswered
mcondarelli
asked this question in
Q&A
Replies: 1 comment 1 reply
-
When using the web interface, you have to set the model parameters by clicking on the settings icon at the top right corner. For deepseek, the temperature should be set to 0.6. Each model comes with its unique set of parameters for optimum performance. Also, even if you set the parameters, reasoning models take a long time(I have seen 70 minutes for the response to finish) depending on your hardware. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
This looks like a bug, but it could also be some bad setting on my side (or even a model bug, but that would be very strange)
Please advise.
I compiled from latest git (commit: 8b9cc7c) using the following options:
It compiles without apparent errors.
I started
llama-server
as (model was downloaded from HuggingFace):I connected to web server (using Firefox, if relevant) and sent a test question: "in modern python: how to start a subprocess in daemon mode, redirecting output to file and saving pid to file?"
Model actually answered the question correctly, but it didn't stop and started to prattle between itself until I hit the "Stop" button.
I attach the actual output: conversation_conv-1744465379214.json.
I also attach the log where I see some worrying comments I don't know how to interpret: llama-server.log
It seems there's some mismatch between prompt format expected by model and what is actually sent by
llama-server
Beta Was this translation helpful? Give feedback.
All reactions