-
Notifications
You must be signed in to change notification settings - Fork 75
Open
Description
Summary
In certain requests, the model.respond(code_chat, response_format=response_format)
call in the LM Studio Python SDK hangs indefinitely. The model appears to be running, but it never returns a response, and the call never raises any exception either.
Steps to Reproduce
-
Load a model via LM Studio's Python SDK.
-
Construct a prompt and call
model.respond(...)
with aChat
object and aresponse_format
. -
Occasionally, the call does not return or raise an error.
-
The Python process remains stuck, and the model engine keeps running in the background.
-
The only recovery options are:
- Killing the Python process.
- Manually stopping the running model in LM Studio.
Expected Behavior
There should be a way to set a timeout for the synchronous API (model.respond(...)
). If the model takes too long to respond, the SDK should either:
- Raise a timeout exception, or
- Provide a configurable timeout parameter (e.g.,
timeout=180
).
Actual Behavior
- No timeout mechanism exists.
- The process hangs indefinitely on certain inputs.
- This creates challenges for production or automated workflows where stuck calls block the system.
Environment
- OS: Windows 11
- Python: 3.11.9
- LMSTUDIO Version: 0.3.17 build 11
- Model: qwen2.5-14b-instruct
Metadata
Metadata
Assignees
Labels
No labels