-
Notifications
You must be signed in to change notification settings - Fork 559
Description
Search before asking
- I had searched in the issues and found no similar issues.
Operating system information
Linux
What happened
使用docker composer部署的0.8版本,在模型配置中,尝试配置硅基流动的deepseek-v3模型是没有问题的,但是在配置本地用vllm部署的qwen3-32b模型时报错(模型部署正常,有其它应用在使用),docker容器有如下日志:Caused by: pemja.core.PythonException: <class 'RuntimeError'>: invalid llm config: {'api_key': 'ha', 'base_url': 'http://172.29.85.208:9000/v1', 'model': 'qwen3-32b', 'modelType': 'chat', 'type': 'maas', 'customize': {}}, for details: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 8192 tokens. However, you requested 8218 tokens (26 in the messages, 8192 in the completion). Please reduce the length of the messages or completion. None", 'type': 'BadRequestError', 'param': None, 'code': 400}
尝试添加自定义参数max_tokens,配置为4000,在post消息中确实看到了相关的自定义参数,但仍然有上面的报错:8218 tokens......
web页面提示的错误和上面的类似:unknown error
PemjaUtils.invoke Exception:pemja.core.PythonException: <class 'RuntimeError'>: invalid llm config: {'api_key': 'ha', 'base_url': 'http://172.29.85.208:9000/v1', 'model': 'qwen3-32b', 'modelType': 'chat', 'type': 'maas', 'customize': {}}, for details: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 8192 tokens. However, you requested 8218 tokens (26 in the messages, 8192 in the completion). Please reduce the length of the messages or completion. None", 'type': 'BadRequestError', 'param': None, 'code': 400}
How to reproduce
使用docker-compose方式部署(2025-8-1拉取镜像),配置模型从openai模型中选择添加,填入相应的本地模型参数,点确定,就会报错。
Are you willing to submit PR?
- Yes I am willing to submit a PR!