-
Notifications
You must be signed in to change notification settings - Fork 157
[Usage]: 4*910B2 部署deepseek r1/v3 报错 #485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This log shows you're using a quant weight of deepseek, but quantization is not supported now in vllm-ascend. And it seems the weight map is mismatched with your weight parameters. Could you use bf16 deepseek and check the weight files' matchness?
|
我的模型是w8a8量化的,vllm serve时如何指定特性的量化方法 |
|
使用ray来启动多节点的大模型推理,速度是不是很慢? |
@zhanglzu Thanks for comments. Currently, we are working on distributed inference enhancement. We cooperate with vLLM and Ray:
|
Your current environment
How would you like to use vllm on ascend
4*910B2 部署deepseek r1/v3 报错
V3-0324:
R1:
The text was updated successfully, but these errors were encountered: