-
Notifications
You must be signed in to change notification settings - Fork 255
[aclgraph] implentment NPUPiecewiseBackend to enable aclgraph #836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I believe this is acceptable overall; however, we need to standardize our use of either Additionally, some comments still reference “cuda,” which should be updated to “ascend” or “npu.” |
Good catch, the latest commit has changed to use |
23e1738
to
c2c5403
Compare
@yiz-liu @wangxiyuan could you take a look at the latest code? Currently, aclgraph is enabled in V1 by default, and will raise error if running deepseek and throw warning if running models except for qwen. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please rebase after #952 merged.
Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: MengqingCao <cmq0113@163.com>
@wangxiyuan thanks for your review and all the comments are addressed now. Could this be merged now? |
Signed-off-by: MengqingCao <cmq0113@163.com>
… main * 'main' of https://github.com/raindaywhu/vllm-ascend: [aclgraph] implentment NPUPiecewiseBackend to enable aclgraph (vllm-project#836) [Bugfix][V1] Fix deepseek with v1 (vllm-project#958) [Perf] Refactor tensor disposal logic to reduce memory usage (vllm-project#966)
…roject#836) ### What this PR does / why we need it? 1. Implentment `NPUPiecewiseBackend` to enable aclgraph 2. Eable aclgraph by default in V1, but raise error when running deepseek and raise warning when running models except for qwen ### How was this patch tested? CI pass with the new ut --------- Signed-off-by: MengqingCao <cmq0113@163.com>
What this PR does / why we need it?
NPUPiecewiseBackend
to enable aclgraphHow was this patch tested?
CI pass with the new ut