Skip to content

[aclgraph] implentment NPUPiecewiseBackend to enable aclgraph #836

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 29, 2025

Conversation

MengqingCao
Copy link
Collaborator

@MengqingCao MengqingCao commented May 13, 2025

What this PR does / why we need it?

  1. Implentment NPUPiecewiseBackend to enable aclgraph
  2. Eable aclgraph by default in V1, but raise error when running deepseek and raise warning when running models except for qwen

How was this patch tested?

CI pass with the new ut

@yiz-liu
Copy link
Contributor

yiz-liu commented May 23, 2025

I believe this is acceptable overall; however, we need to standardize our use of either npugraph or aclgraph throughout the codebase. At present, I use aclgraph consistently and recommend that we continue doing so moving forward.

Additionally, some comments still reference “cuda,” which should be updated to “ascend” or “npu.”

@MengqingCao

@MengqingCao
Copy link
Collaborator Author

I believe this is acceptable overall; however, we need to standardize our use of either npugraph or aclgraph throughout the codebase. At present, I use aclgraph consistently and recommend that we continue doing so moving forward.

Additionally, some comments still reference “cuda,” which should be updated to “ascend” or “npu.”

@MengqingCao

Good catch, the latest commit has changed to use aclgraph, and the comments have been fixed

@MengqingCao MengqingCao marked this pull request as ready for review May 23, 2025 08:22
@MengqingCao MengqingCao force-pushed the piecewise branch 2 times, most recently from 23e1738 to c2c5403 Compare May 27, 2025 01:47
@MengqingCao
Copy link
Collaborator Author

@yiz-liu @wangxiyuan could you take a look at the latest code?

Currently, aclgraph is enabled in V1 by default, and will raise error if running deepseek and throw warning if running models except for qwen.

Copy link
Collaborator

@wangxiyuan wangxiyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please rebase after #952 merged.

Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: MengqingCao <cmq0113@163.com>
@MengqingCao
Copy link
Collaborator Author

@wangxiyuan thanks for your review and all the comments are addressed now. Could this be merged now?

Signed-off-by: MengqingCao <cmq0113@163.com>
@wangxiyuan wangxiyuan merged commit a93bed4 into vllm-project:main May 29, 2025
22 checks passed
raindaywhu added a commit to raindaywhu/vllm-ascend that referenced this pull request May 30, 2025
… main

* 'main' of https://github.com/raindaywhu/vllm-ascend:
  [aclgraph] implentment NPUPiecewiseBackend to enable aclgraph (vllm-project#836)
  [Bugfix][V1] Fix deepseek with v1 (vllm-project#958)
  [Perf] Refactor tensor disposal logic to reduce memory usage (vllm-project#966)
David9857 pushed a commit to David9857/vllm-ascend that referenced this pull request Jun 3, 2025
…roject#836)

### What this PR does / why we need it?
1. Implentment `NPUPiecewiseBackend` to enable aclgraph
2. Eable aclgraph by default in V1, but raise error when running
deepseek and raise warning when running models except for qwen

### How was this patch tested?
CI pass with the new ut

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
@MengqingCao MengqingCao deleted the piecewise branch June 4, 2025 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants