-
Notifications
You must be signed in to change notification settings - Fork 233
Fix the device error when using ray and add initialize_cache to support vllm main #884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
zhuo97
commented
May 16, 2025
- Remove RAY_EXPERIMENTAL_NOSET_ASCEND_RT_VISIBLE_DEVICES
- Add lazy init for vllm_ascend_C
f085e48
to
f71a8dd
Compare
vllm_ascend/ops/rotary_embedding.py
Outdated
|
||
|
||
def custom_rotary_embedding_enabled(query, neox_style, head_size): | ||
return query.dtype == torch.float16 and neox_style and head_size % 32 == 0 and CUSTOM_OP_ENABLED | ||
try_register_lib("vllm_ascend.vllm_ascend_C") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI failed. I guess the problem is here. for rotary_embedding test, this func should be called as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I forgot to modify test_rotary_embedding.py
to use try_register_lib("vllm_ascend.vllm_ascend_C")
instead of import vllm_ascend.platform
.
d4c5e5f
to
7d9155d
Compare
CI failure has been fixed. Please rebase main again. Thanks. |
tests/ops/test_rotary_embedding.py
Outdated
import vllm_ascend.platform # noqa: F401 | ||
from vllm_ascend.utils import try_register_lib | ||
|
||
try_register_lib("vllm_ascend.vllm_ascend_C") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please pass lib_info
as well to keep the same as before:
Failed to register custom ops, all custom ops will be disabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, the information is added.
3023c18
to
970f649
Compare
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
8130d67
to
278532c
Compare
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
vllm_ascend/platform.py
Outdated
@@ -33,14 +32,6 @@ | |||
from vllm_ascend.utils import ASCEND_QUATIZATION_METHOD, update_aclgraph_sizes | |||
|
|||
CUSTOM_OP_ENABLED = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this variable is removed.
@@ -50,7 +41,6 @@ | |||
VllmConfig = None | |||
FlexibleArgumentParser = None | |||
|
|||
os.environ["RAY_EXPERIMENTAL_NOSET_ASCEND_RT_VISIBLE_DEVICES"] = "1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Add ut for ray backend. we should do this after this pr and #1235
import vllm_ascend.platform as pf | ||
|
||
pf.CUSTOM_OP_ENABLED = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import vllm_ascend.platform as pf | |
pf.CUSTOM_OP_ENABLED = True | |
import vllm_ascend.vllm_ascend_C |
vllm_ascend/utils.py
Outdated
@@ -67,6 +67,22 @@ def try_register_lib(lib_name: str, lib_info: str = ""): | |||
pass | |||
|
|||
|
|||
def enable_custom_op(): | |||
CUSTOM_OP_ENABLED = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a comment here to remind developers to import vllm_ascend.vllm_ascend_C
in examples or UTs of custom op
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, related comments are added.
LGTM now, thanks! |
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase the main. the e2e test has been moved to tests/e2e folder already.
|
||
pf.CUSTOM_OP_ENABLED = True # set True for custom Ops of Multi-Step. | ||
enable_custom_op() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this 2 lines can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this 2 lines are removed.
@@ -67,6 +67,26 @@ def try_register_lib(lib_name: str, lib_info: str = ""): | |||
pass | |||
|
|||
|
|||
def enable_custom_op(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make CUSTOM_OP_ENABLED as global var.
CUSTOM_OP_ENABLED = None
def enable_custom_op():
global CUSTOM_OP_ENABLED
if CUSTOM_OP_ENABLED is not None:
return CUSTOM_OP_ENABLED
else:
xxxx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have changed CUSTOM_OP_ENABLED to a global variable.
3589b0e
to
17efb65
Compare
CI is broken due the vllm change: vllm-project/vllm@1173804#diff-3dd8e96bc7c1aaf28faa13b3b705f4f7bdbf755aec34dcf4d9b67c933ddfb127 Can you fix it in this PR as well? |
…or vllm_ascend_C Signed-off-by: zhuo97 <1103045176@qq.com>
…ect#884) 1. Remove RAY_EXPERIMENTAL_NOSET_ASCEND_RT_VISIBLE_DEVICES 2. Add lazy init for vllm_ascend_C Signed-off-by: zhuo97 <1103045176@qq.com>
…ect#884) 1. Remove RAY_EXPERIMENTAL_NOSET_ASCEND_RT_VISIBLE_DEVICES 2. Add lazy init for vllm_ascend_C Signed-off-by: zhuo97 <1103045176@qq.com> Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
…ect#884) 1. Remove RAY_EXPERIMENTAL_NOSET_ASCEND_RT_VISIBLE_DEVICES 2. Add lazy init for vllm_ascend_C Signed-off-by: zhuo97 <1103045176@qq.com> Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
…to main * 'main' of https://github.com/vllm-project/vllm-ascend: (22 commits) [Bugfix] Remove cuda related lines and add additional pip mirror (vllm-project#1252) [refactor] Refactoring AscendFusedMoE (vllm-project#1229) [Doc] Refactor and init user story page (vllm-project#1224) [Doctest] add installation doctest (vllm-project#1179) [DP][V1] Fix rank set in DP scenario & Bump torch-npu version to 2.5.1.post1.dev20250528 (vllm-project#1235) Fix the device error when using ray as vllm-acend backend (vllm-project#884) [CI] Add unit test framework (vllm-project#1201) [Build] Speedup image build (vllm-project#1216) [CI] Make e2e test to be preemptible and simple (vllm-project#1217) Waiting for BMM NZ support(Improve TPOP 2ms performance) (vllm-project#1131) [Doc] fix VLLM_USE_V1 value in graph mode docs (vllm-project#1226) vllm-ascend support chunked prefill (vllm-project#1172) [CI/UT][Graph] Add ut for torchair graph mode (vllm-project#1103) Add ShouJian Zheng (@jianzs) as vLLM Ascend maintainer (vllm-project#1203) [CI] Recover ut for ascend scheduler only in ci of v1. (vllm-project#1180) Support multistream of MLA vector operations (vllm-project#1135) [Doc] Add Referer header for CANN package download url. (vllm-project#1192) [fix] fix bug in 1p1d disaggregated_prefill example (vllm-project#1184) [CI][Benchmark] Add qwen2.5-7b test (vllm-project#1104) [CI][Benchmark] Add new model and v1 test to perf benchmarks (vllm-project#1099) ... Sync with upstream main branch# the commit.