Skip to content

Fix the device error when using ray and add initialize_cache to support vllm main #884

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 16, 2025

Conversation

zhuo97
Copy link
Contributor

@zhuo97 zhuo97 commented May 16, 2025

  1. Remove RAY_EXPERIMENTAL_NOSET_ASCEND_RT_VISIBLE_DEVICES
  2. Add lazy init for vllm_ascend_C



def custom_rotary_embedding_enabled(query, neox_style, head_size):
return query.dtype == torch.float16 and neox_style and head_size % 32 == 0 and CUSTOM_OP_ENABLED
try_register_lib("vllm_ascend.vllm_ascend_C")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI failed. I guess the problem is here. for rotary_embedding test, this func should be called as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot to modify test_rotary_embedding.py to use try_register_lib("vllm_ascend.vllm_ascend_C") instead of import vllm_ascend.platform.

@wangxiyuan
Copy link
Collaborator

CI failure has been fixed. Please rebase main again. Thanks.

import vllm_ascend.platform # noqa: F401
from vllm_ascend.utils import try_register_lib

try_register_lib("vllm_ascend.vllm_ascend_C")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please pass lib_info as well to keep the same as before:
Failed to register custom ops, all custom ops will be disabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, the information is added.

Copy link

github-actions bot commented Jun 3, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Copy link

github-actions bot commented Jun 6, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@@ -33,14 +32,6 @@
from vllm_ascend.utils import ASCEND_QUATIZATION_METHOD, update_aclgraph_sizes

CUSTOM_OP_ENABLED = False
Copy link
Collaborator

@MengqingCao MengqingCao Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this variable is removed.

@@ -50,7 +41,6 @@
VllmConfig = None
FlexibleArgumentParser = None

os.environ["RAY_EXPERIMENTAL_NOSET_ASCEND_RT_VISIBLE_DEVICES"] = "1"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Add ut for ray backend. we should do this after this pr and #1235

Comment on lines 13 to 15
import vllm_ascend.platform as pf

pf.CUSTOM_OP_ENABLED = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import vllm_ascend.platform as pf
pf.CUSTOM_OP_ENABLED = True
import vllm_ascend.vllm_ascend_C

@@ -67,6 +67,22 @@ def try_register_lib(lib_name: str, lib_info: str = ""):
pass


def enable_custom_op():
CUSTOM_OP_ENABLED = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment here to remind developers to import vllm_ascend.vllm_ascend_C in examples or UTs of custom op

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, related comments are added.

@MengqingCao
Copy link
Collaborator

LGTM now, thanks!

Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Copy link
Collaborator

@wangxiyuan wangxiyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase the main. the e2e test has been moved to tests/e2e folder already.


pf.CUSTOM_OP_ENABLED = True # set True for custom Ops of Multi-Step.
enable_custom_op()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this 2 lines can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this 2 lines are removed.

@@ -67,6 +67,26 @@ def try_register_lib(lib_name: str, lib_info: str = ""):
pass


def enable_custom_op():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make CUSTOM_OP_ENABLED as global var.

CUSTOM_OP_ENABLED = None

def enable_custom_op():
    global CUSTOM_OP_ENABLED
    if CUSTOM_OP_ENABLED is not None:
         return CUSTOM_OP_ENABLED 
    else:
        xxxx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have changed CUSTOM_OP_ENABLED to a global variable.

@zhuo97 zhuo97 force-pushed the main branch 2 times, most recently from 3589b0e to 17efb65 Compare June 16, 2025 11:16
@wangxiyuan
Copy link
Collaborator

CI is broken due the vllm change: vllm-project/vllm@1173804#diff-3dd8e96bc7c1aaf28faa13b3b705f4f7bdbf755aec34dcf4d9b67c933ddfb127

Can you fix it in this PR as well?

…or vllm_ascend_C

Signed-off-by: zhuo97 <1103045176@qq.com>
@wangxiyuan wangxiyuan merged commit f5404dc into vllm-project:main Jun 16, 2025
20 checks passed
@Yikun Yikun changed the title Fix the device error when using ray as vllm-acend backend Fix the device error when using ray and add initialize_cache to support vllm main Jun 17, 2025
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 17, 2025
…ect#884)

1. Remove RAY_EXPERIMENTAL_NOSET_ASCEND_RT_VISIBLE_DEVICES
2. Add lazy init for vllm_ascend_C

Signed-off-by: zhuo97 <1103045176@qq.com>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 17, 2025
…ect#884)

1. Remove RAY_EXPERIMENTAL_NOSET_ASCEND_RT_VISIBLE_DEVICES
2. Add lazy init for vllm_ascend_C

Signed-off-by: zhuo97 <1103045176@qq.com>
Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 17, 2025
…ect#884)

1. Remove RAY_EXPERIMENTAL_NOSET_ASCEND_RT_VISIBLE_DEVICES
2. Add lazy init for vllm_ascend_C

Signed-off-by: zhuo97 <1103045176@qq.com>
Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
songshanhu07 added a commit to songshanhu07/vllm-ascend that referenced this pull request Jun 17, 2025
…to main

* 'main' of https://github.com/vllm-project/vllm-ascend: (22 commits)
  [Bugfix] Remove cuda related lines and add additional pip mirror (vllm-project#1252)
  [refactor] Refactoring AscendFusedMoE (vllm-project#1229)
  [Doc] Refactor and init user story page  (vllm-project#1224)
  [Doctest] add installation doctest (vllm-project#1179)
  [DP][V1] Fix rank set in DP scenario & Bump torch-npu version to 2.5.1.post1.dev20250528 (vllm-project#1235)
  Fix the device error when using ray as vllm-acend backend (vllm-project#884)
  [CI] Add unit test framework (vllm-project#1201)
  [Build] Speedup image build (vllm-project#1216)
  [CI] Make e2e test to be preemptible and simple (vllm-project#1217)
  Waiting for BMM NZ support(Improve TPOP 2ms performance)  (vllm-project#1131)
  [Doc] fix VLLM_USE_V1 value in graph mode docs (vllm-project#1226)
  vllm-ascend support chunked prefill (vllm-project#1172)
  [CI/UT][Graph] Add ut for torchair graph mode (vllm-project#1103)
  Add ShouJian Zheng (@jianzs) as vLLM Ascend maintainer (vllm-project#1203)
  [CI] Recover ut for ascend scheduler only in ci of v1. (vllm-project#1180)
  Support multistream of MLA vector operations (vllm-project#1135)
  [Doc] Add Referer header for CANN package download url. (vllm-project#1192)
  [fix] fix bug in 1p1d disaggregated_prefill example (vllm-project#1184)
  [CI][Benchmark] Add qwen2.5-7b test (vllm-project#1104)
  [CI][Benchmark] Add new model and v1 test to perf benchmarks (vllm-project#1099)
  ...
Sync with upstream main branch# the commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants