Skip to content

[1/N][UT][v1 MTP] add basic v1 mtp features #890

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 30, 2025

Conversation

XWFAlone
Copy link
Contributor

@XWFAlone XWFAlone commented May 17, 2025

What this PR does / why we need it?

add basic v1 mtp features
please merge it after #874 and #844.

Does this PR introduce any user-facing change?

now, we supported basic v1 mtp, only supported tp only、eager mode and k=1
we will continue to expand more scenarios.

How was this patch tested?

local tested

@XWFAlone XWFAlone force-pushed the v1_mtp branch 3 times, most recently from 4a85243 to 7d3ca5a Compare May 17, 2025 10:33
@XWFAlone XWFAlone force-pushed the v1_mtp branch 2 times, most recently from 721b02d to edaf563 Compare May 23, 2025 12:34
@@ -0,0 +1,230 @@
import threading
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add patch desciption in init

@@ -228,7 +232,7 @@ def __init__(self, vllm_config: VllmConfig, device: torch.device):
self.requests: Dict[str, CachedRequestState] = {}
# Persistent batch.
# Remove this after we drop 0.8.5 support
if vllm_version_is("0.8.5") or vllm_version_is("0.8.5.post1"):
if vllm_version_is("0.8.5") or ("0.8.5.post1"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this if statement will always be True, change this back to the last version

Suggested change
if vllm_version_is("0.8.5") or ("0.8.5.post1"):
if vllm_version_is("0.8.5") or vllm_version_is("0.8.5.post1"):

@XWFAlone XWFAlone force-pushed the v1_mtp branch 2 times, most recently from 27f8f0f to 5771f55 Compare May 27, 2025 02:05
import pytest
from vllm import LLM, SamplingParams

os.environ['VLLM_USE_MODELSCOPE'] = 'True'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if add this env, u should make a single progress in CI to avoid affecting other cases in the same progress that do not use modelscope;
You can also clear this environment variable after the script is executed. In short, make sure that this environment variable is only valid for this file.


from vllm_ascend.attention.attention_v1 import AscendAttentionState
from vllm_ascend.ops.attention import vanilla_chunked_prefill_mla
from vllm_ascend.utils import vllm_version_is
from vllm_ascend.utils import vllm_major_version_is, vllm_version_is
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add this version judgment function? Please explain

# Convert from (L, N, P) to (N, P, L)
self.W_UK_T = W_UK.permute(1, 2, 0).contiguous()
self.W_UV.data = torch_npu.npu_format_cast(self.W_UV.data, 29)
self.W_UK_T.data = torch_npu.npu_format_cast(self.W_UK_T.data, 29)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why make this change?

@mengwei805
Copy link
Collaborator

pls rebase u all commits to 1 commit

@XWFAlone XWFAlone force-pushed the v1_mtp branch 3 times, most recently from 2a7968b to fb0db2b Compare May 28, 2025 01:54
@mengwei805 mengwei805 added long-term-test enable long term test for PR ready-for-test start test by label for PR labels May 28, 2025
@mengwei805 mengwei805 added the ready read for review label May 28, 2025
@mengwei805 mengwei805 removed the ready read for review label May 28, 2025
@XWFAlone XWFAlone force-pushed the v1_mtp branch 2 times, most recently from 7af2e72 to 56f8efb Compare May 29, 2025 02:08
@wangxiyuan
Copy link
Collaborator

you can rebase now. The CI error is fixed

@wangxiyuan wangxiyuan added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels May 29, 2025
@@ -228,7 +232,7 @@ def __init__(self, vllm_config: VllmConfig, device: torch.device):
self.requests: Dict[str, CachedRequestState] = {}
# Persistent batch.
# Remove this after we drop 0.8.5 support
if vllm_version_is("0.8.5") or vllm_version_is("0.8.5.post1"):
if vllm_version_is("0.8.5") or ("0.8.5.post1"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change is not work.

@mengwei805 mengwei805 added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels May 29, 2025
@mengwei805 mengwei805 added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels May 29, 2025
@mengwei805 mengwei805 added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels May 29, 2025
@mengwei805 mengwei805 added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels May 29, 2025
@@ -0,0 +1,92 @@
from __future__ import annotations
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add this import

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoiding circular reference problems with type annotations

Co-authored-by: XWFAlone <xuewenfei2@huawei.com>
Co-authored-by: mengwei805 <mengwei25@huawei.com>
Co-authored-by: JC-ut0 <xuyexiong@huawei.com>
Signed-off-by: XWFAlone <xuewenfei2@huawei.com>
@mengwei805 mengwei805 added ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels May 29, 2025
@wangxiyuan wangxiyuan merged commit 3442fbd into vllm-project:main May 30, 2025
26 checks passed
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 3, 2025
### What this PR does / why we need it?
add basic v1 mtp features
please merge it after
vllm-project#874 and
vllm-project#844.

### Does this PR introduce _any_ user-facing change?
now, we supported basic v1 mtp, only supported tp only、eager mode and
k=1
we will continue to expand more scenarios.

### How was this patch tested?
local tested

Signed-off-by: XWFAlone <xuewenfei2@huawei.com>
Co-authored-by: mengwei805 <mengwei25@huawei.com>
Co-authored-by: JC-ut0 <xuyexiong@huawei.com>
Signed-off-by: wangxiaoxin (A) <w00664509@china.huawei.com>
David9857 pushed a commit to David9857/vllm-ascend that referenced this pull request Jun 3, 2025
### What this PR does / why we need it?
add basic v1 mtp features
please merge it after
vllm-project#874 and
vllm-project#844.

### Does this PR introduce _any_ user-facing change?
now, we supported basic v1 mtp, only supported tp only、eager mode and
k=1
we will continue to expand more scenarios.

### How was this patch tested?
local tested

Signed-off-by: XWFAlone <xuewenfei2@huawei.com>
Co-authored-by: mengwei805 <mengwei25@huawei.com>
Co-authored-by: JC-ut0 <xuyexiong@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
long-term-test enable long term test for PR module:ops module:tests ready read for review ready-for-test start test by label for PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants