-
Notifications
You must be signed in to change notification settings - Fork 281
[CI]Add e2e test for 310p #1879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
14c173e
to
8fd9102
Compare
8fd9102
to
c567f9c
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1879 +/- ##
=======================================
Coverage 71.49% 71.49%
=======================================
Files 86 86
Lines 9131 9131
=======================================
Hits 6528 6528
Misses 2603 2603
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
c567f9c
to
a240a5e
Compare
a240a5e
to
4bd1674
Compare
4bd1674
to
3b3ac11
Compare
3b3ac11
to
eed48ec
Compare
4bc42e5
to
9a22d24
Compare
LGTM if CI passed |
@@ -0,0 +1,51 @@ | |||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not create a new forder. change to something like:
tests/e2e/singlecard/test_offiline_inference_310p.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# This file is a part of the vllm-ascend project. | ||
# Adapted from vllm/tests/basic_correctness/test_basic_correctness.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
useless 2 lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
# | ||
"""Compare the short outputs of the Pangu (Ascend) model when using greedy sampling. | ||
|
||
Run `pytest tests/e2e/test_offline_inference.py`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete it
67309fd
to
63e626f
Compare
@@ -0,0 +1,117 @@ | |||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why create a new workflow? you can just add 2 jobs e2e-310p
and e2e-4-cards-310p
in vllm_ascend_test like other does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The startup method of this workflow is different with vllm_ascend_test , including label, schedule and tag.
# TODO(yikun): Remove m.daocloud.io prefix when infra proxy ready | ||
image: m.daocloud.io/quay.io/ascend/cann:8.1.rc1-310p-ubuntu22.04-py3.10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this according to: #1912
@pytest.mark.parametrize("model", MODELS) | ||
@pytest.mark.parametrize("dtype", ["float16"]) | ||
@pytest.mark.parametrize("max_tokens", [5]) | ||
def test_pangu_model(model: str, dtype: str, max_tokens: int) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to also add torchair testcase
@Angazenn Please give some parameter examples here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, here is an example for pangu with torchair:
additional_config = {
"ascend_scheduler_config": {
"enabled": True,
},
"torchair_graph_config": {
"enabled": True,
},
}
with VllmRunner(
"vllm-ascend/pangu-pro-moe-pruing",
dtype="half",
tensor_parallel_size=4,
distributed_executor_backend="mp",
enforce_eager=False,
additional_config=additional_config,
enable_expert_parallel=True,
compilation_config={
"custom_ops":
["+unquantized_fused_moe"]
}
) as vllm_model:
Signed-off-by: hfadzxy <starmoon_zhang@163.com>
63e626f
to
c9039d5
Compare
What this PR does / why we need it?
Add e2e test for 310p:
trigger conditions:tag, labels(ready-for-test, e2e-310p-test), schedule
image: m.daocloud.io/quay.io/ascend/cann:8.1.rc1-310p-ubuntu22.04-py3.10
runner: linux-aarch64-310p-1, linux-aarch64-310p-4
model: IntervitensInc/pangu-pro-moe-model, Qwen/Qwen3-0.6B-Base, Qwen/Qwen2.5-7B-Instruct
Does this PR introduce any user-facing change?
How was this patch tested?