-
Notifications
You must be signed in to change notification settings - Fork 249
[CI/Build] Upgrade CANN to 8.2.RC1.alpha003 #1653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1653 +/- ##
===========================================
+ Coverage 27.39% 53.87% +26.47%
===========================================
Files 56 80 +24
Lines 6191 9964 +3773
===========================================
+ Hits 1696 5368 +3672
- Misses 4495 4596 +101
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: MengqingCao <cmq0113@163.com>
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consdier the CANN alpha3 release bring many oom and internal error, we will not ugprade this
[1] https://github.com/vllm-project/vllm-ascend/actions/runs/16133440759/job/45525106786?pr=1653
RuntimeError: replay:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:201 NPU function error: c10_npu::acl::AclmdlRIExecuteAsync(model_ri_, c10_npu::getCurrentNPUStream()), error code is 507000
[2] OOM as unexpected:
https://github.com/vllm-project/vllm-ascend/actions/runs/16133440759/job/45525106780?pr=1653
RuntimeError: NPU out of memory. Tried to allocate 98.00 MiB (NPU 0; 29.50 GiB total capacity; 1010.15 MiB already allocated; 1010.15 MiB current active; 39.48 MiB free; 1.02 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.
What this PR does / why we need it?
Upgrade CANN to 8.2.RC1.alpha003
Does this PR introduce any user-facing change?
Need to upgrade CANN to 8.2.RC1.alpha003
How was this patch tested?
CI passed with existing test.