Skip to content

[WIP][perf] Replace _npu_rotary_embedding with npu_mrope #1195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

David9857
Copy link
Contributor

What this PR does / why we need it?

Replace the original interface(_npu_rotary_embedding) with a new interface(npu_mrope) with better performance.

Does this PR introduce any user-facing change?

NA

How was this patch tested?

Signed-off-by: David9857 <985700846@qq.com>
Signed-off-by: David9857 <985700846@qq.com>
@Yikun Yikun added long-term-test enable long term test for PR accuracy-test enable all accuracy test for PR performance-test enable performance test for PR ready-for-test start test by label for PR and removed ready-for-test start test by label for PR labels Jun 15, 2025
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
@Yikun Yikun added the ready-for-test start test by label for PR label Jun 15, 2025
@@ -64,14 +64,14 @@ def rope_forward_oot(
# TODO: Remove the contiguous in the future.
query = query.contiguous().view(query.shape[0], -1)
key = key.contiguous().view(key.shape[0], -1)
torch_npu._npu_rotary_embedding(
query, key = torch_npu.npu_mrope(
Copy link
Collaborator

@Yikun Yikun Jun 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick question: which torch_npu version supports the npu_mrope operator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for graph mode, no release version supports npu_mrope yet

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you combine #1231 together

@David9857 David9857 changed the title [perf] Replace _npu_rotary_embedding with npu_mrope [WIP][perf] Replace _npu_rotary_embedding with npu_mrope Jun 16, 2025
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accuracy-test enable all accuracy test for PR long-term-test enable long term test for PR merge-conflicts module:ops module:tests performance-test enable performance test for PR ready-for-test start test by label for PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants