-
Notifications
You must be signed in to change notification settings - Fork 248
[WIP][perf] Replace _npu_rotary_embedding with npu_mrope #1195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: David9857 <985700846@qq.com>
Signed-off-by: David9857 <985700846@qq.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
@@ -64,14 +64,14 @@ def rope_forward_oot( | |||
# TODO: Remove the contiguous in the future. | |||
query = query.contiguous().view(query.shape[0], -1) | |||
key = key.contiguous().view(key.shape[0], -1) | |||
torch_npu._npu_rotary_embedding( | |||
query, key = torch_npu.npu_mrope( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick question: which torch_npu version supports the npu_mrope
operator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for graph mode, no release version supports npu_mrope yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you combine #1231 together
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
What this PR does / why we need it?
Replace the original interface(_npu_rotary_embedding) with a new interface(npu_mrope) with better performance.
Does this PR introduce any user-facing change?
NA
How was this patch tested?