Skip to content

Adjust KV cache shape for compatibility with updated APIs for graph mode #657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

linfeng-yuan
Copy link
Contributor

adapt: modify kv caches' shape to adapt to the updated kv_rmsnorm_rope_cache and npu_fused_infer_attention_score APIs.

@linfeng-yuan linfeng-yuan force-pushed the modify_kv_caches_for_graph_mode branch from 0d92f3a to ae5c578 Compare April 28, 2025 06:17
Bumps [actions/setup-python](https://github.com/actions/setup-python)
from 5.5.0 to 5.6.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@linfeng-yuan linfeng-yuan force-pushed the modify_kv_caches_for_graph_mode branch from ae5c578 to 6beab94 Compare April 28, 2025 06:31
@wangxiyuan
Copy link
Collaborator

need rebase first

@@ -920,7 +920,7 @@ def exec_kv(
# npu_kv_rmsnorm_rope_cache needs [B, N, S, D]
kv = kv.view(B, N, S, self.kv_lora_rank + self.qk_rope_head_dim)

k_pe, k_nope = torch.ops.npu_inference.npu_kv_rmsnorm_rope_cache(
k_pe, k_nope, _, _ = torch.ops.npu_inference.npu_kv_rmsnorm_rope_cache(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can we run this before, is it caused by torchair's update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. It results from torchair's update in new version of CANN. I'll submit a new PR containing some new modifications and #683.

@linfeng-yuan
Copy link
Contributor Author

need rebase first
Sure, I'll submit a new PR containing some new modifications and #683.

@wangxiyuan
Copy link
Collaborator

need rebase first

Copy link

github-actions bot commented Jun 3, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants