Skip to content

Conversation

@charchit7
Copy link
Contributor

Fixes #12538

The original implementation had inconsistent formulas between init and forward:

In init: Dimensions calculated as h_dim = w_dim = 2 * (attention_head_dim // 6)
In forward: Split sizes calculated as attention_head_dim // 3
While mathematically similar, these can produce different results due to integer division, leading to potential tensor dimension mismatches when splitting the rotary embedding buffers.

Solution
Store the computed dimensions (t_dim, h_dim, w_dim) as instance variables in init and reuse them in forward().

Before submitting

Who can review?

@Warvito, @yiyixuxu
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

…s V2 transformers

- Store t_dim, h_dim, w_dim as instance variables in WanRotaryPosEmbed and SkyReelsV2RotaryPosEmbed __init__
- Use stored dimensions in forward() instead of recalculating with different formula
- Fixes inconsistency between init (using // 6) and forward (using // 3)
- Ensures split_sizes matches the dimensions used to create rotary embeddings
Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@yiyixuxu yiyixuxu requested a review from dg845 November 5, 2025 19:50
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misallignment in Rotary Frequencies

3 participants