Skip to content

[Transform] Implement multi-headed transforms #383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 99 commits into
base: main
Choose a base branch
from

Conversation

kylesayrs
Copy link
Contributor

@kylesayrs kylesayrs commented Jul 8, 2025

Purpose

  • Support applying transforms to modules involved in multi-headed attention
    • Specifically, the Spinquant R2 transform applied to v_proj and o_proj

Prerequisites

Implementation

In order to better support low-memory representations transform weight as well as size-specific features such as permutations, we led head_dim override the size of the generated transform. Then, at apply-time, we reshape (unflatten) the value to reflect the head_dim, apply the weight, then reshape (flatten) back to the original shape.

Changes

  • Add head_dim field to TransformScheme
    • This field specifies the size of heads in mha. This can be inferred from most model configs
  • Rename get_matrix_size to get_transform_size
    • Support overriding the size of the transform weight with head_dim
  • Rename transform/utils/utils.py to transform/utils/matrix.py

Testing

  • Add tests for head_dim as well as a mock attention operation

kylesayrs added 30 commits May 30, 2025 13:40
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@kylesayrs kylesayrs force-pushed the kylesayrs/transform-attention-head branch from 3cb99d4 to 10fccfe Compare July 9, 2025 15:21
kylesayrs added 4 commits July 9, 2025 13:33
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@kylesayrs kylesayrs force-pushed the kylesayrs/transform-attention-head branch from 10fccfe to 116b9f9 Compare July 9, 2025 18:36
kylesayrs added 5 commits July 9, 2025 15:50
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@kylesayrs kylesayrs marked this pull request as ready for review July 9, 2025 20:27
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Base automatically changed from kylesayrs/transform_apply to main July 9, 2025 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants