[Transform] Spinquant with R1 and R2 #1615

brian-dellabetta · 2025-07-02T19:51:07Z

Prerequisites

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

gemini-code-assist

Summary of Changes

Hello @brian-dellabetta, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the TransformModifier by introducing support for predefined transformation configurations, known as presets. This allows users to easily apply complex transformation schemes like QUIP and SpinQuant, streamlining the process of applying advanced model compression techniques. The changes also include an updated example demonstrating the new functionality and improved validation for the modifier.

Highlights

Enhanced TransformModifier Flexibility: The TransformModifier now accepts either a preset_config string to load predefined transformation schemes (like QUIP or SpinQuant) or a direct config object for custom transformation setups, making it more versatile and user-friendly.
Introduction of Predefined Transformation Presets: New modules have been added under src/llmcompressor/modifiers/transform/presets to define and expose QUIP, QUIP_ONLINE, LLAMA_SPINQUANT, and LLAMA_SPINQUANT_R1R2 configurations. These presets simplify the application of complex transformation strategies based on research papers.
Updated Llama-3 Example: The llama3_example.py script has been revised to showcase the usage of the TransformModifier with a preset_config (specifically LLAMA_SPINQUANT_R1R2) and to use QuantizationModifier instead of GPTQModifier. The example also now uses a smaller Llama model for faster execution and includes a dispatch_for_generation call.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

The pull request significantly enhances the TransformModifier by introducing a robust preset configuration system and improving module targeting. The refactoring to use Pydantic for configuration validation greatly improves maintainability and prevents invalid states. The changes to use regex for module targeting in the presets (spinquant.py and quip.py) are a notable improvement for flexibility and robustness.

src/llmcompressor/modifiers/transform/transform.py

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

brian-dellabetta · 2025-07-15T20:05:27Z

src/llmcompressor/modifiers/transform/spinquant/base.py

+    rotations: List[SpinquantRotation] = Field(
+        default_factory=lambda: ["R1", "R2"], exclude=True
+    )
    transform_type: Literal["hadamard", "random-hadamard", "random-matrix"] = Field(
-        default="hadamard"
+        default="hadamard", exclude=True
    )
-    randomize: bool = Field(default=False)
-    learnable: bool = Field(default=False)
+    randomize: bool = Field(default=False, exclude=True)
+    learnable: bool = Field(default=False, exclude=True)


@kylesayrs why are we excluding these? wouldn't we want them to persist in json?

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

kylesayrs and others added 8 commits June 23, 2025 19:34

wip

ba617db

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

use random-hadamard, add correctness tests

2f5b1c8

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

add correctness test, note that precision makes a large difference

3aa35e7

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

add on lifecycle methods

b6c088e

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

Merge branch 'main' into kylesayrs/transform-modifier

d1eb2a1

TransformModifier with SpinQuant R1&R2

3207124

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

spinquant and quip_online, running but outputting gibberish

a88ca3c

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

updated example

5bd51df

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

gemini-code-assist bot reviewed Jul 2, 2025

View reviewed changes

src/llmcompressor/modifiers/transform/transform.py Outdated Show resolved Hide resolved

brian-dellabetta and others added 8 commits July 8, 2025 21:29

DummyModel script

3c216dd

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

implement fuse_norm_linears

bbcdc8c

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

Merge branch 'kylesayrs/fuse-helpers' into bdellabe/transform-modifier

bd7f4d5

R1 working

f5c2150

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

add r2, increase precision

dc5c30c

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

spinquant modifier

7172c26

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

remove space

9298e82

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

use iterable

f77226d

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs changed the base branch from kylesayrs/transform-modifier to main July 11, 2025 18:52

kylesayrs added 11 commits July 11, 2025 14:58

add rotation validation

fdb64b5

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

embedding fusion

5daa2d5

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

add missing norm fusion

0e9af7b

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

use norm mappings

fce83be

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

break into separate files

a979f8a

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

small cleanup

4cab29e

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

cleanup

f1cc987

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

more cleanup

a7bb2e2

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

make new weight on cpu

0cf0188

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

standardize, make modifier serializable

53ea307

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

add compress model script

4b4257f

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs added 5 commits July 15, 2025 11:08

use untie_word_embeddings

dc7ac1a

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

style

8542f8d

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

better registery logic

b1e637e

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

remove dummy model test (add later)

b44ac81

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

docstring

7a52b71

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs changed the title ~~[Transforms] Modifier Updates~~ [Transform] Spinquant with R1 and R2 Jul 15, 2025

kylesayrs added 2 commits July 15, 2025 15:02

update docstring

f4d7ec6

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

rename example file

f18d0e8

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

brian-dellabetta commented Jul 15, 2025

View reviewed changes

kylesayrs and others added 4 commits July 16, 2025 11:13

use match_modules_set

cec2914

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

Merge branch 'main' into bdellabe/transform-modifier

f6c797e

unit test fixes

0c5c514

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

style fixes

f2ef7cf

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Transform] Spinquant with R1 and R2 #1615

[Transform] Spinquant with R1 and R2 #1615

brian-dellabetta commented Jul 2, 2025 •

edited by kylesayrs

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

brian-dellabetta Jul 15, 2025

Uh oh!

Uh oh!

[Transform] Spinquant with R1 and R2 #1615

Are you sure you want to change the base?

[Transform] Spinquant with R1 and R2 #1615

Conversation

brian-dellabetta commented Jul 2, 2025 • edited by kylesayrs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Prerequisites

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

brian-dellabetta Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

brian-dellabetta commented Jul 2, 2025 •

edited by kylesayrs

Loading