Skip to content

Conversation

@Giuseppe5
Copy link

@Giuseppe5 Giuseppe5 commented Oct 30, 2025

The goal of this PR is to allow any user to easily change how quantization is applied during the fine tuning process.
A longer description of the issue can be found in #3521

By making _prepare_for_qat a staticmethod, it is possible to replace it like so:

def my_quant_func(...):
    ...
    return model

FastLlamaModel._prepare_for_qat = my_quant_func

As mentioned in the issue above, even allowing for a custom quantization function (and thus, custom quantizer like Brevitas), does not solve all the issues, since Unsloth makes a strong assumption about the names of the quantization functions for weights and activations.

This can be patched without necessarily changing anything within Unsloth, but I believe it might be worth thinking about a more general implementation for the naming conventions (either a task for this or another PR, you tell me).

cc @Datta0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant