[MXFP4] Add calibration support #509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

dsikka wants to merge 5 commits into quant_args_dtype from mxfp4_calibration_support

Collaborator

dsikka commented Nov 4, 2025 •

edited

Loading

Summary

Add option to generate mxfp4 scales when calculating qparams depending on qargs
Add option to convert from mxfp4 scales when running QDQ
Add preset schemes for MXFP4 and MXFP4A16
Update _dequantize to take in qargs

Testing:

Need to add e2e testing to ensure proper scale generation given a quant scheme

With this, we are able to generate MXFP4 models using RTN.
Sample Model: https://huggingface.co/nm-testing/TinyLlama-1.1B-Chat-v1.0-MXFP4/blob/main/config.json


          update

e571a36

This was referenced Nov 4, 2025

[MXFP4] Support MXFp4 Scheme #439

Closed

[MXFP4] Add calibration support #440

Closed

dsikka added 4 commits

November 4, 2025 14:57


          update

68771b1


          fix typo

bf0d0c6


          update

da3ad9f


          updatE

94dbb58

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet