Skip to content

Conversation

@dsikka
Copy link
Collaborator

@dsikka dsikka commented Nov 4, 2025

Summary

  • Add option to generate mxfp4 scales when calculating qparams depending on qargs
  • Add option to convert from mxfp4 scales when running QDQ
  • Add preset schemes for MXFP4 and MXFP4A16
  • Update _dequantize to take in qargs

Testing:

  • Need to add e2e testing to ensure proper scale generation given a quant scheme

With this, we are able to generate MXFP4 models using RTN.
Sample Model: https://huggingface.co/nm-testing/TinyLlama-1.1B-Chat-v1.0-MXFP4/blob/main/config.json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants