-
Notifications
You must be signed in to change notification settings - Fork 282
Update MXQuant doc #2309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update MXQuant doc #2309
Conversation
Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
for more information, see https://pre-commit.ci
"It adapts a granularity falling between per-channel and per-tensor to balance accuracy and memory consumption." in introduction section looks not right. block size 32 is normally smaller than channel dimension. @mengniwang95, should we remove this sentence? |
Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>
…essor into kaihui/mx_doc
Also formular "The exponent (exp) is equal to torch.floor(torch.log2(amax))" in introduction section is not right. According to recipe document, the formular is: clamp(floor(log2(amax)) - maxExp, -127, 127), Where maxExp is the largest power-of-two representable in the element data type, e.g. for element FP8 E4M3, maxExp is 8, FP4 E2M1, maxExp is 2. @mengniwang95 , please double confirm if it is default option used in auto-round |
Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>
yes, you ar right |
clamp(floor(log2(amax)) - maxExp, -127, 127) is used in autoround |
Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>
@Kaihui-intel , please help to update formular as well |
|
User description
Type of Change
documentation
Description
transfer to AutoRound Quant
Expected Behavior & Potential Risk
the expected behavior that triggered by this PR
How has this PR been tested?
how to reproduce the test (including hardware information)
Dependency Change?
any library dependency introduced or removed
PR Type
Documentation, Enhancement
Description
Updated documentation for AutoRound Quantization API
Added example using Hugging Face models
Included code snippet for model quantization and inference
Diagram Walkthrough
File Walkthrough
PT_MXQuant.md
Updated to AutoRound Quantization API
docs/source/3x/PT_MXQuant.md