MQuant

Offical code for ACM MM2025 paper MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization (Arxiv)

🔥🔥🔥 Welcome any PR or development for MQuant!

News

2025.08.07: 🔥🔥🔥 MQuant for Qwen2-VL has been released. Looking forward to your response!

2025.08.05: 🔥🔥🔥 MQuant for Intern-VL2 and MiniCPM-V has been released.

2025.08.04: 🔥🔥🔥 MQuant for Qwen-VL has been released.

2025.07.06: 🌟🌟🌟 MQuant has been accepted by ACM MM 2025. 🎉 Cheers!

ToDo List

release the quantization code for Qwen2-VL
release the quantization code for Intern-VL2, MiniCPM-V
release the quantization code for Qwen-VL
release the core code after the paper is accepted
update acknowledgement
release the paper link

Highlight

MQuant is the first quantization solution for Multimodal large language models applicable to 5 mainstream MLLMs.
MQuant proposes the Modality-Specific Static Quantization (MSQ) to significantly reduce the Time-to-First-Token (TTFT) and Rotation Magnitude Suppression (RMS) to mitigate weight outliers.
MQuant achieves near-floating-point accuracy (<1% degradation) while reducing inference latency by up to 30% on 5 mainstram MLLMs (Qwen-VL/Intern-VL/Qwen2-VL/GLM-4V/MiniCPM-V) under W4A8 setting.

Quick Start

1. Installation

see here

2. Quant Model

Contact

Any questions or suggestions are welcome! Jiangyong Yu jiangyongyufocus@gmail.com, Sifan Zhou sifanjay@gmail.com, Dawei Yangdawei.yang@houmo.ai.

Star History

Acknowledgement

Our implementation is based on Quarot, GPTQ and VLMEvalKit. Thanks for the great open-source work!

Citation

If you think our paper or code is helpful, please consider citing our work.

@inproceedings{yu2025mquant,
      title={MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization}, 
      author={JiangYong Yu and Sifan Zhou and Dawei Yang and Shuo Wang and Shuoyu Li and Xing Hu and Chen Xu and Zukang Xu and Changyong Shu and Zhihang Yuan},
      booktitle={Proceedings of the 33rd ACM international conference on multimedia (MM'25)},
      year={2025}
}

License

MQuant is release under MIT license (see LICENSE).

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
assert/images		assert/images
docs		docs
evaluation		evaluation
exam		exam
fake_quant		fake_quant
model		model
plugin		plugin
third/VLMEvalKit		third/VLMEvalKit
.gitignore		.gitignore
README.md		README.md
SimSun.ttf		SimSun.ttf
environment.yml		environment.yml
qwen2vl_environment.yml		qwen2vl_environment.yml
vlmeval		vlmeval

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MQuant

News

ToDo List

Highlight

Quick Start

1. Installation

2. Quant Model

1. QwenVL

2. InternVL2

3. Minicpmv

4. Qwen2VL

Contact

Star History

Acknowledgement

Citation

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

StiphyJay/MQuant

Folders and files

Latest commit

History

Repository files navigation

MQuant

News

ToDo List

Highlight

Quick Start

1. Installation

2. Quant Model

1. QwenVL

2. InternVL2

3. Minicpmv

4. Qwen2VL

Contact

Star History

Acknowledgement

Citation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages