AQLM: Extreme Compression of Large Language Models via Additive Quantization #5984

joseph777111 started this conversation in Ideas

joseph777111
Mar 10, 2024

Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization

https://arxiv.org/pdf/2401.06118.pdf

https://github.com/Vahe1994/AQLM

Thoughts?

Replies: 1 comment

vladfaust
Jul 26, 2024

New boss in town: AQLM + PV tuning.

See https://huggingface.co/ISTA-DASLab/Mistral-7B-v0.1-AQLM-PV-2Bit-1x16-hf (2.51GB for 7B model). Immensely useful for edge.

0 replies

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment