Does qlora adapter only support BitsAndBytes quantization model ? #9544

tarukumar · 2024-10-21T05:57:30Z

tarukumar
Oct 21, 2024

What i have observed is when I try to deploy the model using qlora_adapter_name_or_path for qlora adapter it fails to deploy with the error mentioned in the line https://github.com/vllm-project/vllm/blob/main/vllm/engine/arg_utils.py#L899-L911. The question is to deploy qlora adpater should i use --lora-modules or adapter-cache parameter for qlora adapter? What is the best approach here ?

jeejeelee · 2024-10-21T16:30:26Z

jeejeelee
Oct 21, 2024
Collaborator

See: https://github.com/vllm-project/vllm/blob/main/examples/lora_with_quantization_inference.py

9 replies

chenqianfzh Oct 22, 2024

@jeejeelee sure, my pleasure.

QLoRA and 4-bit quantization were originally introduced together in the paper "QLoRA: Efficient Finetuning of Quantized LLMs" (https://arxiv.org/abs/2305.14314). In practice, most QLoRA models utilize bitsandbytes for 4-bit quantization.

However, as @tarukumar observed from the code, QLoRA adapters are not restricted to supporting bitsandbytes quantization exclusively. We are prepared to modify the code if other quantization models require such functionality.

Hope this helps. :-)

tarukumar Oct 23, 2024
Author

@chenqianfzh That would be good if we can expand that. Thank you!

tarukumar Oct 28, 2024
Author

@chenqianfzh should i open the issue for this support ?

chenqianfzh Oct 29, 2024

@chenqianfzh should i open the issue for this support ?

Do you see many such use cases? I would add the feature if it is a popular demand.

tarukumar Oct 29, 2024
Author

It would be definitely useful may be we can add this as a part of bytedance-iaas#2

chenqianfzh · 2024-10-29T17:15:10Z

chenqianfzh
Oct 29, 2024

@tarukumar Feel free to open an issue at https://github.com/bd-iaas-us/vllm/issues and assign it to me.

In this issue, please let us know why existing Quantization + LoRA solution in vLLM does not suffice, as well as some models in need of this feature.

Thanks!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Does qlora adapter only support BitsAndBytes quantization model ? #9544

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 9 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Does qlora adapter only support BitsAndBytes quantization model ? #9544

Uh oh!

tarukumar Oct 21, 2024

Replies: 2 comments · 9 replies

Uh oh!

jeejeelee Oct 21, 2024 Collaborator

Uh oh!

chenqianfzh Oct 22, 2024

Uh oh!

tarukumar Oct 23, 2024 Author

Uh oh!

tarukumar Oct 28, 2024 Author

Uh oh!

chenqianfzh Oct 29, 2024

Uh oh!

tarukumar Oct 29, 2024 Author

Uh oh!

chenqianfzh Oct 29, 2024

tarukumar
Oct 21, 2024

Replies: 2 comments 9 replies

jeejeelee
Oct 21, 2024
Collaborator

tarukumar Oct 23, 2024
Author

tarukumar Oct 28, 2024
Author

tarukumar Oct 29, 2024
Author

chenqianfzh
Oct 29, 2024