-
Notifications
You must be signed in to change notification settings - Fork 22
Open
Description
Hi @ChenMnZ,
I am seeking clarification and guidance on the process of quantizing the Llava 1.6 model using the Efficient QAT repository. Specifically, I would like to confirm the steps involved and understand the details regarding what components are finetuned at each stage of the process.
Queries
- Applying BlockAP on LLM : Is the initial step to apply BlockAP quantization on the LLM? If so, are there any specific datasets, considerations or configurations required during this step?
- Freezing LLM and Vision Transformer (ViT), and Training the Projector: After obtaining the BlockAP-quantized LLM, the next step appears to involve freezing both the LLM and ViT while training the projector. Could you provide details on where to perform the projector training ? Any relevant scripts or functions would be helpful for implementing this step effectively.
- End-to-End Fine-Tuning of LLM and Projector: During the end-to-end fine-tuning stage, do we:
a. Finetune only the scales for the LLM?
b. Finetune the weights for the projector?
Are there any additional parameters or components involved in this finetuning process that I might be missing?
Could you please provide clarity on the above queries and confirm if the outlined process aligns with the intended approach for quantizing Llava 1.6? Additionally, I would appreciate any guidance on specific code references or best practices for implementing the training and fine-tuning stages.
Looking forward to your insights! Thank you in advance for your support.
Metadata
Metadata
Assignees
Labels
No labels