Llava 1.6 Instruction Finetuning

Hi @ChenMnZ,

I am seeking clarification and guidance on the process of quantizing the Llava 1.6 model using the Efficient QAT repository. Specifically, I would like to confirm the steps involved and understand the details regarding what components are finetuned at each stage of the process.

Queries

1. **Applying BlockAP on LLM** : Is the initial step to apply BlockAP quantization on the LLM? If so, are there any specific datasets, considerations or configurations required during this step?
2. **Freezing LLM and Vision Transformer (ViT), and Training the Projector:** After obtaining the BlockAP-quantized LLM, the next step appears to involve freezing both the LLM and ViT while training the projector. Could you provide details on where to perform the projector training ? Any relevant scripts or functions would be helpful for implementing this step effectively.
3.  **End-to-End Fine-Tuning of LLM and Projector:**  During the end-to-end fine-tuning stage, do we: 
a. Finetune only the scales for the LLM?
b. Finetune the weights for the projector?
Are there any additional parameters or components involved in this finetuning process that I might be missing?

Could you please provide clarity on the above queries and confirm if the outlined process aligns with the intended approach for quantizing Llava 1.6? Additionally, I would appreciate any guidance on specific code references or best practices for implementing the training and fine-tuning stages.

Looking forward to your insights! Thank you in advance for your support.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Llava 1.6 Instruction Finetuning #25

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Llava 1.6 Instruction Finetuning #25

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions