Skip to content

Question about the pre-alignment trainging strategy. #43

@TitleZ99

Description

@TitleZ99

Hello. This is a very good work and a comprehensive analysis of how to design a hybrid visual encoder, gave me a lot of inspiration. At the same time, after reading the paper, I have some questions about the pre-alignment training: 1.What is the difference between pre alignment training in stage one and pre training in stage two? Considering that both types of training unfreeze visual encoders and projectors. The loss functions of the two stages are also commonly used regression loss, that is, next-token-prediction supervision. 2. For pre-alignment training, why introduce additional smaller LLMs (Vicuna-7B in practice) for alignment instead of the original LLMs of the model, because they are all using 7B models. The meaning of this step is not very clear to me. Looking forward to your reply!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions