Skip to content

Inquiry: Fusion Encoder in Downstream Task #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
JungMinKyun opened this issue Apr 11, 2025 · 0 comments
Open

Inquiry: Fusion Encoder in Downstream Task #8

JungMinKyun opened this issue Apr 11, 2025 · 0 comments

Comments

@JungMinKyun
Copy link

Firs of all, thank you very much for your research, I'm really interested in this works.

I have a question regarding the "Fusion Encoder" component of the downstream task,
specifically in the context of the linear probe evaluation.

My understanding is that, for the linear probe, you freeze the pre-trained encoder,
attach a classifier head on top, and then perform classification.
In the code, I see that you load pre-trained weights for the Image and Audio encoders, but for the Fusion Encoder you initialize it randomly, freeze it, and then train only the classifier.

I wonder whether it might make sense to load pre-trained weights for the Fusion Encoder as well, or at least allow it to be learnable if it is randomly initialized.
(If I misunderstanding something in here, please tell me)

Thank you for your time, and I would greatly appreciate any clarification or correction

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant