-
Notifications
You must be signed in to change notification settings - Fork 81
Open
Description
Hi, thank you for your great work and detailed explanation!
I’m currently working on a project based on your work and have a few questions.
I noticed a similar discussion in issue #49, but I’d like to follow up with a more specific question.
I’m curious about your decision to apply LoRA only on attn1 in each transformer block and on the convolutional layers, instead of applying it to all attention layers as typically done in the PEFT configuration.
Was this choice based on qualitative observations or some other considerations we might be overlooking?
Thanks again!
Metadata
Metadata
Assignees
Labels
No labels