Replies: 2 comments 7 replies
-
Good idea! Btw, shouldn't we implement a LoRA extractor in |
Beta Was this translation helpful? Give feedback.
7 replies
-
Small update on this, I've been able to convert the diff between 2 models into a LoRA adapter: https://huggingface.co/ngxson/LoRA-Qwen2.5-Coder-7B-Instruct I haven't tested with infill, will try in a few days. But in the mean time, we also need #11131 to be merged, so lora for token embeddings will be supported |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The author of Qwen model confirm that infill capability is only possible with Qwen-coder (non-Instruct): https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct/discussions/2#6731a45e0e39be0605a0df20
This will limit the capability of the model to
/infill
only, so it cannot be used with/chat/completions
However, we know that the instruct version is indeed fine-tuned from non-instruct, see the technical report: https://arxiv.org/pdf/2409.12186
To make the model usable with both chat and infill, one solution is to extract the difference between 2 models to a LoRA adapter. This can be done via something like
mergekit-extract-lora
, then we can set lora scale at runtime (i.e. set to 0.0 on infill and 1.0 on chat)Beta Was this translation helpful? Give feedback.
All reactions