Why was the LoRA option removed from exllamav2 #1267
psych0v0yager
started this conversation in
Feature requests
Replies: 1 comment
-
I don't remember why this was changed, so this might have been a mistake. We are happy to support these changes! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I previously added the ability to swap LoRAs on exllamav2 but it is not present in the current version. What was the reason for this change, and are there possible improvements I could make to add LoRAs back to exllama
EDIT: I noticed exllama refactored their LoRA system. exllamav2 now uses
generator.set_loras(lora)
to add a LoRA rather than the old.generate_simple(prompt_, settings, max_new_tokens, loras = lora_)
.I can add this feature back if Outlines is willing to support it. One caveat is LoRAs cannot be swapped while there is a dynamic job in place. However, if no jobs are taking place, it should be a fairly simple swap
EDIT 2: Exllamav2 also has a new q6 cache. I can add support for this as well.
Please let me know if you approve of these changes
Beta Was this translation helpful? Give feedback.
All reactions