LoRA fine tuning on quantized models is the future. How do I expand finetuning.cpp to more models? #5131
MotorCityCobra
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The finetuning.cpp (finetuning.exe for me) only works on literal llama models? Today I took a look inside finetuning.cpp and ggml.h and Oh My God, what even is this stuff? Did anyone else here figure this stuff out? How long did it take you? I know Pytorch and Libtorch but not whatever is happening in here.
What would need to be done to add LoRA fine tuning for Mistral models or anything else other than llama models? The Mistral 7B specifically.
All value and freedom will come from fine tuned models. From what I understand LoRA is the best way to fine tune. LoRA on a quantized model is better than QLoRA on a non-quantized model.
Quantized models fit on smaller GPUs so LoRA fine tuning should available for any model.
This logic is a mouthful to string together, but why isn't everyone working on this? We're going to have GPT4 level models on millions of computers running 50 series cards this year. Models able to learn new things. This is way beyond Guttenberg or the semiconductor.
So, what kind of baby steps should I even take to start learning how to use what's inside ggml.h and finetuning.cpp?
Beta Was this translation helpful? Give feedback.
All reactions