LoRA fine tuning on quantized models is the future. How do I expand finetuning.cpp to more models? #5131

MotorCityCobra · 2024-01-26T02:09:21Z

MotorCityCobra
Jan 26, 2024

The finetuning.cpp (finetuning.exe for me) only works on literal llama models? Today I took a look inside finetuning.cpp and ggml.h and Oh My God, what even is this stuff? Did anyone else here figure this stuff out? How long did it take you? I know Pytorch and Libtorch but not whatever is happening in here.
What would need to be done to add LoRA fine tuning for Mistral models or anything else other than llama models? The Mistral 7B specifically.

All value and freedom will come from fine tuned models. From what I understand LoRA is the best way to fine tune. LoRA on a quantized model is better than QLoRA on a non-quantized model.
Quantized models fit on smaller GPUs so LoRA fine tuning should available for any model.
This logic is a mouthful to string together, but why isn't everyone working on this? We're going to have GPT4 level models on millions of computers running 50 series cards this year. Models able to learn new things. This is way beyond Guttenberg or the semiconductor.
So, what kind of baby steps should I even take to start learning how to use what's inside ggml.h and finetuning.cpp?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LoRA fine tuning on quantized models is the future. How do I expand finetuning.cpp to more models? #5131

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

LoRA fine tuning on quantized models is the future. How do I expand finetuning.cpp to more models? #5131

Uh oh!

MotorCityCobra Jan 26, 2024

Replies: 0 comments

MotorCityCobra
Jan 26, 2024