-
Notifications
You must be signed in to change notification settings - Fork 51
Description
Is your feature request related to a problem? Please describe.
The data gathered through the use of rai_benchmark
presents itself as a great opportunity to create datasets, which can be used for fine-tuning of smaller L/VLMs. Ability to create datasets automatically after a benchmark is run would be very helpful.
Describe the solution you'd like
A module could be created allowing for automatic creation of a dataset from a langfuse traceback after a run of rai_benchmark
-- (however the module should not be tightly integrated with rai_benchmark
itself -- in the future it might be beneficial to extend this functionality to non-benchmark deployments, allowing for online and/or imitation learning).
Describe alternatives you've considered
TBD
Additional context
Initial research has pointed to https://docs.unsloth.ai/ as a possible target framework to conduct fine tuning. Additional information on dataset requirements may be found there.
Note: Different models might require/benefit from different prompt structure for fine tuning. The dataset shouldn't necessarily take care of that, it might be better to leave it to pre-processing pipeline during fine tuning (once that's developed). However the dataset should possibly include all relevant metadata, which will allow for finetuning possibly largest range of models.