Synthetic dataset generation - Brief howto built on top of llama.cpp #3568
paschembri
started this conversation in
Show and tell
Replies: 1 comment 1 reply
-
Suggest you first explore grammer functionality in llama.cpp and the excellent grammer builder helper app. I suspect you'll find them valuable. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I’am currently exploring finetuning smaller models for classification / fill-mask.
I wrote a quick intro on how to produce synthetic dataset in a few lines of python.
It covers:
If you have performance tips for this kind of tasks I’m all ears 👀. Now I have to look on how to implement distilbert using ggml 😱
Beta Was this translation helpful? Give feedback.
All reactions