This lightweight app allows you to effortlessly convert your training data from .csv to .jsonl for OpenAI model fine-tuning or .json for Llama (Alpaca structure) models.
- Clone repository
git clone https://github.com/IliaShkola/CSV_Jsonl_Converter.git
- Move to the project folder
cd CSV_Jsonl_Converter
- Create new python environment
python -m venv myenv
- Activate the environment
myenv\Scripts\activate
- Upgrade pip
python.exe -m pip install --upgrade pip
- Install the libraries
pip install -r requirements.txt
- Create an executable with PyInstaller
pyinstaller app.spec --noconfirm --clean
The executable file will be stored in the 'dist' project folder.
To convert a CSV file to JSONL or JSON, simply drag and drop your .csv file into the designated area. Then, enter the system prompt in the text box and select the output format based on the model you want to fine-tune. Alpaca models don't require a system prompt.
The .csv file containing training data should include two columns: 'Prompt' and 'Answer'.
Please do not use training data from the TestData directory in a real fine-tuning project!