A simple way to train a llm by using transformers and running it for chat conversations.
MAKE SURE THAT YOU HAVE THESE MODULES INSTALLED FIRST!
transformers
torch
time
1️⃣Save your dataset as dataset.txt .
2️⃣Run train_llm.py
and wait till it finishes.
3️⃣Run run_llm.py
to run the llm and chat with it.
Also while training, cache file like cached_lm_GPT2Tokenizer_16_dataset.txt
as uploaded above is created, do not edit or move it. Also you only need to download train_llm.py
and run_llm.py
files and make sure that you install dependencies."
🟢Make sure your dataset is organized, formatted with high quality data
🟢If you are training on low end system like in my case,set epoc and save step values to low or if you have plenty of time you can mess around
🟢Make sure your dataset isnt too small
🟢If you get any kind of warning while running,dont panic just wait for the llms response,do not panic