GitHub - CharlesJonah/quantum-gpt

Quantum GPT

This is a LLM model based on GPT2 with some improvements from OpenAI GPT3 paper.

The training has been done for 4 epochs across the whole 10b tokens

Below are the results:

To train the model, run - torchrun --standalone --nproc-per-node=8 train.py --resume False --last_checkpoint_step 0

To get output from the model(inferencing), run python inference.py

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
log		log
Readme.md		Readme.md
fineweb_download.py		fineweb_download.py
hellaswag.py		hellaswag.py
inference.py		inference.py
input.txt		input.txt
output.png		output.png
play.ipynb		play.ipynb
requirements.txt		requirements.txt
train.py		train.py