An LLM optimizer that streamlines matrix vector multiplication process by implementing multithreading and synchronization techniques. This optimizer is compatible for Linux machines.
(see folder "examples")
- Running the matrix vector multiplication using a single thread. Time taken: 4.927 s
- Running the matrix vector multiplication using 4 threads. Time taken: 2.338 s
- Running the matrix vector multiplication using 16 threads. Time taken: 1.548 s
- Clone the repository
git clone https://github.com/davinnev/PowerUp-LLM.git
- Navigate to the source folder
cd src
- Compile the code
gcc llama2.c -o llama2
- Try running the optimizer
./llama2 {seed} {num of threads}
For example,
./llama2 42 4