This project is the Rust implementation of Andrej Karpathy's llama2.c.
- At very initial stage.
- Able to generate at ~180 tk/s rate using
stories15M.bin
model.
- Producing gibberish tokens till now.
- Failing intermittently due to decoding a non-utf8 token.
- Yet to implement
chat
functionality. - Yet to implement CLI functionalities.
- Make sure you are in project base directory.
- Then run
cargo run src\main.rs