chat-rs

LLM model serving from scratch in pure Rust. Made with tonic, axum, candle.rs, and ❤️

Currently in active development so expect this doc to be updated regularly!

demo

running

Currently there are 2 main services; a tonic-based gRPC server which hosts the model running inference (Qwen2.5-0.5B-Instruct) and an axum-based web server hosting the very basic chat.html.

Before running these services though we need to connect to some db instances. This is done by running

./scripts/postgres.sh && ./scripts/redis.sh

Those scripts will spin up some Docker containers for our Postgres and Redis instances, respectively.

To get the server up and running execute in one terminal the command below (n.b. the --release is necessary to get performance out of our locally running LLM):

cargo run --bin server --release

And in another terminal run

cargo run --bin web

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.cargo		.cargo
.github/workflows		.github/workflows
assets		assets
grpc_service		grpc_service
inference_core		inference_core
migrations		migrations
scripts		scripts
web_service		web_service
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
rust-toolchain.toml		rust-toolchain.toml
todos.md		todos.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

chat-rs

demo

running

About

Uh oh!

Releases

Packages

Uh oh!

Languages

nnethercott/chat-rs

Folders and files

Latest commit

History

Repository files navigation

chat-rs

demo

running

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages