GuidanceRPC

A very simple server to run guidance programs over http.

Supports health checking and reflection.

Goals

Run guidance programs over http in a reliable and performant way.

Run simple programs consisting of gen + prompt text
Streaming
Logging (no idea why this is not working.)
Error handling
Guidance programs with async steps

Non-Goals

Support non-hugging-face models (including openai)
Support windows (use wsl/docker/podman)
CPU support (fixes going this direction are fine, it should not add complexity)

Acceptable Contributions

Improving my awful python
Inprove Dockerfile
Add docker examples
Bug fixes
Documentation
Tests
Performance improvements (startup speed on larger models is a big one)
Increasing the number of guidance programs that can be run

Running

Quickstart

podman run -e MODEL_NAME=gpt2 -p 50051:50051 --init --device=nvidia.com/gpu=all ghcr.io/utilityai/guidance-rpc:latest

Locally

Required poetry to be installed.

poetry install
poetry run python src/main.py

Podman / Docker

This should work almost 1-1 with docker

the device flag in run may be different
the suffix ,z on the --mount will not be required

Latest Release

podman run \
  -p 50051:50051 \
  -e MODEL_NAME=meta-llama/Llama-2-7b-hf \
  -e HF_TOKEN=hf_aaaaaaaaaaaaaaaaaaaaaaaaaa \
  --mount type=bind,src=$XDG_CONFIG_HOME/.cache/huggingface,dst=/root/.cache/huggingface,z \
  --init \
  --device=nvidia.com/gpu=all \
  ghcr.io/utilityai/guidance-rpc:latest

From source

podman build -t guidance-rpc .

podman run \
  -p 50051:50051 \
  -e MODEL_NAME=TheBloke/Llama-2-7b-Chat-GPTQ \
  -e CACHE=False \
  --mount type=bind,src=$HOME/.cache/huggingface,dst=/root/.cache/huggingface,z \
  --init \
  --device=nvidia.com/gpu=all \
  guidance-rpc

Contributing

See Acceptable Contributions and Non-Goals above.

Generate grpc files with

python -m grpc_tools.protoc -I protos --python_out=src --pyi_out=src --grpc_python_out=src protos/guidance.proto

If you update dependencies, run

poetry update

and then

poetry export -f requirements.txt --output requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
protos		protos
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
README.md		README.md
img.png		img.png
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GuidanceRPC

Goals

Non-Goals

Acceptable Contributions

Running

Quickstart

Locally

Podman / Docker

Latest Release

From source

Contributing

About

Uh oh!

Releases 3

Packages

Uh oh!

Languages

License

utilityai/guidance-rpc

Folders and files

Latest commit

History

Repository files navigation

GuidanceRPC

Goals

Non-Goals

Acceptable Contributions

Running

Quickstart

Locally

Podman / Docker

Latest Release

From source

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Languages

Packages