one click llm deployments

Spin-up production-ready API endpoints for open-source Large Language Models in seconds on Runpod.

Every link in this repository opens the Runpod console with a fully-configured template selected - just choose a GPU, press Deploy and start prompting.

Templates

Llama 3.1 8B Instruct Deploy on Runpod
Qwen3-30B-A3B-FP8 - SGLang Deploy on Runpod
Qwen3-32B-FP8 - SGLang Deploy on Runpod
Qwen3-235B-A22B-FP8 - SGLang Deploy on Runpod
DeepSeek-R1-Distill-Qwen-32B-FP8-dynamic Deploy on Runpod

More models will be added soon.

Hardware Requirements

Resource	Minimum	Recommended
Pod Disk	40 GB	60 GB+ (big models)
VRAM	24 GB (8 B) / 48 GB (70 B)	See GPU table
Arch	Ampere+ (A40/A100/H100)	Ada/Hopper for FP8

Docker Implementations

We have custom Docker images for each inference engine alongside the Runpod templates.

Images live in the docker/ folder (docker/vllm-base, docker/llamacpp, etc.).
They extend official base images (e.g. vllm/vllm-openai) with:
- Faster model-download tooling (hf_transfer)
- Hardened startup scripts (health checks, graceful shutdown)
- Extra libraries for specialised models (audio, vision).
Each sub-folder contains its own README with build instructions:

# example
cd docker/vllm-base
docker build -t myorg/vllm-base:latest .

Using these images keeps pods reproducible and lets us apply optimisations once, then reuse them across all templates.

Usage

Click a One-Click Link above.
Log in or create a Runpod account.
Select a GPU meeting the “Minimum GPU” column.
(Optional) add HUGGING_FACE_HUB_TOKEN for gated models.
Press Deploy Pod.
Wait for weights to download and the server to start (~1-2 min).
Your endpoint will be:

https://<POD_ID>-8000.proxy.runpod.net

Curl Example

ENDPOINT="https://<POD_ID>-8000.proxy.runpod.net"

curl -X POST "$ENDPOINT/v1/chat/completions" \
     -H "Content-Type: application/json" \
     -d '{
           "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
           "messages": [{"role": "user","content": "Hello, who are you?"}],
           "max_tokens": 100
         }'

Built with ❤ to make self-hosting state-of-the-art models effortless.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docker		docker
docs		docs
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

one click llm deployments

Templates

Hardware Requirements

Docker Implementations

Usage

Curl Example

About

Uh oh!

Releases

Packages

Languages

mohsincsv/my-one-click-llms

Folders and files

Latest commit

History

Repository files navigation

one click llm deployments

Templates

Hardware Requirements

Docker Implementations

Usage

Curl Example

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages