Skip to content

mohsincsv/my-one-click-llms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

one click llm deployments

Spin-up production-ready API endpoints for open-source Large Language Models in seconds on Runpod.

Every link in this repository opens the Runpod console with a fully-configured template selected - just choose a GPU, press Deploy and start prompting.

Templates

More models will be added soon.

Hardware Requirements

Resource Minimum Recommended
Pod Disk 40 GB 60 GB+ (big models)
VRAM 24 GB (8 B) / 48 GB (70 B) See GPU table
Arch Ampere+ (A40/A100/H100) Ada/Hopper for FP8

Docker Implementations

We have custom Docker images for each inference engine alongside the Runpod templates.

  1. Images live in the docker/ folder (docker/vllm-base, docker/llamacpp, etc.).
  2. They extend official base images (e.g. vllm/vllm-openai) with:
    • Faster model-download tooling (hf_transfer)
    • Hardened startup scripts (health checks, graceful shutdown)
    • Extra libraries for specialised models (audio, vision).
  3. Each sub-folder contains its own README with build instructions:
# example
cd docker/vllm-base
docker build -t myorg/vllm-base:latest .

Using these images keeps pods reproducible and lets us apply optimisations once, then reuse them across all templates.

Usage

  1. Click a One-Click Link above.
  2. Log in or create a Runpod account.
  3. Select a GPU meeting the “Minimum GPU” column.
  4. (Optional) add HUGGING_FACE_HUB_TOKEN for gated models.
  5. Press Deploy Pod.
  6. Wait for weights to download and the server to start (~1-2 min).
  7. Your endpoint will be:
https://<POD_ID>-8000.proxy.runpod.net

Curl Example

ENDPOINT="https://<POD_ID>-8000.proxy.runpod.net"

curl -X POST "$ENDPOINT/v1/chat/completions" \
     -H "Content-Type: application/json" \
     -d '{
           "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
           "messages": [{"role": "user","content": "Hello, who are you?"}],
           "max_tokens": 100
         }'

Built with ❤ to make self-hosting state-of-the-art models effortless.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published