sglang
Here are 28 public repositories matching this topic...
Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
-
Updated
Aug 6, 2025 - Python
基于SparkTTS、OrpheusTTS等模型,提供高质量中文语音合成与声音克隆服务。
-
Updated
May 18, 2025 - Python
☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
-
Updated
Jul 28, 2025 - Go
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
-
Updated
Aug 9, 2025 - Go
kvcached: Elastic KV cache for dynamic GPU sharing and efficient multi-LLM inference.
-
Updated
Aug 8, 2025 - Python
Arks is a cloud-native inference framework running on Kubernetes
-
Updated
Jul 29, 2025 - Go
A tool for benchmarking LLMs on Modal
-
Updated
Jul 30, 2025 - Python
AI-based search done right
-
Updated
Aug 8, 2025 - TypeScript
A guide to structured generation using constrained decoding
-
Updated
Jun 9, 2024 - Jupyter Notebook
DeepSeek-V3, R1 671B on 8xH100 Throughput Benchmarks
-
Updated
Mar 13, 2025 - Python
The Private AI Setup Dream Guide for Demos automates the installation of the software needed for a local private AI setup, utilizing AI models (LLMs and diffusion models) for use cases such as general assistance, business ideas, coding, image generation, systems administration, marketing, planning, and more.
-
Updated
Jul 17, 2025 - Shell
Bench360 is a modular benchmarking suite for local LLM deployments. It offers a full-stack, extensible pipeline to evaluate the latency, throughput, quality, and cost of LLM inference on consumer and enterprise GPUs. Bench360 supports flexible backends, tasks and scenarios, enabling fair and reproducible comparisons for researchers & practitioners.
-
Updated
Aug 3, 2025 - Python
-
Updated
Apr 27, 2025 - TypeScript
Improve this page
Add a description, image, and links to the sglang topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the sglang topic, visit your repo's landing page and select "manage topics."