Skip to content
Change the repository type filter

All

    Repositories list

    • Mooncake

      Public
      Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
      C++
      3613.8k12547Updated Aug 27, 2025Aug 27, 2025
    • sglang

      Public
      SGLang is a fast serving framework for large language models and vision language models.
      Python
      2.7k201Updated Aug 24, 2025Aug 24, 2025
    • SGLang is a fast serving framework for large language models and vision language models.
      Python
      2.7k000Updated Aug 12, 2025Aug 12, 2025
    • A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
      Python
      1.1k15k60316Updated Aug 2, 2025Aug 2, 2025
    • DeepEP: an efficient expert-parallel communication library that supports fault tolerance
      Cuda
      907100Updated Jul 31, 2025Jul 31, 2025
    • FlashInfer: Kernel Library for LLM Serving
      Cuda
      462600Updated Jul 24, 2025Jul 24, 2025
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      9.7k1400Updated Mar 27, 2025Mar 27, 2025