Skip to content

MLX inference service compatible with OpenAI API, built on MLX-LM and MLX-VLM.基于MLX-LM和MLX-VLM构建的OpenAI API兼容的MLX推理服务.

Notifications You must be signed in to change notification settings

AreChen/mlx_inference_openai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLX INFERENCE Logo

MLX INFERENCE

License 中文文档

Project Introduction

MLX INFERENCE is an OpenAI API compatible inference service based on MLX-LM and MLX-VLM, providing the following endpoints:

  • /v1/chat/completions - Chat completion interface
  • /v1/responses - Response interface
  • /v1/models - Get available model list

Installation

pip install -r requirements.txt
# Copy environment file
cp .env.example .env

Start Service

Execute in project root directory:

uvicorn mlx_Inference:app --workers 1 --port 8002

Parameters:

  • --workers: Number of worker processes
  • --port: Service port number

Features

  • Compatible with OpenAI API specifications
  • Backend inference uses MLX-LM and MLX-VLM, supports mlx-community models
  • Easy to deploy and use

About

MLX inference service compatible with OpenAI API, built on MLX-LM and MLX-VLM.基于MLX-LM和MLX-VLM构建的OpenAI API兼容的MLX推理服务.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages