12 Jun 06:17

cubist38

v1.0.11

1c0b466

v1.0.11 Latest

Latest

Release Summary

Updated README.md to streamline content and enhance clarity for easier understanding.
Merged Pull Request #27 (contributed by @koush) to remove an unused dependency, reducing potential bloat and improving maintainability.

Contributors

koush

Assets 2

09 Jun 04:49

cubist38

v1.0.10

7ed1fa5

v1.0.10

Summary of Changes

Updated to the latest versions of mlx-lm and mlx-vlm to ensure compatibility and access to recent improvements.
Optimized default image resolution from 512 to 448 for improved performance.

Assets 2

31 May 15:23

cubist38

v1.0.9

ffec869

v1.0.9

Summary of Changes

Implemented OpenAI-Standard Streaming with Tool Calls and Thinking Parser: Fully integrated streaming support for tool calls, including a custom thinking parser, in compliance with OpenAI’s latest API standards. Implementation by @tienthanh214 (#25).
Logging System Migration: Replaced the standard logging module with loguru to improve readability, flexibility, and ease of debugging.
Demo Video Update: Updated the demo video in README.md to reflect the latest features and behavior of the current codebase.

Contributors

tienthanh214

Assets 2

24 May 15:25

cubist38

v1.0.8

5eb717a

v1.0.8

Craft a concise and informative description for PyPI package. The description highlights the key features and purpose of mlx-openai-server package while being clear and engaging.

Assets 2

24 May 15:15

cubist38

v1.0.7

1ad17af

v1.0.7

Summary of Changes

Temporarily removed metrics.py pending a better implementation.
Aligned all Pydantic model fields with the OpenAI schema for consistency.
Updated finish_reason from "function_call" to "tool_calls" in streaming responses involving tool usage.
Fixed parsing of tool calls for Qwen3 models.
Resolved issues with VLM models when handling non-streaming text requests.
Added torchvision to setup.py — required by some VLM models for image processing.

Assets 2

20 May 15:30

cubist38

v1.0.6

35b7818

v1.0.6

Summary

This new release tag to mark the latest stable version of the codebase. Key updates included in this release:

New Feature: Introduced the /v1/models endpoint for monitoring model serving status.
Updates: Synced with the latest versions of mlx_vlm and mlx_lm for up-to-date performance and compatibility.
Bug Fix: Fixed a text extraction issue when processing chunks.
Enhancement: Refined the resource cleanup logic for improved efficiency and stability.

Assets 2

13 May 09:59

cubist38

v1.0.5

b02e64a

v1.0.5

BREAKING CHANGE: Rename package and CLI from mlx-server to mlx-openai-server

Summary

This PR introduces a breaking change by renaming the package and CLI from mlx-server to mlx-openai-server to resolve PyPI naming conflicts and improve compatibility.

Changes

Renamed the package in setup.py from mlx-server to mlx-openai-server
Updated all CLI references from mlx-server to mlx-openai-server
Updated the repository/package name in the README and all usage instructions
Added a "Breaking Change" notice to the README

Impact

Breaking change:
All users must update their scripts, CLI commands, and installation instructions to use mlx-openai-server instead of mlx-server.

Motivation

The original name mlx-server was too similar to existing projects on PyPI, causing upload errors. This change ensures uniqueness and future compatibility.

Migration

Replace all usage of mlx-server with mlx-openai-server in your scripts and CLI commands.
Update any installation instructions to use the new package name.

Assets 2

13 May 09:57

cubist38

v1.0.4

b394c56

v1.0.4

Automating Package Releases with GitHub Action

Assets 2

04 May 09:49

cubist38

v1.0.3

3163bd4

v1.0.3

Key Enhancements in This Release:

Function Calling Support for Qwen3: Added support for function calling with the Qwen3 model, the latest addition to Alibaba Model Studio’s Qwen family of large language models.
Embeddings Endpoint: Introduced the v1/embeddings endpoint for both LM and VLM, enabling embedding generation.
Improved OpenAI Compatibility: Refactored the schema to enhance compatibility with OpenAI’s API format.
RAG Demo Notebook: Added simple_rag_demo.ipynb, showcasing an engaging use case for serving local models using mlx-server-OAI-compat.
Consistent Error Handling: Standardized all error responses across the codebase using create_error_response for a unified API error format.

Assets 2

20 Apr 13:45

cubist38

v1.0.2

d34536f

v1.0.2

Changes in this release:

Refactored API schemas and response formats for improved consistency and maintainability.
Updated chat history handling logic for better performance and reliability.
Exposed the /v1/embeddings endpoint to support MLX-LM models (text-only).
Added a new notebook, embeddings_examples, demonstrating how to use the embeddings endpoint via the OpenAI-compatible API.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Contributors

Uh oh!

Uh oh!

Contributors

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BREAKING CHANGE: Rename package and CLI from mlx-server to mlx-openai-server

Summary

Changes

Impact

Motivation

Migration

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: cubist38/mlx-openai-server

v1.0.11

Contributors

Uh oh!

v1.0.10

Uh oh!

v1.0.9

Contributors

Uh oh!

v1.0.8

Uh oh!

v1.0.7

Uh oh!

v1.0.6

Uh oh!

v1.0.5

BREAKING CHANGE: Rename package and CLI from mlx-server to mlx-openai-server

Summary

Changes

Impact

Motivation

Migration

Uh oh!

v1.0.4

Uh oh!

v1.0.3

Uh oh!

v1.0.2

Uh oh!