v1.0.3
Key Enhancements in This Release:
- Function Calling Support for Qwen3: Added support for function calling with the Qwen3 model, the latest addition to Alibaba Model Studio’s Qwen family of large language models.
- Embeddings Endpoint: Introduced the v1/embeddings endpoint for both LM and VLM, enabling embedding generation.
- Improved OpenAI Compatibility: Refactored the schema to enhance compatibility with OpenAI’s API format.
- RAG Demo Notebook: Added simple_rag_demo.ipynb, showcasing an engaging use case for serving local models using mlx-server-OAI-compat.
- Consistent Error Handling: Standardized all error responses across the codebase using create_error_response for a unified API error format.