Skip to content

v1.0.3

Compare
Choose a tag to compare
@cubist38 cubist38 released this 04 May 09:49
· 66 commits to main since this release

Key Enhancements in This Release:

  • Function Calling Support for Qwen3: Added support for function calling with the Qwen3 model, the latest addition to Alibaba Model Studio’s Qwen family of large language models.
  • Embeddings Endpoint: Introduced the v1/embeddings endpoint for both LM and VLM, enabling embedding generation.
  • Improved OpenAI Compatibility: Refactored the schema to enhance compatibility with OpenAI’s API format.
  • RAG Demo Notebook: Added simple_rag_demo.ipynb, showcasing an engaging use case for serving local models using mlx-server-OAI-compat.
  • Consistent Error Handling: Standardized all error responses across the codebase using create_error_response for a unified API error format.