-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Expected Behavior
I would like to contribute a new model integration for vLLM, a fast and memory-efficient inference engine for LLMs.
This new model implementation would enable Spring AI users to connect to a vLLM backend more directly and cleanly, similar to existing integrations with OpenAI and Hugging Face.
Current Behavior
Currently, Spring AI does not offer native support for vLLM.
While it is technically possible to connect to a vLLM backend by customizing the OpenAI model implementation, this approach is not ideal—it requires workarounds and results in less maintainable and reusable code.
Context
I'm currently working on a project that uses vLLM for high-performance inference and would benefit from tighter integration with Spring AI’s abstractions and tooling.
Before proceeding with the implementation and contributing a PR, I’d like to check if the Spring AI team is open to a contribution that adds dedicated support for vLLM.
Are there any existing plans or previous discussions around this? If not, I’d be happy to explore it further and share a prototype.