Important
This is an early developer preview of the vLLM Gaudi Plugin and is not yet intended for general use. For a more stable experience, consider using the HabanaAI/vllm-fork or the in-tree Gaudi implementation available in vllm-project/vllm.
vLLM Gaudi plugin (vllm-gaudi) integrates Intel Gaudi accelerators with vLLM to optimize large language model inference.
This plugin follows the [RFC]: Hardware pluggable and [RFC]: Enhancing vLLM Plugin Architecture principles, providing a modular interface for Intel Gaudi hardware.
Learn more: