Dynamic Model Loading with Docker Deployment #8077

ArijitSinghEDA · 2024-09-02T09:23:56Z

ArijitSinghEDA
Sep 2, 2024

If I use the Dockerfile to deploy a vLLM server, does it only support single model deployment, or can I load models dynamically as well?

The Dockerfile contains only this line as its execution:
ENTRYPOINT ["python", "-m", "vllm.entrypoints.openai.api_server"]

In case it does support loading multiple model loading, how do I use it then?