You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a result, we have limited control over text generation and some output types are not supported.
However, for vLLM, it seems like all output types are supported, in direct contrast to the other server-based models. Also surprisingly, it uses the OpenAI client while the atual OpenAI server-based model doesn't support anything but JSON Schema. As a result I have 2 questions:
Can I host any Huggingface model that can be served by vLLM and get full support for all output types?
If so, what's so special about the vLLM integration that allows us to get full support for all output types compared to other server-based models
If I wanted to use https://github.com/EricLBuehler/mistral.rs as the inference server, depending on its OpenAI compatible endpoint, what are some changes in both mistral.rs, outlines (and anything else) that needs to be made to have it support all output types in a similar manner to vLLM (and are there PRs that I could look at for more handson guides)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
As mentioned at https://dottxt-ai.github.io/outlines/latest/features/models/:
However, for vLLM, it seems like all output types are supported, in direct contrast to the other server-based models. Also surprisingly, it uses the OpenAI client while the atual OpenAI server-based model doesn't support anything but JSON Schema. As a result I have 2 questions:
Features Matrix
: https://dottxt-ai.github.io/outlines/latest/features/models/#features-matrixRE:
#3
, I assume (without reading the code) that it's using extra parameters from https://docs.vllm.ai/en/stable/features/structured_outputs.html#online-serving-openai-api - it would be good to know more details and get a confirmationBeta Was this translation helpful? Give feedback.
All reactions