Should we remove the `model_name` param when instantiating models #1580

RobinPicard · 2025-05-15T09:51:50Z

RobinPicard
May 15, 2025
Maintainer

Currently many models accept a model_name argument in their __init__ function. This was added for convenience to allow users to only specify the model name once at first. Then the model name is given to the generate function of the provider at each model call without the user having to specify it in the inference parameters.

An issue with this setup is that it's contrary to the general approach of Outlines v1 in which we delegate as much as possible to the provider's client/model and do not try to standardize their behavior.

I'm in favor of removing this feature as it adds complexity in terms of knowing Outlines-specific things for users. We want users to be able to use Outlines in a way as similar as possible to how they would use their provider mode.

For instance with Anthopic (their client's inference argument for the model name is called model):

from anthropic import Anthropic as AnthropicClient
from outlines import from_anthropic

# Curent
client = Anthopic()
model = from_anthropic(client, model_name="claude-3-haiku-20240307")
response = model("Hello world")

# Proposed
client = Anthopic()
model = from_anthropic(client)
response = model("Hello world", model="claude-3-haiku-20240307")

cpfiffer · 2025-05-15T16:04:17Z

cpfiffer
May 15, 2025

TLDR: the interface in Outlines v1 is a lot less easy to use than <v1. I'd prefer more convenience functionality, such as leaving in model_name and special-casing it for the user. I would also support a separate interface library that is more opinionated on the user's behalf.

I think we leave it in, though you are right that it does not generally align with the current design. We may also clash with model_name if the underlying provider supports model_name with a different meaning, though I'm not sure this is the case for any of our inference engine.

I usually tilt more towards being opinionated on the interface, such as with automatic JSON schema validation or setting a max_tokens default keyword for transformers.

In my opinion, we've actually removed a lot convenience from the user. Outlines v1.0 is quite different in terms of simplicity of use, mostly because Outlines pre-v1 was extraordinarily simple to use. Our popularity likely arose largely because Outlines before v1 required very little modification of models, generators, etc. Outlines handled a lot of common, repeated, and annoying model-specific stuff from the user's hands.

That said, the prior behavior regarding keyword forwarding was a bad experience for transformer configs for example, because you had to specify a dictionary. Lots of annoying little things arose partly because Outlines too care of slightly too much, but as with all things there has to be some kind of balance between v1's pure-unix style vs. <v1's ultra convenient interface.

However, the specific case of model_name, requires users to know a lot about their separate model providers, and it completely prevents being able to easily switch your inference client without also modifying runtime code.

In the old outlines, all you did was construct a single model class, which worked arbitrarily prior to inference time. Now, the name of the keyword argument changes at runtime, which makes user code more fragile and annoying to maintain.

The current outlines supports easy model hot-swapping, whereas here changing the inference engine requires users touching possibly many parts of their code. In the proposed interface, a user that toggles between cloud and local compute would have to:

Change model_name to model when the inference engine changes
know about the keyword argument that works across different engines

1 is an annoyance. Enough that I'd be irritated about it. 2 is not a huge problem in that users should probably know about their engine keywords, but this information is scattered all over the web and is not readily accessibly and standardized as it was prior to v1.

My suggestions are not well suited to the design of Outlines v1, though. You are right that these convenience features start introducing incongruent behavior at the low level.

I might prefer a second package with a simpler interface that is more opinionated, so perhaps this discussion is better suited elsewhere.

3 replies

rlouf May 16, 2025
Maintainer

instructor has the exact same interface for model integration and yet is very popular, if not more.

It also recently introduced a simplified interface which we could consider adding in outlines:

client = instructor.from_provider("openai/gpt-4")  # OpenAI
client = instructor.from_provider("anthropic/claude-3-sonnet")  # Anthropic
client = instructor.from_provider("google/gemini-pro")  # Google
client = instructor.from_provider("mistral/mistral-large")  # Mistral

Design-wise we should always start with the most flexible, and possibly write thin wrappers down the line.

Unifying the inference kwargs is something we considered keeping at some point, and we should probably revisit now that we have much cleaner and modular model integrations. Then we can consider imposing reasonable library-wide, instead of having to special-case the logic for each model integration.

RobinPicard May 16, 2025
Maintainer Author

To take decisions on those topics, I think it's important we define more precisely our goals and what type of users we want to focus on. Instructor seems to be made entirely for interacting with managed solutions for instance, while it's not the case for Outlines.

For the 2 elements you mentioned, I think it's feasible with our current design:

We could have a reserved inference keyword that takes a dataclass instance with a set of standardized keywords we have defined. Then, each model could have an object similar to the ModelTypeAdapter to translate the keywords into what the model expects. In any case, the user would be able not to use it and to just rely on kwargs as in the current design.
The from_provider function should be quite straightforward to implement as we could say that the user can provide kwargs that will be pass on to the client for more complex uses cases.

Something I would insist on is maintaining flexibility for advanced users. I agree that it's a lot easier to design now that we have a very flexible organization.

cpfiffer May 16, 2025

Agree on all fronts here.

Flexibility and low level first, clean interfaces built on top of that flexibility. If you're some kind of smarty pants and you want all the low level stuff, then we can provide all of that cleanly with what we have.

I also agree that we should focus on our user types and target them specifically. I'll ask on our discord to get more information, but I can also speculate a little on who our users are:

Less technically sophisticated, non-AI people who want to do some work. Data scientists and stuff. To support these users means having the lowest overhead interface possible. <v1 is great for these types.
Researchers. They want control. v1 gives them this in a big, powerful way.
Big AI people at foundation labs. They're half researchers, half product people, but they typically have the resources needed to work with any available interface.
Tinkerers and localllama people. They don't usually want a ton of flexibility, they just want the model to say stuff in a structured way. <v1 is great for them.
Backend or inference people working on hardware or high throughput technology, such as vLLM. We want the the lowest-level tooling possible, basically as close to the metal as possible. v1 is pretty good at this, though tbh they're probably better off going to outlines-core directly.

I think we want all of these user types involved, and I think we're most of the way there as far as flexibility is concerned, though at the expense of some of the interface's simplicity.

The massive value that Instructor offers is just the interface. The reprompting stuff isn't going to be particularly useful going forward, especially as more and more providers offer constrained decoding. It is comically easy to use Instructor, and that's why it's popular (as Remi noted).

I'd rather not cede more ground to Instructor on the interface front, especially since they have the correct idea that most inference should not be local. Most structured generation should happen either in a separate process on the same machine, or on a fully remote server like vLLM or Anthropic or whoever.

We're addressing a lot of this currently, but I suspect it's possible to provide some simple convenience tools in a separate module that mildly mimic the old interface.

RobinPicard · 2025-06-24T13:01:19Z

RobinPicard
Jun 24, 2025
Maintainer Author

Closing the discussion as we decided to keep it.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Should we remove the `model_name` param when instantiating models #1580

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Should we remove the model_name param when instantiating models #1580

Uh oh!

Uh oh!

RobinPicard May 15, 2025 Maintainer

Replies: 2 comments · 3 replies

Uh oh!

cpfiffer May 15, 2025

Uh oh!

Uh oh!

rlouf May 16, 2025 Maintainer

Uh oh!

RobinPicard May 16, 2025 Maintainer Author

Uh oh!

cpfiffer May 16, 2025

Uh oh!

RobinPicard Jun 24, 2025 Maintainer Author

Should we remove the `model_name` param when instantiating models #1580

RobinPicard
May 15, 2025
Maintainer

Replies: 2 comments 3 replies

cpfiffer
May 15, 2025

rlouf May 16, 2025
Maintainer

RobinPicard May 16, 2025
Maintainer Author

RobinPicard
Jun 24, 2025
Maintainer Author