You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- `model` (required): The OCI artifact identifier for the model. This is what Compose pulls and runs via the model runner.
71
+
- `model` (required): The OCI artifact identifier for the model. This is what Compose pulls and runs via the model runner.
72
72
- `context_size`: Defines the maximum token context size for the model.
73
-
73
+
74
74
> [!NOTE]
75
75
> Each model has its own maximum context size. When increasing the context length,
76
76
> consider your hardware constraints. In general, try to keep context size
77
77
> as small as feasible for your specific needs.
78
-
78
+
79
79
- `runtime_flags`: A list of raw command-line flags passed to the inference engine when the model is started.
80
80
For example, if you use llama.cpp, you can pass any of [the available parameters](https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md).
81
81
- Platform-specific options may also be available via extension attributes `x-*`
@@ -172,7 +172,7 @@ Docker Model Runner will:
172
172
>
173
173
> This approach is deprecated. Use the [`models` top-level element](#basic-model-definition) instead.
174
174
175
-
You can also use the `provider` service type, which allows you to declare platform capabilities required by your application.
175
+
You can also use the `provider` service type, which allows you to declare platform capabilities required by your application.
176
176
For AI models, you can use the `model` type to declare model dependencies.
0 commit comments