Mistral Small 3.1 2503 chat template, include tools and support for images #13585

Phate334 · 2025-05-16T09:30:57Z

Phate334
May 16, 2025

Hello everyone,

I’m deploying the Mistral Small 3.1-2503 model using llama.cpp, and I noticed that the default chat template doesn’t include the tool annotation. As a result, llama-server deployments cannot properly use the function-calling feature, unlike Ollama.

To address this, I modified the vLLM example and tested it with the following setup:

ghcr.io/ggml-org/llama.cpp:server-cuda-b5391
Model files from unsloth/Mistral-Small-3.1-24B-Instruct-2503-GGUF:
- Mistral-Small-3.1-24B-Instruct-2503-Q4_K_M.gguf
- mmproj-F16.gguf

Below is the final version of my modified chat template, along with the execution parameters. With these changes, it can correctly handle chat completions, function calls, and interleave multiple images in the chat history.

https://gist.github.com/Phate334/dd633561879f41a8c4affc4031df1c7f

I’d appreciate any feedback on whether this approach is correct or if there are any missing steps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Mistral Small 3.1 2503 chat template, include tools and support for images #13585

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Mistral Small 3.1 2503 chat template, include tools and support for images #13585

Uh oh!

Phate334 May 16, 2025

Replies: 0 comments

Phate334
May 16, 2025