An Ollama server I use for local experiments and small Proofs of Concept (POCs). It is automatically built, downloads a Mistral model, and allows API interactions.
- Docker
- Make
- Clone the project repository.
- Start the server using the
make
command.
make start
- Once the server is running, you can interact with the API using the following command, which will send a "Hello, World!" request.
make test
- If you visit
http://localhost:11434/
, you will see the message: "Ollama is running." Here’s the improved version of your text:
- The command will build the image, and the Mistral model will be downloaded during the first startup.
- The initial run will take longer since the Mistral model is approximately 4 GiB in size.
Simply add or update the ollama/entrypoint.sh
file to pull all the models you need.
Request Payload format without streaming:
{
"model": "mistral",
"messages": [{"role": "user", "content": "Hey, hello"}],
"stream": false
}
curl request command example
curl -X POST http://localhost:11434/api/chat \
-H "Content-Type: application/json" \
-d '{"model":"mistral","messages":[{"role":"user","content":"Hey, hello"}],"stream":false}'
Response format (stream off):
{
"model": "mistral",
"created_at": "2025-03-22T11:18:44.390931329Z",
"message": {
"role": "assistant",
"content": "Hi there! How can I help you today?"
},
"done_reason": "stop",
"done": true,
"total_duration": 6333487103,
"load_duration": 4574464627,
"prompt_eval_count": 13,
"prompt_eval_duration": 652533937,
"eval_count": 11,
"eval_duration": 1096839380
}