llama.cpp Server exit code: -4 on older CPU (No AVX2, running in Docker)

### Describe the bug

I am encountering a persistent llama.cpp server error (exit code: -4) when attempting to load GGUF models with text-generation-webui running in a Docker container. My setup is a roughly 10-year-old Dell workstation with a CPU that lacks AVX2 instruction support. Even the atinoda/text-generation-webui:default-nvidia-noavx2 Docker image results in this error, indicating that the llama.cpp binary within it may still rely on CPU instructions not supported by my older processor.

### Is there an existing issue for this?

- [x] I have searched the existing issues

### Reproduction

Pulled and ran atinoda/text-generation-webui:default-cpu or atinoda/text-generation-webui:default-nvidia-noavx2 via Docker.

Ensured user_data/models is mounted to store the GGUF model.

Accessed the web UI and attempted to load tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf.

### Screenshot

_No response_

### Logs

```shell
When attempting to load a GGUF model, the llama-server process terminates unexpectedly. The relevant log snippets from the Docker container output are:

07:24:12-283821 INFO     Loading "tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf"         
07:24:12-301826 INFO     llama-server command-line flags:                       
--model user_data/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf --ctx-size 2048 --gpu-layers 0 --batch-size 256 --port 33391 --no-webui --threads 1
07:24:12-303710 INFO     Using gpu_layers=0 | ctx_size=2048 | cache_type=fp16   
07:24:13-311473 ERROR    Error loading the model with llama.cpp: Server process 
                         terminated unexpectedly with exit code: -4
```

### System Info

```shell
System Info:

Host OS: Proxmox (running an Ubuntu 22.04 LTS container)

Docker Management: Portainer

WebUI Images Tried: atinoda/text-generation-webui:default-cpu and atinoda/text-generation-webui:default-nvidia-noavx2

CPU: Intel(R) Xeon(R) CPU E5-2603 0 @ 1.80GHz (This is an older CPU, likely lacking AVX2, F16C, and FMA support.)

GPU: Advanced Micro Devices, Inc. [AMD/ATI] Barts XT [Radeon HD 6870]

RAM: Hundreds of GBs

Model: tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf

Notes: The gpu-layers setting is correctly set to 0 in the llama-server flags, confirming the issue is CPU-related. This problem occurs even with a small model like tinyllama-1.1b.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llama.cpp Server exit code: -4 on older CPU (No AVX2, running in Docker) #7162

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

llama.cpp Server exit code: -4 on older CPU (No AVX2, running in Docker) #7162

Description

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions