Failing to build llamafile for Qwen2.5-Coder-7B-Instruct-Q4_K_M (with failed to load model error)

Tried to build llamafile for the Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf model. Tried by both letting the script download the model file and work with it, and also pre-downloading the GGUF file for the model in ./model/ path then running the build script, but I keep getting the "llama_load_model_from_file: failed to load model" error, which might have something to do with the "[not a pkzip archive](warning: not a pkzip archive)" issue with the model file. The model is freshly downloaded one, done using Firefox browser, but same thing happens if I download model GGUF file using wget as well.

Note that I am on a freshly setup Ubuntu MATE 24.04.1 setup on a Intel i5-8440 (with IGPU) with 32GB DDR4 RAM (single channel) system.

chugnomug@mtkailash:~/Work/llamafile_chat$ ./build_file.sh 
Please enter the model URL: https://huggingface.co/bartowski/Qwen2.5-Coder-7B-Instruct-GGUF/resolve/main/Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf
Model already exists.
Building Docker image...
[+] Building 33.6s (15/15) FINISHED                                                                                                                                                                                docker:default
 => [internal] load build definition from Dockerfile                                                                                                                                                                         0.0s
 => => transferring dockerfile: 1.52kB                                                                                                                                                                                       0.0s
 => resolve image config for docker-image://docker.io/docker/dockerfile:1                                                                                                                                                    1.7s
 => CACHED docker-image://docker.io/docker/dockerfile:1@sha256:865e5dd094beca432e8c0a1d5e1c465db5f998dca4e439981029b3b81fb39ed5                                                                                              0.0s
 => [internal] load metadata for docker.io/library/debian:bullseye-slim                                                                                                                                                      1.6s
 => [internal] load .dockerignore                                                                                                                                                                                            0.0s
 => => transferring context: 2B                                                                                                                                                                                              0.0s
 => CACHED [downloader 1/5] FROM docker.io/library/debian:bullseye-slim@sha256:610b4c7ad241e66f6e2f9791e3abdf0cc107a69238ab21bf9b4695d51fd6366a                                                                              0.0s
 => [final 2/5] RUN addgroup --gid 1000 user                                                                                                                                                                                 0.4s
 => CACHED [downloader 2/5] WORKDIR /download                                                                                                                                                                                0.0s
 => [downloader 3/5] RUN apt-get update && apt-get install -y curl                                                                                                                                                           8.9s
 => [final 3/5] RUN adduser --uid 1000 --gid 1000 --disabled-password --gecos "" user                                                                                                                                        0.3s
 => [final 4/5] WORKDIR /usr/src/app                                                                                                                                                                                         0.1s
 => [downloader 4/5] RUN curl -L -o ./llamafile https://github.com/Mozilla-Ocho/llamafile/releases/download/0.8.16/llamafile-0.8.16                                                                                         18.2s 
 => [downloader 5/5] RUN chmod +x ./llamafile                                                                                                                                                                                0.9s 
 => [final 5/5] COPY --from=downloader /download/llamafile ./llamafile                                                                                                                                                       0.6s 
 => exporting to image                                                                                                                                                                                                       0.7s 
 => => exporting layers                                                                                                                                                                                                      0.7s 
 => => writing image sha256:19790387eb6eab39840265d1d9489a50d0916a404c1c3039697fe4b470299704                                                                                                                                 0.0s 
 => => naming to docker.io/library/llamafile_image                                                                                                                                                                           0.0s 
Running Docker container...
Model filename: Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf
Current directory: /home/chugnomug/Work/llamafile_chat
Running Docker container with the following command:
docker run -p 8080:8080 -v "/home/chugnomug/Work/llamafile_chat/model:/usr/src/app/model" llamafile_image --server --host 0.0.0.0 -m "/usr/src/app/model/Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf"
note: if you have an AMD or NVIDIA GPU then you need to pass -ngl 9999 to enable GPU offloading
{"build":1500,"commit":"a30b324","function":"server_cli","level":"INFO","line":2898,"msg":"build info","tid":"11808320","timestamp":1731317430}
{"function":"server_cli","level":"INFO","line":2905,"msg":"system info","n_threads":6,"n_threads_batch":6,"system_info":"AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | ","tid":"11808320","timestamp":1731317430,"total_threads":6}
/usr/src/app/model/Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf: warning: not a pkzip archive
{"function":"load_model","level":"ERR","line":463,"model":"/usr/src/app/model/Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf","msg":"unable to load model","tid":"11808320","timestamp":1731317430}
llama_model_load: error loading model: failed to open /usr/src/app/model/Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf: Invalid argument
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '/usr/src/app/model/Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf'


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Failing to build llamafile for Qwen2.5-Coder-7B-Instruct-Q4_K_M (with failed to load model error) #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Failing to build llamafile for Qwen2.5-Coder-7B-Instruct-Q4_K_M (with failed to load model error) #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions