Skip to content

Bug: Performance drop with 14292913 #461 #490

@nux

Description

@nux

What happened?

Performance dropping with commit 1429291 #461

To identify which commit the performance dropped with I was running:

Was running for i in cut -d " " -f1 commits.txt ;do git checkout $i;./cmd-build.sh ;./start-bench.sh >> results.txt;done

start-bench.sh is:
./build/bin/llama-bench -m /mnt/nvme/models/ubergarm/DeepSeek-V3-0324-GGUF/DeepSeek-V3-0324-IQ4_K_R4/DeepSeek-V3-0324-IQ4_K_R4-00001-of-00010.gguf -p 512 -t 32 -mla 2 -fa 1 -fmoe 1 -ngl 99 --override-tensor "exps=CPU" -amb 512

Relevant results.txt:

model size params backend ngl fa mla amb fmoe test t/s
deepseek2 671B IQ4_K_R4 - 4.5 bpw 386.18 GiB 672.05 B CUDA 99 1 2 512 1 pp512 26.74 ± 0.05
deepseek2 671B IQ4_K_R4 - 4.5 bpw 386.18 GiB 672.05 B CUDA 99 1 2 512 1 tg128 4.80 ± 0.00

build: 0976467 (3715)

model size params backend ngl fa mla amb fmoe test t/s
deepseek2 671B IQ4_K_R4 - 4.5 bpw 386.18 GiB 672.05 B CUDA 99 1 2 512 1 pp512 26.75 ± 0.04
deepseek2 671B IQ4_K_R4 - 4.5 bpw 386.18 GiB 672.05 B CUDA 99 1 2 512 1 tg128 4.81 ± 0.00

build: 1429291 (3714)

model size params backend ngl fa mla amb fmoe test t/s
deepseek2 671B IQ4_K_R4 - 4.5 bpw 386.18 GiB 672.05 B CUDA 99 1 2 512 1 pp512 76.24 ± 1.44
deepseek2 671B IQ4_K_R4 - 4.5 bpw 386.18 GiB 672.05 B CUDA 99 1 2 512 1 tg128 10.08 ± 0.06

build: 24c010b (3713)

model size params backend ngl fa mla amb fmoe test t/s
deepseek2 671B IQ4_K_R4 - 4.5 bpw 386.18 GiB 672.05 B CUDA 99 1 2 512 1 pp512 77.25 ± 0.70
deepseek2 671B IQ4_K_R4 - 4.5 bpw 386.18 GiB 672.05 B CUDA 99 1 2 512 1 tg128 10.07 ± 0.06

build: c7ecd4e (3712)

Building like this:
cmake -B build -DBUILD_SHARED_LIBS=OFF -DGGML_CUDA=ON -DLLAMA_CURL=ON
cmake --build build --config Release -j --clean-first

Running on 2x9115, 768gb ram, 3090 gpu

Name and Version

version: 3710 (9fb82af)
built with cc (Debian 12.2.0-14+deb12u1) 12.2.0 for x86_64-linux-gnu

What operating system are you seeing the problem on?

Linux

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions