Illegal Instruction using Q4_0_4_4 on Rockchip rk3399 SOC #9853

EugeneParmentier · 2024-10-11T20:59:11Z

EugeneParmentier
Oct 11, 2024

Introduction

I'm experimenting with llama.cpp on Linux, running on the Pinephone Pro, a device which uses a Rockchip rk3399 SOC.
My goal is to run very small models (range 360M - 2B) as fast as possible on the device.

Q4 quantz offers best performance so far, but i want to try the neon-optimized version.

However, when running llama.cpp on the device, using a Q4_0_4_4 model, the program crashes, reporting an Illegal Instruction just after loading the model (supposedly when starting inference).

Troubleshooting

All cores of the SOC supports ARM v8-A and Neon SIMD instructions (datasheet)
I've compiled llama.cpp on-device with gcc, using make GGML_NO_LLAMAFILE=1, i am aware that the doc says to explicitly set the -march option during build, however i'm usure what i'm suppose to set for my CPU and if that's really relevant since the build is done with -march=native
When starting, llama-cli reports NEON to 1, and LLAMAFILE to 0
cat /proc/cpuinfo reports : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
When i run llama.cpp with valgrind, the program works without crashing. Hypothesis : either valgrind (emulation) supports more instructions than the CPU, either my kernel is missing some features and disallow instructions supported by the CPU.

Conclusion

What i am missing to be able to run a Q4_0_4_4 model on this SOC / system ? Is it even possible ?

Answered by Djip007

Oct 19, 2024

Arm introduced the SDOT (Signed Dot Product) and UDOT (Unsigned Dot Product) instructions in
the 2017 extensions to the Arm Architecture, known as Armv8.4-A. (https://documentation-service.arm.com/static/61b85a52a2506b46a5c323aa)

This instruction (use in ggml-aarch64.c) is not available on all neon arm CPU, the RK3399 is a old ARMv8-A with no sdot support.
so if I do not make mistake you can't use "accelerate" Q4_0_4_4 model with it.

// That's not enough to use the instructions sdot.
if (ggml_cpu_has_neon()) {
}

// we need something like:
if (ggml_cpu_has_arm() >= ARM_V84_A) {
}

View full answer

ggerganov · 2024-10-12T05:20:04Z

ggerganov
Oct 12, 2024
Maintainer

Does it work with this patch:

diff --git a/ggml/src/ggml-aarch64.c b/ggml/src/ggml-aarch64.c
index b27f4114..7c6817b3 100644
--- a/ggml/src/ggml-aarch64.c
+++ b/ggml/src/ggml-aarch64.c
@@ -617,7 +617,7 @@ void ggml_gemv_q4_0_4x4_q8_0(int n, float * restrict s, size_t bs, const void *
     UNUSED(ncols_interleaved);
     UNUSED(blocklen);
 
-#if ! ((defined(_MSC_VER)) && ! defined(__clang__)) && defined(__aarch64__) && defined(__ARM_NEON)
+#if 0
     if (ggml_cpu_has_neon()) {
         const void * b_ptr = vx;
         const void * a_ptr = vy;

This disables the actual ARM_NEON code and fallbacks to non-SIMD execution path. Just want to confirm the location of the illegal instruction.

0 replies

EugeneParmentier · 2024-10-12T17:10:54Z

EugeneParmentier
Oct 12, 2024
Author

Note : I was (and still am) running my tests in interactive mode by launching :
./llama-cli -m model-Q4_0_4_4.gguf -p "You are an helpful assistant" -c 1024 -cnv

After applying the patch, the program doesn't crash after loading the model anymore and it displays the interactive prompt token ('> ').
However, immediately after i enter a sentence (and press enter), the program crashes and i get an Illegal Instruction.

I suppose the code you had me patch was called only during model initialization and now the program crash because it encountered another code path with Neon instructions ?

5 replies

ggerganov Oct 12, 2024
Maintainer

Hm, that's strange. I'm not sure how to debug this further. There seems to be something wrong during the compilation, although a native build is supposed to work, as you pointed out. Let us know if you find some more information.

EugeneParmentier Oct 12, 2024
Author

Alright !
Thanks for the help ! I'm gonna try to dig deeper.
Maybe i can find a way to retrieve the illegal instruction the CPU tried to execute and identify it.

Djip007 Oct 19, 2024

Arm introduced the SDOT (Signed Dot Product) and UDOT (Unsigned Dot Product) instructions in
the 2017 extensions to the Arm Architecture, known as Armv8.4-A. (https://documentation-service.arm.com/static/61b85a52a2506b46a5c323aa)

This instruction (use in ggml-aarch64.c) is not available on all neon arm CPU, the RK3399 is a old ARMv8-A with no sdot support.
so if I do not make mistake you can't use "accelerate" Q4_0_4_4 model with it.

// That's not enough to use the instructions sdot.
if (ggml_cpu_has_neon()) {
}

// we need something like:
if (ggml_cpu_has_arm() >= ARM_V84_A) {
}

Answer selected by EugeneParmentier

ggerganov Oct 20, 2024
Maintainer

Good catch. Would you like to submit a patch to fix this?

Djip007 Oct 25, 2024

I am not sure how to do it, and for now I work on adding FP8.

We may have to find "define" set by compiler like with AVX512_BF / FP16 ... for the SDOT

https://community.arm.com/arm-community-blogs/b/tools-software-ides-blog/posts/exploring-the-arm-dot-product-instructions
Look it is optional (so possible existe) on A55/A75...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Illegal Instruction using Q4_0_4_4 on Rockchip rk3399 SOC #9853

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 5 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Illegal Instruction using Q4_0_4_4 on Rockchip rk3399 SOC #9853

Uh oh!

EugeneParmentier Oct 11, 2024

Introduction

Troubleshooting

Conclusion

Replies: 2 comments · 5 replies

Uh oh!

ggerganov Oct 12, 2024 Maintainer

Uh oh!

EugeneParmentier Oct 12, 2024 Author

Uh oh!

ggerganov Oct 12, 2024 Maintainer

Uh oh!

Uh oh!

EugeneParmentier Oct 12, 2024 Author

Uh oh!

Djip007 Oct 19, 2024

Uh oh!

ggerganov Oct 20, 2024 Maintainer

Uh oh!

Uh oh!

Djip007 Oct 25, 2024

EugeneParmentier
Oct 11, 2024

Replies: 2 comments 5 replies

ggerganov
Oct 12, 2024
Maintainer

EugeneParmentier
Oct 12, 2024
Author

ggerganov Oct 12, 2024
Maintainer

EugeneParmentier Oct 12, 2024
Author

ggerganov Oct 20, 2024
Maintainer