GGUF

I was trying to convert the GGUF model but there is a major limitation that the script is not handling.
The RKLLM is only cappable of converting the output weigths of q_4_0 and fp16 quants. It's not about the generic GGUF quant defined in filename but the output weight.

For example there GGUF files from q_2 to q_6 of Qwen 2.5  has output of q_6. Only the q_8 has q_8 output quant and then the fp16 has fp16.

In fact there is no way of converting Qwen2.5 using GGUF other than fp16...  In this case i would use HF instead.

Second thing is that the GGUF logic downloads all files while the GGUF loading function expects only the one specific filename - it should use q_4_0 probably however in real scenario we should first determine if the output quant is q_4_0. I don't know if the HF library has an option to read metadata ? Because there is the weights detailed information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GGUF #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

GGUF #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions