Replies: 3 comments 4 replies
-
I've also encountered this, with a llama-2-70b-chat q3_k_m ggml, after converting the same way you did. I'll try and use a 70B GGUF from TheBloke (presumably converted directly from the pytorch .bins) and report back if those are fine. |
Beta Was this translation helpful? Give feedback.
-
I don't think so, this looks like correct usage (just for reference I'm the one responsible for that conversion script). It'll copy over the tensors but use the metadata and vocabulary from the original model. The tool doesn't make any changes to the tensors and the metadata/vocab part just uses the code in edit: Thanks @person4268 for testing. Based on what they said, I'd recommend trying to run llama.cpp directly against the model file that's giving you issues and see if you're still able to replicate the problem. |
Beta Was this translation helpful? Give feedback.
-
I'm seeing occasional missing words using llava with llama.cpp. I'm at git commit 254cfefa680a5eaebda36ed8664659f42108c6ff. Mostly default settings (e.g. batch=512). I do single shot generation so I will run llava for each prompt+image combination generating 100-1000 words.
(I've also seen a missing word for --temp 0.5 --top-p 0.99 --min-p 0.01) I'm using a converted model: llava-f16-13b_q8_0.gguf (converted to q8) from: |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm using a WIP branch of oobabooga/text-generation-webui, so there could of course be something more that needs to be updated.
Converted airoboros-33b-gpt4-2.0.ggmlv3.q5_K_M.bin from TheBloke to GGUF using the python script and metadata from original model on huggingface.
Example generated text by GGUF model:
It kind of looks like it has forgotten the words "contrary" and "suffering", which would fit well where there appears to be missing words in the first and third sentences. It happens very regularly with the converted model, while the original GGML works fine.
Log from converter script:
Anything I'm likely doing wrong during conversion? Only passing the --input --output and --model-metdata-dir parameters. The metadata dir has tokenizer_config.json tokenizer_config.json tokenizer.model config.json from https://huggingface.co/jondurbin/airoboros-33b-gpt4-2.0/tree/main
Beta Was this translation helpful? Give feedback.
All reactions