how to reduce hallucinations for a specific 4bit / 8bit model? #3209
Replies: 3 comments 6 replies
-
Not really. A lot of it depends on the model and how it was trained. Some models will be better than others. You can possibly look at HuggingFace's Open LLM leaderboard: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard Models that have high average scores or maybe high truthfulness scores are likely to hallucinate less. Even really huge models with massive amounts of tuning and lots of help like ChatGPT still hallucinate a decent amount. Quantizing less just reduces the quality loss from the original, full-sized version. It doesn't directly relate to hallucinating. Other stuff like sampling parameters can also have an effect. For example, setting temperature lower may reduce hallucinations. So can the prompt you use: saying stuff like "Don't make it up if you don't know", telling it to think through its response step by step, etc. Which approach works, which prompt to use, etc can depend a lot on which model you're using. There isn't really a one-size-fits-all approach. The best thing to do is experiment around and see what works, but you shouldn't expect to be able to eliminate hallucinations not matter what you do. |
Beta Was this translation helpful? Give feedback.
-
"Don't make it up if you don't know" seems to help a lot! thx. vram 8gb rtx 4060, 16gb ram. ryzen 7640HS i think it's a laptop i'm working on portability. using ubuntu 22.04 "headless" mode to get more vram out of it. i'm looking at P40s for desktop now but saw some compat issues with latest llama.cpp in github issue section. hope it's supported long term. |
Beta Was this translation helpful? Give feedback.
-
Hello. Do you have any additional information on an efficient prompt to reduce hallucination in the instructions? Or do you know of any additional papers related to this topic? I'm curious if you still think prompting in the instructions is an effective way to reduce hallucination. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
how to reduce hallucinations for a specific 4bit / 8bit model?
what parameters shld i put? 6_0 bit models from thebloke for 7b seems ok but sometimes (like 1 out of 10 times) can generate hallucinated stuff especially with biographical content.
anyone can give any ideas on how to reduce hallucination? i'm not sure how high the parameters / bit should have to have 100% zero hallucination. (is this possible?)
Beta Was this translation helpful? Give feedback.
All reactions