Replies: 2 comments
-
Hello @tommycwh! I'm here to help you solve bugs, answer your questions, and assist you in becoming a contributor. While you wait for a human maintainer, feel free to share your issue, and I'll do my best to assist you! The issue you're experiencing with degraded generation outputs when using For example, the I found a similar unsolved discussion: HuggingfaceLLM Meta Llama3.1 8B generation doesn't stop, which might provide additional insights or shared experiences [2]. To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other |
Beta Was this translation helpful? Give feedback.
-
I just found that I made a very simple mistake here. I am sorry that that. I forgot to pass the tokenizer when I wrap the HF model as a Fixing this, the code below generates reasonable texts.
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I have been using HuggingFace and now I am trying out LlamaIndex and see if I can use LlamaIndex to load HF models. However, when I try to load a HF model as a
llama_index.llms.huggingface.HuggingFaceLLM
model, it does not generate reasonable texts, e.g., repeated words or meaningless texts. So, I want to ask what I am missing here that make my HF models not working in LlamaIndex.Let me explain my test case here. To begin with, I am using a prompt from the dataset
MiniMtBenchSingleGradingDataset
, following the steps in the example "LlamaIndexBenchmarking LLM Evaluators On A Mini MT-Bench (Single Grading)". I mainly use the first prompt from the dataset:
If I only use the HF packages, the models I tested can generate reasonable texts. For example:
However, when I load the HF model as a
HuggingFaceLLM
, it does not give reasonable results. For example, following the steps in this example:In some of my other trials, it does generate a sentence, but then just repeating it, probably till the max_token is reached.
Does anyone have any idea about why this happens, or does anyone also have a similar experience? Any comments about this will be very helpful. Thank you!
Beta Was this translation helpful? Give feedback.
All reactions