Very Short Responses Running External Model Weights With a Llamafile #645
Unanswered
michaelgetachew-abebe
asked this question in
Q&A
Replies: 2 comments
-
Could you copy and paste the command you're running and the output? |
Beta Was this translation helpful? Give feedback.
0 replies
-
yes hi, I'm just stuck, I devote all my time to studying, very closely. thanks for the help.
@echo off
call .\llamafile.exe -m model-f16.gguf -t 52 -c 2048 -b 1024
[--languare-ru_RU]
-ngl 9999 [--gpu-AUTO]
-o log.txt
[--save\all\logits-main.log] [--log-test] [--log-enable] [--log-append]
[--interactive-first]
pause
…________________________________
From: Justine Tunney ***@***.***>
Sent: Friday, November 29, 2024 10:57 PM
To: Mozilla-Ocho/llamafile ***@***.***>
Cc: Subscribed ***@***.***>
Subject: Re: [Mozilla-Ocho/llamafile] Very Short Responses Running External Model Weights With a Llamafile (Discussion #645)
Could you copy and paste the command you're running and the output?
―
Reply to this email directly, view it on GitHub<#645 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AOOFRCBFHCAABKTX33QYXAL2DDBMDAVCNFSM6AAAAABSW3LSKKVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCNBRHA3TANA>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello everyone,
I am a bit new to Llamafile. I was trying to run Llama2 using the 5 bit quantized gguf weights. However, I am experiencing very short responses is there a way I can adjust the response length in Llamafile? Or how can I make it respond with better and longer reponses?
Thanks in advance
Beta Was this translation helpful? Give feedback.
All reactions