Support for Llama-2-7B-32K-Instruct? #2720
quarterturn
started this conversation in
General
Replies: 2 comments 1 reply
-
Doesn't look like it needs anything special. You might need to set rope scaling. |
Beta Was this translation helpful? Give feedback.
1 reply
-
works well with llama-cpp-python and llama.cpp. I was able to use this on a single 3090:
I was unable to get the model to work properly at anthing other than f16, though. Even 8_0 resulted in broken replies. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
https://huggingface.co/togethercomputer/Llama-2-7B-32K-Instruct
"Model Description
Llama-2-7B-32K-Instruct is an open-source, long-context chat model finetuned from Llama-2-7B-32K, over high-quality instruction and chat data."
Beta Was this translation helpful? Give feedback.
All reactions