Support for Llama-2-7B-32K-Instruct? #2720

quarterturn · 2023-08-22T17:53:23Z

quarterturn
Aug 22, 2023

https://huggingface.co/togethercomputer/Llama-2-7B-32K-Instruct

"Model Description

Llama-2-7B-32K-Instruct is an open-source, long-context chat model finetuned from Llama-2-7B-32K, over high-quality instruction and chat data."

KerfuffleV2 · 2023-08-23T00:22:47Z

KerfuffleV2
Aug 23, 2023
Collaborator

Doesn't look like it needs anything special. You might need to set rope scaling.

1 reply

MichaelDays Aug 24, 2023

Was tinkering with this model yesterday, trying to get it to summarise some text. Not a lot of success, but my instruct promptfu is weak. :)

quarterturn · 2023-08-26T16:07:40Z

quarterturn
Aug 26, 2023
Author

works well with llama-cpp-python and llama.cpp. I was able to use this on a single 3090:

llm = Llama(model_path="/home/qtr/Documents/llama.cpp/models/llama2-7b-32k-instruct-ggml-f16/llama2-7b-32k-f16", n_gpu_layers=35, rope_freq_base=1047620, rope_freq_scale=0.25, n_ctx=16386)

I was unable to get the model to work properly at anthing other than f16, though. Even 8_0 resulted in broken replies.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for Llama-2-7B-32K-Instruct? #2720

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Support for Llama-2-7B-32K-Instruct? #2720

Uh oh!

quarterturn Aug 22, 2023

Replies: 2 comments · 1 reply

Uh oh!

KerfuffleV2 Aug 23, 2023 Collaborator

Uh oh!

MichaelDays Aug 24, 2023

Uh oh!

quarterturn Aug 26, 2023 Author

quarterturn
Aug 22, 2023

Replies: 2 comments 1 reply

KerfuffleV2
Aug 23, 2023
Collaborator

quarterturn
Aug 26, 2023
Author