Where can I download the LLaMa model weights? #4576

jzry · 2023-12-21T21:29:31Z

jzry
Dec 21, 2023

I am trying to LLaMa running and I am stuck at this step: https://github.com/ggerganov/llama.cpp#prepare-data--run

I'm not sure exactly what this command is:
65B 30B 13B 7B tokenizer_checklist.chk tokenizer.model
Is this supposed to decompress the model weights or something?

What is the difference between running llama.cpp with the BPE tokenizer model weights and the LLaMa model weights?

Do I run both commands:
65B 30B 13B 7B vocab.json and python convert.py models/7B/ --vocabtype bpe, but not 65B 30B 13B 7B tokenizer_checklist.chk tokenizer.model
if I have the BPE model weights or are they both still executed consecutively?

I have searched around the web but I can't seem to find the actual model weights. I'm also not sure if I just move all the files to the models folder once I download the model weights and if that would allow the program to start working once I run the rest of the commands in the prepare data run command and do ./main -m. Does anyone with more experience know how to get llama.cpp working?
Thank you.

Answered by supportend

Dec 22, 2023

I cloned the llama.cpp source with git, build it with make and downloaded GGUF-Files of the models. When i use the exact prompt syntax, the prompt was trained with, it worked.

Good source for GGUF-files: https://huggingface.co/TheBloke

Sure, when you use a graphic card, perhaps you have to enable something, to make it work.

65B 30B 13B 7B vocab.json is not a command, you have to execute. I have my models in two folders and use them this way (CPU only):

./main -t 6 -m ~/Downloads/models/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf -c 8192 --temp 0.7 --repeat_penalty 1.1 --log-disable -n -1 -p "<s>[INST] Write a short text about UPX. [/INST]"

Take care of useable RAM and RAM consumption and adjus…

View full answer

supportend · 2023-12-22T00:12:06Z

supportend
Dec 22, 2023

I cloned the llama.cpp source with git, build it with make and downloaded GGUF-Files of the models. When i use the exact prompt syntax, the prompt was trained with, it worked.

Good source for GGUF-files: https://huggingface.co/TheBloke

Sure, when you use a graphic card, perhaps you have to enable something, to make it work.

65B 30B 13B 7B vocab.json is not a command, you have to execute. I have my models in two folders and use them this way (CPU only):

./main -t 6 -m ~/Downloads/models/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf -c 8192 --temp 0.7 --repeat_penalty 1.1 --log-disable -n -1 -p "<s>[INST] Write a short text about UPX. [/INST]"

Take care of useable RAM and RAM consumption and adjust for your needs.

0 replies

BarfingLemurs · 2023-12-22T02:33:26Z

BarfingLemurs
Dec 22, 2023

@jzry these original instructions are for the first release of LLAMA, released on a strict research condition only, they will have to process your request if you plan to obtain the original weights.

The models are in a pytorch format, (not huggingface's) ; for the line you mentioned: the two of them are together as one.

# obtain the original LLaMA model weights and place them in ./models
ls ./models 65B 30B 13B 7B tokenizer_checklist.chk tokenizer.model
  # [Optional] for models using BPE tokenizers

1 reply

jzry Dec 22, 2023
Author

I got the 7B model working. Thanks!

seven-dev · 2024-04-11T10:01:26Z

seven-dev
Apr 11, 2024

This worked for me:

from huggingface_hub import hf_hub_download
import joblib

REPO_ID = "TheBloke/LLaMA-7b-GGUF"
FILENAME = "llama-7b.Q3_K_S.gguf"

hf_hub_download(repo_id=REPO_ID, filename=FILENAME)

https://huggingface.co/TheBloke/LLaMA-7b-GGUF

1 reply

phymbert Apr 11, 2024
Collaborator

it is better to use builtin download support:

./main --hf-repo TheBloke/LLaMA-7b-GGUF --hf-file llama-7b.Q3_K_S.gguf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Where can I download the LLaMa model weights? #4576

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Where can I download the LLaMa model weights? #4576

Uh oh!

jzry Dec 21, 2023

Replies: 3 comments · 2 replies

Uh oh!

Uh oh!

supportend Dec 22, 2023

Uh oh!

BarfingLemurs Dec 22, 2023

Uh oh!

jzry Dec 22, 2023 Author

Uh oh!

Uh oh!

seven-dev Apr 11, 2024

Uh oh!

phymbert Apr 11, 2024 Collaborator

jzry
Dec 21, 2023

Replies: 3 comments 2 replies

supportend
Dec 22, 2023

BarfingLemurs
Dec 22, 2023

jzry Dec 22, 2023
Author

seven-dev
Apr 11, 2024

phymbert Apr 11, 2024
Collaborator