Skip to content

Convert GGUF model to Hugging Face model #3770

Answered by Vaibhavs10
fakerybakery asked this question in Q&A
Discussion options

You must be logged in to vote

As of today you should be able to not just load & run inference but also convert a Mistral or LLama checkpoint over to the Transformers format from the latest Transformers release!

Linking the relevant docs here: Transformers docs

Transformers supports conversion from all the major quantisation formats:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF"
filename = "tinyllama-1.1b-chat-v1.0.Q6_K.gguf"

tokenizer = AutoTokenizer.from_pretrained(model_id, gguf_file=filename)
model = AutoModelForCausalLM.from_pretrained(model_id, gguf_file=filename)

To further save persist the model checkpoint, you can:

tokenizer.save_pretrained('…

Replies: 4 comments 8 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
8 replies
@KerfuffleV2
Comment options

@illumionous
Comment options

@SeanZhang7
Comment options

@Vaibhavs10
Comment options

@SeanZhang7
Comment options

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Answer selected by fakerybakery
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
5 participants