Convert GGUF model to Hugging Face model #3770
-
Hi, |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 8 replies
-
I TRIED,I transfered it to dequantized model ,and it became many bins with specialname ,and I tried load it ,but the model seems very perform very awfal |
Beta Was this translation helpful? Give feedback.
-
I don't think there's really a good way to do that currently. Also, if the model is quantized then you're going to lose a lot of quality if you ever quantize it again. You'd also basically have to undo the stuff |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
As of today you should be able to not just load & run inference but also convert a Mistral or LLama checkpoint over to the Transformers format from the latest Transformers release! Linking the relevant docs here: Transformers docs Transformers supports conversion from all the major quantisation formats: from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF"
filename = "tinyllama-1.1b-chat-v1.0.Q6_K.gguf"
tokenizer = AutoTokenizer.from_pretrained(model_id, gguf_file=filename)
model = AutoModelForCausalLM.from_pretrained(model_id, gguf_file=filename) To further save persist the model checkpoint, you can: tokenizer.save_pretrained('directory')
model.save_pretrained('directory') We're hoping to extend this functionality further based on whichever architectures are requested further by the community! Please log your requests over on this Transformers issue |
Beta Was this translation helpful? Give feedback.
As of today you should be able to not just load & run inference but also convert a Mistral or LLama checkpoint over to the Transformers format from the latest Transformers release!
Linking the relevant docs here: Transformers docs
Transformers supports conversion from all the major quantisation formats:
To further save persist the model checkpoint, you can: