exllamav2 and IBM Granite models #460
Closed
RaRasputinRGLM
started this conversation in
General
Replies: 1 comment 6 replies
-
I'm unsure what the issue is? I added support for Granite two weeks ago. It indeed doesn't have an |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I wanted to work with these because the sizes are appealing for local LLMs, they all use the starcoder tokenizer, however the model.embed_tokens is not found in any of the safe tensors, I was thinking, that I could hack a solution out with extracting the safetensor piece from the starcoder, but I think there might be a faster way to go that I am missing due to inexperience.
I don't think this is really a exllamav2 issue but worth discussing. Here is a similar issue with IBM granite that can give some insight about how they structured their models, over at llama.cpp ggml-org/llama.cpp#7116
In short what is the right route with exllamav2 and missing embed_tokens if you know the tokenizer it uses?
Beta Was this translation helpful? Give feedback.
All reactions