Replies: 4 comments
-
BTLM-3B-8K Highlights:
|
Beta Was this translation helpful? Give feedback.
-
@TheBloke did you take a look 😄 ? |
Beta Was this translation helpful? Give feedback.
-
It's a new model architecture so there would need to be a GGML implementation first, at https://github.com/ggerganov/ggml - you could raise it there. Even once a GGML implementation is added, llama.cpp is unlikely to support it for now, as currently it only supports Llama models. Llama.cpp may add support for other model architectures in future, but not yet. Adding a GGML implementation is not something I can do. If a GGML implementation is released for it, I am happy to release quantisations of it. But I can't do the implementation myself. It will likely be easier to get GPTQ support, but again someone would have to add that, eg to AutoGPTQ. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Looks like arch is bashed on Cerebras-GPT
so should work ?
anyway i donot see any info about it if anyone gave it a try.
i always used ggml file converted by other people so have not tried.
unlike previous models bashed on Cerebras-GPT this one looks like much more capable for it's class even challenging 7b models.
uses slimredpajama 600billion tokens cleanup from redpajama dataset
https://www.cerebras.net/blog/btlm-3b-8k-7b-performance-in-a-3-billion-parameter-model/
Previous Cerebras-GPT Disussion
Beta Was this translation helpful? Give feedback.
All reactions