Replies: 4 comments
-
Beta Was this translation helpful? Give feedback.
-
llama.cpp does support it. But the main code does not yet work with alibi. |
Beta Was this translation helpful? Give feedback.
-
There are bounty($2000) for CPU inference support for Refact LLM: smallcloudai/refact#77 |
Beta Was this translation helpful? Give feedback.
-
tracking issue: #3061 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Reddit announce: https://www.reddit.com/r/LocalLLaMA/comments/169yonh/we_trained_a_new_16b_parameters_code_model_that/
Blog: https://refact.ai/blog/2023/introducing-refact-code-llm/
Code: https://github.com/smallcloudai/refact/
Model: https://huggingface.co/smallcloudai/Refact-1_6B-fim
Do I understand correctly that this model cannot yet be used in llama.cpp since there is no support for Multi Query Attention yet?
Is this the only blocker?
Beta Was this translation helpful? Give feedback.
All reactions