Replies: 1 comment 2 replies
-
I agree, from a quick skim of the PR code, I don't see anything that would lead to a complicated port. I could do it if no one else gets to it first. Especially due to this part in that PR:
|
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hey, friend,
Out of curiosity, do you have any plans to add dots.llm1 support? The model seems interesting enough. I tried it out on mainline, but the speeds were atrocious for its size, making it unusable, at least for me. That’s why I jumped over to your fork (thanks to ubergarm) for both the insane MoE speedups and for being the godfather of, arguably, the absolute SOTA quants in my eyes.
Here's the pull request from mainline for dots:
ggml-org/llama.cpp@9ae4143
Regardless of whether it’s on your roadmap or not, I just wanted to say thank you, ikawrakow, for all that you have done and continue to do. You are one of a kind.
Beta Was this translation helpful? Give feedback.
All reactions