Mixtral 8x7b #4539

ronangrant · 2023-12-19T22:38:43Z

ronangrant
Dec 19, 2023

This is not a small model, but it is shown to preform at the level as GPT 4 and is open source. I'm super curious if anyone has gotten it working on their machine.

supportend · 2023-12-20T22:17:11Z

supportend
Dec 20, 2023

Sure, it works good with recent llama.cpp. I don't have a good GPU and run it on CPU only. Depending on quantisation method more or less RAM is required. I use 6-Bit Q at the moment.

0 replies

mounta11n · 2023-12-21T04:43:17Z

mounta11n
Dec 21, 2023

I tested the q4_K_M variant hf/TheBloke/Mixtral_Instruct on intel i5 at 3 cores – cpu only as well. It needs about ~30 gb of RAM and generates at 3 tokens per second.

It is of course not at the level as GPT-4, but it is anyway indeed incredibly smart! The smartes llm I have seen so far after GPT-4.

However in my case it was not really useable for everday usecases due to extrem long eval time, but afaik there is a fix for it in the latest llama.cpp, but haven't tested yet.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Mixtral 8x7b #4539

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Mixtral 8x7b #4539

Uh oh!

ronangrant Dec 19, 2023

Replies: 2 comments

Uh oh!

Uh oh!

supportend Dec 20, 2023

Uh oh!

mounta11n Dec 21, 2023

ronangrant
Dec 19, 2023

supportend
Dec 20, 2023

mounta11n
Dec 21, 2023