How can I start a server that uses one gguf for the model, and a separate gguf for the embeddins target? #7721

MrDowntempo · 2024-06-03T21:04:02Z

MrDowntempo
Jun 3, 2024

running a server of a gguf has been pretty simple to setup. I know that embeddings are supported, but the --embeddings flag doesn't seem to accuept a separate gguf for embeddings specifically. If I want to use a llama3 model as the main model, but nomic-embed for the embeddings target, how is that accomplished?

Answered by ggerganov

Jun 4, 2024

Currently this is not supported. You can try to start a second instance of server

View full answer

ggerganov · 2024-06-04T05:58:56Z

ggerganov
Jun 4, 2024
Maintainer

Currently this is not supported. You can try to start a second instance of server

1 reply

MrDowntempo Jun 4, 2024
Author

Thank you, that's what I expected but I couldn't find it explicitly stated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How can I start a server that uses one gguf for the model, and a separate gguf for the embeddins target? #7721

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How can I start a server that uses one gguf for the model, and a separate gguf for the embeddins target? #7721

Uh oh!

MrDowntempo Jun 3, 2024

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

ggerganov Jun 4, 2024 Maintainer

Uh oh!

MrDowntempo Jun 4, 2024 Author

MrDowntempo
Jun 3, 2024

Replies: 1 comment 1 reply

ggerganov
Jun 4, 2024
Maintainer

MrDowntempo Jun 4, 2024
Author