Support for intfloat/e5-mistral-7b-instruct #4863
nathanpbell
started this conversation in
Ideas
Replies: 1 comment
-
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
In trying to wrap my head around this, I think I've found that llama.cpp would need to support two new features to get this embedding model to work optimally:
we need away to probe the values at the last layer, before the LM head (or ideally skip the LM head all together). Does the current embedding endpoint do exactly that? I couldn't fully follow where it grabs its values from
We need a way to pass in an attention mask along with the batch of inputs, or calculate one.
Before I explore this more, am I on the right path here that a) llama doesn't currently have these features and b) they are needed and c) they are in theory sufficient (or close to it) to get e5-mistral to work as intended.
Beta Was this translation helpful? Give feedback.
All reactions