Replies: 1 comment
-
The default head size is |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Usually, the attention head size
head_dim = hidden_dim // num_attention_heads
in many model architectures including Llama.Some models use more flexible
head_dim
sizes such asFor Llama models, here is one pending PR for HF
Looking at
src/llama.cpp
, I feel like the information is handled around here but I'm not sure.Could anybody help me understand how the information is loaded into
hparams
and can be used inbuild_*()
?Thank you!
Beta Was this translation helpful? Give feedback.
All reactions