Replies: 1 comment 5 replies
-
I think that all the tensors in the llama-2 model files distributed by meta are BF16. When converting or quantizing the model to GGUF, some of these tensors are always exported as FP32, regardless of the |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
As I understand it, models like
meta-llama/Llama-2-13b-chat-hf
contains both fp16 and fp32 tensors, so I am wondering:--outtype fp16
, do all the fp32 tensors in the model gets converted to fp16 and the tensors that are already fp16 gets no change?--outtype fp32
, do all the fp16 tensors in the model gets converted to fp32, and the fp32 tensors does not change?Thanks!
Beta Was this translation helpful? Give feedback.
All reactions