Skip to content

bf16 --> bf16 conversion still has f32 tensors? #9590

Answered by CISC
arch-btw asked this question in Q&A
Discussion options

You must be logged in to vote

So my question is: if I want it to be completely lossless, would it make more sense to convert it to bf16 at this step? My reasoning would be that it would stay in the same format.

For archival purposes, for sure.

Or alternatively, would it actually make more sense to convert to f32 for it to be completely lossless?

Depends, a minor issue with storing it in BF16 is that this format is not directly supported (yet) by most hardware acceleration in llama.cpp. If you want to do inference from the original unquantized model, F32 is probably the way to go right now (but again, this may change in the future).

The part that confuses me is that when I convert it to bf16 and then run ./llama-q…

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by arch-btw
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants