Skip to content
Discussion options

You must be logged in to vote

Hi @davidmezzetti

We're currently considering our options for additional quantization schemes. MXFP4 and NVFP4 are definitely options we're considering to add. I've mostly been thinking about this in terms of LLMs to be honest, but I think your use case is quite interesting too. Do you have any background you can share on how well either of those formats would potentially perform for ANN?

Also as an FYI regarding hardware support:

  • bitsandbytes officially supports Intel XPU and Gaudi2/Gaudi3.
  • We have AMD GPU support in an experimental phase as well - the main limitation with that will be that on the datacenter GPUs the minimum blocksize for 4bit is 128 instead of 64. It's not included in …

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@davidmezzetti
Comment options

Answer selected by davidmezzetti
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants