Are there other quantized versions of the model #449

Open

frankinstien111

opened

on May 22, 2025

I'm assuming the inference is using FP32 or BF16 but is there a model that is quantized for FP16, INT8, or even INT4?

Metadata

Assignees

No one assigned

Labels

No labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests