Release v1.7.4 · huggingface/text-embeddings-inference

Noticeable Changes

Qwen3 was not working fine on CPU / MPS when sending batched requests on FP16 precision, due to the FP32 minimum value downcast (now manually set to FP16 minimum value instead) leading to null values, as well as a missing to_dtype call on the attention_bias when working with batches.

What's Changed

Fix Qwen3 Embedding Float16 DType by @tpendragon in #663
Fix fmt by re-running pre-commit by @alvarobartt in #671
Update version to 1.7.4 by @alvarobartt in #677

Full Changelog: v1.7.3...v1.7.4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v1.7.4

Noticeable Changes

What's Changed

Contributors

Uh oh!