v1.5.1
What's new in 1.5.1 (2025-04-30)
These are the changes in inference v1.5.1.
New features
- FEAT: Wan 2.1 text2video by @qinxuye in #3297
- FEAT: [UI] highlight the input box content. by @yiboyasss in #3306
- FEAT: [UI] display the model_ability parameter. by @yiboyasss in #3308
- FEAT: add ggufv2 support for vLLM by @harryzwh in #3259
- FEAT: ovis2 by @Minamiyama in #3170
- FEAT: support Qwen3 and Qwen3MOE by @Jun-Howie in #3347
- FEAT: Add support for Qwen3 GPTQ quantization format by @Jun-Howie in #3363
Enhancements
- ENH: support setting sse ping attempts by @llyycchhee in #3313
- ENH: Support GLM4-0414 MLX and GGUF by @Jun-Howie in #3325
- ENH: optimize qwen3, support chat_template_kwargs for all engines by @qinxuye in #3354
- REF: Drop internal compression logic for
transformers
quantization, using bnb config instead by @ChengjieLi28 in #3324 - REF: Unify audio model abilities by @llyycchhee in #3351
Bug fixes
- BUG: fix sglang chat by @qinxuye in #3326
- BUG: Show engine options on UI even if the specific engine is not installed by @ChengjieLi28 in #3331
- BUG: fix failure of clearing resources when loading model failed by @qinxuye in #3361
Documentation
New Contributors
- @llyycchhee made their first contribution in #3313
- @harryzwh made their first contribution in #3259
- @qiulang made their first contribution in #3342
Full Changelog: v1.5.0...v1.5.1