Skip to content

v0.7.3.post1

Latest
Compare
Choose a tag to compare
@wangxiyuan wangxiyuan released this 29 May 09:50
· 2 commits to v0.7.3-dev since this release
c69ceac

This is the first post release of 0.7.3. Please follow the official doc to start the journey. It includes the following changes:

Highlights

  • Qwen3 and Qwen3MOE is supported now. The performance and accuracy of Qwen3 is well tested. You can try it now. Mindie Turbo is recomanded to improve the performance of Qwen3. #903 #915
  • Added a new performance guide. The guide aims to help users to improve vllm-ascend performance on system level. It includes OS configuration, library optimization, deploy guide and so on. #878 Doc Link

Bug Fix

  • Qwen2.5-VL works for RLHF scenarios now. #928
  • Users can launch the model from online weights now. e.g. from huggingface or modelscope directly #858 #918
  • The meaningless log info UserWorkspaceSize0 has been cleaned. #911
  • The log level for Failed to import vllm_ascend_C has been changed to warning instead of error. #956
  • DeepSeek MLA now works with chunked prefill in V1 Engine. Please note that V1 engine in 0.7.3 is just expermential and only for test usage. #849 #936

Docs

  • The benchmark doc is updated for Qwen2.5 and Qwen2.5-VL #792
  • Add the note to clear that only "modelscope<1.23.0" works with 0.7.3. #954