Skip to content

Support Qwen3 Embedding & Reranker on Gaudi for aice/v1.21.0 branch #1456

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 19 commits into
base: aice/v1.21.0
Choose a base branch
from

Conversation

gyou2021
Copy link

@gyou2021 gyou2021 commented Jun 19, 2025

  1. Support Qwen3 Embedding & Reranker on Gaudi
  2. Enabled HPU lazy mode of Roberta model on Gaudi.

Code modifications on HPU:

  1. Cherry-pick commit 3952731 of vllm-project on GPU
  2. Fixed the bug in hpu_model_runner.py to enable correct call.
  3. Updated pooler.py of aice/v1.21.0 branch of vllm-fork based on that of the main branch of vllm-project and fixed the score bug.
  4. Removed assert in roberta.py to enable hpu_graph mode.

example:
cd examples/offline_inference
PT_HPU_LAZY_MODE=1 VLLM_SKIP_WARMUP=true python qwen3_reranker.py

Supported on aice/v1.21.0.

@czhu15
Copy link

czhu15 commented Jun 20, 2025

"Currently, not supported on aice/v1.21.0 yet."
Is this feature/PR ready for review/merge?

@czhu15
Copy link

czhu15 commented Jun 20, 2025

Will be good to provide the link the original PR and highlight the changes on HPU.
This will make our future debug easy.
Thanks!

@gyou2021 gyou2021 changed the title Support Qwen3 Embedding & Reranker on Gaudi for aice/v1.21.0 branch Support Qwen3 Embedding & Reranker on Gaudi for aice/v1.21.0 branch [WIP] Jun 20, 2025
@gyou2021 gyou2021 changed the title Support Qwen3 Embedding & Reranker on Gaudi for aice/v1.21.0 branch [WIP] [WIP] Support Qwen3 Embedding & Reranker on Gaudi for aice/v1.21.0 branch Jun 20, 2025
@ranzhejiang
Copy link

I notice that you use _hpu_merge_multimodal_embeddings, but #1436 has remove this function and using another way, can you use the same approach?

gyou2021 added 2 commits June 20, 2025 06:24
Signed-off-by: gyou2021 <ganmei.you@intel.com>
Signed-off-by: gyou2021 <ganmei.you@intel.com>
@gyou2021 gyou2021 changed the title [WIP] Support Qwen3 Embedding & Reranker on Gaudi for aice/v1.21.0 branch Support Qwen3 Embedding & Reranker on Gaudi for aice/v1.21.0 branch Jun 20, 2025
@gyou2021 gyou2021 changed the title Support Qwen3 Embedding & Reranker on Gaudi for aice/v1.21.0 branch [WIP] Support Qwen3 Embedding & Reranker on Gaudi for aice/v1.21.0 branch Jun 20, 2025
Signed-off-by: gyou2021 <ganmei.you@intel.com>
@gyou2021 gyou2021 changed the title [WIP] Support Qwen3 Embedding & Reranker on Gaudi for aice/v1.21.0 branch Support Qwen3 Embedding & Reranker on Gaudi for aice/v1.21.0 branch Jun 23, 2025
@gyou2021
Copy link
Author

"Currently, not supported on aice/v1.21.0 yet." Is this feature/PR ready for review/merge?

Yes.

@gyou2021
Copy link
Author

gyou2021 commented Jun 23, 2025

I notice that you use _hpu_merge_multimodal_embeddings, but #1436 has remove this function and using another way, can you use the same approach?

The code has been rebased to the latest aice/v1.21.0, including changes made by #1436.

@czhu15
Copy link

czhu15 commented Jun 23, 2025

@gyou2021
Will be good to provide the link the original PR and highlight the changes on HPU.
This will make our future debug easy.
Can you add some description on the commit message?

gyou2021 added 5 commits June 23, 2025 03:54
Signed-off-by: gyou2021 <ganmei.you@intel.com>
Signed-off-by: gyou2021 <ganmei.you@intel.com>
Signed-off-by: gyou2021 <ganmei.you@intel.com>
Signed-off-by: gyou2021 <ganmei.you@intel.com>
@gyou2021
Copy link
Author

@gyou2021 Will be good to provide the link the original PR and highlight the changes on HPU. This will make our future debug easy. Can you add some description on the commit message?
the original PR on vllm-project:
vllm-project@3952731

@gyou2021 Will be good to provide the link the original PR and highlight the changes on HPU. This will make our future debug easy. Can you add some description on the commit message?

Sure. Added the original PR and highlight the changes on HPU in the above PR descriptions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants