Skip to content

Fix gemma3 workload execution failure #1162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: habana_main
Choose a base branch
from

Conversation

shepark
Copy link

@shepark shepark commented Apr 24, 2025

Fixed gemma3 workload execution failures.
Tested with gemma3 4b it model.
For your testing, you might need to use cache for model, and need to set no_proxy for client connection.

  • With 1.21.0
    Server command:
VLLM_PROMPT_BS_BUCKET_MIN=1 VLLM_PROMPT_BS_BUCKET_STEP=1 VLLM_PROMPT_BS_BUCKET_MAX=1 \
VLLM_PROMPT_SEQ_BUCKET_MIN=384 VLLM_PROMPT_SEQ_BUCKET_MAX=384 \
VLLM_DECODE_BS_BUCKET_MIN=1 VLLM_DECODE_BS_BUCKET_MAX=1 \
VLLM_DECODE_BLOCK_BUCKET_MIN=512 VLLM_DECODE_BLOCK_BUCKET_MAX=512 \
python -m vllm.entrypoints.openai.api_server \
--model google/gemma-3-4b-it --max-num-batched-tokens 8192 --max-model-len 8192 --port 8000

for text and image input

curl -X POST "http://localhost:8000/v1/chat/completions" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "google/gemma-3-4b-it",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "Describe this image in one sentence."
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
            }
          }
        ]
      }
    ]
  }' | jq

for text only

curl -X POST "http://localhost:8000/v1/chat/completions" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "google/gemma-3-4b-it",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "Describe the shape of an apple."
          }
        ]
      }
    ]
  }' | jq

@shepark shepark force-pushed the dev/shepark/fix_gemma3_failure branch from d6b0576 to 6f93ec6 Compare April 25, 2025 05:03
@shepark shepark marked this pull request as ready for review April 25, 2025 05:24
@michalkuligowski
Copy link

/run-gaudi-tests

@shepark shepark force-pushed the dev/shepark/fix_gemma3_failure branch from 92f4fd0 to fedaedc Compare April 25, 2025 14:35
@michalkuligowski
Copy link

/run-gaudi-tests

@michalkuligowski
Copy link

/run-gaudi-tests

@shepark shepark force-pushed the dev/shepark/fix_gemma3_failure branch 3 times, most recently from 1f59955 to e92d432 Compare May 11, 2025 23:17
Comment on lines 643 to 644
input_ids = input_ids.flatten()
positions = positions.flatten()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add hpu specific method for this and in the invocation call "current_platform.is_hpu()"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add hpu specific method for this and in the invocation call "current_platform.is_hpu()"

Thank you for the review. updated.

@shepark shepark force-pushed the dev/shepark/fix_gemma3_failure branch 2 times, most recently from 5d2cd62 to b179744 Compare June 13, 2025 16:34
@shepark shepark force-pushed the dev/shepark/fix_gemma3_failure branch from b179744 to 0bea4f2 Compare June 13, 2025 16:58
@@ -31,6 +31,8 @@
from vllm.multimodal.profiling import BaseDummyInputsBuilder
from vllm.sequence import IntermediateTensors

from vllm.platforms import current_platform
Copy link

@michalkuligowski michalkuligowski Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Precommit suite fails with the white space at the end of this line

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants