[CB][do not merge] Support batch size 1 for decode, simplify warmup #312

yannicks1 · 2025-07-16T08:39:16Z

[CB][do not merge] Support batch size 1 for decode, simplify warmup

As we now switched to torch 2.7.1 in #307 , dynamic dimension of size 1 are supported by torch. Hence, batch size 1 for decode should produce the same graph as batch size >= 2. This PR relaxes the constraint and adapts the warmup. To be tested on the card. if working, this makes #287 redundant.

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

github-actions · 2025-07-16T08:39:25Z

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

yannicks1 · 2025-07-16T08:42:12Z

bot:test
TEST_FILE=tests/e2e/test_spyre_cb.py MARKERS="spyre"

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

yannicks1 · 2025-07-16T11:01:30Z

bot test failed in warmup decode.

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

yannicks1 · 2025-07-16T11:04:31Z

bot:test
TEST_FILE=tests/e2e/test_spyre_cb.py MARKERS="spyre"

yannicks1 · 2025-07-16T11:25:08Z

Note: CPU failures is expected for BS 1 (didnt adapt warmup as in #287 )

Spyre card: reverting the warmup changes results in a runtime error: compile graph failed

yannicks1 · 2025-07-16T11:27:39Z

Looks like batch size 1 for decode is not supported by the compiler yet... Priority of this is low as performance advantage is marginal paired with a limited use case.

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

yannicks1 · 2025-07-21T07:58:22Z

update: i tried Joshs suggestion, so far without success

relax decode batch size > 1 constraint, adapt warmup

419c471

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

enable batch size 1 tests

2d0e675

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

yannicks1 changed the title ~~[CB] Support batch size 1 for decode, simplify warmup~~ [CB][do not merge] Support batch size 1 for decode, simplify warmup Jul 16, 2025

revert warmup changes.

149f96f

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

yannicks1 added 2 commits July 18, 2025 11:22

Merge branch 'main' into ysc-batch-1

2f11200

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

fix warmup for batch size 1

293c33c

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

yannicks1 mentioned this pull request Jul 18, 2025

[CB][do not merge] enable warmup for batch size 1 #287

Draft

yannicks1 self-assigned this Jul 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CB][do not merge] Support batch size 1 for decode, simplify warmup #312

[CB][do not merge] Support batch size 1 for decode, simplify warmup #312

Uh oh!

yannicks1 commented Jul 16, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 16, 2025

Uh oh!

yannicks1 commented Jul 16, 2025

Uh oh!

yannicks1 commented Jul 16, 2025

Uh oh!

yannicks1 commented Jul 16, 2025

Uh oh!

yannicks1 commented Jul 16, 2025

Uh oh!

yannicks1 commented Jul 16, 2025

Uh oh!

yannicks1 commented Jul 21, 2025

Uh oh!

Uh oh!

[CB][do not merge] Support batch size 1 for decode, simplify warmup #312

Are you sure you want to change the base?

[CB][do not merge] Support batch size 1 for decode, simplify warmup #312

Uh oh!

Conversation

yannicks1 commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!