[CB][do not merge] enable warmup for batch size 1 #287

yannicks1 · 2025-07-08T08:05:04Z

[CB] enable warmup for batch size 1

in #285 we wanted to allow batch size 1 for continuous batching. However, the warmup did not support batch size 1. With this small fix it does work on CPU at least. It does not currently compile on the card. Would need to compare graphs next...

Signed-off-by: Joe Runde <joe@joerun.de>

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

github-actions · 2025-07-08T08:14:43Z

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

yannicks1 · 2025-07-08T13:32:34Z

bot:test
TEST_FILE=tests/e2e/test_spyre_cb.py MARKERS="spyre"

prashantgupta24 · 2025-07-08T16:24:39Z

the warmup did not support batch size 1

naive question - can we add a failing test for that which this PR fixes?

Edit: I guess we had the assert earlier which prevented batch size 1 from running 🤷

joerunde · 2025-07-08T17:22:15Z

ah interesting- I swear I had tested manually on a dev pod and saw batch size 1 working, but I dunno how that was happening when the test clearly failed 🤦.

Anyway nice fix!

joerunde · 2025-07-08T17:28:55Z

vllm_spyre/v1/worker/spyre_worker.py

@@ -317,6 +318,18 @@ def _warmup_spyre_dynamic_size(self, special_token_ids):
        prompt_len = 42
        num_decode_tokens = 2

+        # Fix for batch size 1: set input batch to fit 2 requests for warmup
+        if model_runner.vllm_config.scheduler_config.max_num_seqs == 1:
+            model_runner.input_batch = InputBatch(


Alternatively, could the InputBatch construct itself with:

self.max_num_reqs = min(max_num_reqs, 2)

since we know that it'll always need at least 2, and then we avoid reconstructing it in the worker here? That way we have a much smaller diff to back out once we can lift this bs>=2 restriction

not sure if I follow here. it has to be >=2 for the warmup. with the min(1,2) we would still fail?

would that work if you directly set model_runner.input_batch.max_num_reqs = 2, instead of instantiating a new InputBatch?

no, because InputBatch initialization gets model_runner.input_batch.max_num_reqs..

max_num_seqs occurs 17 times in the init of the InputBatch. It is not a single attribute, but used to construct several attributes. So re-initializing is simpler...

yannicks1 · 2025-07-08T18:35:06Z

I tried to test this on the card, but it failed. Not clear to me why though. Will have to compare graphs tomorrow.

yannicks1 · 2025-07-10T13:42:55Z

bot:test
TEST_FILE=tests/e2e/test_spyre_cb.py MARKERS="spyre"

waleedqk · 2025-07-15T18:30:30Z

bot:test
TEST_FILE=tests/e2e/test_spyre_cb.py MARKERS="spyre"

Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>

yannicks1 · 2025-07-18T09:33:05Z

bot:test
TEST_FILE=tests/e2e/test_spyre_cb.py MARKERS="spyre"

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

JRosenkranz · 2025-07-18T11:40:41Z

I tried to test this on the card, but it failed. Not clear to me why though. Will have to compare graphs tomorrow.

@yannicks1 in order to support this, we will need to add the following which will enable symbolic shapes for size 1 dimensions (if marked dynamic):

from torch.fx.experimental import _config as config
config.backed_size_oblivious = True

yannicks1 · 2025-07-18T12:05:30Z

thanks for the hint @JRosenkranz ! I think though your suggested changes should be tried with #312 which actually uses torch 2.7.1 and applies real batch size 1. This PR (#287) is doing padding under the hood. It is merely supposed to support --max_num_seqs 1 from a scheduling perspective...

joerunde and others added 3 commits July 7, 2025 15:33

🧪 add test for cb @ bs 1

a52c7ec

Signed-off-by: Joe Runde <joe@joerun.de>

🐛 update max batch size to 2 for cb abort test

1797ea3

Signed-off-by: Joe Runde <joe@joerun.de>

remove assertion batch size >= 2 (for warmup)

9561a9d

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

yannicks1 added 2 commits July 8, 2025 13:17

fix batch size 1 warmup

10aed51

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

sorting imports

78d864a

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

yannicks1 changed the title ~~remove assertion batch size >= 2 (for warmup)~~ [CB] enable warmup for batch size 1 Jul 8, 2025

yannicks1 marked this pull request as ready for review July 8, 2025 14:12

yannicks1 requested review from tdoublep, nikolaospapandreou and sducouedic as code owners July 8, 2025 14:12

yannicks1 requested a review from joerunde July 8, 2025 14:12

joerunde reviewed Jul 8, 2025

View reviewed changes

yannicks1 self-assigned this Jul 9, 2025

Merge branch 'main' into cb-batch-size-1-test

ff7b32a

yannicks1 requested review from rafvasq and prashantgupta24 as code owners July 9, 2025 15:21

Base automatically changed from cb-batch-size-1 to main July 9, 2025 19:23

Merge branch 'main' into cb-batch-size-1-test

55f4efa

Merge branch 'main' into cb-batch-size-1-test

b392e23

Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>

yannicks1 mentioned this pull request Jul 16, 2025

[CB][do not merge] Support batch size 1 for decode, simplify warmup #312

Draft

Merge branch 'main' into cb-batch-size-1-test

3959678

Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>

yannicks1 changed the title ~~[CB] enable warmup for batch size 1~~ [CB][do not merge] enable warmup for batch size 1 Jul 18, 2025

add test for max_num_seqs = 1

d55a23f

Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>

yannicks1 marked this pull request as draft July 18, 2025 12:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CB][do not merge] enable warmup for batch size 1 #287

[CB][do not merge] enable warmup for batch size 1 #287

yannicks1 commented Jul 8, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 8, 2025

Uh oh!

yannicks1 commented Jul 8, 2025

Uh oh!

prashantgupta24 commented Jul 8, 2025 •

edited

Loading

Uh oh!

joerunde commented Jul 8, 2025 •

edited

Loading

Uh oh!

joerunde Jul 8, 2025

Uh oh!

yannicks1 Jul 8, 2025

Uh oh!

sducouedic Jul 8, 2025

Uh oh!

yannicks1 Jul 9, 2025

Uh oh!

yannicks1 Jul 10, 2025

Uh oh!

yannicks1 commented Jul 8, 2025

Uh oh!

yannicks1 commented Jul 10, 2025

Uh oh!

waleedqk commented Jul 15, 2025

Uh oh!

yannicks1 commented Jul 18, 2025

Uh oh!

JRosenkranz commented Jul 18, 2025

Uh oh!

yannicks1 commented Jul 18, 2025

Uh oh!

Uh oh!

[CB][do not merge] enable warmup for batch size 1 #287

Are you sure you want to change the base?

[CB][do not merge] enable warmup for batch size 1 #287

Conversation

yannicks1 commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

[CB] enable warmup for batch size 1

Uh oh!

github-actions bot commented Jul 8, 2025

Uh oh!

yannicks1 commented Jul 8, 2025

Uh oh!

prashantgupta24 commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joerunde commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joerunde Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

yannicks1 Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

sducouedic Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

yannicks1 Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

yannicks1 Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

yannicks1 commented Jul 8, 2025

Uh oh!

yannicks1 commented Jul 10, 2025

Uh oh!

waleedqk commented Jul 15, 2025

Uh oh!

yannicks1 commented Jul 18, 2025

Uh oh!

JRosenkranz commented Jul 18, 2025

Uh oh!

yannicks1 commented Jul 18, 2025

Uh oh!

Uh oh!

yannicks1 commented Jul 8, 2025 •

edited

Loading

prashantgupta24 commented Jul 8, 2025 •

edited

Loading

joerunde commented Jul 8, 2025 •

edited

Loading