[CI] Tests for graph comparison between vllm and AFTU #286

wallashss · 2025-07-08T01:48:01Z

Description

This PR adds tests that compare graphs generated by vLLM vs AFTU (aiu-fms-testing-utils). The tests do an inference with vLLM and execute inference.py from AFTU. For both execution, the tests dump the compiled graphs to files by setting the env variable DEE_DUMP_GRAPHS=name. Then, they compare the corresponding graphs generated for both method.

The main limitation of this implementation is AFTU is not a proper package. So, I had to make some workarounds:

I added an environment variable: VLLM_SPYRE_TEST_AFTU_SCRIPTS_DIR. If set, we can use to inform where is the scripts dirs (where it has the inference.py) to copy from there and execute the script within the tests.
If the above env var is not set, then we try to get from the install by importing the module and using the aiu_fms_testing_utils.__file__ variable to find the directory.

I added a marker aftu to ease select the tests of this PR, and also to filter out from other test suite.

Signed-off-by: Wallas Santos <wallashss@ibm.com>

…-compare-graph Signed-off-by: Wallas Santos <wallashss@ibm.com>

Signed-off-by: Wallas Santos <wallashss@ibm.com>

github-actions · 2025-07-08T01:48:08Z

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

Signed-off-by: Wallas Santos <wallashss@ibm.com>

prashantgupta24 · 2025-07-08T16:23:36Z

pyproject.toml

@@ -150,6 +151,7 @@ dev = [
    "pytest-timeout==2.3.1",
    "requests==2.32.3",
    "sentence-transformers==3.4.1",
+    "aiu_fms_testing_utils@git+https://github.com/foundation-model-stack/aiu-fms-testing-utils.git#1a77f630104a5661fff554164c1e536ea08393e3"


wondering if we need to update the uv.lock file as well 🤔

Signed-off-by: Wallas Santos <wallashss@ibm.com>

rafvasq · 2025-07-09T16:05:09Z

tests/aftu/graph_compare_utils.py

+    # Get G1 graphs, it assumes the input_dir has the folder export_dtcompiler
+    # where are the files


nit: something like

Suggested change

# Get G1 graphs, it assumes the input_dir has the folder export_dtcompiler

# where are the files

# Get G1 graphs.

# Assumes the 'input_dir' contains 'export_dtcompiler' with the files.

rafvasq · 2025-07-09T16:05:13Z

tests/aftu/graph_compare_utils.py

+    # Silly regex to find all s#.
+    # We are only considered those that surrounds by space (whole word)
+    # or started with space and terminated with comma
+    # examples:
+    # ' s1 '
+    # ' s1,'
+    # ' s1 s2 '


just a nit: comment could be a bit more succinct, something like:

Suggested change

# Silly regex to find all s#.

# We are only considered those that surrounds by space (whole word)

# or started with space and terminated with comma

# examples:

# ' s1 '

# ' s1,'

# ' s1 s2 '

# Regex to find all 's#' patterns surrounded by spaces,

# or starting with a space and ending with a comma.

# Examples: ' s1 ', ' s1,', ' s1 s2 '

joerunde · 2025-07-14T16:19:06Z

tests/aftu/test_compare_graphs.py

+
+@pytest.mark.aftu
+@pytest.mark.parametrize("model", get_spyre_model_list())
+@pytest.mark.parametrize("backend", ["sendnn"])


If this test can only run on the sendnn backend then we don't need to parameterize it, we can just set the backend as sendnn in the test.

Also, we should mark all tests running with the sendnn backend with pytest.mark.spyre for consistency with the rest of the test suite

joerunde · 2025-07-14T16:26:45Z

tests/aftu/graph_compare_utils.py

+
+
+def get_aftu_script_dir() -> str:
+    # TODO: since AFTU is not a lib yet, this function does the best


What does AFTU is not a lib yet mean here? Is the problem that inference.py is not shipped with the AFTU wheel?

joerunde · 2025-07-14T16:31:36Z

pyproject.toml

@@ -131,6 +131,7 @@ markers = [
    "multi: Tests that require >1 cards",
    "utils: Tests for utility functions",
    "worker: Tests for worker logic",
+    "aftu: Tests to compare graphs from aiu-fms-testing-utils",


I'd like to avoid creating more custom markers unless it's completely necessary. (Unrelated but it looks like utils and worker are unused and we should delete them as well)

These tests seem to be important to catch problems early so I do want them running with our default set of markers if possible

wallashss added 3 commits July 7, 2025 15:49

feat: added aftu dep

3662001

Signed-off-by: Wallas Santos <wallashss@ibm.com>

Merge branch 'main' of github.com:vllm-project/vllm-spyre into wallas…

d1a8cf7

…-compare-graph Signed-off-by: Wallas Santos <wallashss@ibm.com>

feat: tests for graph comparison between vllm and AFTU

81f19c1

Signed-off-by: Wallas Santos <wallashss@ibm.com>

wallashss requested review from rafvasq, prashantgupta24, sducouedic and joerunde as code owners July 8, 2025 01:48

wallashss marked this pull request as draft July 8, 2025 01:48

wallashss added 5 commits July 7, 2025 22:52

fix: missing default parameter

3efec86

Signed-off-by: Wallas Santos <wallashss@ibm.com>

feat: reindex of symbols for cb

4f15d47

Signed-off-by: Wallas Santos <wallashss@ibm.com>

docs: rewrite comment

d9df875

Signed-off-by: Wallas Santos <wallashss@ibm.com>

ci: setup tests markers

1287b6a

Signed-off-by: Wallas Santos <wallashss@ibm.com>

docs: nit

dbb25aa

Signed-off-by: Wallas Santos <wallashss@ibm.com>

wallashss marked this pull request as ready for review July 8, 2025 13:31

wallashss requested a review from ckadner as a code owner July 8, 2025 13:31

prashantgupta24 reviewed Jul 8, 2025

View reviewed changes

feat: update uv.lock

9d1895e

Signed-off-by: Wallas Santos <wallashss@ibm.com>

rafvasq reviewed Jul 9, 2025

View reviewed changes

joerunde reviewed Jul 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI] Tests for graph comparison between vllm and AFTU #286

[CI] Tests for graph comparison between vllm and AFTU #286

wallashss commented Jul 8, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 8, 2025

Uh oh!

prashantgupta24 Jul 8, 2025

Uh oh!

wallashss Jul 8, 2025

Uh oh!

rafvasq Jul 9, 2025 •

edited

Loading

Uh oh!

rafvasq Jul 9, 2025

Uh oh!

joerunde Jul 14, 2025

Uh oh!

joerunde Jul 14, 2025

Uh oh!

joerunde Jul 14, 2025

Uh oh!

Uh oh!

		# Get G1 graphs, it assumes the input_dir has the folder export_dtcompiler
		# where are the files



		def get_aftu_script_dir() -> str:
		# TODO: since AFTU is not a lib yet, this function does the best

[CI] Tests for graph comparison between vllm and AFTU #286

Are you sure you want to change the base?

[CI] Tests for graph comparison between vllm and AFTU #286

Conversation

wallashss commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

github-actions bot commented Jul 8, 2025

Uh oh!

prashantgupta24 Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

wallashss Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

rafvasq Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rafvasq Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

joerunde Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

joerunde Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

joerunde Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wallashss commented Jul 8, 2025 •

edited

Loading

rafvasq Jul 9, 2025 •

edited

Loading