[WIP]🚨 set dtype=float16 for CPU as well #266

prashantgupta24 · 2025-06-27T19:00:15Z

Description

Use float16 for CPU to try and speed tests

Related Issues

github-actions · 2025-06-27T19:00:27Z

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>

prashantgupta24 · 2025-07-08T16:29:33Z

tests/spyre_util.py

@@ -228,7 +229,8 @@ def generate_hf_output(
    if not isinstance(max_new_tokens, list):
        max_new_tokens = [max_new_tokens] * len(prompts)

-    hf_model = AutoModelForCausalLM.from_pretrained(model)
+    hf_model = AutoModelForCausalLM.from_pretrained(model,
+                                                    torch_dtype=torch.float16)


not sure if this should be float16 or bfloat16

prashantgupta24 requested review from yannicks1, tdoublep, nikolaospapandreou and sducouedic as code owners June 27, 2025 19:00

prashantgupta24 changed the title ~~🚨 set dtype=float16 for CPU as well~~ [WIP]🚨 set dtype=float16 for CPU as well Jun 27, 2025

prashantgupta24 marked this pull request as draft June 27, 2025 19:04

prashantgupta24 force-pushed the float16 branch 2 times, most recently from bc83e4b to 3190d8d Compare June 27, 2025 20:34

tjohnson31415 and others added 4 commits July 1, 2025 09:47

try using OMP_NUM_THREADS=1 to allow non-forked tests

e1f7c39

try spawn instead of OMP_NUM_THREADS=1

fa06501

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

🚨 set dtype=float16 for CPU as well

6a4c991

Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>

🚨 set dtype=float16 for tests

a05bf7f

Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>

prashantgupta24 force-pushed the float16 branch from 3190d8d to a05bf7f Compare July 1, 2025 17:12

🚧 try bf16

95e3e2a

Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>

prashantgupta24 commented Jul 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP]🚨 set dtype=float16 for CPU as well #266

[WIP]🚨 set dtype=float16 for CPU as well #266

Uh oh!

prashantgupta24 commented Jun 27, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 27, 2025

Uh oh!

prashantgupta24 Jul 8, 2025

Uh oh!

Uh oh!

[WIP]🚨 set dtype=float16 for CPU as well #266

Are you sure you want to change the base?

[WIP]🚨 set dtype=float16 for CPU as well #266

Uh oh!

Conversation

prashantgupta24 commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Uh oh!

github-actions bot commented Jun 27, 2025

Uh oh!

prashantgupta24 Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

prashantgupta24 commented Jun 27, 2025 •

edited

Loading