-
Notifications
You must be signed in to change notification settings - Fork 116
[V1] Defragmentation support #1568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/run-gaudi-tests |
2a8407a
to
a3b8961
Compare
/run-gaudi-tests |
1 similar comment
/run-gaudi-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for memory defragmentation in the HPU model runner, introduces step-based profiling in the HPU worker, and ensures runtime configuration is finalized.
- Invoke and import
finalize_config
to apply updated runtime settings. - Add
setup_step_profiler
and debug logging to track per-step profiling inHPUWorker
. - Integrate
OnlineDefragmenter
intoHPUModelRunner
to resolve, track, and defragment block IDs. - Update HPU extension dependency to the defragmentation-enabled branch.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
File | Description |
---|---|
vllm/worker/hpu_model_runner.py | Import and call finalize_config after setting VLLM config. |
vllm/v1/worker/hpu_worker.py | Add setup_step_profiler , per-step profiling, and debug logging. |
vllm/v1/worker/hpu_model_runner.py | Initialize and use OnlineDefragmenter throughout model runner. |
requirements/hpu.txt | Point vllm-hpu-extension to dev/madamczyk/v1_defrag branch. |
Comments suppressed due to low confidence (5)
vllm/v1/worker/hpu_worker.py:36
- [nitpick] The helper
setup_step_profiler
lacks a docstring; add a brief description of its purpose and the meaning of its parameters.
def setup_step_profiler(steps):
vllm/v1/worker/hpu_worker.py:104
- [nitpick] The attribute name
step_debug
may not clearly convey its purpose; consider renaming it tostep_logger
or similar for clarity.
self.step_debug = init_debug_logger('steps')
vllm/v1/worker/hpu_worker.py:101
- [nitpick] The variable
step
is quite generic; renaming tocurrent_step
orstep_counter
could improve readability.
self.step = 0
vllm/v1/worker/hpu_model_runner.py:619
- The integration of
OnlineDefragmenter
is significant but currently untested; consider adding unit tests for itsresolve
,update_state
, anddefragment
flows.
self.defragmenter = OnlineDefragmenter()
vllm/worker/hpu_model_runner.py:990
- [nitpick] The indentation of
finalize_config()
is inconsistent with the surrounding block; align it with theenvironment.set_vllm_config
call for clarity.
finalize_config()
Signed-off-by: Michal Adamczyk <madamczyk@habana.ai>
Signed-off-by: Michal Adamczyk <madamczyk@habana.ai>
Signed-off-by: Michal Adamczyk <madamczyk@habana.ai>
ab51d58
to
dde41a0
Compare
/run-gaudi-tests |
/run-gaudi-tests |
/run-gaudi-tests |
/skip-gaudi-tests |
CI tests already passed before sha update: ef7cbbc |
extension PR: HabanaAI/vllm-hpu-extension#275