Skip to content

[V1] Defragmentation support #1568

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 30, 2025
Merged

Conversation

madamczyk-intel
Copy link

@madamczyk-intel madamczyk-intel commented Jul 10, 2025

@madamczyk-intel
Copy link
Author

/run-gaudi-tests

@madamczyk-intel madamczyk-intel force-pushed the dev/madamczyk/v1_defrag branch 2 times, most recently from 2a8407a to a3b8961 Compare July 17, 2025 12:11
@madamczyk-intel
Copy link
Author

/run-gaudi-tests

1 similar comment
@madamczyk-intel
Copy link
Author

/run-gaudi-tests

@madamczyk-intel madamczyk-intel requested a review from Copilot July 17, 2025 12:41
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for memory defragmentation in the HPU model runner, introduces step-based profiling in the HPU worker, and ensures runtime configuration is finalized.

  • Invoke and import finalize_config to apply updated runtime settings.
  • Add setup_step_profiler and debug logging to track per-step profiling in HPUWorker.
  • Integrate OnlineDefragmenter into HPUModelRunner to resolve, track, and defragment block IDs.
  • Update HPU extension dependency to the defragmentation-enabled branch.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
vllm/worker/hpu_model_runner.py Import and call finalize_config after setting VLLM config.
vllm/v1/worker/hpu_worker.py Add setup_step_profiler, per-step profiling, and debug logging.
vllm/v1/worker/hpu_model_runner.py Initialize and use OnlineDefragmenter throughout model runner.
requirements/hpu.txt Point vllm-hpu-extension to dev/madamczyk/v1_defrag branch.
Comments suppressed due to low confidence (5)

vllm/v1/worker/hpu_worker.py:36

  • [nitpick] The helper setup_step_profiler lacks a docstring; add a brief description of its purpose and the meaning of its parameters.
def setup_step_profiler(steps):

vllm/v1/worker/hpu_worker.py:104

  • [nitpick] The attribute name step_debug may not clearly convey its purpose; consider renaming it to step_logger or similar for clarity.
        self.step_debug = init_debug_logger('steps')

vllm/v1/worker/hpu_worker.py:101

  • [nitpick] The variable step is quite generic; renaming to current_step or step_counter could improve readability.
        self.step = 0

vllm/v1/worker/hpu_model_runner.py:619

  • The integration of OnlineDefragmenter is significant but currently untested; consider adding unit tests for its resolve, update_state, and defragment flows.
        self.defragmenter = OnlineDefragmenter()

vllm/worker/hpu_model_runner.py:990

  • [nitpick] The indentation of finalize_config() is inconsistent with the surrounding block; align it with the environment.set_vllm_config call for clarity.
        finalize_config()

Signed-off-by: Michal Adamczyk <madamczyk@habana.ai>
Signed-off-by: Michal Adamczyk <madamczyk@habana.ai>
Signed-off-by: Michal Adamczyk <madamczyk@habana.ai>
@madamczyk-intel madamczyk-intel force-pushed the dev/madamczyk/v1_defrag branch from ab51d58 to dde41a0 Compare July 29, 2025 10:11
@madamczyk-intel
Copy link
Author

/run-gaudi-tests

Signed-off-by: Michal Adamczyk <madamczyk@habana.ai>
@madamczyk-intel
Copy link
Author

/run-gaudi-tests

@madamczyk-intel
Copy link
Author

/run-gaudi-tests

@madamczyk-intel
Copy link
Author

/skip-gaudi-tests

@madamczyk-intel
Copy link
Author

CI tests already passed before sha update: ef7cbbc

@madamczyk-intel madamczyk-intel merged commit 046343b into habana_main Jul 30, 2025
6 checks passed
@madamczyk-intel madamczyk-intel deleted the dev/madamczyk/v1_defrag branch July 30, 2025 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants