Skip to content

Support different tp_size - Part 1 p/d dynamic connection #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: mooncake_transfer_engine
Choose a base branch
from

Conversation

yuan-luo
Copy link
Collaborator

@yuan-luo yuan-luo commented Apr 7, 2025

Motivation

To support different tensor parallel size on prefill and decode. This PR is Part 1, supporting prefill / decode connection establishment and register to bootstrap server.

Modifications

  1. Bootstrap server support REGISTER http route service with prefill and decode
  2. KVManager add connection pool cache
  3. Refactor decode's connection establishment with bootstrap server
  4. Refactor prefill's connection establishment with bootstrap server

Checklist

@yuan-luo yuan-luo force-pushed the sgl_different_tp_size branch 5 times, most recently from e62a485 to 58b4d73 Compare April 8, 2025 14:16
@yuan-luo yuan-luo force-pushed the sgl_different_tp_size branch 3 times, most recently from fc3c6ff to 544534c Compare April 8, 2025 15:21
@yuan-luo yuan-luo changed the title WIP: Support different tp size WIP: Support different tp size - Part 1 Apr 8, 2025
@yuan-luo yuan-luo self-assigned this Apr 8, 2025
@yuan-luo yuan-luo force-pushed the sgl_different_tp_size branch 2 times, most recently from 36f32ff to ebcdc7b Compare April 9, 2025 07:38
@yuan-luo yuan-luo force-pushed the sgl_different_tp_size branch 2 times, most recently from 4ae00f0 to b032106 Compare April 9, 2025 08:27
@stmatengss stmatengss requested a review from Copilot April 9, 2025 13:42
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

@yuan-luo yuan-luo force-pushed the sgl_different_tp_size branch 2 times, most recently from 69fa651 to 326e583 Compare April 10, 2025 01:23
@yuan-luo yuan-luo force-pushed the sgl_different_tp_size branch 3 times, most recently from 35abefb to ccaac4f Compare April 10, 2025 08:35
@yuan-luo yuan-luo force-pushed the sgl_different_tp_size branch from ccaac4f to 0664822 Compare April 10, 2025 10:13
@yuan-luo yuan-luo force-pushed the sgl_different_tp_size branch from 7d7e521 to b5fafe7 Compare April 12, 2025 08:13
@yuan-luo yuan-luo force-pushed the sgl_different_tp_size branch from b5fafe7 to 0f33b49 Compare April 12, 2025 08:22
@yuan-luo yuan-luo changed the title WIP: Support different tp size - Part 1 Support different tp_size - Part 1 p/d dynamic connection Apr 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants