Skip to content

Conversation

@adil-a
Copy link
Collaborator

@adil-a adil-a commented Nov 6, 2025

CP breaks up the sequence inside fwd/bwd context manager, and we calculate the TPS outside the CP context. As such, we need to correctly account for the CP group size when reporting the per-GPU TPS.

Signed-off-by: adil-a <adil.asif2000@hotmail.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 6, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@adil-a adil-a linked an issue Nov 6, 2025 that may be closed by this pull request
@adil-a
Copy link
Collaborator Author

adil-a commented Nov 6, 2025

/ok to test 10396d6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TPS incorrectly reported with CP > 1

2 participants