You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running a workload that involves both CPU and TPU processing, where there's a true dependency from CPU to TPU—the CPU must complete its task before the TPU begins, as the TPU relies on data prepared by the CPU.
However, the actual trace shows the opposite, with the CPU appearing to finish after the TPU has started. I’d like to ask whether the profiling traces from the CPU and TPU are accurately time-aligned. Could this be a trace misalignment issue?
The example program is below, and the TPU kernel has a true dependency on the CPU kernel's result.
The figure below is the trace for the example program generated by jax.profiler.trace(). In the trace, we can see the TPU kernel starts before the CPU kernel finishes. Moreover, the block the red arrow points to, I think, is the CPU calling the TPU kernel, and it is also after the corresponding TPU kernel runs.
Thus, we suspect there is a time shift between the CPU and TPU traces.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I'm running a workload that involves both CPU and TPU processing, where there's a true dependency from CPU to TPU—the CPU must complete its task before the TPU begins, as the TPU relies on data prepared by the CPU.
However, the actual trace shows the opposite, with the CPU appearing to finish after the TPU has started. I’d like to ask whether the profiling traces from the CPU and TPU are accurately time-aligned. Could this be a trace misalignment issue?
The example program is below, and the TPU kernel has a true dependency on the CPU kernel's result.
The figure below is the trace for the example program generated by
jax.profiler.trace()
. In the trace, we can see the TPU kernel starts before the CPU kernel finishes. Moreover, the block the red arrow points to, I think, is the CPU calling the TPU kernel, and it is also after the corresponding TPU kernel runs.Thus, we suspect there is a time shift between the CPU and TPU traces.
Thanks,
Jingtian
Beta Was this translation helpful? Give feedback.
All reactions