You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SYCL][UR] Fix the accuracy of command submission timestamp (#18735)
Current "submission time" calculation is inaccurate because we don't use
both synchronized timestamps returned by zeDeviceGetGlobalTimestamps but
using only device timestamp from that call and use std::chrono "close"
to that call to record the host time. This estimation becomes inaccurate
pretty quickly. This PR fixes this problem using the known fact that L0
runtime implementation uses CLOCK_MONOTONIC_RAW on Linux and
QueryPerformanceCounter on Windows.
So, with this fix, at the first call, we use both device and host
timestamps from zeDeviceGetGlobalTimestamps, subsequent calls (when
device timestamp is not requested) will return corresponding host
timestamp only, without making the `zeDeviceGetGlobalTimestamps` call
which has high latency.
Even though this approach improves accuracy and submit time doesn't
become "invalid" (submit time > start_time) fast, it still doesn't
guarantee that it will not happen. So, there will be additional fix done
in #18717 to fix that. Test is also
updated there to check larger number of iterations.
Also apply the same fix as
138bef7
for cuda adapter (i.e. record host time after cuEventSynchronize for
more precise measurement)
0 commit comments