Performance Benchmarking: Unstable Timing for JIT-compiled Matrix Multiplication #29017
Unanswered
juliusuberall
asked this question in
Q&A
Replies: 1 comment
-
Never mind my comment about asynchronous dispatch: I somehow didn't see your blocking in my initial read of the code! I'll take a look later today to see if I have more relevant suggestions. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
While doing microbenchmarking in JAX, we ran into the issue of measuring unstable wall-clock time.
We ended up creating a minimal case to evaluate this behaviour further - the code is below. We have a function doing a matrix multiplication, trace and JIT compile the function and then do 100K independent runs for benchmarking.
We tested this both on CPU and GPU using Google Collab. In both cases the measured time between runs is fluctuating heavily, which makes it difficult to do reliable benchmarking. Find attached the time plots for GPU and CPU.
Is this an expected behaviour ? Are we missing something essential ? Did people encounter this before while benchmarking with JAX ?
Thankful for any ideas or suggestions!
Beta Was this translation helpful? Give feedback.
All reactions