You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When profiling CUDA/CUTLASS, the profiler can provide line-by-line profiling for user code, in addition to PTX and SASS. Triton can also do this, likely because its compiler tracks source locations. I believe CuTeDSL has a similar feature since it tracks source locations too. However, I’m unsure how to enable this, as the default ncu output only shows SASS. Do you happen to know how to enable detailed profiling if it’s possible?
simveit, Chillee, sazczmh, devashishshankar and odelame