How to Analyze Operator Latency Breakdown in llama.cpp? #10839
-
I am currently looking to analyze the latency of operators in a model (e.g., a breakdown of operator time proportions). I know that |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Take a look at #9659 |
Beta Was this translation helpful? Give feedback.
Take a look at #9659
Currently it supports only the CPU backend but will give you the info you're looking for.
I'm planning on updating it to the latest master and to add support for the OpenCL backend (and hopefully others one later).