RPC GPU Offload Tracing #9518

Allan-Luu · 2024-09-17T04:25:56Z

Allan-Luu
Sep 17, 2024

Hello,

I currently have GPUs on a cluster that are receiving offloaded layers from my llama.cpp on a different host through RPC servers. I'm trying to see which process is offloading the LLM inference onto the GPUs and trace the GPU events triggered by llama.cpp.

Any advice on how to trace the GPU events triggered by RPC servers?

Answered by rgerganov

Sep 17, 2024

You can trace the calls to rpc-server by setting GGML_DEBUG to 1 here.

View full answer

rgerganov · 2024-09-17T07:11:02Z

rgerganov
Sep 17, 2024
Collaborator

You can trace the calls to rpc-server by setting GGML_DEBUG to 1 here.

1 reply

Allan-Luu Sep 17, 2024
Author

Is there an option for this to trace calls from local GPUs from host machine as well?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RPC GPU Offload Tracing #9518

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

RPC GPU Offload Tracing #9518

Uh oh!

Allan-Luu Sep 17, 2024

Replies: 1 comment · 1 reply

Uh oh!

rgerganov Sep 17, 2024 Collaborator

Uh oh!

Allan-Luu Sep 17, 2024 Author

Allan-Luu
Sep 17, 2024

Replies: 1 comment 1 reply

rgerganov
Sep 17, 2024
Collaborator

Allan-Luu Sep 17, 2024
Author