Replies: 4 comments 3 replies
-
@rootfs and @sunya-ch, we can decouple the metric collection and the estimation part within the Kepler exporter. This will involve dedicating one container to collect all metrics and another to receive these metrics and perform power estimation using Power models. To illustrate this architecture, see the diagram below: ![]()
|
Beta Was this translation helpful? Give feedback.
-
For supplementary comment, about gRPC proto, we can think it is an alternative to expose metric collected by Kepler metric collector (in addition to prometheus format with /metrics endpoints). Even if query call to prometheus metric server can similarly filter the response with specific query set; however, again, as @marceloamaral mentioned above, directly connecting to kepler collector will reduce the query load to the prometheus metric server. |
Beta Was this translation helpful? Give feedback.
-
@marceloamaral @sunya-ch build on top of the gRPC idea. Kepler eBPF collector could use telemetry library and the kepler power estimator is the metrics collector and export the metrics to prometheus, wdyt? |
Beta Was this translation helpful? Give feedback.
-
does it mean we need to rewrite/move some features from kepler to kepler model server? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Currently Kepler exports both ebpf metrics and energy estimation, the energy estimation is based a ML model with the ebpf metrics features.
This monolithic and sync (i.e. all metrics are produced synchronously) architecture has a number of advantage in deployment, upgrade in the early days. Now I think we see scenarios that favor a decoupled and async architecture that produce eBPF metrics and energy metrics asynchronously. This architecture consists two exporters: ebpf metrics and energy estimation. The ebpf metrics exporter just exports ebpf metrics, while the energy estimation exporter reads the ebpf metrics, uses ML models and then exports the energy estimates.
The async architecture has the following advantages, especially the energy estimate exporter can be used to predict energy from offline ebpf metrics, while the sync exporter has to export both metrics synchronously.
Comments are welcome.
@sustainable-computing-io/maintainer
Beta Was this translation helpful? Give feedback.
All reactions