Skip to content

Take into account CPU spikes #456

@dudicoco

Description

@dudicoco

Is your feature request related to a problem? Please describe.

Prometheus is based on samples, which means that even if we scraped every 15 seconds we could miss many short 100% or more CPU spikes.
So our Prom query may show that our 95th percentile utilization is at 50% of the current CPU requests value, but in practice lowering the requests might cause CPU throttling and/or increase latency.

Describe the solution you'd like
I believe a profiling tool would be needed here such as an ebpf exporter which could expose a metric with cpu spikes.
Perhaps something like https://github.com/cloudflare/ebpf_exporter

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions