-
Notifications
You must be signed in to change notification settings - Fork 27
[awsemfexporter] Fix grouping for container insights metrics for mixed metric types #320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Do you mean that the exporter splits the metrics into separate log events within the same CloudWatch Log Group? Or do they actually go to separate CloudWatch log groups? |
if metadata.receiver == containerInsightsReceiver { | ||
// For container insights, put all metrics in the same group regardless of type (ie gauge/counter) | ||
metadata.groupedMetricMetadata.metricDataType = pmetric.MetricTypeEmpty | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am concerned about mixing other metric types into the same log event, e.g. histograms/summary with gauge/counters.
This will only group them into the same log event if they also have the same dimensions, right? If so, then I think it'd be ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't currently emit non-counter metrics for container insights today. But if one of the container insights prom scrapers (neuron/dcgm/nvme/etc) starts ingesting non-counter metrics, we want that all to be in the same log. Otherwise customers will be charged extra
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Histogram/summary metrics have extra dimensions like "quantile" which we wouldn't want applied to the counter/gauge metrics since that doesn't make any sense. So I think they will have to be in a separate log event.
|
||
if metadata.receiver == containerInsightsReceiver { | ||
// For container insights, put all metrics in the same group regardless of type (ie guage/counter) | ||
metadata.groupedMetricMetadata.metricDataType = pmetric.MetricTypeEmpty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to test if this affects any other part of container insights. I see that the only other reference to the data type is in the translateGroupedMetricToCWMetric
func:
if isPrometheusMetric {
fields[fieldPrometheusMetricType] = fieldPrometheusTypes[groupedMetric.metadata.metricDataType]
}
code:
fields[fieldPrometheusMetricType] = fieldPrometheusTypes[groupedMetric.metadata.metricDataType] |
idx := 0 | ||
|
||
// Sort metadata list to prevent race condition | ||
var metadataList []cWMetricMetadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a flaky test. The github runner runs the unit tests with -race. Had to fix it by sorting the metadata list
Co-authored-by: Jeffrey Chien <chienjef@amazon.com>
Description
Currently if there exists container insights metrics of different metric types (counter vs gauge vs histogram), emf exporter will split them into different emf logs
This is a problem for EBS NVMe metrics where the
node_diskio_ebs_volume_queue_length
metric is a gauge where the others are counters. This causes extra cost.Testing
Tested this on an existing cluster and verified that the EBS NVMe metrics are in 1 emf log:
Before:

EMF Log 1:
EMF Log 2:
After:

Sample Log: