Skip to content

Releases: aws/amazon-cloudwatch-agent

v1.300055.2

23 May 18:45
Compare
Choose a tag to compare

Released On

Static Badge

Bug Fixes

🐞 [ContainerInsights] Fix endpoint slice to capture service name instead of slice name 🐞 by @sky333999 in amazon-contributing/opentelemetry-collector-contrib#314
This change fixes a bug between v1.300054.0 and v1.300055.1 where the agent would incorrectly send the Service as the name of the endpoint slice as opposed to the name of the service itself on the Container Insights metrics.

v1.300054.1

15 May 14:57
Compare
Choose a tag to compare

Released On

Static Badge

Enhancements

💡 [Core] Go1.24 bump by @Paramadon in #1623 💡
The same artifacts as v1.300054.0 but built with Go 1.24.

v1.300055.1

23 May 18:37
b64795b
Compare
Choose a tag to compare

Released On

Static Badge Static Badge Static Badge Static Badge

Enhancements

💡 [Core] Go1.24 bump by @Paramadon in #1623 💡
The same artifacts as v1.300055.0 but built with Go 1.24.

v1.300055.0

15 May 15:47
604e315
Compare
Choose a tag to compare

Released On

Static Badge Static Badge Static Badge Static Badge

Bug Fixes

🐞 [Prometheus/AMP] Add deltatocumulativeprocessor for AMP destinations 🐞 by @dricross in #1624
When using CWAgent to scrape Prometheus metrics and publish to Amazon Managed Prometheus (AMP), the agent would previously drop any non-cumulative delta metrics, leading to a potential loss in observability. This bug fixes the issue by inserting a delta to cumulative processor in the pipeline such that we no longer drop these metrics.

🐞[Application Signals] Fix service ip to workload mapping in an edge case 🐞 by @pxaws in #1644
An edge case was introduced in v1.300053.0 where a K8s Service backed by one pod that is terminated can trigger a race condition resulting in the agent not mapping the pod's IP to the corresponding application/workload - thus breaking the Application Signals topology & experience. This fixes the race condition by accounting for delayed termination of pods when populating the PodIP <-> Service cache.

Enhancements

💡 [Explore Related] Support entity for OTLP custom metrics / Add service entity support for EMF exporter 💡by @musa-asad in #1529
CWAgent now supports attaching Entity information (namespace, node name & application/workload) for OTLP custom metrics sent to it. This expands the coverage of Exploring Related telemetry to use cases utilizing the agent's standalone OTLP plugin for metrics (even when Application Signals is not being used).

💡 [Explore Related] Implement Kubernetes Metadata Extension 💡 by @musa-asad in #1605
In order to facilitate attaching entity for use cases outside of Application Signals, CWAgent needs the ability to map a Source IP back to the corresponding application/workload. This change introduces a new re-usable extension that extracts the logic to map a Source IP to a workload into a common place that can then be used for Application Signals, Explore Related & other use cases in the future.

💡 [Explore Related] Add platform type for Container Insights in EKS 💡by @zhihonl in #1638
Container logs published to Container Insights log groups cannot be distinguished for EKS vs native K8s today and this results in Container Insights metrics not being properly correlated to Container Insights application logs. With this enhancement, the agent now sets the Platform Type (EKS vs native K8s) on the log events to facilitate correct correlation between metrics & logs for customers leveraging Explore Related.

💡 [Core] Reduce noisy warning log in CI podresourcesstore / Reduce noisy logs with prometheus receiver 💡 by @okankoAMZ in amazon-contributing/opentelemetry-collector-contrib#300 and by @movence in amazon-contributing/opentelemetry-collector-contrib#297
These enhancements reduce the verbosity of noisy logs printed by the agent in scenarios when Enhanced Container Insights is enabled and also on non-GPU Kubernetes clusters. These logs were previously polluting the agent logs with noise, leading to confusion when triaging genuine issues.


Full Changelog: v1.300054.0...v1.300055.0


v1.300054.0

21 Apr 20:46
9a655fc
Compare
Choose a tag to compare

Released On

Static Badge Static Badge Static Badge Static Badge

Bug Fixes

🐞 [OTLP] Drop metric if it has nil value in cloudwatch exporter 🐞 by @musa-asad in #1608
OTLP metrics that did not specify a value and sent to the CWAgent would cause it to panic & crash. This bug fix improves the resiliency by filtering out such metrics prior to publishing to CloudWatch.

🐞 [Container Insights] Handle bearer token rotation 🐞 by @rvasahu-amazon & @movence in amazon-contributing/opentelemetry-collector-contrib#285 & amazon-contributing/opentelemetry-collector-contrib#292
As part of Container Insights, CWAgent scrapes Control Plane metrics and uses a Bearer Token to authenticate itself. These tokens expire after 90 days, after which the agent previously would not pick up the new token unless restarted.
With this change, we pick up the renewed tokens without requiring any manual intervention.

New Features

🚀 [Logs] Configure Backpressure mode to release file descriptors 🚀 by @movence in #1577
Backpressure is a scenario where the agent is unable to publish logs at the same rate at which logs are being read (or unable to publish entirely due to CloudWatch being unavailable for example). A common symptom in such scenarios is that the agent does not release the file descriptors, thus consuming disk space even though a file is "deleted", causing the host to run out of disk eventually.
CWAgent offers a new capability to configure the mode when facing Backpressure while publishing logs to CloudWatch. Setting backpressure_mode to fd_release now dictates the agent to release file descriptors when facing backpressure.

🚀 [Logs] Ability to trim timestamp in log messages 🚀 by @zhihonl & @nathalapooja in #1582
CWAgent now provides the ability to trim timestamps before publishing log events to CloudWatch Logs. Often times, the timestamp in a log message is redundant since its already present as a field on the CloudWatch Log event. This new opt-in feature, trim_timestamp, can help reduce the size of bytes ingested to CloudWatch Logs.

🚀 [Container Insights] Add EFA device level metrics 🚀 by @Aakash-Dantre in #1594
As part of EFA metrics for Enhanced Container Insights, we now publish more granular metrics at a EFA Device level, thus enhancing the observability of EFA devices by linking them to their corresponding ENI IDs in AWS. This in turn makes it easier to correlate EFA metrics with network interface resources.

Enhancements

💡 [Container Insights] Use endpoint slices instead of endpoints 💡 by @nathalapooja in #1593
CWAgent now uses the better & optimized Endpoint Slices API instead of the legacy Endpoints API when fetching service metadata for Container Insights. This helps improve the performance of the agent, especially on the impact the agent has on the K8s API Server for large clusters running 1000s of nodes.

💡 [Disk] Pick up new volumes every 5 mins 💡 by @lisguo in #1578
CWAgent would only discover volumes only upon start and would not refresh them after. This meant metrics for new volumes created after the agent start would miss the VolumeId dimension.
CWAgent now sets RefreshVolumesInterval to a default of 5mins if VolumeId is configured as an append_dimension, thus ensuring the VolumeId dimension is present as expected.

New Contributors


Full Changelog: v1.300053.0...v1.300054.0


v1.300053.0

12 Mar 20:55
3d83201
Compare
Choose a tag to compare

Released On

Static Badge Static Badge Static Badge Static Badge

Bug Fixes

🐞 [Logs] Only add to cache if log stream is created 🐞 by @jefchien in #1566
This change fixes a bug in v1.300052.0 where if agents installed in a fleet/cluster are writing to log streams in the same log group, there could be a race condition where one agent creates the log group and the others detect the presence of the log group falsely as a success and caches the same, preventing them from trying to re-create the log stream and successfully publishing logs.

🐞 [Core] Take entity data into account for payload size 🐞 by @varunch77 in #1483
This change fixes a bug between v1.300049.1 and v1.300052.0 where in some edge cases, the agent can miscalculate the overall size of a request and not batch the metrics effectively.

Enhancements

💡 [Application Signals] Add endpoint slice watcher 💡 by @pxaws in #1528
The startup logic in Application Signals was previously calling expensive list API calls to the /pods endpoint on the K8s API Server, overloading the control plane for large clusters. With this optimization, we switched to a less expensive list/watch mechanism for /endpointslices, retrieving a lot less information that is actually required for the Application Signals experience.

💡 [Logs] Reduce PutRetentionPolicy calls by checking existing policy 💡 by @movence in #1545
Previously, the agent would make a PutRetentionPolicy API call for every log group every time it restarted, thus causing throttling issues. With this enhancement, the agent first checks if the Retention Policy is already set by calling DescribeLogGroup and only sets it if required. It also now does this asynchronously and with a jitter & exponential backoff.

💡 [Explore Related] Scrape auto scaling group attributes from resource metrics 💡 by @zhihonl in #1521
Previously, if IMDS tags are disabled, the agent would not set the ASG on the entity even though it can be retrieved through ectagger using the Describe API. We now retrieve ASG from ec2tagger for entity even when IMDS tags are not enabled

💡 [Explore Related] Add entity processor to the otlp metrics and logs pipeline 💡 by @duhminick in #1504
Extends support for setting entities for the OTLP flows both in the metrics and logs section when running on EKS to provide an even richer Explore Related experience

💡 [Core] Changing log level of cert watcher from error to debug 💡 by @mitali-salvi in #1531
Reduce verbosity of a non-impacting warning to reduce clutter

💡 [Core] Fix EKS cluster detection 💡 by @musa-asad in #1532
The agent previously relied on the existence of aws-auth config map to know if it was running on EKS. With the launch of EKS access entries, aws-auth is considered deprecated. So we now switched to a better mechanism to inspect the issue field in the service account when determining if we are running on EKS

Full Changelog: v1.300052.0...v1.300053.0

v1.300052.1

27 Feb 03:09
6c0624d
Compare
Choose a tag to compare

Released On

Static Badge

Bug Fixes

🐞 Only add to cache if log stream is created 🐞 by @jefchien in #1566
This change fixes a bug in v1.300052.0 where if agents installed in a fleet/cluster are writing to log streams in the same log group, there could be a race condition where one agent creates the log group and the others detect the presence of the log group falsely as a success and caches the same, preventing them from trying to re-create the log stream and successfully publishing logs.

Full Changelog: v1.300052.0...v1.300052.1

v1.300052.0

05 Feb 16:57
af960d7
Compare
Choose a tag to compare

Released On

Static Badge Static Badge Static Badge Static Badge

Bug Fixes

Enhancements

  • [Logs] Improve performance with multi-threaded log pusher implementation by @jefchien in #1499
  • [Related Telemetry] Add case insensitive metadata support for EC2 tags by @chadpatel in #1478
  • [Related Telemetry] Reduce number of entities in explore experience by retrieving instance tags simultaneously by @nathalapooja in #1474
  • [Related Telemetry] Add fallback to use application signals for entity population when IMDS tags are not enabled by @duhminick in #1486
  • [ApplicationSignals] Add support for .NET runtime metrics exporting by @bjrara in #1471

Full Changelog: v1.300051.0...v1.300052.0

v1.300051.0

16 Jan 22:42
2c8e72f
Compare
Choose a tag to compare

Released On

Static Badge Static Badge Static Badge Static Badge

Bug Fixes:

  • Fix Excessive IMDS related error logging
  • Fixed a concurrency issue in entity attribute handling that was causing agent crashes due to simultaneous map writes.

Enhancements:

  • [AppSignal] Support RemoteEnvironment dimension for non-Kubernetes platforms.
  • [Logs] Support exporting large exponential histograms as EMF logs
  • [Logs] Reduce EMF exporter verbose logging
  • [Traces] Generate URL section in X-Ray segment when net.peer.name attribute is available

Full Changelog: v1.300050.0...v1.300051.0

v1.300050.0

07 Jan 05:54
4ec6d35
Compare
Choose a tag to compare

Released On

Static Badge Static Badge

New Features

  • [Prometheus] Introduce OTel Prometheus Receiver for publishing to AMP
  • [Prometheus] Support Target Allocator with Prometheus Receivers
  • [ContainerInsights] Introduce Kueue metrics for Container Insights

Full Changelog: v1.300049.1...v1.300050.0