Skip to content

cpu_monitor should issue warning/error about CPU temperature for NVIDIA Tegra platforms #11334

@nishikawa-masaki

Description

@nishikawa-masaki

Checklist

  • I've read the contribution guidelines.
  • I've searched other issues and no duplicate issues were found.
  • I've agreed with the maintainers that I can plan this task.

Description

Currently, the "cpu_monitor" node reports temperature of CPU cores (on Intel/AMD) or CPU thermal zones (on ARM, Tegra, Raspberry Pi), but it doesn't issue warning or error even when the temperature is high.
This is because the "cpu_monitor" node had raised "false alarms" on Intel/AMD platforms in the past and "temperature warning/error" was disabled to avoid the problem.
More reliable "thermal throttling" diagnostics is used on Intel/AMD platforms.

In the case of NVIDIA Tegra family (ex. Jetson AGX Orin), there is no direct method to detect thermal throttling.
Therefore, "cpu_monitor" should issue warning and error about high CPU temperature.

Purpose

Enable warning and error about high CPU temperature only on NVIDIA Tegra platforms.

Possible approaches

In CMakeLists.txt, define the macro "ENABLE_TEMPERATURE_DIAGNOSTICS" if the platform belongs to NVIDIA Tegra family.

Definition of done

  • When CPU temperature goes higher than warning/error threshold, warning/error is reported in the "/diagnostics" topic.
  • With appropriate warning/error threshold, there should not be "false alarm".

Metadata

Metadata

Labels

component:systemSystem design and integration. (auto-assigned)

Projects

Status

To Triage

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions