|
| 1 | +:_mod-docs-content-type: ASSEMBLY |
| 2 | +[id="common-monitoring-configuration-scenarios"] |
| 3 | += Common monitoring configuration scenarios |
| 4 | +include::_attributes/common-attributes.adoc[] |
| 5 | +:context: common-monitoring-configuration-scenarios |
| 6 | + |
| 7 | +toc::[] |
| 8 | + |
| 9 | +After {product-title} is installed, core platform monitoring components immediately begin collecting metrics, which you can query and view. |
| 10 | +The default in-cluster monitoring stack includes the core platform Prometheus instance that collects metrics from your cluster and the core Alertmanager instance that routes alerts, among other components. |
| 11 | +Depending on who will use the monitoring stack and for what purposes, as a cluster administrator, you can further configure these monitoring components to suit the needs of different users in various scenarios. |
| 12 | + |
| 13 | +In addition to core platform monitoring, you can also optionally xref:../monitoring/enabling-monitoring-for-user-defined-projects.adoc#enabling-monitoring-for-user-defined-projects[enable monitoring for user-defined projects] for user workload monitoring. |
| 14 | +Users can then monitor their own services and workloads without the need for an additional monitoring solution. |
| 15 | + |
| 16 | +[id="configuring-core-platform-monitoring-postinstallation-steps_{context}"] |
| 17 | +== Configuring core platform monitoring: Postinstallation steps |
| 18 | + |
| 19 | +After {product-title} is installed, cluster administrators typically configure core platform monitoring to suit their needs. |
| 20 | +These activities include setting up storage and configuring options for Prometheus, Alertmanager, and other monitoring components. |
| 21 | + |
| 22 | +[NOTE] |
| 23 | +==== |
| 24 | +By default, in a newly installed {product-title} system, users can query and view collected metrics. |
| 25 | +You need only configure an alert receiver if you want users to receive alert notifications. |
| 26 | +Any other configuration options listed here are optional. |
| 27 | +==== |
| 28 | + |
| 29 | +* xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#creating-cluster-monitoring-configmap_configuring-the-monitoring-stack[Create the `cluster-monitoring-config` `ConfigMap` object] if it does not exist. |
| 30 | +* xref:../../observability/monitoring/managing-alerts.adoc#sending-notifications-to-external-systems_managing-alerts[Configure alert receivers] so that Alertmanager can send alerts to an external notification system such as email, Slack, or PagerDuty. |
| 31 | +* For shorter term data retention, xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#configuring-persistent-storage_configuring-the-monitoring-stack[configure persistent storage] for Prometheus and Alertmanager to store metrics and alert data. |
| 32 | +Specify the metrics data retention parameters for Prometheus and Thanos Ruler. |
| 33 | ++ |
| 34 | +[NOTE] |
| 35 | +==== |
| 36 | +By default, in a newly installed {product-title} system, the monitoring `ClusterOperator` resource reports a `PrometheusDataPersistenceNotConfigured` status message to remind you that storage is not configured. |
| 37 | +==== |
| 38 | ++ |
| 39 | +* For longer term data retention, xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#configuring_remote_write_storage_configuring-the-monitoring-stack[configure the remote write feature] to enable Prometheus to send ingested metrics to remote systems for storage. |
| 40 | ++ |
| 41 | +[IMPORTANT] |
| 42 | +==== |
| 43 | +Be sure to xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#adding-cluster-id-labels-to-metrics_configuring-the-monitoring-stack[add cluster ID labels to metrics] for use with your remote write storage configuration. |
| 44 | +==== |
| 45 | ++ |
| 46 | +* xref:../../observability/monitoring/enabling-monitoring-for-user-defined-projects.adoc#granting-users-permission-to-monitor-user-defined-projects_enabling-monitoring-for-user-defined-projects[Assign monitoring cluster roles] to any non-administrator users that need to access certain monitoring features. |
| 47 | +* xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#assigning-tolerations-to-monitoring-components_configuring-the-monitoring-stack[Assign tolerations] to monitoring stack components so that administrators can move them to tainted nodes. |
| 48 | +* xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#setting-the-body-size-limit-for-metrics-scraping_configuring-the-monitoring-stack[Set the body size limit] for metrics collection to help avoid situations in which Prometheus consumes excessive amounts of memory when scraped targets return a response that contains a large amount of data. |
| 49 | +* xref:../../observability/monitoring/managing-alerts.adoc#managing-core-platform-alerting-rules_managing-alerts[Modify or create alerting rules] for your cluster. |
| 50 | +These rules specify the conditions that trigger alerts, such as high CPU or memory usage, network latency, and so forth. |
| 51 | +* xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#managing-cpu-and-memory-resources-for-monitoring-components[Specify resource limits and requests for monitoring components] to ensure that the containers that run monitoring components have enough CPU and memory resources. |
| 52 | + |
| 53 | +With the monitoring stack configured to suit your needs, Prometheus collects metrics from the specified services and stores these metrics according to your settings. |
| 54 | +You can go to the *Observe* pages in the {product-title} web console to view and query collected metrics, manage alerts, identify performance bottlenecks, and scale resources as needed: |
| 55 | + |
| 56 | +* xref:../../observability/monitoring/reviewing-monitoring-dashboards.adoc#reviewing-monitoring-dashboards[View dashboards] to visualize collected metrics, troubleshoot alerts, and monitor other information about your cluster. |
| 57 | +* xref:../../observability/monitoring/managing-metrics.adoc#about-querying-metrics_managing-metrics[Query collected metrics] by creating PromQL queries or using predefined queries. |
0 commit comments