Merge pull request #60288 from bburt-rh/RHDEVDOCS-5096-example-monitoring-post-install-setup-scenario

bburt-rh · web-flow · commit d8563bbaf07f · 2024-07-19T09:59:26.000-04:00
OBSDOCS-236 - example-core-monitoring-post-install-setup-scenario
diff --git a/_topic_maps/_topic_map.yml b/_topic_maps/_topic_map.yml
@@ -2710,6 +2710,8 @@ Topics:
   Topics:
   - Name: Monitoring overview
     File: monitoring-overview
+  - Name: Common monitoring configuration scenarios
+    File: common-monitoring-configuration-scenarios
   - Name: Configuring the monitoring stack
     File: configuring-the-monitoring-stack
   - Name: Enabling monitoring for user-defined projects
diff --git a/observability/monitoring/common-monitoring-configuration-scenarios.adoc b/observability/monitoring/common-monitoring-configuration-scenarios.adoc
@@ -0,0 +1,57 @@
+:_mod-docs-content-type: ASSEMBLY
+[id="common-monitoring-configuration-scenarios"]
+= Common monitoring configuration scenarios
+include::_attributes/common-attributes.adoc[]
+:context: common-monitoring-configuration-scenarios
+
+toc::[]
+
+After {product-title} is installed, core platform monitoring components immediately begin collecting metrics, which you can query and view.
+The default in-cluster monitoring stack includes the core platform Prometheus instance that collects metrics from your cluster and the core Alertmanager instance that routes alerts, among other components.
+Depending on who will use the monitoring stack and for what purposes, as a cluster administrator, you can further configure these monitoring components to suit the needs of different users in various scenarios.
+
+In addition to core platform monitoring, you can also optionally xref:../monitoring/enabling-monitoring-for-user-defined-projects.adoc#enabling-monitoring-for-user-defined-projects[enable monitoring for user-defined projects] for user workload monitoring.
+Users can then monitor their own services and workloads without the need for an additional monitoring solution.
+
+[id="configuring-core-platform-monitoring-postinstallation-steps_{context}"]
+== Configuring core platform monitoring: Postinstallation steps
+
+After {product-title} is installed, cluster administrators typically configure core platform monitoring to suit their needs.
+These activities include setting up storage and configuring options for Prometheus, Alertmanager, and other monitoring components.
+
+[NOTE]
+====
+By default, in a newly installed {product-title} system, users can query and view collected metrics.
+You need only configure an alert receiver if you want users to receive alert notifications.
+Any other configuration options listed here are optional.
+====
+
+* xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#creating-cluster-monitoring-configmap_configuring-the-monitoring-stack[Create the `cluster-monitoring-config` `ConfigMap` object] if it does not exist.
+* xref:../../observability/monitoring/managing-alerts.adoc#sending-notifications-to-external-systems_managing-alerts[Configure alert receivers] so that Alertmanager can send alerts to an external notification system such as email, Slack, or PagerDuty.
+* For shorter term data retention, xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#configuring-persistent-storage_configuring-the-monitoring-stack[configure persistent storage] for Prometheus and Alertmanager to store metrics and alert data.
+Specify the metrics data retention parameters for Prometheus and Thanos Ruler.
++
+[NOTE]
+====
+By default, in a newly installed {product-title} system, the monitoring `ClusterOperator` resource reports a `PrometheusDataPersistenceNotConfigured` status message to remind you that storage is not configured.
+====
++
+* For longer term data retention, xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#configuring_remote_write_storage_configuring-the-monitoring-stack[configure the remote write feature] to enable Prometheus to send ingested metrics to remote systems for storage.
++
+[IMPORTANT]
+====
+Be sure to xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#adding-cluster-id-labels-to-metrics_configuring-the-monitoring-stack[add cluster ID labels to metrics] for use with your remote write storage configuration.
+====
++
+* xref:../../observability/monitoring/enabling-monitoring-for-user-defined-projects.adoc#granting-users-permission-to-monitor-user-defined-projects_enabling-monitoring-for-user-defined-projects[Assign monitoring cluster roles] to any non-administrator users that need to access certain monitoring features.
+* xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#assigning-tolerations-to-monitoring-components_configuring-the-monitoring-stack[Assign tolerations] to monitoring stack components so that administrators can move them to tainted nodes.
+* xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#setting-the-body-size-limit-for-metrics-scraping_configuring-the-monitoring-stack[Set the body size limit] for metrics collection to help avoid situations in which Prometheus consumes excessive amounts of memory when scraped targets return a response that contains a large amount of data.
+* xref:../../observability/monitoring/managing-alerts.adoc#managing-core-platform-alerting-rules_managing-alerts[Modify or create alerting rules] for your cluster.
+These rules specify the conditions that trigger alerts, such as high CPU or memory usage, network latency, and so forth.
+* xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#managing-cpu-and-memory-resources-for-monitoring-components[Specify resource limits and requests for monitoring components] to ensure that the containers that run monitoring components have enough CPU and memory resources.
+
+With the monitoring stack configured to suit your needs, Prometheus collects metrics from the specified services and stores these metrics according to your settings.
+You can go to the *Observe* pages in the {product-title} web console to view and query collected metrics, manage alerts, identify performance bottlenecks, and scale resources as needed:
+
+* xref:../../observability/monitoring/reviewing-monitoring-dashboards.adoc#reviewing-monitoring-dashboards[View dashboards] to visualize collected metrics, troubleshoot alerts, and monitor other information about your cluster.
+* xref:../../observability/monitoring/managing-metrics.adoc#about-querying-metrics_managing-metrics[Query collected metrics] by creating PromQL queries or using predefined queries.