Skip to content

Commit 9a02e92

Browse files
authored
Merge pull request #90113 from gabriel-rh/OBSDOCS-1751
OBSDOCS-1751 incident detection ui plugin
2 parents 630446a + b9244bd commit 9a02e92

13 files changed

+154
-4
lines changed

_topic_maps/_topic_map.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2919,6 +2919,8 @@ Topics:
29192919
Topics:
29202920
- Name: Observability UI plugins overview
29212921
File: observability-ui-plugins-overview
2922+
- Name: Monitoring UI plugin
2923+
File: monitoring-ui-plugin
29222924
- Name: Logging UI plugin
29232925
File: logging-ui-plugin
29242926
- Name: Distributed tracing UI plugin
Loading
16.8 KB
Loading
33.1 KB
Loading
7.14 KB
Loading
9.97 KB
Loading

images/coo-incidents-timeline.png

17.5 KB
Loading

modules/coo-distributed-tracing-ui-plugin-install.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
// * observability/cluster_observability_operator/ui_plugins/distributed-tracing-ui-plugin.adoc
44

55
:_mod-docs-content-type: PROCEDURE
6-
[id="coo-distributed-tracing-ui-plugin-install-_{context}"]
6+
[id="coo-distributed-tracing-ui-plugin-install_{context}"]
77
= Installing the {coo-full} distributed tracing UI plugin
88

99

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
// Module included in the following assemblies:
2+
3+
// * observability/cluster_observability_operator/ui_plugins/incident-detection-ui-plugin.adoc
4+
5+
:_mod-docs-content-type: CONCEPT
6+
[id="coo-incident-detection-overview_{context}"]
7+
= {coo-full} incident detection overview
8+
9+
Clusters can generate significant volumes of monitoring data, making it hard for you to distinguish critical signals from noise.
10+
Single incidents can trigger a cascade of alerts, and this results in extended time to detect and resolve issues.
11+
12+
The {coo-full} incident detection feature groups related alerts into *incidents*. These incidents are then visualized as timelines that are color-coded by severity.
13+
Alerts are mapped to specific components, grouped by severity, helping you to identify root causes by focusing on high impact components first.
14+
You can then drill down from the incident timelines to individual alerts to determine how to fix the underlying issue.
15+
16+
{coo-full} incident detection transforms the alert storm into clear steps for faster understanding and resolution of the incidents that occur on your clusters.
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
// Module included in the following assemblies:
2+
3+
// * observability/cluster_observability_operator/ui_plugins/incident-detection-ui-plugin.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="coo-incident-detection-using_{context}"]
7+
= Using {coo-full} incident detection
8+
9+
.Prerequisites
10+
11+
* You have access to the cluster as a user with the `cluster-admin` cluster role.
12+
* You have logged in to the {product-title} web console.
13+
* You have installed the {coo-full}.
14+
* You have installed the {coo-full} monitoring UI plugin with incident detection enabled.
15+
16+
17+
.Procedure
18+
19+
. In the Administrator perspective of the web console, click on *Observe* -> *Incidents*.
20+
21+
. The Incidents Timeline UI shows the grouping of alerts into *incidents*. The color coding of the lines in the graph corresponds to the severity of the incident. By default, a seven day timeline is presented.
22+
+
23+
image::coo-incidents-timeline-weekly.png[Weekly incidents timeline]
24+
+
25+
[NOTE]
26+
====
27+
It will take at least 10 minutes to process the correlations and to see the timeline, after you enable incident detection.
28+
29+
The analysis and grouping into incidents is performed only for alerts that are firing after you have enabled this feature. Alerts that have been resolved before feature enablement are not included.
30+
====
31+
32+
. Zoom in to a 1-day view by clicking on the drop-down to specify the duration.
33+
+
34+
image::coo-incidents-timeline-daily.png[Daily incidents timeline]
35+
36+
. By clicking on an incident, you can see the timeline of alerts that are part of that incident, in the Alerts Timeline UI.
37+
+
38+
image::coo-incident-alerts-timeline.png[Incidents alerts timeline]
39+
40+
. In the list of alerts that follows, alerts are mapped to specific components, which are grouped by severity.
41+
+
42+
image::coo-incident-alerts-components.png[Incidents alerts components]
43+
44+
. Click to expand a compute component in the list. The underlying alerts related to that component are displayed.
45+
+
46+
image::coo-incident-alerts-components-expanded.png[Incidents expanded components]
47+
48+
. Click the link for a firing alert, to see detailed information about that alert.
49+
50+
51+
52+
[NOTE]
53+
====
54+
**Known issues**
55+
56+
* Depending on the order of the timeline bars, the tooltip might overlap and hide the underlying bar. You can still click the bar and select the incident or alert.
57+
58+
* The Silence Alert button in the **Incidents** -> **Component** section does not pre-populate the fields and is not usable. As a workaround, you can use the same menu and the Silence Alert button in the **Alerting** section instead.
59+
====

0 commit comments

Comments
 (0)