Skip to content

Commit 1d6c532

Browse files
authored
Merge pull request #92920 from openshift-cherrypick-robot/cherry-pick-91565-to-enterprise-4.19
[enterprise-4.19] OBSDOCS-1795: Improve the contents of 'Configuring alert routing for default platform alerts'
2 parents 6f27aac + 562b29f commit 1d6c532

File tree

1 file changed

+94
-56
lines changed

1 file changed

+94
-56
lines changed

modules/monitoring-configuring-alert-routing-default-platform-alerts.adoc

Lines changed: 94 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
[id="configuring-alert-routing-default-platform-alerts_{context}"]
77
= Configuring alert routing for default platform alerts
88

9-
You can configure Alertmanager to send notifications. Customize where and how Alertmanager sends notifications about default platform alerts by editing the default configuration in the `alertmanager-main` secret in the `openshift-monitoring` namespace.
9+
You can configure Alertmanager to send notifications to receive important alerts coming from your cluster. Customize where and how Alertmanager sends notifications about default platform alerts by editing the default configuration in the `alertmanager-main` secret in the `openshift-monitoring` namespace.
1010

1111
[NOTE]
1212
====
@@ -16,28 +16,24 @@ All features of a supported version of upstream Alertmanager are also supported
1616
.Prerequisites
1717

1818
* You have access to the cluster as a user with the `cluster-admin` cluster role.
19+
* You have installed the {oc-first}.
1920

2021
.Procedure
2122

22-
. Open the Alertmanager YAML configuration file:
23-
24-
** To open the Alertmanager configuration from the CLI:
25-
26-
.. Print the currently active Alertmanager configuration from the `alertmanager-main` secret into `alertmanager.yaml` file:
23+
. Extract the currently active Alertmanager configuration from the `alertmanager-main` secret and save it as a local `alertmanager.yaml` file:
2724
+
2825
[source,terminal]
2926
----
3027
$ oc -n openshift-monitoring get secret alertmanager-main --template='{{ index .data "alertmanager.yaml" }}' | base64 --decode > alertmanager.yaml
3128
----
3229

33-
.. Open the `alertmanager.yaml` file.
30+
. Open the `alertmanager.yaml` file.
3431

35-
** To open the Alertmanager configuration from the {product-title} web console:
32+
. Edit the Alertmanager configuration:
3633

37-
.. Go to the *Administration* -> *Cluster Settings* -> *Configuration* -> *Alertmanager* -> *YAML* page of the web console.
38-
39-
. Edit the Alertmanager configuration by updating parameters in the YAML:
34+
.. Optional: Change the default Alertmanager configuration:
4035
+
36+
.Example of the default Alertmanager secret YAML
4137
[source,yaml]
4238
----
4339
global:
@@ -54,54 +50,88 @@ route:
5450
- "alertname=Watchdog"
5551
repeat_interval: 2m
5652
receiver: watchdog
57-
- matchers:
58-
- "service=<your_service>" # <5>
59-
routes:
60-
- matchers:
61-
- <your_matching_rules> # <6>
62-
receiver: <receiver> # <7>
6353
receivers:
6454
- name: default
6555
- name: watchdog
66-
- name: <receiver>
67-
<receiver_configuration> # <8>
6856
----
6957
<1> If you configured an HTTP cluster-wide proxy, set the `proxy_from_environment` parameter to `true` to enable proxying for all alert receivers.
7058
<2> Specify how long Alertmanager waits while collecting initial alerts for a group of alerts before sending a notification.
7159
<3> Specify how much time must elapse before Alertmanager sends a notification about new alerts added to a group of alerts for which an initial notification was already sent.
7260
<4> Specify the minimum amount of time that must pass before an alert notification is repeated.
7361
If you want a notification to repeat at each group interval, set the `repeat_interval` value to less than the `group_interval` value.
7462
The repeated notification can still be delayed, for example, when certain Alertmanager pods are restarted or rescheduled.
75-
<5> Specify the name of the service that fires the alerts.
76-
<6> Specify labels to match your alerts.
77-
<7> Specify the name of the receiver to use for the alerts.
78-
<8> Specify the receiver configuration.
63+
64+
.. Add your alert receiver configuration:
7965
+
80-
[IMPORTANT]
81-
====
82-
* Use the `matchers` key name to indicate the matchers that an alert has to fulfill to match the node.
83-
Do not use the `match` or `match_re` key names, which are both deprecated and planned for removal in a future release.
66+
[source,yaml]
67+
----
68+
# ...
69+
receivers:
70+
- name: default
71+
- name: watchdog
72+
- name: <receiver> # <1>
73+
<receiver_configuration> # <2>
74+
# ...
75+
----
76+
<1> The name of the receiver.
77+
<2> The receiver configuration. The supported receivers are PagerDuty, webhook, email, Slack, and Microsoft Teams.
78+
+
79+
.Example of configuring PagerDuty as an alert receiver
80+
[source,yaml]
81+
----
82+
# ...
83+
receivers:
84+
- name: default
85+
- name: watchdog
86+
- name: team-frontend-page
87+
pagerduty_configs:
88+
- routing_key: ABCD01234EFGHIJ56789
89+
http_config: # <1>
90+
proxy_from_environment: true
91+
authorization:
92+
credentials: xxxxxxxxxx
93+
# ...
94+
----
95+
<1> Optional: Add custom HTTP configuration for a specific receiver. That receiver does not inherit the global HTTP configuration settings.
8496

85-
* If you define inhibition rules, use the following key names:
97+
.. Add the routing configuration:
8698
+
87-
--
88-
** `target_matchers`: to indicate the target matchers
89-
** `source_matchers`: to indicate the source matchers
90-
--
99+
[source,yaml]
100+
----
101+
# ...
102+
route:
103+
group_wait: 30s
104+
group_interval: 5m
105+
repeat_interval: 12h
106+
receiver: default
107+
routes:
108+
- matchers:
109+
- "alertname=Watchdog"
110+
repeat_interval: 2m
111+
receiver: watchdog
112+
- matchers: # <1>
113+
- "<your_matching_rules>" # <2>
114+
receiver: <receiver> # <3>
115+
# ...
116+
----
117+
<1> Use the `matchers` key name to specify the matching rules that an alert has to fulfill to match the node.
118+
If you define inhibition rules, use `target_matchers` key name for target matchers and `source_matchers` key name for source matchers.
119+
<2> Specify labels to match your alerts.
120+
<3> Specify the name of the receiver to use for the alerts.
91121
+
92-
Do not use the `target_match`, `target_match_re`, `source_match`, or `source_match_re` key names, which are deprecated and planned for removal in a future release.
122+
[WARNING]
123+
====
124+
Do not use the `match`, `match_re`, `target_match`, `target_match_re`, `source_match`, and `source_match_re` key names, which are deprecated and planned for removal in a future release.
93125
====
94126
+
95-
.Example of Alertmanager configuration with PagerDuty as an alert receiver
127+
--
128+
.Example of alert routing
96129
[source,yaml]
97130
----
98-
global:
99-
resolve_timeout: 5m
100-
http_config:
101-
proxy_from_environment: true
131+
# ...
102132
route:
103-
group_wait: 30s
104-
group_interval: 5m
133+
group_wait: 30s
134+
group_interval: 5m
105135
repeat_interval: 12h
106136
receiver: default
107137
routes:
@@ -111,31 +141,39 @@ route:
111141
receiver: watchdog
112142
- matchers: # <1>
113143
- "service=example-app"
114-
routes:
144+
routes: # <2>
115145
- matchers:
116146
- "severity=critical"
117147
receiver: team-frontend-page
118-
receivers:
119-
- name: default
120-
- name: watchdog
121-
- name: team-frontend-page
122-
pagerduty_configs:
123-
- service_key: "<your_key>"
124-
http_config: # <2>
125-
proxy_from_environment: true
126-
authorization:
127-
credentials: xxxxxxxxxx
148+
# ...
128149
----
129-
<1> Alerts of `critical` severity that are fired by the `example-app` service are sent through the `team-frontend-page` receiver. Typically, these types of alerts would be paged to an individual or a critical response team.
130-
<2> Custom HTTP configuration for a specific receiver. If you configure the custom HTTP configuration for a specific alert receiver, that receiver does not inherit the global HTTP config settings.
150+
<1> This example matches alerts from the `example-app` service.
151+
<2> You can create routes within other routes for more complex alert routing.
152+
--
153+
+
154+
The previous example routes alerts of `critical` severity that are fired by the `example-app` service to the `team-frontend-page` receiver. Typically, these types of alerts are paged to an individual or a critical response team.
131155

132156
. Apply the new configuration in the file:
133-
134-
** To apply the changes from the CLI, run the following command:
135157
+
136158
[source,terminal]
137159
----
138160
$ oc -n openshift-monitoring create secret generic alertmanager-main --from-file=alertmanager.yaml --dry-run=client -o=yaml | oc -n openshift-monitoring replace secret --filename=-
139161
----
140162

141-
** To apply the changes from the {product-title} web console, click *Save*.
163+
. Verify your routing configuration by visualizing the routing tree:
164+
+
165+
[source,terminal]
166+
----
167+
$ oc exec alertmanager-main-0 -n openshift-monitoring -- amtool config routes show --alertmanager.url http://localhost:9093
168+
----
169+
+
170+
.Example output
171+
[source,terminal]
172+
----
173+
Routing tree:
174+
.
175+
└── default-route receiver: default
176+
├── {alertname="Watchdog"} receiver: Watchdog
177+
└── {service="example-app"} receiver: default
178+
└── {severity="critical"} receiver: team-frontend-page
179+
----

0 commit comments

Comments
 (0)