Skip to content

Commit 813ab96

Browse files
Merge pull request #79038 from laubai/osdocs-11160-maxsurge-maxunavailable
OSDOCS#11160: New max-unavailable and max-surge parameters
2 parents 48e8a97 + 487c1eb commit 813ab96

8 files changed

+379
-28
lines changed

modules/rosa-create-objects.adoc

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -744,6 +744,21 @@ a|--kubelet-configs <kubeletconfig_name>
744744
|--min-replicas
745745
|Specifies the minimum number of compute nodes when enabling autoscaling.
746746

747+
//OSDOCS-11160: HCP only, but need to wait on separate HCP publishing
748+
//ifdef::openshift-rosa-hcp[]
749+
|--max-surge
750+
a| For {hcp-title-first} clusters, the `max-surge` parameter defines the number of new nodes that can be provisioned in excess of the desired number of replicas for the machine pool, as configured using the `--replicas` parameter, or as determined by the autoscaler when autoscaling is enabled. This can be an absolute number (for example, `2`) or a percentage of the machine pool size (for example, `20%`), but must use the same unit as the `max-unavailable` parameter.
751+
752+
The default value is `1`, meaning that the maximum number of nodes in the machine pool during an upgrade is 1 plus the desired number of replicas for the machine pool. In this situation, one excess node can be provisioned before existing nodes need to be made unavailable. The number of nodes that can be provisioned simultaneously during an upgrade is `max-surge` plus `max-unavailable`.
753+
754+
|--max-unavailable
755+
a|For {hcp-title-first} clusters, the `max-unavailable` parameter defines the number of nodes that can be made unavailable in a machine pool during an upgrade, before new nodes are provisioned. This can be an absolute number (for example, `2`) or a percentage of the current replica count in the machine pool (for example, `20%`), but must use the same unit as the `max-surge` parameter.
756+
757+
The default value is `0`, meaning that no outdated nodes are removed before new nodes are provisioned. The valid range for this value is from `0` to the current machine pool size, or from `0%` to `100%`. The total number of nodes that can be upgraded simultaneously during an upgrade is `max-surge` plus `max-unavailable`.
758+
759+
//endif::openshift-rosa-hcp[]
760+
// end OSDOCS-11160: HCP only, when separate docs are available
761+
747762
|--name
748763
|Required: The name (string) for the machine pool.
749764

@@ -797,6 +812,16 @@ Add a machine pool that is named `mp-1` with 3 replicas of `m5.xlarge` to a clus
797812
$ rosa create machinepool --cluster=mycluster --replicas=3 --instance-type=m5.xlarge --name=mp-1
798813
----
799814

815+
Add a machine pool (`mp-1`) to a {hcp-title-first} cluster, configuring 6 replicas and the following upgrade behavior:
816+
817+
* Allow up to 2 excess nodes to be provisioned during an upgrade.
818+
* Ensure that no more than 3 nodes are unavailable during an upgrade.
819+
820+
[source,terminal]
821+
----
822+
$ rosa create machinepool --cluster=mycluster --replicas=6 --name=mp-1 --max-surge=2 --max-unavailable=3
823+
----
824+
800825
Add a machine pool with labels to a cluster.
801826

802827
[source,terminal]

modules/rosa-edit-objects.adoc

Lines changed: 31 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -234,7 +234,7 @@ Allows edits to the machine pool in a cluster.
234234
.Syntax
235235
[source,terminal]
236236
----
237-
$ rosa edit machinepool --cluster=<cluster_name> | <cluster_id> <machinepool_ID> [arguments]
237+
$ rosa edit machinepool --cluster=<cluster_name_or_id> <machinepool_ID> [arguments]
238238
----
239239

240240
.Arguments
@@ -263,6 +263,20 @@ a|--kubelet-configs <kubeletconfig_name>
263263
|--min-replicas
264264
|Specifies the minimum number of compute nodes when enabling autoscaling.
265265

266+
//OSDOCS-11160: HCP only, but need to wait on separate HCP publishing
267+
//ifdef::openshift-rosa-hcp[]
268+
|--max-surge
269+
a| For {hcp-title-first} clusters, the `max-surge` parameter defines the number of new nodes that can be provisioned in excess of the desired number of replicas for the machine pool, as configured using the `--replicas` parameter, or as determined by the autoscaler when autoscaling is enabled. This can be an absolute number (for example, `2`) or a percentage of the machine pool size (for example, `20%`), but must use the same unit as the `max-unavailable` parameter.
270+
271+
The default value is `1`, meaning that the maximum number of nodes in the machine pool during an upgrade is 1 plus the desired number of replicas for the machine pool. In this situation, one excess node can be provisioned before existing nodes need to be made unavailable. The number of nodes that can be provisioned simultaneously during an upgrade is `max-surge` plus `max-unavailable`.
272+
273+
|--max-unavailable
274+
a|For {hcp-title-first} clusters, the `max-unavailable` parameter defines the number of nodes that can be made unavailable in a machine pool during an upgrade, before new nodes are provisioned. This can be an absolute number (for example, `2`) or a percentage of the current replica count in the machine pool (for example, `20%`), but must use the same unit as the `max-surge` parameter.
275+
276+
The default value is `0`, meaning that no outdated nodes are removed before new nodes are provisioned. The valid range for this value is from `0` to the current machine pool size, or from `0%` to `100%`. The total number of nodes that can be upgraded simultaneously during an upgrade is `max-surge` plus `max-unavailable`.
277+
//endif::openshift-rosa-hcp[]
278+
// end OSDOCS-11160: HCP only, when separate docs are available
279+
266280
|--node-drain-grace-period
267281
|Specifies the node drain grace period when upgrading or replacing the machine pool. (This is for {hcp-title} clusters only.)
268282

@@ -297,33 +311,43 @@ Set 4 replicas on a machine pool named `mp1` on a cluster named `mycluster`.
297311

298312
[source,terminal]
299313
----
300-
$ rosa edit machinepool --cluster=mycluster --replicas=4 --name=mp1
314+
$ rosa edit machinepool --cluster=mycluster --replicas=4 mp1
301315
----
302316

303317
Enable autoscaling on a machine pool named `mp1` on a cluster named `mycluster`.
304318

305319
[source,terminal]
306320
----
307-
$ rosa edit machinepool --cluster=mycluster --enable-autoscaling --min-replicas=3 --max-replicas=5 --name=mp1
321+
$ rosa edit machinepool --cluster=mycluster --enable-autoscaling --min-replicas=3 --max-replicas=5 mp1
308322
----
309323

310324
Disable autoscaling on a machine pool named `mp1` on a cluster named `mycluster`.
311325

312326
[source,terminal]
313327
----
314-
$ rosa edit machinepool --cluster=mycluster --enable-autoscaling=false --replicas=3 --name=mp1
328+
$ rosa edit machinepool --cluster=mycluster --enable-autoscaling=false --replicas=3 mp1
315329
----
316330

317331
Modify the autoscaling range on a machine pool named `mp1` on a cluster named `mycluster`.
318332

319333
[source,terminal]
320334
----
321-
$ rosa edit machinepool --max-replicas=9 --cluster=mycluster --name=mp1
335+
$ rosa edit machinepool --max-replicas=9 --cluster=mycluster mp1
336+
----
337+
338+
On {hcp-title-first} clusters, edit the `mp1` machine pool to add the following behavior during upgrades:
339+
340+
* Allow up to 2 excess nodes to be provisioned during an upgrade.
341+
* Ensure that no more than 3 nodes are unavailable during an upgrade.
342+
343+
[source,terminal]
344+
----
345+
$ rosa edit machinepool --cluster=mycluster mp1 --max-surge=2 --max-unavailable=3
322346
----
323347

324-
Associate a `KubeletConfig` object with an existing machine pool on a {hcp-title-first} cluster.
348+
Associate a `KubeletConfig` object with an existing `high-pid-pool` machine pool on a {hcp-title} cluster.
325349

326350
[source,terminal]
327351
----
328-
$ rosa edit machinepool -c mycluster --kubelet-configs=set-high-pids --name high-pid-pool
352+
$ rosa edit machinepool -c mycluster --kubelet-configs=set-high-pids high-pid-pool
329353
----

modules/rosa-hcp-upgrade-options.adoc

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
:_mod-docs-content-type: CONCEPT
2+
[id="rosa-upgrade-options_{context}"]
3+
= Upgrade options for {hcp-title} clusters
4+
5+
In OpenShift, upgrading means provisioning a new component with updated software and using it to replace an existing component that has outdated software.
6+
7+
You can control the impact of upgrades to your workload by controlling which parts of the cluster are upgraded, for example:
8+
9+
Upgrade only the hosted control plane:: This does not impact your worker nodes.
10+
11+
Upgrade nodes in a single machine pool:: This initiates a rolling replacement of nodes in the specified machine pool, and temporarily impacts the worker nodes on that machine pool. This does not impact nodes on other machine pools in the cluster.
12+
13+
Upgrade nodes in multiple machine pools simultaneously:: This initiates a rolling replacement of nodes in the specified machine pools, and temporarily impacts the worker nodes on those machine pools. You can run this type of upgrade as a single command, or as multiple commands.
14+
15+
Upgrade the whole cluster in sequence:: This initiates upgrade of the hosted control plane, followed by a rolling replacement of nodes in the specified machine pools. These upgrades occur in sequence because the hosted control plane and the machine pools cannot be upgraded at the same time. When an upgrade of the hosted control plane is in progress, nodes in the machine pools cannot be upgraded. When upgrade is in progress for nodes in the machine pools, the hosted control plane cannot be upgraded.
16+
+
17+
[IMPORTANT]
18+
====
19+
To maintain compatibility between nodes in the cluster, nodes in machine pools cannot use a newer version than the hosted control plane. This means that the hosted control plane should always be upgraded to a given version before any machine pools are upgraded to the same version.
20+
====
21+
22+
The time required to upgrade the hosted control plane varies depending on your workload configuration.
23+
24+
The time required to upgrade a machine pool varies according to the number of worker nodes in the machine pool (`--replicas` or `--max-replicas`).
25+
26+
You can further control the time required for an upgrade, and the impact of an upgrade to your workload, by editing the `--max-surge` and `--max-unavailable` values for each machine pool. These options control the number of nodes that can be upgraded simultaneously, and whether an upgrade provisions excess nodes or makes some existing nodes unavailable or both, for example:
27+
28+
* **To prioritize high workload availability**, you can provision excess nodes instead of making existing nodes unavailable by setting a higher value for `--max-surge` and setting `--max-unavailable` to `0`.
29+
* **To prioritize lower infrastructure costs**, you can make some existing nodes unavailable and avoid provisioning excess nodes by setting a higher value for `--max-unavailable` and setting `--max-surge` to `0`.
30+
* **To prioritize upgrade speed by upgrading multiple nodes simultaneously**, you can provision excess nodes and allow some existing nodes to be made unavailable by configuring moderate values for both `--max-surge` and `--max-unavailable`.
31+
32+
For more information about these parameters and their usage, see the _ROSA CLI reference_ for `rosa edit machinepool`.
33+
34+
//Additional resources included in assembly.
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * upgrading/rosa-hcp-upgrading.adoc
4+
5+
// NOTE: This module is included several times in the same upgrade assembly.
6+
7+
:_mod-docs-content-type: PROCEDURE
8+
[id="rosa-hcp-upgrading-cli-control-plane_{context}"]
9+
// HCP-ONLY: Conditions for upgrading the hosted control plane WITHOUT upgrading any machine pools
10+
ifeval::["{context}" != "rosa-hcp-upgrading-whole-cluster"]
11+
= Upgrading the hosted control plane with the ROSA CLI
12+
13+
You can manually upgrade the hosted control plane of a {hcp-title} cluster by using the ROSA CLI. This method schedules the control plane for an upgrade if a more recent version is available, either immediately, or at a specified future time.
14+
15+
[NOTE]
16+
====
17+
Your control plane only supports machine pools within two minor Y-stream versions. For example, a {hcp-title} cluster with a control plane using version 4.15.z supports machine pools with version 4.13.z and 4.14.z, but the control plane does not support machine pools using version 4.12.z.
18+
====
19+
20+
endif::[]
21+
//END HCP-ONLY conditions
22+
23+
// WHOLE CLUSTER: Condition for upgrading hosted control plane as part of upgrading the whole cluster in sequence
24+
ifeval::["{context}" == "rosa-hcp-upgrading-whole-cluster"]
25+
= Upgrading the hosted control plane
26+
27+
When you need to upgrade the whole cluster, upgrade the hosted control plane first.
28+
endif::[]
29+
30+
31+
.Prerequisites
32+
* You have installed and configured the latest version of the ROSA CLI.
33+
* No machine pool upgrades are in progress or scheduled to take place at the same time as the hosted control plane upgrade.
34+
35+
//END WHOLE CLUSTER conditions
36+
37+
.Procedure
38+
39+
. Verify the current version of your cluster by running the following command:
40+
+
41+
[source,terminal]
42+
----
43+
$ rosa describe cluster --cluster=<cluster_name_or_id> <1>
44+
----
45+
<1> Replace `<cluster_name_or_id>` with the cluster name or the cluster ID.
46+
47+
. List the versions that you can upgrade your control plane to by running the following command:
48+
+
49+
[source,terminal]
50+
----
51+
$ rosa list upgrade --cluster=<cluster_name_or_id>
52+
----
53+
+
54+
The command returns a list of available updates, including the recommended version.
55+
+
56+
.Example output
57+
+
58+
[source,terminal]
59+
----
60+
VERSION NOTES
61+
4.14.8 recommended
62+
4.14.7
63+
4.14.6
64+
----
65+
66+
. Upgrade the cluster's hosted control plane by running the following command:
67+
+
68+
[source,terminal]
69+
----
70+
$ rosa upgrade cluster -c <cluster_name_or_id> --control-plane [--schedule-date=<yyyy-mm-dd> --schedule-time=<HH:mm>] --version <version_number>
71+
----
72+
73+
** To schedule an immediate upgrade to the specified version, run the following command:
74+
+
75+
[source,terminal]
76+
----
77+
$ rosa upgrade cluster -c <cluster_name_or_id> --control-plane --version <version_number>
78+
----
79+
+
80+
Your hosted control plane is scheduled for an immediate upgrade.
81+
82+
** To schedule an upgrade to the the specified version at a future date, run the following command:
83+
+
84+
[source,terminal]
85+
----
86+
$ rosa upgrade cluster -c <cluster_name_or_id> --control-plane --schedule-date=<yyyy-mm-dd> --schedule-time=<HH:mm> --version=<version_number>
87+
----
88+
+
89+
Your hosted control plane is scheduled for an upgrade at the specified time in Coordinated Universal Time (UTC).
90+
91+
ifeval::["{context}" != "rosa-hcp-upgrading-whole-cluster"]
92+
.Troubleshooting
93+
* Sometimes a scheduled upgrade does not initiate. See link:https://access.redhat.com/solutions/6648291[Upgrade maintenance canceled] for more information.
94+
endif::[]
Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * upgrading/rosa-hcp-upgrading.adoc
4+
5+
// NOTE: This module is included several times in the same upgrade assembly.
6+
7+
:_mod-docs-content-type: PROCEDURE
8+
[id="rosa-hcp-upgrading-cli-machinepool_{context}"]
9+
// POOL-ONLY: Conditions for upgrading machine pools WITHOUT upgrading hosted control planes
10+
ifeval::["{context}" != "rosa-hcp-upgrading-whole-cluster"]
11+
= Upgrading machine pools with the ROSA CLI
12+
13+
You can manually upgrade one or more machine pools in a {hcp-title} cluster by using the ROSA CLI. This method schedules the specified machine pools for an upgrade if a more recent version is available, either immediately, or at a specified future time.
14+
15+
[NOTE]
16+
====
17+
Your control plane only supports machine pools within two minor Y-stream versions. For example, a {hcp-title} cluster with a control plane using version 4.15.z supports machine pools with version 4.13.z and 4.14.z, but the control plane does not support machine pools using version 4.12.z.
18+
====
19+
20+
.Prerequisites
21+
* You have installed and configured the latest version of the ROSA CLI.
22+
* No upgrades for the hosted control plane are in progress on the cluster, or scheduled to occur at the same time as the machine pool upgrade.
23+
endif::[]
24+
//END POOL-ONLY condition
25+
26+
// WHOLE CLUSTER: Conditions for upgrading machine pools as part of upgrading the whole cluster in sequence
27+
ifeval::["{context}" == "rosa-hcp-upgrading-whole-cluster"]
28+
= Upgrading machine pools
29+
30+
When your hosted control plane upgrade is complete, you can upgrade one or more machine pools simultaneously.
31+
endif::[]
32+
//END WHOLE CLUSTER condition
33+
34+
.Procedure
35+
. Verify the current version of your cluster by running the following command:
36+
+
37+
[source,terminal]
38+
----
39+
$ rosa describe cluster --cluster=<cluster_name_or_id> <1>
40+
----
41+
<1> Replace `<cluster_name_or_id>` with the cluster name or the cluster ID.
42+
+
43+
ifeval::["{context}" != "rosa-hcp-upgrading-whole-cluster"]
44+
.Example output
45+
[source,terminal]
46+
----
47+
OpenShift Version: 4.14.0
48+
----
49+
endif::[]
50+
ifeval::["{context}" == "rosa-hcp-upgrading-whole-cluster"]
51+
.Example output
52+
[source,terminal]
53+
----
54+
OpenShift Version: 4.14.8
55+
----
56+
//WHOLE CLUSTER: updating the version here to show after hcp upgrade in whole cluster section
57+
endif::[]
58+
59+
. List the versions that you can upgrade your machine pools to by running the following command:
60+
+
61+
[source,terminal]
62+
----
63+
$ rosa list upgrade --cluster <cluster-name> --machinepool <machinepool_name>
64+
----
65+
+
66+
The command returns a list of available updates, including the recommended version.
67+
+
68+
.Example output
69+
+
70+
[source,terminal]
71+
----
72+
VERSION NOTES
73+
4.14.5 recommended
74+
4.14.4
75+
4.14.3
76+
----
77+
+
78+
[IMPORTANT]
79+
====
80+
Do not upgrade your machine pool to a version higher than your control plane. If you want to move to a higher version, upgrade the control plane to that version first.
81+
====
82+
//Is it even possible to do this? Will a higher version display? Can you specify a higher version even if it doesn't display?
83+
84+
. Verify the upgrade behavior of the machine pools you intend to upgrade by running the following command:
85+
+
86+
[source,terminal]
87+
----
88+
$ rosa describe machinepool --cluster=<cluster_name_or_id> <machine_pool_name>
89+
----
90+
+
91+
.Example output
92+
[source,terminal]
93+
----
94+
Replicas: 5
95+
Node drain grace period: 30 minutes
96+
97+
Management upgrade:
98+
- Type: Replace
99+
- Max surge: 20%
100+
- Max unavailable: 20%
101+
----
102+
+
103+
In the example, these settings allow the machine pool to provision one excess node (`max-surge` of 20% of `replicas`) and to have up to one node unavailable (`max-unavailable` of 20% of `replicas`) during an upgrade. This machine pool can therefore upgrade two nodes at a time, by provisioning one new node in excess of the replica count, and by making one node unavailable and replacing it. Node upgrades may be delayed by up to 30 minutes (`node-drain-grace-period` of 30 minutes) if necessary to protect workloads that have a pod disruption budget.
104+
105+
. Upgrade one or more of your machine pools by running the following command:
106+
+
107+
[source,terminal]
108+
----
109+
$ rosa upgrade machinepool -c <cluster_name> <first_machine_pool_id> <second_machine_pool_id> [--schedule-date=<yyyy-mm-dd> --schedule-time=<HH:mm>] --version <version_number>
110+
----
111+
+
112+
Multiple machine pools can be upgraded simultaneously. You can schedule machine pool upgrades individually or schedule multiple upgrades in a single command.
113+
114+
** To schedule the immediate upgrade of a specific machine pool on your cluster, run the following command:
115+
+
116+
[source,terminal]
117+
----
118+
$ rosa upgrade machinepool -c <cluster_name> <your_machine_pool_id> --version <version_number>
119+
----
120+
+
121+
Your machine pool is scheduled for immediate upgrade, which initiates a rolling replacement of all nodes in the specified machine pool.
122+
123+
** To schedule an upgrade of multiple machine pools to start at a future date, run the following command:
124+
+
125+
[source,terminal]
126+
----
127+
$ rosa upgrade machinepool -c <cluster_name> <first_machine_pool_id> <second_machine_pool_id> --schedule-date=<yyyy-mm-dd> --schedule-time=<HH:mm> --version <version_number>
128+
----
129+
+
130+
Your machine pools are scheduled to begin an upgrade at the specified time and date in Coordinated Universal Time (UTC). This will initiate a rolling replacement of all nodes in the specified machine pools, beginning at the specified time.
131+
132+
.Troubleshooting
133+
* Sometimes a scheduled upgrade does not initiate. See link:https://access.redhat.com/solutions/6648291[Upgrade maintenance canceled] for more information.

0 commit comments

Comments
 (0)