Merge pull request #78825 from fmcdonal/OSDOCS-9533.1

kcarmichael08 · web-flow · commit 342612c77263 · 2024-08-26T14:52:05.000-04:00
OSDOCS#9533.1: Updates for preflight cluster upgrades for ROSA Classic (peer review completed through PR: #76632)
diff --git a/modules/rosa-deleting-cluster-upgrade-cli.adoc b/modules/rosa-deleting-cluster-upgrade-cli.adoc
@@ -0,0 +1,42 @@
+// Module included in the following assemblies:
+//
+// * upgrading/rosa_upgrading/rosa-upgrading-sts.adoc
+:_mod-docs-content-type: PROCEDURE
+[id="rosa-deleting-cluster-upgrade-cli_{context}"]
+= Deleting a ROSA cluster upgrade with the ROSA CLI
+
+You can use either the ROSA CLI (`rosa`) or {cluster-manager} console to delete a scheduled upgrade. This procedure uses the ROSA CLI.
+
+.Procedure
+
+. Verify the cluster update has not started using the following command:
++
+[source,terminal]
+----
+$ rosa list upgrades --cluster=<cluster_name|cluster_id>
+----
++
+.Example output
+[source,terminal]
+----
+VERSION  NOTES
+4.15.14  recommended - scheduled for 2024-06-02 15:00 UTC
+4.15.13
+----
+
+. Delete a scheduled update by running the following command:
++
+[source,terminal]
+----
+$ rosa delete upgrade --cluster=<cluster_name|cluster_id>
+----
++
+. Confirm the deletion by entering `Yes` at the confirmation prompt.
++
+.Example output
+[source,terminal]
+----
+I: Successfully canceled scheduled upgrade on cluster 'my-cluster'
+----
+
+You will receive an email notification confirming that the scheduled upgrade has been canceled.
diff --git a/modules/rosa-deleting-cluster-upgrade-ocm.adoc b/modules/rosa-deleting-cluster-upgrade-ocm.adoc
@@ -0,0 +1,18 @@
+// Module included in the following assemblies:
+//
+// * upgrading/rosa_upgrading/rosa-upgrading-sts.adoc
+:_mod-docs-content-type: PROCEDURE
+[id="rosa-deleting-cluster-upgrade-ocm_{context}"]
+= Deleting a ROSA cluster upgrade with the {cluster-manager} console
+
+You can use the {cluster-manager} console to delete a scheduled upgrade.
+
+.Procedure
+
+. Log in to {cluster-manager-url}.
+. Select the cluster with the scheduled upgrade.
+. Click the *Settings* tab.
+. In the *Update status* pane, click *Cancel this update*.
+. Review the update details in the *Cancel update* dialog and click *Cancel this update*.
+
+You will receive an email notification confirming that the scheduled upgrade has been canceled.
diff --git a/modules/rosa-how-upgrades-work.adoc b/modules/rosa-how-upgrades-work.adoc
@@ -0,0 +1,58 @@
+
+// Module included in the following assemblies:
+//
+// upgrading/rosa-upgrading-sts.adoc
+
+
+:_mod-docs-content-type: CONCEPT
+[id="rosa-how-upgrades-work_{context}"]
+= How ROSA (classic architecture) cluster upgrades work
+
+Upgrades are manually initiated (one-time) or automatically scheduled (recurring). Red{nbsp}Hat Site Reliability Engineers (SREs) monitor upgrade progress and either proactively notify you to take corrective actions or remedy issues encountered.
+
+The Cluster Version Operator (CVO) is the primary component that orchestrates and facilitates the OpenShift Container Platform update process.
+
+The Managed Upgrade Operator (MUO) handles the scheduling, monitoring, and notifications of ROSA (classic architecture) cluster upgrades. The MUO orchestrates the automated in-place cluster upgrades by ensuring the operating conditions are met before and after the upgrade of a managed cluster.
+
+[id="rosa-upgrade-scheduled-time_{context}"]
+== Cluster upgrade scheduled time
+
+You can schedule cluster upgrades by setting the scheduled time. This is when the preparation for the cluster upgrade begins with pre-upgrade health checks and additional compute capacity creation. The actual cluster upgrade starts within one hour from the scheduled time. You receive an email notification when the cluster upgrade starts.
+
+The Pre-Health Check (PHC) provides extra protection to ensure the scheduled update proceeds as expected and runs in the following two scenarios:
+
+* If an upgrade is scheduled for more than 2 hours from the current time, the PHC runs and the user is notified if there are any failures. This PHC is in the _New_ phase of the upgrade.
+* When the upgrade is immediate or within 2 hours, the PHC runs just before the upgrade begins. This PHC is in the _Upgrading_ phase of the upgrade.
+This means that PHC is always run at least one time during the upgrading phase but can also be run additionally in advance if the upgrade is scheduled for more than 2 hours from the current time.
+
+You can observe the status of the cluster upgrade by running the `rosa describe upgrade --cluster=<cluster name|cluster_id>` command in the ROSA CLI (`rosa`).
+
+[id="rosa-cluster-upgrade-overview_{context}"]
+== ROSA (classic architecture) upgrade overview
+
+The following are the high-level steps that occur during the ROSA (classic architecture) cluster update:
+
+. Scheduling the upgrade in advance triggers the `PreHealthCheck` and notifies users of any failures that they can then address before the upgrade begins.
+. Before the cluster upgrade begins, a cluster health check is performed by the MUO. If the MUO identifies an issue that requires corrective action, you will be notified. Some examples of the cluster health checks that MUO performs include:
+** Identifying any Pod Disruption Budgets (PDBs) that can potentially block or delay the update of the nodes.
+** Ensuring cluster Operators are available and healthy.
+** Ensuring cluster critical alerts are not firing.
+. A temporary compute node is created in the cluster to allow for the scheduling of drained pods during the update.
++
+[NOTE]
+====
+The temporary compute node creation does not always happen. For example, if there is no `worker` machine pool, the temporary compute node will not be created. This might happen when a cluster admin deletes the existing `worker` machine pool and creates another `worker` machine pool with a different name or instance type.
+====
+. The cluster version is set to the target version.
++
+[NOTE]
+====
+In certain situations, an upgrade path can become unavailable since the time the cluster update was requested but before it was completed. In such cases, the upgrade is automatically canceled and a notification is sent. You must pick another target version to request the upgrade.
+====
++
+. During the upgrade, the control plane components are updated to the new version.
+. Next, individual cluster Operators perform update tasks on their domain of the cluster.
+. Finally, the MCO updates the system configuration and operating system of every node. During this step, each node is rebooted after successfully draining the workloads running on the node.
+.. During the update of each node, workloads are drained, honoring the PDBs. Workloads with PDBs that do not allow disruptions essentially block the draining of the node, increasing the elapsed time for the cluster update.
+.. During the update of every node in the cluster, the cluster update waits until the time specified by the _node drain grace period_ to allow for safely draining the workloads. Upon reaching the node drain grace period, the node is forcibly drained to allow for cluster upgrade to progress. You can only configure the node drain grace period before initiating the upgrade and you cannot change it after the cluster upgrade begins.
+.. When cluster nodes are updated, the MCO selects one node at a time per machine config pool according to their age, starting with the oldest.
diff --git a/modules/rosa-upgrading-cli-tutorial.adoc b/modules/rosa-upgrading-cli-tutorial.adoc
@@ -15,13 +15,12 @@ endif::[]
 [id="rosa-upgrading-cli_{context}"]
 = Upgrading with the ROSA CLI
 
-You can upgrade a {product-title} (ROSA) cluster manually by using the ROSA CLI (`rosa`).
-
-This method schedules the cluster for an immediate upgrade, if a more recent version is available.
+You can use the ROSA CLI (`rosa`) to upgrade a {product-title} (ROSA) cluster either immediately within one hour or at a future time.
 
 .Prerequisites
 
 * You have installed and configured the latest ROSA CLI on your installation host.
+* Your {product-title} cluster is in a `Ready` state.
 
 .Procedure
 
@@ -40,14 +39,14 @@ $ rosa describe cluster --cluster=<cluster_name|cluster_id> <1>
 $ rosa list upgrade --cluster=<cluster_name|cluster_id>
 ----
 +
-The command returns a list of versions to which the cluster can be upgraded, including a recommended version.
+The command returns a list of versions to which the cluster can be upgraded, including a recommended version. The recommendation is based on the conditional update risks. Each known risk might apply to all clusters or only clusters matching certain conditions. Refer to the OpenShift release notes to evaluate, validate and determine the appropriate version to upgrade to.
 
-. To upgrade a cluster to the latest available version, enter the following command:
+. To upgrade the cluster to a specified version immediately within the next hour, enter the following command:
 +
 ifndef::rosa-hcp[]
 [source,terminal]
 ----
-$ rosa upgrade cluster --cluster=<cluster_name|cluster_id>
+$ rosa upgrade cluster --cluster=<cluster_name|cluster_id> --version <version-id>
 ----
 endif::rosa-hcp[]
 ifdef::rosa-hcp[]
@@ -58,9 +57,51 @@ $ rosa upgrade cluster --cluster=<cluster_name|cluster_id> --control-plane
 ----
 endif::rosa-hcp[]
 +
-The cluster is scheduled for an immediate upgrade. This action can take an hour or longer, depending on your workload configuration, such as pod disruption budgets.
+[NOTE]
+====
+If you are upgrading an AWS Security Token Service (STS) cluster, this command starts an interactive IAM Roles/policies upgrade mode process that verifies the account and operator role policies for the chosen cluster are compatible with the target version of the upgrade. If the policies are not compatible with the chosen upgrade version, the CLI automatically upgrades them in auto mode.
+====
++
+The cluster is scheduled for an immediate upgrade as denoted by the _Scheduled Time_. The upgrade will begin within one hour from the scheduled time.
++
+. Alternatively, to upgrade the cluster at a future time in UTC, enter the following command:
++
+[source,terminal]
+----
+$ rosa upgrade cluster --cluster=<cluster_name|cluster_id>   \
+          --version <version-id>   \
+          --schedule-date yyyy-mm-dd \
+          --schedule-time HH:mm
+----
++
+. To customize the grace period for every node to be drained during the cluster upgrade, enter the following command:
++
+[source,terminal]
+----
+$ rosa upgrade cluster --cluster=<cluster_name|cluster_id>   \
+          --version <version-id>   \
+          --node-drain-grace-period 15 minutes
+----
++
+
+.Verification
+
+. You can view the status of the upgrade by entering the following command, which shows both the status (scheduled or started) and the scheduled time.
++
+[source,terminal]
+----
+$ rosa list upgrade --cluster=<cluster_name|cluster_id>
+----
 +
-You will receive an email when the upgrade is complete. You can also check the status by running the `rosa describe cluster` command again from the ROSA CLI or view the status in {cluster-manager} console.
+.Example output
+[source,terminal]
+----
+VERSION  NOTES
+4.15.14  recommended - scheduled for 2024-06-02 15:00 UTC
+4.15.13
+----
+
+You will receive email notifications confirming the scheduling, beginning, and completion of the cluster upgrade.
 
 .Troubleshooting
 * Sometimes a scheduled upgrade does not trigger. See link:https://access.redhat.com/solutions/6648291[Upgrade maintenance cancelled] for more information.
diff --git a/modules/rosa-upgrading-manual-ocm.adoc b/modules/rosa-upgrading-manual-ocm.adoc
@@ -9,22 +9,27 @@ endif::[]
 
 :_mod-docs-content-type: PROCEDURE
 [id="rosa-upgrade-ocm_{context}"]
-= Scheduling individual upgrades through the {cluster-manager} console
+= Upgrading with the {cluster-manager} console
 
-You can schedule upgrades for a {product-title} cluster manually one time by using {cluster-manager} console.
+You can schedule upgrades for a ROSA (classic architecture) cluster manually either one time or on a recurring schedule by using {cluster-manager} console.
 
 .Procedure
 
 . Log in to {cluster-manager-url}.
 . Select a cluster to upgrade.
 . Click the *Settings* tab.
-. In the *Update strategy* pane, select *Individual Updates*.
-. Select the version you want to upgrade your cluster to. Recommended cluster upgrades appear in the UI.
-. If you select an update version that requires approval, provide an administrator’s acknowledgment and click *Approve and continue*.
+. In the *Update strategy* pane, select which type of update you want:
+** For individual updates, you can request the upgrade either immediately (to start within an hour) or at a future time.
+** For recurring updates, choose a recurring date and time to start the upgrade automatically to the latest x.y.Z (z-stream) version available.
 +
-. In the *Node draining* pane, select a grace period interval from the list. The grace period enables the nodes to gracefully drain before forcing the pod eviction. The default is *1 hour*.
+[IMPORTANT]
+====
+Recurring updates are applicable only for z-stream updates. Minor version or y-stream updates need to be done manually. You will be notified when a new y-stream update is available.
+====
 +
-[NOTE]
+. Optional: In the *Node draining* pane, select a grace period interval from the list. The grace period enables the nodes to gracefully drain before forcing the pod eviction. The default is *1 hour*.
++
+[IMPORTANT]
 ====
 You cannot change the node drain grace period after you start the upgrade process.
 ====
@@ -37,16 +42,21 @@ You cannot change the node drain grace period after you start the upgrade proces
 The *Update* button is enabled only when an upgrade is available.
 ====
 +
-. In the *Select version* dialog, choose a target upgrade version and click *Next*.
+. The *Update cluster* dialog opens. Recommended cluster upgrades appear in the *Select version* pane. Select the version you want to upgrade your cluster to, and click *Next*.
+. Optional: For ROSA clusters that use AWS Security Token Service (STS), the account-level and cluster-specific Operator roles might need to be updated, depending on the selected target version.
+.. In the ROSA CLI, run the `rosa list account-roles` command to list and verify that the account roles are compatible with the target minor version chosen for the upgrade. If the roles are not compatible, run the `rosa upgrade account-roles` command to upgrade the account roles to the latest OpenShift version.
+.. In the ROSA CLI, run the `rosa list operator-roles` command to list and verify that Operator roles associated with the cluster are compatible with the target minor version chosen for the upgrade. If not, run the `rosa upgrade operators-roles` command to upgrade the cluster's Operator roles to the latest OpenShift version.
+.. If you select an update version that requires approval, provide an administrator's acknowledgment by typing *Acknowledge* into the field provided, and click *Next*.
 . In the *Schedule update* dialog, schedule your cluster upgrade.
 +
 * To upgrade within an hour, select *Update now* and click *Next*.
 * To upgrade at a later time, select *Schedule a different time* and set a time and date for your upgrade. Click *Next* to proceed to the confirmation dialog.
 +
 . After reviewing the version and schedule summary, select *Confirm update*.
-+
-The cluster is scheduled for an upgrade to the target version. This action can take an hour or longer, depending on the selected upgrade schedule and your workload configuration, such as pod disruption budgets.
-+
+. Click *Close* to exit out of the *Update cluster* dialog.
+
+The cluster is scheduled for an upgrade to the target version. This action can take up to an hour, depending on the selected upgrade schedule and your workload configuration, such as pod disruption budgets.
+
 The status is displayed in the *Update status* pane.
 
 .Troubleshooting
diff --git a/upgrading/rosa-upgrading-sts.adoc b/upgrading/rosa-upgrading-sts.adoc
@@ -1,31 +1,41 @@
 :_mod-docs-content-type: ASSEMBLY
 [id="rosa-upgrading-sts"]
-= Upgrading ROSA Classic clusters
+= Upgrading ROSA (classic architecture) clusters
 include::_attributes/attributes-openshift-dedicated.adoc[]
 :context: rosa-upgrading-sts
 
 toc::[]
 
+Use one of the following methods to upgrade ROSA (classic architecture) clusters:
+
+* Manually through the ROSA CLI (`rosa`) - Start a one-time immediate upgrade or schedule a one-time upgrade for a future date or time.
+* Manually through the {cluster-manager} UI - Start a one-time immediate upgrade or schedule a one-time upgrade for a future date or time; or schedule an upgrade window for automatic recurring upgrades whenever a new z-version is available.
+
 [id="rosa-lifecycle-policy_{context}"]
 == Life cycle policies and planning
 
 To plan an upgrade, review the xref:../rosa_architecture/rosa_policy_service_definition/rosa-life-cycle.adoc#rosa-life-cycle[{product-title} update life cycle]. The life cycle page includes release definitions, support and upgrade requirements, installation policy information and life cycle dates.
 
-Upgrades are manually initiated or automatically scheduled. Red{nbsp}Hat Site Reliability Engineers (SREs) monitor upgrade progress and remedy any issues encountered.
+You can use update channels to decide which Red Hat OpenShift Container Platform minor version to update your clusters to. {product-title} supports updates only through the `stable` channel. To learn more about OpenShift update channels and releases, see link:https://docs.openshift.com/container-platform/latest/updating/understanding_updates/understanding-update-channels-release.html[Understanding update channels and releases].
 
 [id="rosa-sts-upgrading-a-cluster-with-sts"]
 == Upgrading a ROSA Classic cluster
 
-There are two methods to upgrade Classic {product-title} (ROSA) clusters:
-
-* Individual upgrades through the ROSA CLI (`rosa`)
-* Individual upgrades through the {cluster-manager} console
+You must upgrade ROSA (classic architecture) clusters using either the ROSA CLI (`rosa`) or the {cluster-manager} console.
 
 [NOTE]
 ====
-When you follow a scheduled upgrade policy, a delay of an hour or more before the upgrade process begins might occur, even if the upgrade is configured to begin immediately. Additionally, the duration of the upgrade might vary based on your workload configuration.
+The actual start time of the cluster upgrade will be within one hour of the upgrade schedule time. Additionally, the duration of the upgrade might vary based on your workload configuration.
 ====
 
+When a ROSA (classic architecture) cluster that uses AWS Security Token Services (STS) is upgraded, the ROSA CLI verifies the account and Operator role policies for the chosen cluster are compatible with the target version of the upgrade. If the policies are compatible, the CLI automatically upgrades the cluster. If the policies are not compatible with the chosen upgrade version, the CLI automatically upgrades IAM policies before upgrading the cluster. When scheduling the upgrade, you give administrative acknowledgment to confirm you have reviewed the changes involved with the upgrade, if required.
+
+include::modules/rosa-how-upgrades-work.adoc[leveloffset=+2]
+
 include::modules/rosa-upgrading-cli-tutorial.adoc[leveloffset=+2]
 
+include::modules/rosa-deleting-cluster-upgrade-cli.adoc[leveloffset=+2]
+
 include::modules/rosa-upgrading-manual-ocm.adoc[leveloffset=+2]
+
+include::modules/rosa-deleting-cluster-upgrade-ocm.adoc[leveloffset=+2]