|
| 1 | +// Module included in the following assemblies: |
| 2 | +// |
| 3 | +// * scalability_and_performance/ztp_far_edge/ztp-configuring-managed-clusters-policies.adoc |
| 4 | + |
| 5 | +:_mod-docs-content-type: PROCEDURE |
| 6 | +[id="ztp-coordinating-reboots-for-config-changes_{context}"] |
| 7 | += Coordinating reboots for configuration changes |
| 8 | + |
| 9 | +You can use {cgu-operator-full} (TALM) to coordinate reboots across a fleet of spoke clusters when configuration changes require a reboot, such as deferred tuning changes. {cgu-operator} reboots all nodes in the targeted `MachineConfigPool` on the selected clusters when the reboot policy is applied. |
| 10 | + |
| 11 | +Instead of rebooting nodes after each individual change, you can apply all configuration updates through policies and then trigger a single, coordinated reboot. |
| 12 | + |
| 13 | +.Prerequisites |
| 14 | + |
| 15 | +* You have installed the {oc-first}. |
| 16 | +* You have logged in to the hub cluster as a user with `cluster-admin` privileges. |
| 17 | +* You have deployed and configured {cgu-operator}. |
| 18 | +
|
| 19 | +.Procedure |
| 20 | + |
| 21 | +. Generate the configuration policies by creating a `PolicyGenerator` custom resource (CR). You can use one of the following sample manifests: |
| 22 | + |
| 23 | +* `out/argocd/example/acmpolicygenerator/acm-example-sno-reboot` |
| 24 | +* `out/argocd/example/acmpolicygenerator/acm-example-multinode-reboot` |
| 25 | +
|
| 26 | +. Update the `policyDefaults.placement.labelSelector` field in the `PolicyGenerator` CR to target the clusters that you want to reboot. Modify other fields as necessary for your use case. |
| 27 | ++ |
| 28 | +If you are coordinating a reboot to apply a deferred tuning change, ensure the `MachineConfigPool` in the reboot policy matches the value specified in the `spec.recommend` field in the `Tuned` object. |
| 29 | + |
| 30 | +. Apply the `PolicyGenerator` CR to generate and apply the configuration policies. For detailed steps, see "Customizing a managed cluster with PolicyGenerator CRs". |
| 31 | + |
| 32 | +. After ArgoCD completes syncing the policies, create and apply the `ClusterGroupUpgrade` (CGU) CR. |
| 33 | ++ |
| 34 | +.Example CGU custom resource configuration |
| 35 | +[source,yaml] |
| 36 | +---- |
| 37 | +apiVersion: ran.openshift.io/v1alpha1 |
| 38 | +kind: ClusterGroupUpgrade |
| 39 | +metadata: |
| 40 | + name: reboot |
| 41 | + namespace: default |
| 42 | +spec: |
| 43 | + clusterLabelSelectors: |
| 44 | + - matchLabels: <1> |
| 45 | +# ... |
| 46 | + enable: true |
| 47 | + managedPolicies: <2> |
| 48 | + - example-reboot |
| 49 | + remediationStrategy: |
| 50 | + timeout: 300 <3> |
| 51 | + maxConcurrency: 10 |
| 52 | +# ... |
| 53 | +---- |
| 54 | +<1> Configure the labels that match the clusters you want to reboot. |
| 55 | +<2> Add all required configuration policies before the reboot policy. {cgu-operator} applies the configuration changes as specified in the policies, in the order they are listed. |
| 56 | +<3> Specify the timeout in seconds for the entire upgrade across all selected clusters. Set this field by considering the worst-case scenario. |
| 57 | + |
| 58 | +. After you apply the CGU custom resource, {cgu-operator} rolls out the configuration policies in order. Once all policies are compliant, it applies the reboot policy and triggers a reboot of all nodes in the specified `MachineConfigPool`. |
| 59 | + |
| 60 | +.Verification |
| 61 | + |
| 62 | +. Monitor the CGU rollout status. |
| 63 | ++ |
| 64 | +You can monitor the rollout of the CGU custom resource on the hub by checking the status. Verify the successful rollout of the reboot by running the following command: |
| 65 | ++ |
| 66 | +[source,terminal] |
| 67 | +---- |
| 68 | +oc get cgu -A |
| 69 | +---- |
| 70 | ++ |
| 71 | +.Example output |
| 72 | +[source,terminal] |
| 73 | +---- |
| 74 | +NAMESPACE NAME AGE STATE DETAILS |
| 75 | +default reboot 1d Completed All clusters are compliant with all the managed policies |
| 76 | +---- |
| 77 | + |
| 78 | +. Verify successful reboot on a specific node. |
| 79 | ++ |
| 80 | +To confirm that the reboot was successful on a specific node, check the status of the `MachineConfigPool` (MCP) for the node by running the following command: |
| 81 | ++ |
| 82 | +[source,terminal] |
| 83 | +---- |
| 84 | +oc get mcp master |
| 85 | + |
| 86 | +---- |
| 87 | ++ |
| 88 | +.Example output |
| 89 | +[source,terminal] |
| 90 | +---- |
| 91 | +NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE |
| 92 | +master rendered-master-be5785c3b98eb7a1ec902fef2b81e865 True False False 3 3 3 0 72d |
| 93 | +---- |
0 commit comments