Skip to content

Commit 8048224

Browse files
committed
[OSDOCS-13172]: DR docs for HCP on agent
1 parent 1426ebe commit 8048224

File tree

5 files changed

+491
-16
lines changed

5 files changed

+491
-16
lines changed

hosted_control_planes/hcp_high_availability/hcp-disaster-recovery-oadp.adoc

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -76,9 +76,26 @@ If the data plane workload is not important, you can skip this procedure. To bac
7676

7777
* Restoring a hosted cluster by using {oadp-short}
7878

79-
include::modules/hcp-dr-oadp-backup-cp-workload.adoc[leveloffset=+1]
79+
[id="backing-up-cp-oadp_{context}"]
80+
== Backing up the control plane workload
8081

81-
include::modules/hcp-dr-oadp-restore.adoc[leveloffset=+1]
82+
You can back up the control plane workload by creating the `Backup` custom resource (CR). The steps vary depending on whether your platform is {aws-short} or bare metal.
83+
84+
include::modules/hcp-dr-oadp-backup-cp-workload-aws.adoc[leveloffset=+2]
85+
include::modules/hcp-dr-oadp-backup-cp-workload-bm.adoc[leveloffset=+2]
86+
87+
[id="hcp-restoring-oadp_{context}"]
88+
== Restoring a hosted cluster by using {oadp-short}
89+
90+
You can restore a hosted cluster into the same management cluster or into a new management cluster.
91+
92+
include::modules/hcp-dr-oadp-restore.adoc[leveloffset=+2]
93+
include::modules/hcp-dr-oadp-restore-new-mgmt.adoc[leveloffset=+2]
94+
95+
[role="_additional-resources"]
96+
.Additional resources
97+
* link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.13/html/clusters/cluster_mce_overview#remove-a-cluster-by-using-the-console[Removing a cluster by using the console]
98+
* link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.13/html/clusters/cluster_mce_overview#removing-a-cluster-from-management-in-special-cases[Removing remaining resources after removing a cluster]
8299

83100
include::modules/hcp-dr-oadp-observe.adoc[leveloffset=+1]
84101

modules/hcp-dr-oadp-backup-cp-workload.adoc renamed to modules/hcp-dr-oadp-backup-cp-workload-aws.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@
33
// * hosted_control_planes/hcp-disaster-recovery-oadp.adoc
44

55
:_mod-docs-content-type: REFERENCE
6-
[id="hcp-dr-oadp-backup-cp-workload_{context}"]
7-
= Backing up the control plane workload
6+
[id="hcp-dr-oadp-backup-cp-workload-aws_{context}"]
7+
= Backing up the control plane workload on {aws-short}
88

99
You can back up the control plane workload by creating the `Backup` custom resource (CR).
1010

Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * hosted_control_planes/hcp-disaster-recovery-oadp.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="hcp-dr-oadp-backup-cp-workload-bm_{context}"]
7+
= Backing up the control plane workload on a bare-metal platform
8+
9+
You can back up the control plane workload by creating the `Backup` custom resource (CR).
10+
11+
To monitor and observe the backup process, see "Observing the backup and restore process".
12+
13+
.Procedure
14+
15+
. Pause the reconciliation of the `HostedCluster` resource by running the following command:
16+
+
17+
[source,terminal]
18+
----
19+
$ oc --kubeconfig <management_cluster_kubeconfig_file> \
20+
patch hostedcluster -n <hosted_cluster_namespace> <hosted_cluster_name> \
21+
--type json -p '[{"op": "add", "path": "/spec/pausedUntil", "value": "true"}]'
22+
----
23+
24+
. Get the infrastructure ID of your hosted cluster by running the following command:
25+
+
26+
[source,terminal]
27+
----
28+
$ oc --kubeconfig <management_cluster_kubeconfig_file> \
29+
get hostedcluster -n <hosted_cluster_namespace> \
30+
<hosted_cluster_name> -o=jsonpath="{.spec.infraID}"
31+
----
32+
33+
. Note the infrastructure ID to use in the next step.
34+
35+
. Pause the reconciliation of the `cluster.cluster.x-k8s.io` resource by running the following command:
36+
+
37+
[source,terminal]
38+
----
39+
$ oc --kubeconfig <management_cluster_kubeconfig_file> \
40+
annotate cluster -n <hosted_control_plane_namespace> \
41+
<hosted_cluster_infra_id> cluster.x-k8s.io/paused=true
42+
----
43+
44+
. Pause the reconciliation of the `NodePool` resource by running the following command:
45+
+
46+
[source,terminal]
47+
----
48+
$ oc --kubeconfig <management_cluster_kubeconfig_file> \
49+
patch nodepool -n <hosted_cluster_namespace> <node_pool_name> \
50+
--type json -p '[{"op": "add", "path": "/spec/pausedUntil", "value": "true"}]'
51+
----
52+
53+
. Pause the reconciliation of the `AgentCluster` resource by running the following command:
54+
+
55+
[source,terminal]
56+
----
57+
$ oc --kubeconfig <management_cluster_kubeconfig_file> \
58+
annotate agentcluster -n <hosted_control_plane_namespace> \
59+
cluster.x-k8s.io/paused=true --all
60+
----
61+
62+
. Pause the reconciliation of the `AgentMachine` resource by running the following command:
63+
+
64+
[source,terminal]
65+
----
66+
$ oc --kubeconfig <management_cluster_kubeconfig_file> \
67+
annotate agentmachine -n <hosted_control_plane_namespace> \
68+
cluster.x-k8s.io/paused=true --all
69+
----
70+
71+
. If you are backing up and restoring to the same management cluster, annotate the `HostedCluster` resource to prevent the deletion of the hosted control plane namespace by running the following command:
72+
+
73+
[source,terminal]
74+
----
75+
$ oc --kubeconfig <management_cluster_kubeconfig_file> \
76+
annotate hostedcluster -n <hosted_cluster_namespace> <hosted_cluster_name> \
77+
hypershift.openshift.io/skip-delete-hosted-controlplane-namespace=true
78+
----
79+
80+
. Create a YAML file that defines the `Backup` CR:
81+
+
82+
.Example `backup-control-plane.yaml` file
83+
[%collapsible]
84+
====
85+
[source,yaml]
86+
----
87+
apiVersion: velero.io/v1
88+
kind: Backup
89+
metadata:
90+
name: <backup_resource_name> # <1>
91+
namespace: openshift-adp
92+
labels:
93+
velero.io/storage-location: default
94+
spec:
95+
hooks: {}
96+
includedNamespaces: # <2>
97+
- <hosted_cluster_namespace> # <3>
98+
- <hosted_control_plane_namespace> # <4>
99+
- <agent_namespace> # <5>
100+
includedResources:
101+
- sa
102+
- role
103+
- rolebinding
104+
- pod
105+
- pvc
106+
- pv
107+
- bmh
108+
- configmap
109+
- infraenv
110+
- priorityclasses
111+
- pdb
112+
- agents
113+
- hostedcluster
114+
- nodepool
115+
- secrets
116+
- services
117+
- deployments
118+
- hostedcontrolplane
119+
- cluster
120+
- agentcluster
121+
- agentmachinetemplate
122+
- agentmachine
123+
- machinedeployment
124+
- machineset
125+
- machine
126+
excludedResources: []
127+
storageLocation: default
128+
ttl: 2h0m0s
129+
snapshotMoveData: true # <6>
130+
datamover: "velero" # <6>
131+
defaultVolumesToFsBackup: true # <7>
132+
----
133+
====
134+
<1> Replace `backup_resource_name` with the name of your `Backup` resource.
135+
<2> Selects specific namespaces to back up objects from them. You must include your hosted cluster namespace and the hosted control plane namespace.
136+
<3> Replace `<hosted_cluster_namespace>` with the name of the hosted cluster namespace, for example, `clusters`.
137+
<4> Replace `<hosted_control_plane_namespace>` with the name of the hosted control plane namespace, for example, `clusters-hosted`.
138+
<5> Replace `<agent_namespace>` with the namespace where your `Agent`, `BMH`, and `InfraEnv` CRs are located, for example, `agents`.
139+
<6> Enables the CSI volume snapshots and uploads the control plane workload automatically to the cloud storage.
140+
<7> Sets the `fs-backup` backing up method for persistent volumes (PVs) as default. This setting is useful when you use a combination of Container Storage Interface (CSI) volume snapshots and the `fs-backup` method.
141+
+
142+
[NOTE]
143+
====
144+
If you want to use CSI volume snapshots, you must add the `backup.velero.io/backup-volumes-excludes=<pv_name>` annotation to your PVs.
145+
====
146+
147+
. Apply the `Backup` CR by running the following command:
148+
+
149+
[source,terminal]
150+
----
151+
$ oc apply -f backup-control-plane.yaml
152+
----
153+
154+
.Verification
155+
156+
* Verify if the value of the `status.phase` is `Completed` by running the following command:
157+
+
158+
[source,terminal]
159+
----
160+
$ oc get backups.velero.io <backup_resource_name> -n openshift-adp \
161+
-o jsonpath='{.status.phase}'
162+
----
163+
164+
.Next steps
165+
166+
* Restore a hosted cluster by using OADP.

0 commit comments

Comments
 (0)