From e35d9367cc39c2ca68333d4ef85719f11585e944 Mon Sep 17 00:00:00 2001 From: Laura Hinson Date: Wed, 18 Jun 2025 10:36:24 -0400 Subject: [PATCH] Add automated backup/restore with OADP docs --- _topic_maps/_topic_map.yml | 2 + .../hcp-disaster-recovery-oadp-auto.adoc | 58 +++++++ .../hcp-disaster-recovery-oadp.adoc | 2 +- .../hcp-dr-oadp-backup-cp-workload-auto.adoc | 102 ++++++++++++ modules/hcp-dr-oadp-dpa.adoc | 151 ++++++++++++++++++ modules/hcp-dr-oadp-observe-velero.adoc | 1 + modules/hcp-dr-oadp-observe.adoc | 1 + modules/hcp-dr-oadp-restore-auto.adoc | 86 ++++++++++ modules/hcp-dr-prep-oadp-auto.adoc | 11 ++ 9 files changed, 413 insertions(+), 1 deletion(-) create mode 100644 hosted_control_planes/hcp_high_availability/hcp-disaster-recovery-oadp-auto.adoc create mode 100644 modules/hcp-dr-oadp-backup-cp-workload-auto.adoc create mode 100644 modules/hcp-dr-oadp-dpa.adoc create mode 100644 modules/hcp-dr-oadp-restore-auto.adoc create mode 100644 modules/hcp-dr-prep-oadp-auto.adoc diff --git a/_topic_maps/_topic_map.yml b/_topic_maps/_topic_map.yml index 4c1558016579..a2fc2ed181a7 100644 --- a/_topic_maps/_topic_map.yml +++ b/_topic_maps/_topic_map.yml @@ -2570,6 +2570,8 @@ Topics: File: hcp-disaster-recovery-aws - Name: Disaster recovery for a hosted cluster by using OADP File: hcp-disaster-recovery-oadp + - Name: Automated disaster recovery for a hosted cluster by using OADP + File: hcp-disaster-recovery-oadp-auto - Name: Authentication and authorization for hosted control planes File: hcp-authentication-authorization - Name: Handling machine configuration for hosted control planes diff --git a/hosted_control_planes/hcp_high_availability/hcp-disaster-recovery-oadp-auto.adoc b/hosted_control_planes/hcp_high_availability/hcp-disaster-recovery-oadp-auto.adoc new file mode 100644 index 000000000000..09b2b64cfe0c --- /dev/null +++ b/hosted_control_planes/hcp_high_availability/hcp-disaster-recovery-oadp-auto.adoc @@ -0,0 +1,58 @@ +:_mod-docs-content-type: ASSEMBLY +[id="hcp-disaster-recovery-oadp-auto"] += Automated disaster recovery for a hosted cluster by using {oadp-short} +include::_attributes/common-attributes.adoc[] +:context: hcp-disaster-recovery-oadp-auto + +toc::[] + +In hosted clusters on bare-metal or {aws-first} platforms, you can automate some backup and restore steps by using the {oadp-first} Operator. + +The process involves the following steps: + +. Configuring {oadp-short} +. Defining a Data Protection Application (DPA) +. Backing up the data plane workload +. Backing up the control plane workload +. Restoring a hosted cluster by using {oadp-short} + +[id="hcp-auto-dr-prereqs_{context}"] +== Prerequisites + +You must meet the following prerequisites on the management cluster: + +* You xref:../../backup_and_restore/application_backup_and_restore/installing/oadp-installing-operator.adoc#oadp-installing-operator[installed the {oadp-short} Operator]. +* You created a storage class. +* You have access to the cluster with `cluster-admin` privileges. +* You have access to the {oadp-short} subscription through a catalog source. +* You have access to a cloud storage provider that is compatible with {oadp-short}, such as S3, {azure-full}, {gcp-full}, or MinIO. +* In a disconnected environment, you have access to a self-hosted storage provider that is compatible with {oadp-short}, for example link:https://docs.redhat.com/en/documentation/red_hat_openshift_data_foundation/[{odf-full}] or link:https://min.io/[MinIO]. +* Your {hcp} pods are up and running. + +include::modules/hcp-dr-prep-oadp-auto.adoc[leveloffset=+1] + +[role="_additional-resources"] +.Additional resources + +* xref:../../backup_and_restore/application_backup_and_restore/installing/installing-oadp-aws.adoc#installing-oadp-aws[Configuring the {oadp-full} with Multicloud Object Gateway] +* xref:../../backup_and_restore/application_backup_and_restore/installing/installing-oadp-mcg.adoc#installing-oadp-mcg[Configuring the {oadp-full} with AWS S3 compatible storage] + +include::modules/hcp-dr-oadp-dpa.adoc[leveloffset=+1] + +[id="backing-up-data-plane-oadp-auto_{context}"] +== Backing up the data plane workload + +To back up the data plane workload by using the {oadp-short} Operator, see "Backing up applications". If the data plane workload is not important, you can skip this procedure. + +[role="_additional-resources"] +.Additional resources + +* xref:../../backup_and_restore/application_backup_and_restore/backing_up_and_restoring/backing-up-applications.adoc#backing-up-applications[Backing up applications] + +include::modules/hcp-dr-oadp-backup-cp-workload-auto.adoc[leveloffset=+1] + +include::modules/hcp-dr-oadp-restore-auto.adoc[leveloffset=+1] + +include::modules/hcp-dr-oadp-observe.adoc[leveloffset=+1] + +include::modules/hcp-dr-oadp-observe-velero.adoc[leveloffset=+1] \ No newline at end of file diff --git a/hosted_control_planes/hcp_high_availability/hcp-disaster-recovery-oadp.adoc b/hosted_control_planes/hcp_high_availability/hcp-disaster-recovery-oadp.adoc index fcd71c7d4f7d..b10e92777508 100644 --- a/hosted_control_planes/hcp_high_availability/hcp-disaster-recovery-oadp.adoc +++ b/hosted_control_planes/hcp_high_availability/hcp-disaster-recovery-oadp.adoc @@ -82,4 +82,4 @@ include::modules/hcp-dr-oadp-restore.adoc[leveloffset=+1] include::modules/hcp-dr-oadp-observe.adoc[leveloffset=+1] -include::modules/hcp-dr-oadp-observe-velero.adoc[leveloffset=+1] +include::modules/hcp-dr-oadp-observe-velero.adoc[leveloffset=+1] \ No newline at end of file diff --git a/modules/hcp-dr-oadp-backup-cp-workload-auto.adoc b/modules/hcp-dr-oadp-backup-cp-workload-auto.adoc new file mode 100644 index 000000000000..6c73bf0a3a3c --- /dev/null +++ b/modules/hcp-dr-oadp-backup-cp-workload-auto.adoc @@ -0,0 +1,102 @@ +// Module included in the following assemblies: +// +// * hosted_control_planes/hcp-disaster-recovery-oadp-auto.adoc + +:_mod-docs-content-type: REFERENCE +[id="hcp-dr-oadp-backup-cp-workload-auto_{context}"] += Backing up the control plane workload + +You can back up the control plane workload by creating the `Backup` custom resource (CR). + +To monitor and observe the backup process, see "Observing the backup and restore process". + +.Procedure + +. Create a YAML file that defines the `Backup` CR: ++ +.Example `backup-control-plane.yaml` file +[%collapsible] +==== +[source,yaml] +---- +apiVersion: velero.io/v1 +kind: Backup +metadata: + name: <1> + namespace: openshift-adp + labels: + velero.io/storage-location: default +spec: + hooks: {} + includedNamespaces: <2> + - <3> + - <4> + includedResources: + - sa + - role + - rolebinding + - pod + - pvc + - pv + - bmh + - configmap + - infraenv <5> + - priorityclasses + - pdb + - agents + - hostedcluster + - nodepool + - secrets + - services + - deployments + - hostedcontrolplane + - cluster + - agentcluster + - agentmachinetemplate + - agentmachine + - machinedeployment + - machineset + - machine + - route + - clusterdeployment + excludedResources: [] + storageLocation: default + ttl: 2h0m0s + snapshotMoveData: true <6> + datamover: "velero" <6> + defaultVolumesToFsBackup: true <7> +---- +==== +<1> Replace `backup_resource_name` with a name for your `Backup` resource. +<2> Selects specific namespaces to back up objects from them. You must include your hosted cluster namespace and the hosted control plane namespace. +<3> Replace `` with the name of the hosted cluster namespace, for example, `clusters`. +<4> Replace `` with the name of the hosted control plane namespace, for example, `clusters-hosted`. +<5> You must create the `infraenv` resource in a separate namespace. Do not delete the `infraenv` resource during the backup process. +<6> Enables the CSI volume snapshots and uploads the control plane workload automatically to the cloud storage. +<7> Sets the `fs-backup` backing up method for persistent volumes (PVs) as default. This setting is useful when you use a combination of Container Storage Interface (CSI) volume snapshots and the `fs-backup` method. ++ +[NOTE] +==== +If you want to use CSI volume snapshots, you must add the `backup.velero.io/backup-volumes-excludes=` annotation to your PVs. +==== + +. Apply the `Backup` CR by running the following command: ++ +[source,terminal] +---- +$ oc apply -f backup-control-plane.yaml +---- + +.Verification + +* Verify that the value of the `status.phase` is `Completed` by running the following command: ++ +[source,terminal] +---- +$ oc get backups.velero.io -n openshift-adp \ + -o jsonpath='{.status.phase}' +---- + +.Next steps + +* Restore the hosted cluster by using {oadp-short}. \ No newline at end of file diff --git a/modules/hcp-dr-oadp-dpa.adoc b/modules/hcp-dr-oadp-dpa.adoc new file mode 100644 index 000000000000..bda0432ea350 --- /dev/null +++ b/modules/hcp-dr-oadp-dpa.adoc @@ -0,0 +1,151 @@ +// Module included in the following assemblies: +// +// * hosted_control_planes/hcp-disaster-recovery-oadp-auto.adoc + +:_mod-docs-content-type: REFERENCE +[id="hcp-dr-oadp-dpa_{context}"] += Automating the backup and restore process by using a DPA + +You can automate parts of the backup and restore process by using a Data Protection Application (DPA). When you use a DPA, the steps to pause and restart the reconciliation of resources are automated. The DPA defines information including backup locations and Velero pod configurations. + +You can create a DPA by defining a `DataProtectionApplication` object. + +.Procedure + +* If you use a bare-metal platform, you can create a DPA by completing the following steps: + +. Create a manifest file similar to the following example: ++ +.Example `dpa.yaml` file +[%collapsible] +==== +[source,yaml] +---- +apiVersion: oadp.openshift.io/v1alpha1 +kind: DataProtectionApplication +metadata: + name: dpa-sample + namespace: openshift-adp +spec: + backupLocations: + - name: default + velero: + provider: aws # <1> + default: true + objectStorage: + bucket: # <2> + prefix: # <3> + config: + region: minio # <4> + profile: "default" + s3ForcePathStyle: "true" + s3Url: "" # <5> + insecureSkipTLSVerify: "true" + credential: + key: cloud + name: cloud-credentials + default: true + snapshotLocations: + - velero: + provider: aws # <1> + config: + region: minio # <4> + profile: "default" + credential: + key: cloud + name: cloud-credentials + configuration: + nodeAgent: + enable: true + uploaderType: kopia + velero: + defaultPlugins: + - openshift + - aws + - csi + - hypershift + resourceTimeout: 2h +---- +==== +<1> Specify the provider for Velero. If you are using bare metal and MinIO, you can use `aws` as the provider. +<2> Specify the bucket name; for example, `oadp-backup`. +<3> Specify the bucket prefix; for example, `hcp`. +<4> The bucket region in this example is `minio`, which is a storage provider that is compatilble with the S3 API. +<5> Specify the URL of the S3 endpoint. + +. Create the DPA object by running the following command: ++ +[source,terminal] +---- +$ oc create -f dpa.yaml +---- ++ +After you create the `DataProtectionApplication` object, new `velero` deployment and `node-agent` pods are created in the `openshift-adp` namespace. + +* If you use {aws-first}, you can create a DPA by completing the following steps: + +. Create a manifest file similar to the following example: ++ +.Example `dpa.yaml` file +[%collapsible] +==== +[source,yaml] +---- +apiVersion: oadp.openshift.io/v1alpha1 +kind: DataProtectionApplication +metadata: + name: dpa-sample + namespace: openshift-adp +spec: + backupLocations: + - name: default + velero: + provider: aws + default: true + objectStorage: + bucket: # <1> + prefix: # <2> + config: + region: minio # <3> + profile: "backupStorage" + credential: + key: cloud + name: cloud-credentials + snapshotLocations: + - velero: + provider: aws + config: + region: minio # <3> + profile: "volumeSnapshot" + credential: + key: cloud + name: cloud-credentials + configuration: + nodeAgent: + enable: true + uploaderType: kopia + velero: + defaultPlugins: + - openshift + - aws + - csi + - hypershift + resourceTimeout: 2h +---- +==== +<1> Specify the bucket name; for example, `oadp-backup`. +<2> Specify the bucket prefix; for example, `hcp`. +<3> The bucket region in this example is `minio`, which is a storage provider that is compatilble with the S3 API. + +. Create the DPA resource by running the following command: ++ +[source,terminal] +---- +$ oc create -f dpa.yaml +---- ++ +After you create the `DataProtectionApplication` object, new `velero` deployment and `node-agent` pods are created in the `openshift-adp` namespace. + +.Next steps + +* Back up the data plane workload. \ No newline at end of file diff --git a/modules/hcp-dr-oadp-observe-velero.adoc b/modules/hcp-dr-oadp-observe-velero.adoc index 05bcc8eac8a0..3286c14519f8 100644 --- a/modules/hcp-dr-oadp-observe-velero.adoc +++ b/modules/hcp-dr-oadp-observe-velero.adoc @@ -1,6 +1,7 @@ // Module included in the following assemblies: // // * hosted_control_planes/hcp-disaster-recovery-oadp.adoc +// * hosted_control_planes/hcp-disaster-recovery-oadp-auto.adoc :_mod-docs-content-type: PROCEDURE [id="hcp-dr-oadp-observe-velero_{context}"] diff --git a/modules/hcp-dr-oadp-observe.adoc b/modules/hcp-dr-oadp-observe.adoc index cffe047f0b99..e8b7d272c3e4 100644 --- a/modules/hcp-dr-oadp-observe.adoc +++ b/modules/hcp-dr-oadp-observe.adoc @@ -1,6 +1,7 @@ // Module included in the following assemblies: // // * hosted_control_planes/hcp-disaster-recovery-oadp.adoc +// * hosted_control_planes/hcp-disaster-recovery-oadp-auto.adoc :_mod-docs-content-type: PROCEDURE [id="hcp-dr-oadp-observe_{context}"] diff --git a/modules/hcp-dr-oadp-restore-auto.adoc b/modules/hcp-dr-oadp-restore-auto.adoc new file mode 100644 index 000000000000..ff0cbb3fed47 --- /dev/null +++ b/modules/hcp-dr-oadp-restore-auto.adoc @@ -0,0 +1,86 @@ +// Module included in the following assemblies: +// +// * hosted_control_planes/hcp-disaster-recovery-oadp-auto.adoc + +:_mod-docs-content-type: PROCEDURE +[id="hcp-dr-oadp-restore-auto_{context}"] += Restoring a hosted cluster by using {oadp-short} + +You can restore the hosted cluster by creating the `Restore` custom resource (CR). + +* If you are using an in-place update, the `InfraEnv` resource does not need spare nodes. You need to re-provision the worker nodes from the new management cluster. +* If you are using a replace update, you need some spare nodes for the `InfraEnv` resource to deploy the worker nodes. + +[IMPORTANT] +==== +After you back up your hosted cluster, you must destroy it to initiate the restoring process. To initiate node provisioning, you must back up workloads in the data plane before deleting the hosted cluster. +==== + +.Prerequisites + +* You completed the steps in link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.13/html/clusters/cluster_mce_overview#remove-a-cluster-by-using-the-console[Removing a cluster by using the console] ({rh-rhacm} documentation) to delete your hosted cluster. +* You completed the steps in link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.13/html/clusters/cluster_mce_overview#removing-a-cluster-from-management-in-special-cases[Removing remaining resources after removing a cluster] ({rh-rhacm} documentation). + +To monitor and observe the backup process, see "Observing the backup and restore process". + +.Procedure + +. Verify that no pods and persistent volume claims (PVCs) are present in the hosted control plane namespace by running the following command: ++ +[source,terminal] +---- +$ oc get pod pvc -n +---- ++ +.Expected output +[source,terminal] +---- +No resources found +---- + +. Create a YAML file that defines the `Restore` CR: ++ +.Example `restore-hosted-cluster.yaml` file +[source,yaml] +---- +apiVersion: velero.io/v1 +kind: Restore +metadata: + name: <1> + namespace: openshift-adp +spec: + backupName: <2> + restorePVs: true <3> + existingResourcePolicy: update <4> + excludedResources: + - nodes + - events + - events.events.k8s.io + - backups.velero.io + - restores.velero.io + - resticrepositories.velero.io +---- +<1> Replace `` with a name for your `Restore` resource. +<2> Replace `` with a name for your `Backup` resource. +<3> Initiates the recovery of persistent volumes (PVs) and its pods. +<4> Ensures that the existing objects are overwritten with the backed up content. ++ +[IMPORTANT] +==== +You must create the `InfraEnv` resource in a separate namespace. Do not delete the `InfraEnv` resource during the restore process. The `InfraEnv` resource is mandatory for the new nodes to be reprovisioned. +==== + +. Apply the `Restore` CR by running the following command: ++ +[source,terminal] +---- +$ oc apply -f restore-hosted-cluster.yaml +---- + +. Verify if the value of the `status.phase` is `Completed` by running the following command: ++ +[source,terminal] +---- +$ oc get hostedcluster -n \ + -o jsonpath='{.status.phase}' +---- \ No newline at end of file diff --git a/modules/hcp-dr-prep-oadp-auto.adoc b/modules/hcp-dr-prep-oadp-auto.adoc new file mode 100644 index 000000000000..7ee496cffbdc --- /dev/null +++ b/modules/hcp-dr-prep-oadp-auto.adoc @@ -0,0 +1,11 @@ +// Module included in the following assemblies: +// +// * hosted_control_planes/hcp-disaster-recovery-oadp-auto.adoc + +:_mod-docs-content-type: PROCEDURE +[id="hcp-dr-prep-oadp-auto_{context}"] += Configuring {oadp-short} + +If your hosted cluster is on {aws-short}, follow the steps in "Configuring the {oadp-full} with Multicloud Object Gateway" to configure {oadp-short}. + +If your hosted cluster is on a bare-metal platform, follow the steps in "Configuring the {oadp-full} with AWS S3 compatible storage" to configure {oadp-short}. \ No newline at end of file