[OSDOCS-12288]: Adding live migration docs

lahinson · lahinson · commit 6db038c9d0d1 · 2024-11-15T12:56:49.000-05:00
diff --git a/hosted_control_planes/hcp-deploy/hcp-deploy-virt.adoc b/hosted_control_planes/hcp-deploy/hcp-deploy-virt.adoc
@@ -15,18 +15,35 @@ With {hcp} and {VirtProductName}, you can create {product-title} clusters with w
 
 The {hcp} feature is enabled by default.
 
-You can use the hosted control plane command line interface, hcp, to create an {product-title} hosted cluster. The hosted cluster is automatically imported as a managed cluster. If you want to disable this automatic import feature, see _Disabling the automatic import of hosted clusters into multicluster engine operator_.
+You can use the hosted control plane command line interface, `hcp`, to create an {product-title} hosted cluster. The hosted cluster is automatically imported as a managed cluster. If you want to disable this automatic import feature, see "Disabling the automatic import of hosted clusters into multicluster engine operator".
+
+[role="_additional-resources"]
+.Additional resources
+* xref:../../hosted_control_planes/hcp-prepare/hcp-enable-disable.adoc[Enabling or disabling the {hcp} feature]
+* link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.12/html/clusters/cluster_mce_overview#ansible-config-hosted-cluster[Configuring Ansible Automation Platform jobs to run on hosted clusters]
+* xref:../../hosted_control_planes/hcp-import.adoc#hcp-import-disable_hcp-import[Disabling the automatic import of hosted clusters into {mce-short}]
 
 include::modules/hcp-virt-reqs.adoc[leveloffset=+1]
 
 [role="_additional-resources"]
 .Additional resources
 
 * xref:../../scalability_and_performance/recommended-performance-scale-practices/recommended-etcd-practices.adoc#recommended-etcd-practices[Recommended etcd practices]
-* xref:../../storage/persistent_storage/persistent_storage_local/persistent-storage-using-lvms.adoc[Persistent storage using LVM Storage]
-* To disable the {hcp} feature or, if you already disabled it and want to manually enable it, see xref:../../hosted_control_planes/hcp-prepare/hcp-enable-disable.adoc[Enabling or disabling the {hcp} feature].
-* To manage hosted clusters by running Ansible Automation Platform jobs, see link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.12/html/clusters/cluster_mce_overview#ansible-config-hosted-cluster[Configuring Ansible Automation Platform jobs to run on hosted clusters].
-* If you want to disable the automatic import feature, see _Disabling the automatic import of hosted clusters into {mce-short}_.
+* xref:../../storage/persistent_storage/persistent_storage_local/persistent-storage-using-lvms.adoc#persistent-storage-using-lvms[Persistent storage using Logical Volume Manager Storage]
+
+include::modules/hcp-virt-prereqs.adoc[leveloffset=+2]
+
+[role="_additional-resources"]
+.Additional resources
+* xref:../../virt/install/installing-virt.adoc#installing-virt-web[Installing OpenShift Virtualization using the web console]
+* xref:../../post_installation_configuration/post-install-storage-configuration.adoc#post-install-storage-configuration[Postinstallation storage configuration]
+* link:https://console.redhat.com/openshift/install/platform-agnostic/user-provisioned[Install OpenShift on any x86_64 platform with user-provisioned infrastructure]
+* xref:../../hosted_control_planes/hcp-deploy/hcp-deploy-virt.adoc#hcp-metallb_hcp-deploy-virt[Optional: Configuring MetalLB]
+* link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.12/html/clusters/cluster_mce_overview#advanced-config-engine[Advanced configuration]
+
+include::modules/hcp-virt-firewall-port.adoc[leveloffset=+2]
+
+include::modules/hcp-virt-live-migration.adoc[leveloffset=+1]
 
 [id="hcp-virt-create-hc"]
 == Creating a hosted cluster with the KubeVirt platform
diff --git a/modules/hcp-virt-firewall-port.adoc b/modules/hcp-virt-firewall-port.adoc
@@ -0,0 +1,26 @@
+// Module included in the following assemblies:
+//
+// * hosted_control_planes/hcp-deploy-virt.adoc
+
+:_mod-docs-content-type: CONCEPT
+[id="hcp-virt-firewall-port_{context}"]
+= Firewall and port requirements
+
+Ensure that you meet the firewall and port requirements so that ports can communicate between the management cluster, the control plane, and hosted clusters:
+
+* The `kube-apiserver` service runs on port 6443 by default and requires ingress access for communication between the control plane components.
+
+** If you use the `NodePort` publishing strategy, ensure that the node port that is assigned to the `kube-apiserver` service is exposed.
+** If you use MetalLB load balancing, allow ingress access to the IP range that is used for load balancer IP addresses.
+
+* If you use the `NodePort` publishing strategy, use a firewall rule for the `ignition-server` and `Oauth-server` settings.
+
+* The `konnectivity` agent, which establishes a reverse tunnel to allow bi-directional communication on the hosted cluster, requires egress access to the cluster API server address on port 6443. With that egress access, the agent can reach the `kube-apiserver` service.
+
+** If the cluster API server address is an internal IP address, allow access from the workload subnets to the IP address on port 6443.
+** If the address is an external IP address, allow egress on port 6443 to that external IP address from the nodes.
+
+* If you change the default port of 6443, adjust the rules to reflect that change.
+* Ensure that you open any ports that are required by the workloads that run in the clusters.
+* Use firewall rules, security groups, or other access controls to restrict access to only required sources. Avoid exposing ports publicly unless necessary.
+* For production deployments, use a load balancer to simplify access through a single IP address.
diff --git a/modules/hcp-virt-live-migration.adoc b/modules/hcp-virt-live-migration.adoc
@@ -0,0 +1,50 @@
+// Module included in the following assemblies:
+//
+// * hosted_control_planes/hcp-deploy-virt.adoc
+
+:_mod-docs-content-type: CONCEPT
+[id="hcp-virt-live-migration_{context}"]
+= Live migration for compute nodes
+
+While the management cluster for hosted cluster virtual machines (VMs) is undergoing updates or maintenance, the hosted cluster VMs can be automatically live migrated to prevent disrupting hosted cluster workloads. As a result, the management cluster can be updated without affecting the availability and operation of the KubeVirt platform hosted clusters.
+
+[IMPORTANT]
+====
+The live migration of KubeVirt VMs is enabled by default provided that the VMs use `ReadWriteMany` (RWX) storage for both the root volume and the storage classes that are mapped to the `kubevirt-csi` CSI provider.
+====
+
+You can verify that the VMs in a node pool are capable of live migration by checking the `KubeVirtNodesLiveMigratable` condition in the `status` section of a `NodePool` object. 
+
+In the following example, the VMs cannot be live migrated because RWX storage is not used.
+
+.Example configuration where VMs cannot be live migrated
+[source,yaml]
+----
+    - lastTransitionTime: "2024-10-08T15:38:19Z"
+      message: |
+        3 of 3 machines are not live migratable
+        Machine user-np-ngst4-gw2hz: DisksNotLiveMigratable: user-np-ngst4-gw2hz is not a live migratable machine: cannot migrate VMI: PVC user-np-ngst4-gw2hz-rhcos is not shared, live migration requires that all PVCs must be shared (using ReadWriteMany access mode)
+        Machine user-np-ngst4-npq7x: DisksNotLiveMigratable: user-np-ngst4-npq7x is not a live migratable machine: cannot migrate VMI: PVC user-np-ngst4-npq7x-rhcos is not shared, live migration requires that all PVCs must be shared (using ReadWriteMany access mode)
+        Machine user-np-ngst4-q5nkb: DisksNotLiveMigratable: user-np-ngst4-q5nkb is not a live migratable machine: cannot migrate VMI: PVC user-np-ngst4-q5nkb-rhcos is not shared, live migration requires that all PVCs must be shared (using ReadWriteMany access mode)
+      observedGeneration: 1
+      reason: DisksNotLiveMigratable
+      status: "False"
+      type: KubeVirtNodesLiveMigratable
+----
+
+In the next example, the VMs meet the requirements to be live migrated.
+
+.Example configuration where VMs can be live migrated
+[source,yaml]
+----
+    - lastTransitionTime: "2024-10-08T15:38:19Z"
+      message: "All is well"
+      observedGeneration: 1
+      reason: AsExpected
+      status: "True"
+      type: KubeVirtNodesLiveMigratable
+----
+
+While live migration can protect VMs from disruption in normal circumstances, events such as infrastructure node failure can result in a hard restart of any VMs that are hosted on the failed node. For live migration to be successful, the source node that a VM is hosted on must be working correctly.
+
+When the VMs in a node pool cannot be live migrated, workload disruption might occur on the hosted cluster during maintenance on the management cluster. By default, the {hcp} controllers try to drain the workloads that are hosted on KubeVirt VMs that cannot be live migrated before the VMs are stopped. Draining the hosted cluster nodes before stopping the VMs allows pod disruption budgets to protect workload availability within the hosted cluster.
diff --git a/modules/hcp-virt-prereqs.adoc b/modules/hcp-virt-prereqs.adoc
@@ -0,0 +1,40 @@
+// Module included in the following assemblies:
+//
+// * hosted_control_planes/hcp-deploy-virt.adoc
+
+:_mod-docs-content-type: CONCEPT
+[id="hcp-virt-prereqs_{context}"]
+= Prerequisites
+
+You must meet the following prerequisites to create an {product-title} cluster on {VirtProductName}:
+
+* You have administrator access to an {product-title} cluster, version 4.14 or later, specified in the `KUBECONFIG` environment variable.
+* The {product-title} management cluster has wildcard DNS routes enabled, as shown in the following DNS:
++
+[source,terminal]
+----
+$ oc patch ingresscontroller -n openshift-ingress-operator default --type=json -p '[{ "op": "add", "path": "/spec/routeAdmission", "value": {wildcardPolicy: "WildcardsAllowed"}}]'
+----
+* The {product-title} management cluster has {VirtProductName}, version 4.14 or later, installed on it. For more information, see "Installing OpenShift Virtualization using the web console".
+* The {product-title} management cluster is on-premise bare metal.
+* The {product-title} management cluster is configured with OVNKubernetes as the default pod network CNI.
+* The {product-title} management cluster has a default storage class. For more information, see "Postinstallation storage configuration". The following example shows how to set a default storage class:
++
+[source,terminal]
+----
+$ oc patch storageclass ocs-storagecluster-ceph-rbd -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
+----
+
+* You have a valid pull secret file for the `quay.io/openshift-release-dev` repository. For more information, see "Install OpenShift on any x86_64 platform with user-provisioned infrastructure".
+* You have installed the hosted control plane command line interface.
+* You have configured a load balancer. For more information, see "Optional: Configuring MetalLB".
+* For optimal network performance, you are using a network maximum transmission unit (MTU) of 9000 or greater on the {product-title} cluster that hosts the KubeVirt virtual machines. If you use a lower MTU setting, network latency and the throughput of the hosted pods are affected. Enable multiqueue on node pools only when the MTU is 9000 or greater.
+
+* The {mce-short} has at least one managed {product-title} cluster. The `local-cluster` is automatically imported. For more information about the `local-cluster`, see "Advanced configuration" in the {mce-short} documentation. You can check the status of your hub cluster by running the following command:
++
+[source,terminal]
+----
+$ oc get managedclusters local-cluster
+----
+
+* On the {product-title} cluster that hosts the {VirtProductName} virtual machines, you are using a `ReadWriteMany` (RWX) storage class so that live migration can be enabled.
diff --git a/modules/hcp-virt-reqs.adoc b/modules/hcp-virt-reqs.adoc
@@ -12,59 +12,4 @@ As you prepare to deploy {hcp} on {VirtProductName}, consider the following info
 * Each hosted cluster must have a cluster-wide unique name. A hosted cluster name cannot be the same as any existing managed cluster in order for {mce-short} to manage it.
 * Do not use `clusters` as a hosted cluster name.
 * A hosted cluster cannot be created in the namespace of a {mce-short} managed cluster.
-* When you configure storage for {hcp}, consider the recommended etcd practices. To ensure that you meet the latency requirements, dedicate a fast storage device to all hosted control plane etcd instances that run on each control-plane node. You can use LVM storage to configure a local storage class for hosted etcd pods. For more information, see _Recommended etcd practices_ and _Persistent storage using logical volume manager storage_.
-
-[id="hcp-virt-prereqs_{context}"]
-== Prerequisites
-
-You must meet the following prerequisites to create an {product-title} cluster on {VirtProductName}:
-
-* You need administrator access to an {product-title} cluster, version 4.14 or later, specified by the `KUBECONFIG` environment variable.
-* The {product-title} hosting cluster must have wildcard DNS routes enabled, as shown in the following DNS:
-+
-[source,terminal]
-----
-$ oc patch ingresscontroller -n openshift-ingress-operator default --type=json -p '[{ "op": "add", "path": "/spec/routeAdmission", "value": {wildcardPolicy: "WildcardsAllowed"}}]'
-----
-* The {product-title} hosting cluster must have {VirtProductName}, version 4.14 or later, installed on it. For more information, see xref:../../virt/install/installing-virt.adoc#installing-virt-web[Installing OpenShift Virtualization using the web console].
-* The {product-title} hosting cluster must be configured with OVNKubernetes as the default pod network CNI.
-* The {product-title} hosting cluster must have a default storage class. For more information, see xref:../../post_installation_configuration/post-install-storage-configuration.adoc#post-install-storage-configuration[Postinstallation storage configuration]. The following example shows how to set a default storage class:
-+
-[source,terminal]
-----
-$ oc patch storageclass ocs-storagecluster-ceph-rbd -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
-----
-
-* You need a valid pull secret file for the `quay.io/openshift-release-dev` repository. For more information, see link:https://console.redhat.com/openshift/install/platform-agnostic/user-provisioned[Install OpenShift on any x86_64 platform with user-provisioned infrastructure].
-* You need to install the hosted control plane command line interface.
-* Before you can provision your cluster, you need to configure a load balancer. For more information, see _Optional: Configuring MetalLB_.
-* For optimal network performance, use a network maximum transmission unit (MTU) of 9000 or greater on the {product-title} cluster that hosts the KubeVirt virtual machines. If you use a lower MTU setting, network latency and the throughput of the hosted pods are affected. Enable multiqueue on node pools only when the MTU is 9000 or greater.
-
-* The {mce-short} must have at least one managed {product-title} cluster. The `local-cluster` is automatically imported. For more information about the `local-cluster`, see link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.12/html/clusters/cluster_mce_overview#advanced-config-engine[Advanced configuration] in the {mce-short} documentation. You can check the status of your hub cluster by running the following command:
-+
-[source,terminal]
-----
-$ oc get managedclusters local-cluster
-----
-
-[id="hcp-virt-firewall-port_{context}"]
-== Firewall and port requirements
-
-Ensure that you meet the firewall and port requirements so that ports can communicate between the management cluster, the control plane, and hosted clusters:
-
-* The `kube-apiserver` service runs on port 6443 by default and requires ingress access for communication between the control plane components.
-
-** If you use the `NodePort` publishing strategy, ensure that the node port that is assigned to the `kube-apiserver` service is exposed.
-** If you use MetalLB load balancing, allow ingress access to the IP range that is used for load balancer IP addresses.
-
-* If you use the `NodePort` publishing strategy, use a firewall rule for the `ignition-server` and `Oauth-server` settings.
-
-* The `konnectivity` agent, which establishes a reverse tunnel to allow bi-directional communication on the hosted cluster, requires egress access to the cluster API server address on port 6443. With that egress access, the agent can reach the `kube-apiserver` service.
-
-** If the cluster API server address is an internal IP address, allow access from the workload subnets to the IP address on port 6443.
-** If the address is an external IP address, allow egress on port 6443 to that external IP address from the nodes.
-
-* If you change the default port of 6443, adjust the rules to reflect that change.
-* Ensure that you open any ports that are required by the workloads that run in the clusters.
-* Use firewall rules, security groups, or other access controls to restrict access to only required sources. Avoid exposing ports publicly unless necessary.
-* For production deployments, use a load balancer to simplify access through a single IP address.
+* When you configure storage for {hcp}, consider the recommended etcd practices. To ensure that you meet the latency requirements, dedicate a fast storage device to all hosted control plane etcd instances that run on each control-plane node. You can use LVM storage to configure a local storage class for hosted etcd pods. For more information, see "Recommended etcd practices" and "Persistent storage using Logical Volume Manager storage".