Skip to content

Commit 55731e7

Browse files
authored
Merge pull request #83599 from xenolinux/hcp-virt-nvidia-gpus
OSDOCS#12121:HCP KubeVirt Nvidia GPU support
2 parents 445d7da + 07636ad commit 55731e7

File tree

3 files changed

+148
-0
lines changed

3 files changed

+148
-0
lines changed

hosted_control_planes/hcp-manage/hcp-manage-virt.adoc

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,3 +50,7 @@ include::modules/hcp-virt-image-caching.adoc[leveloffset=+2]
5050
* xref:../../virt/virtual_machines/creating_vms_custom/virt-creating-vms-by-cloning-pvcs.adoc#smart-cloning_virt-creating-vms-by-cloning-pvcs[Cloning a data volume using smart-cloning]
5151

5252
include::modules/hcp-virt-etcd-storage.adoc[leveloffset=+2]
53+
54+
include::modules/hcp-virt-attach-nvidia-gpus.adoc[leveloffset=+1]
55+
56+
include::modules/hcp-virt-attach-nvidia-gpus-np-api.adoc[leveloffset=+1]
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * hosted_control_planes/hcp-manage/hcp-manage-virt.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="hcp-virt-attach-nvidia-gpus-np-api_{context}"]
7+
= Attaching NVIDIA GPU devices by using the NodePool resource
8+
9+
You can attach one or more NVIDIA graphics processing unit (GPU) devices to node pools by configuring the `nodepool.spec.platform.kubevirt.hostDevices` field in the `NodePool` resource.
10+
11+
:FeatureName: Attaching NVIDIA GPU devices to node pools
12+
include::snippets/technology-preview.adoc[]
13+
14+
.Procedure
15+
16+
* Attach one or more GPU devices to node pools:
17+
18+
** To attach a single GPU device, configure the `NodePool` resource by using the following example configuration:
19+
+
20+
[source,yaml]
21+
----
22+
apiVersion: hypershift.openshift.io/v1beta1
23+
kind: NodePool
24+
metadata:
25+
name: <hosted_cluster_name> <1>
26+
namespace: <hosted_cluster_namespace> <2>
27+
spec:
28+
arch: amd64
29+
clusterName: <hosted_cluster_name>
30+
management:
31+
autoRepair: false
32+
upgradeType: Replace
33+
nodeDrainTimeout: 0s
34+
nodeVolumeDetachTimeout: 0s
35+
platform:
36+
kubevirt:
37+
attachDefaultNetwork: true
38+
compute:
39+
cores: <cpu> <3>
40+
memory: <memory> <4>
41+
hostDevices: <5>
42+
- count: <count> <6>
43+
deviceName: <gpu_device_name> <7>
44+
networkInterfaceMultiqueue: Enable
45+
rootVolume:
46+
persistent:
47+
size: 32Gi
48+
type: Persistent
49+
type: KubeVirt
50+
replicas: <worker_node_count> <8>
51+
----
52+
<1> Specify the name of your hosted cluster, for instance, `example`.
53+
<2> Specify the name of the hosted cluster namespace, for example, `clusters`.
54+
<3> Specify a value for CPU, for example, `2`.
55+
<4> Specify a value for memory, for example, `16Gi`.
56+
<5> The `hostDevices` field defines a list of different types of GPU devices that you can attach to node pools.
57+
<6> Specify the number of GPU devices you want to attach to each virtual machine (VM) in node pools. For example, if you attach 2 GPU devices to 3 node pool replicas, all 3 VMs in the node pool are attached to the 2 GPU devices. The default count is `1`.
58+
<7> Specify the GPU device name, for example,`nvidia-a100`.
59+
<8> Specify the worker count, for example, `3`.
60+
61+
** To attach multiple GPU devices, configure the `NodePool` resource by using the following example configuration:
62+
+
63+
[source,yaml]
64+
----
65+
apiVersion: hypershift.openshift.io/v1beta1
66+
kind: NodePool
67+
metadata:
68+
name: <hosted_cluster_name>
69+
namespace: <hosted_cluster_namespace>
70+
spec:
71+
arch: amd64
72+
clusterName: <hosted_cluster_name>
73+
management:
74+
autoRepair: false
75+
upgradeType: Replace
76+
nodeDrainTimeout: 0s
77+
nodeVolumeDetachTimeout: 0s
78+
platform:
79+
kubevirt:
80+
attachDefaultNetwork: true
81+
compute:
82+
cores: <cpu>
83+
memory: <memory>
84+
hostDevices:
85+
- count: <count>
86+
deviceName: <gpu_device_name>
87+
- count: <count>
88+
deviceName: <gpu_device_name>
89+
- count: <count>
90+
deviceName: <gpu_device_name>
91+
- count: <count>
92+
deviceName: <gpu_device_name>
93+
networkInterfaceMultiqueue: Enable
94+
rootVolume:
95+
persistent:
96+
size: 32Gi
97+
type: Persistent
98+
type: KubeVirt
99+
replicas: <worker_node_count>
100+
----
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * hosted_control_planes/hcp-manage/hcp-manage-virt.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="hcp-virt-attach-nvidia-gpus_{context}"]
7+
= Attaching NVIDIA GPU devices by using the hcp CLI
8+
9+
You can attach one or more NVIDIA graphics processing unit (GPU) devices to node pools by using the `hcp` command-line interface (CLI) in a hosted cluster on {VirtProductName}.
10+
11+
:FeatureName: Attaching NVIDIA GPU devices to node pools
12+
include::snippets/technology-preview.adoc[]
13+
14+
.Prerequisites
15+
16+
* You have exposed the NVIDIA GPU device as a resource on the node where the GPU device resides. For more information, see link:https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/openshift-virtualization.html[NVIDIA GPU Operator with {VirtProductName}].
17+
18+
* You have exposed the NVIDIA GPU device as an link:https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#extended-resources[extended resource] on the node to assign it to node pools.
19+
20+
.Procedure
21+
22+
* You can attach the GPU device to node pools during cluster creation by running the following command:
23+
+
24+
[source,terminal]
25+
----
26+
$ hcp create cluster kubevirt \
27+
--name <hosted_cluster_name> \// <1>
28+
--node-pool-replicas <worker_node_count> \// <2>
29+
--pull-secret <path_to_pull_secret> \// <3>
30+
--memory <memory> \// <4>
31+
--cores <cpu> \// <5>
32+
--host-device-name="<gpu_device_name>,count:<value>" <6>
33+
----
34+
<1> Specify the name of your hosted cluster, for instance, `example`.
35+
<2> Specify the worker count, for example, `3`.
36+
<3> Specify the path to your pull secret, for example, `/user/name/pullsecret`.
37+
<4> Specify a value for memory, for example, `16Gi`.
38+
<5> Specify a value for CPU, for example, `2`.
39+
<6> Specify the GPU device name and the count, for example, `--host-device-name="nvidia-a100,count:2"`. The `--host-device-name` argument takes the name of the GPU device from the infrastructure node and an optional count that represents the number of GPU devices you want to attach to each virtual machine (VM) in node pools. The default count is `1`. For example, if you attach 2 GPU devices to 3 node pool replicas, all 3 VMs in the node pool are attached to the 2 GPU devices.
40+
+
41+
[TIP]
42+
====
43+
You can use the `--host-device-name` argument multiple times to attach multiple devices of different types.
44+
====

0 commit comments

Comments
 (0)