Skip to content

Commit fdc7c6e

Browse files
authored
Merge pull request #76483 from SNiemann15/cpu_manager_ibmz
[MULTIARCH-4234] Add further steps to cpu manager setup
2 parents fc5ce86 + ab1adc4 commit fdc7c6e

File tree

1 file changed

+68
-15
lines changed

1 file changed

+68
-15
lines changed

modules/setting-up-cpu-manager.adoc

Lines changed: 68 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -7,23 +7,25 @@
77
[id="setting_up_cpu_manager_{context}"]
88
= Setting up CPU Manager
99

10+
To configure CPU manager, create a KubeletConfig custom resource (CR) and apply it to the desired set of nodes.
11+
1012
.Procedure
1113

12-
. Optional: Label a node:
14+
. Label a node by running the following command:
1315
+
1416
[source,terminal]
1517
----
1618
# oc label node perf-node.example.com cpumanager=true
1719
----
1820

19-
. Edit the `MachineConfigPool` of the nodes where CPU Manager should be enabled. In this example, all workers have CPU Manager enabled:
21+
. To enable CPU Manager for all compute nodes, edit the CR by running the following command:
2022
+
2123
[source,terminal]
2224
----
2325
# oc edit machineconfigpool worker
2426
----
2527

26-
. Add a label to the worker machine config pool:
28+
. Add the `custom-kubelet: cpumanager-enabled` label to `metadata.labels` section.
2729
+
2830
[source,yaml]
2931
----
@@ -55,7 +57,7 @@ spec:
5557
* `static`. This policy allows containers in guaranteed pods with integer CPU requests. It also limits access to exclusive CPUs on the node. If `static`, you must use a lowercase `s`.
5658
<2> Optional. Specify the CPU Manager reconcile frequency. The default is `5s`.
5759

58-
. Create the dynamic kubelet config:
60+
. Create the dynamic kubelet config by running the following command:
5961
+
6062
[source,terminal]
6163
----
@@ -64,7 +66,7 @@ spec:
6466
+
6567
This adds the CPU Manager feature to the kubelet config and, if needed, the Machine Config Operator (MCO) reboots the node. To enable CPU Manager, a reboot is not needed.
6668

67-
. Check for the merged kubelet config:
69+
. Check for the merged kubelet config by running the following command:
6870
+
6971
[source,terminal]
7072
----
@@ -84,7 +86,7 @@ This adds the CPU Manager feature to the kubelet config and, if needed, the Mach
8486
]
8587
----
8688

87-
. Check the worker for the updated `kubelet.conf`:
89+
. Check the compute node for the updated `kubelet.conf` file by running the following command:
8890
+
8991
[source,terminal]
9092
----
@@ -101,6 +103,13 @@ cpuManagerReconcilePeriod: 5s <2>
101103
<1> `cpuManagerPolicy` is defined when you create the `KubeletConfig` CR.
102104
<2> `cpuManagerReconcilePeriod` is defined when you create the `KubeletConfig` CR.
103105

106+
. Create a project by running the following command:
107+
+
108+
[source,terminal]
109+
----
110+
$ oc new-project <project_name>
111+
----
112+
104113
. Create a pod that requests a core or multiple cores. Both limits and requests must have their CPU value set to a whole integer. That is the number of cores that will be dedicated to this pod:
105114
+
106115
[source,terminal]
@@ -145,7 +154,9 @@ spec:
145154
# oc create -f cpumanager-pod.yaml
146155
----
147156

148-
. Verify that the pod is scheduled to the node that you labeled:
157+
.Verification
158+
159+
. Verify that the pod is scheduled to the node that you labeled by running the following command:
149160
+
150161
[source,terminal]
151162
----
@@ -172,34 +183,73 @@ QoS Class: Guaranteed
172183
Node-Selectors: cpumanager=true
173184
----
174185

175-
. Verify that the `cgroups` are set up correctly. Get the process ID (PID) of the `pause` process:
186+
. Verify that a CPU has been exclusively assigned to the pod by running the following command:
176187
+
177188
[source,terminal]
178189
----
190+
# oc describe node --selector='cpumanager=true' | grep -i cpumanager- -B2
191+
----
192+
+
193+
.Example output
194+
[source,terminal]
195+
----
196+
NAMESPACE NAME CPU Requests CPU Limits Memory Requests Memory Limits Age
197+
cpuman cpumanager-mlrrz 1 (28%) 1 (28%) 1G (13%) 1G (13%) 27m
198+
----
199+
200+
. Verify that the `cgroups` are set up correctly. Get the process ID (PID) of the `pause` process by running the following commands:
201+
+
202+
[source,terminal]
203+
----
204+
# oc debug node/perf-node.example.com
205+
----
206+
+
207+
[source,terminal]
208+
----
209+
sh-4.2# systemctl status | grep -B5 pause
210+
----
211+
+
212+
[NOTE]
213+
====
214+
If the output returns multiple pause process entries, you must identify the correct pause process.
215+
====
216+
+
217+
.Example output
218+
[source,terminal]
219+
----
179220
# ├─init.scope
180221
│ └─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 17
181222
└─kubepods.slice
182223
├─kubepods-pod69c01f8e_6b74_11e9_ac0f_0a2b62178a22.slice
183224
│ ├─crio-b5437308f1a574c542bdf08563b865c0345c8f8c0b0a655612c.scope
184225
│ └─32706 /pause
185226
----
227+
228+
. Verify that pods of quality of service (QoS) tier `Guaranteed` are placed within the `kubepods.slice` subdirectory by running the following commands:
186229
+
187-
Pods of quality of service (QoS) tier `Guaranteed` are placed within the `kubepods.slice`. Pods of other QoS tiers end up in child `cgroups` of `kubepods`:
230+
[source,terminal]
231+
----
232+
# cd /sys/fs/cgroup/kubepods.slice/kubepods-pod69c01f8e_6b74_11e9_ac0f_0a2b62178a22.slice/crio-b5437308f1ad1a7db0574c542bdf08563b865c0345c86e9585f8c0b0a655612c.scope
233+
----
188234
+
189235
[source,terminal]
190236
----
191-
# cd /sys/fs/cgroup/cpuset/kubepods.slice/kubepods-pod69c01f8e_6b74_11e9_ac0f_0a2b62178a22.slice/crio-b5437308f1ad1a7db0574c542bdf08563b865c0345c86e9585f8c0b0a655612c.scope
192-
# for i in `ls cpuset.cpus tasks` ; do echo -n "$i "; cat $i ; done
237+
# for i in `ls cpuset.cpus cgroup.procs` ; do echo -n "$i "; cat $i ; done
193238
----
194239
+
240+
[NOTE]
241+
====
242+
Pods of other QoS tiers end up in child `cgroups` of the parent `kubepods`.
243+
====
244+
+
195245
.Example output
196246
[source,terminal]
197247
----
198248
cpuset.cpus 1
199249
tasks 32706
200250
----
201251

202-
. Check the allowed CPU list for the task:
252+
. Check the allowed CPU list for the task by running the following command:
203253
+
204254
[source,terminal]
205255
----
@@ -212,12 +262,15 @@ tasks 32706
212262
Cpus_allowed_list: 1
213263
----
214264

215-
. Verify that another pod (in this case, the pod in the `burstable` QoS tier) on the system cannot run on the core allocated for the `Guaranteed` pod:
265+
. Verify that another pod on the system cannot run on the core allocated for the `Guaranteed` pod. For example, to verify the pod in the `besteffort` QoS tier, run the following commands:
266+
+
267+
[source,terminal]
268+
----
269+
# cat /sys/fs/cgroup/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podc494a073_6b77_11e9_98c0_06bba5c387ea.slice/crio-c56982f57b75a2420947f0afc6cafe7534c5734efc34157525fa9abbf99e3849.scope/cpuset.cpus
270+
----
216271
+
217272
[source,terminal]
218273
----
219-
# cat /sys/fs/cgroup/cpuset/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podc494a073_6b77_11e9_98c0_06bba5c387ea.slice/crio-c56982f57b75a2420947f0afc6cafe7534c5734efc34157525fa9abbf99e3849.scope/cpuset.cpus
220-
0
221274
# oc describe node perf-node.example.com
222275
----
223276
+

0 commit comments

Comments
 (0)