openshift
diff --git a/‎modules/cnf-about-collecting-nro-data.adoc
Lines changed: 1 addition & 1 deletion b/‎modules/cnf-about-collecting-nro-data.adoc
Lines changed: 1 addition & 1 deletion
diff --git a/‎modules/cnf-about-numa-aware-scheduling.adoc
Lines changed: 30 additions & 3 deletions b/‎modules/cnf-about-numa-aware-scheduling.adoc
Lines changed: 30 additions & 3 deletions
diff --git a/‎modules/cnf-configuring-kubelet-nro.adoc
Lines changed: 60 additions & 0 deletions b/‎modules/cnf-configuring-kubelet-nro.adoc
Lines changed: 60 additions & 0 deletions
diff --git a/‎modules/cnf-configuring-node-groups-for-the-numaresourcesoperator.adoc
Lines changed: 1 addition & 1 deletion b/‎modules/cnf-configuring-node-groups-for-the-numaresourcesoperator.adoc
Lines changed: 1 addition & 1 deletion
diff --git a/‎modules/cnf-configuring-single-numa-policy.adoc
Lines changed: 16 additions & 0 deletions b/‎modules/cnf-configuring-single-numa-policy.adoc
Lines changed: 16 additions & 0 deletions
diff --git a/‎modules/cnf-creating-nrop-cr.adoc
Lines changed: 28 additions & 4 deletions b/‎modules/cnf-creating-nrop-cr.adoc
Lines changed: 28 additions & 4 deletions
diff --git a/‎modules/cnf-deploying-the-numa-aware-scheduler.adoc
Lines changed: 15 additions & 77 deletions b/‎modules/cnf-deploying-the-numa-aware-scheduler.adoc
Lines changed: 15 additions & 77 deletions
diff --git a/‎modules/cnf-installing-numa-resources-operator-console.adoc
Lines changed: 1 addition & 1 deletion b/‎modules/cnf-installing-numa-resources-operator-console.adoc
Lines changed: 1 addition & 1 deletion
@@ -20,5 +20,5 @@ You can use the `oc adm must-gather` CLI command to collect information about yo
 +
 [source,terminal,subs="attributes+"]
 ----
-$ oc adm must-gather --image=registry.redhat.io/numaresources-must-gather/numaresources-must-gather-rhel9:{product-version}
+$ oc adm must-gather --image=registry.redhat.io/numaresources-must-gather/numaresources-must-gather-rhel9:v{product-version}
 ----
@@ -6,13 +6,40 @@
 [id="cnf-about-numa-aware-scheduling_{context}"]
 = About NUMA-aware scheduling
 
-Non-Uniform Memory Access (NUMA) is a compute platform architecture that allows different CPUs to access different regions of memory at different speeds. NUMA resource topology refers to the locations of CPUs, memory, and PCI devices relative to each other in the compute node. Co-located resources are said to be in the same _NUMA zone_. For high-performance applications, the cluster needs to process pod workloads in a single NUMA zone.
+[discrete]
+[id="introduction-to-numa_{context}"]
+== Introduction to NUMA
 
-NUMA architecture allows a CPU with multiple memory controllers to use any available memory across CPU complexes, regardless of where the memory is located. This allows for increased flexibility at the expense of performance. A CPU processing a workload using memory that is outside its NUMA zone is slower than a workload processed in a single NUMA zone. Also, for I/O-constrained workloads, the network interface on a distant NUMA zone slows down how quickly information can reach the application. High-performance workloads, such as telecommunications workloads, cannot operate to specification under these conditions. NUMA-aware scheduling aligns the requested cluster compute resources (CPUs, memory, devices) in the same NUMA zone to process latency-sensitive or high-performance workloads efficiently. NUMA-aware scheduling also improves pod density per compute node for greater resource efficiency.
+Non-Uniform Memory Access (NUMA) is a compute platform architecture that allows different CPUs to access different regions of memory at different speeds. NUMA resource topology refers to the locations of CPUs, memory, and PCI devices relative to each other in the compute node. Colocated resources are said to be in the same _NUMA zone_. For high-performance applications, the cluster needs to process pod workloads in a single NUMA zone.
+
+[discrete]
+[id="performance-considerations_{context}"]
+== Performance considerations
+
+NUMA architecture allows a CPU with multiple memory controllers to use any available memory across CPU complexes, regardless of where the memory is located. This allows for increased flexibility at the expense of performance. A CPU processing a workload using memory that is outside its NUMA zone is slower than a workload processed in a single NUMA zone. Also, for I/O-constrained workloads, the network interface on a distant NUMA zone slows down how quickly information can reach the application. High-performance workloads, such as telecommunications workloads, cannot operate to specification under these conditions. 
+
+[discrete]
+[id="numa-aware-scheduling_{context}"]
+== NUMA-aware scheduling
+
+NUMA-aware scheduling aligns the requested cluster compute resources (CPUs, memory, devices) in the same NUMA zone to process latency-sensitive or high-performance workloads efficiently. NUMA-aware scheduling also improves pod density per compute node for greater resource efficiency.
+
+[discrete]
+[id="integration-with-node-tuning-operator_{context}"]
+== Integration with Node Tuning Operator
 
 By integrating the Node Tuning Operator's performance profile with NUMA-aware scheduling, you can further configure CPU affinity to optimize performance for latency-sensitive workloads.
 
-The default {product-title} pod scheduler scheduling logic considers the available resources of the entire compute node, not individual NUMA zones. If the most restrictive resource alignment is requested in the kubelet topology manager, error conditions can occur when admitting the pod to a node. Conversely, if the most restrictive resource alignment is not requested, the pod can be admitted to the node without proper resource alignment, leading to worse or unpredictable performance. For example, runaway pod creation with `Topology Affinity Error` statuses can occur when the pod scheduler makes suboptimal scheduling decisions for guaranteed pod workloads by not knowing if the pod's requested resources are available. Scheduling mismatch decisions can cause indefinite pod startup delays. Also, depending on the cluster state and resource allocation, poor pod scheduling decisions can cause extra load on the cluster because of failed startup attempts.
+[discrete]
+[id="default-scheduling-logic_{context}"]
+== Default scheduling logic
+
+The default {product-title} pod scheduler scheduling logic considers the available resources of the entire compute node, not individual NUMA zones. If the most restrictive resource alignment is requested in the kubelet topology manager, error conditions can occur when admitting the pod to a node. Conversely, if the most restrictive resource alignment is not requested, the pod can be admitted to the node without proper resource alignment, leading to worse or unpredictable performance. For example, runaway pod creation with `Topology Affinity Error` statuses can occur when the pod scheduler makes suboptimal scheduling decisions for guaranteed pod workloads without knowing if the pod's requested resources are available. Scheduling mismatch decisions can cause indefinite pod startup delays. Also, depending on the cluster state and resource allocation, poor pod scheduling decisions can cause extra load on the cluster because of failed startup attempts.
+
+
+[discrete]
+[id="numa-aware-pod-scheduling-diagram_{context}"]
+== NUMA-aware pod scheduling diagram
 
 The NUMA Resources Operator deploys a custom NUMA resources secondary scheduler and other resources to mitigate against the shortcomings of the default {product-title} pod scheduler. The following diagram provides a high-level overview of NUMA-aware pod scheduling.
 
 
@@ -0,0 +1,60 @@
+// Module included in the following assemblies:
+//
+// *scalability_and_performance/cnf-numa-aware-scheduling.adoc
+
+:_module-type: PROCEDURE
+[id="cnf-configuring-kubelet-config-nro_{context}"]
+= Creating a KubeletConfig CRD
+
+The recommended way to configure a single NUMA node policy is to apply a performance profile. Another way is by creating and applying a `KubeletConfig` custom resource (CR), as shown in the following procedure.
+
+.Procedure 
+
+. Create the `KubeletConfig` custom resource (CR) that configures the pod admittance policy for the machine profile:
+
+.. Save the following YAML in the `nro-kubeletconfig.yaml` file:
++
+[source,yaml]
+----
+apiVersion: machineconfiguration.openshift.io/v1
+kind: KubeletConfig
+metadata:
+  name: worker-tuning
+spec:
+  machineConfigPoolSelector:
+    matchLabels:
+      pools.operator.machineconfiguration.openshift.io/worker: "" <1>
+  kubeletConfig:
+    cpuManagerPolicy: "static" <2>
+    cpuManagerReconcilePeriod: "5s"
+    reservedSystemCPUs: "0,1" <3>
+    memoryManagerPolicy: "Static" <4>
+    evictionHard:
+      memory.available: "100Mi"
+    kubeReserved:
+      memory: "512Mi"
+    reservedMemory:
+      - numaNode: 0
+        limits:
+          memory: "1124Mi"
+    systemReserved:
+      memory: "512Mi"
+    topologyManagerPolicy: "single-numa-node" <5>
+----
+<1> Adjust this label to match the the `machineConfigPoolSelector` in the `NUMAResourcesOperator` CR.
+<2> For `cpuManagerPolicy`, `static` must use a lowercase `s`.
+<3> Adjust this based on the CPU on your nodes.
+<4> For `memoryManagerPolicy`, `Static` must use an uppercase `S`.
+<5> `topologyManagerPolicy` must be set to `single-numa-node`.
+
+.. Create the `KubeletConfig` CR by running the following command:
++
+[source,terminal]
+----
+$ oc create -f nro-kubeletconfig.yaml
+----
++
+[NOTE]
+====
+Applying performance profile or `KubeletConfig` automatically triggers rebooting of the nodes. If no reboot is triggered, you can troubleshoot the issue by looking at the labels in `KubeletConfig` that address the node group. 
+====
@@ -46,7 +46,7 @@ spec:
 ----
 <1> Valid values are `Periodic`, `Events`, `PeriodicAndEvents`. Use `Periodic` to poll the kubelet at intervals that you define in `infoRefreshPeriod`. Use `Events` to poll the kubelet at every pod lifecycle event. Use `PeriodicAndEvents` to enable both methods.
 <2> Define the polling interval for `Periodic` or `PeriodicAndEvents` refresh modes. The field is ignored if the refresh mode is `Events`.
-<3> Valid values are `Enabled` or `Disabled`. Setting to `Enabled` is a requirement for the `cacheResyncPeriod` specification in the `NUMAResourcesScheduler`.
+<3> Valid values are `Enabled`, `Disabled`, and `EnabledExclusiveResources`. Setting to `Enabled` is a requirement for the `cacheResyncPeriod` specification in the `NUMAResourcesScheduler`.
 
 .Verification
 
 
@@ -0,0 +1,16 @@
+// Module included in the following assemblies:
+//
+// *scalability_and_performance/cnf-numa-aware-scheduling.adoc
+
+:_module-type: PROCEDURE
+[id="cnf-configuring-single-numa-policy_{context}"]
+= Configuring a single NUMA node policy
+
+The NUMA Resources Operator requires a single NUMA node policy to be configured on the cluster. This can be achieved in two ways: by creating and applying a performance profile, or by configuring a KubeletConfig. 
+
+[NOTE]
+====
+The preferred way to configure a single NUMA node policy is to apply a performance profile. You can use the Performance Profile Creator (PPC) tool to create the performance profile. If a performance profile is created on the cluster, it automatically creates other tuning components like `KubeletConfig` and the `tuned` profile.
+====
+
+For more information about creating a performance profile, see "About the Performance Profile Creator" in the "Additional Resources" section.
@@ -18,7 +18,7 @@ When you have installed the NUMA Resources Operator, then create the `NUMAResour
 
 . Create the `NUMAResourcesOperator` custom resource:
 
-.. Save the following YAML in the `nrop.yaml` file:
+.. Save the following minimal required YAML file example as `nrop.yaml`:
 +
 [source,yaml]
 ----
@@ -30,19 +30,26 @@ spec:
   nodeGroups:
   - machineConfigPoolSelector:
       matchLabels:
-        pools.operator.machineconfiguration.openshift.io/worker: ""
+        pools.operator.machineconfiguration.openshift.io/worker: "" <1>
 ----
++
+<1> This should match the `MachineConfigPool` that you want to configure the NUMA Resources Operator on. For example, you might have created a `MachineConfigPool` named `worker-cnf` that designates a set of nodes expected to run telecommunications workloads.
 
 .. Create the `NUMAResourcesOperator` CR by running the following command:
 +
 [source,terminal]
 ----
 $ oc create -f nrop.yaml
 ----
++
+[NOTE]
+====
+Creating the `NUMAResourcesOperator` triggers a reboot on the corresponding machine config pool and therefore the affected node.
+====
 
 .Verification
 
-* Verify that the NUMA Resources Operator deployed successfully by running the following command:
+. Verify that the NUMA Resources Operator deployed successfully by running the following command:
 +
 [source,terminal]
 ----
@@ -53,5 +60,22 @@ $ oc get numaresourcesoperators.nodetopology.openshift.io
 [source,terminal]
 ----
 NAME                    AGE
-numaresourcesoperator   10m
+numaresourcesoperator   27s
+----
+
+. After a few minutes, run the following command to verify that the required resources deployed successfully:
++
+[source,terminal]
+----
+$ oc get all -n openshift-numaresources
+----
++
+.Example output
+[source,terminal]
 ----
+NAME                                                    READY   STATUS    RESTARTS   AGE
+pod/numaresources-controller-manager-7d9d84c58d-qk2mr   1/1     Running   0          12m
+pod/numaresourcesoperator-worker-7d96r                  2/2     Running   0          97s
+pod/numaresourcesoperator-worker-crsht                  2/2     Running   0          97s
+pod/numaresourcesoperator-worker-jp9mw                  2/2     Running   0          97s
+----
@@ -8,59 +8,11 @@
 
 After you install the NUMA Resources Operator, do the following to deploy the NUMA-aware secondary pod scheduler:
 
-* Configure the performance profile.
-
-* Deploy the NUMA-aware secondary scheduler.
-
-.Prerequisites
-
-* Install the OpenShift CLI (`oc`).
-
-* Log in as a user with `cluster-admin` privileges.
-
-* Create the required machine config pool.
-
-* Install the NUMA Resources Operator.
-
 .Procedure
 
-. Create the `PerformanceProfile` custom resource (CR):
-
-.. Save the following YAML in the `nro-perfprof.yaml` file:
-+
-[source,yaml]
-----
-apiVersion: performance.openshift.io/v2
-kind: PerformanceProfile
-metadata:
-  name: perfprof-nrop
-spec:
-  cpu: <1>
-    isolated: "4-51,56-103"
-    reserved: "0,1,2,3,52,53,54,55"
-  nodeSelector:
-    node-role.kubernetes.io/worker: ""
-  numa:
-    topologyPolicy: single-numa-node
-----
-<1> The `cpu.isolated` and `cpu.reserved` specifications define ranges for isolated and reserved CPUs. Enter valid values for your CPU configuration. See the _Additional resources_ section for more information about configuring a performance profile.
-
-.. Create the `PerformanceProfile` CR by running the following command:
-+
-[source,terminal]
-----
-$ oc create -f nro-perfprof.yaml
-----
-+
-.Example output
-[source,terminal]
-----
-performanceprofile.performance.openshift.io/perfprof-nrop created
-----
-
 . Create the `NUMAResourcesScheduler` custom resource that deploys the NUMA-aware custom pod scheduler:
 
-.. Save the following YAML in the `nro-scheduler.yaml` file:
+.. Save the following minimal required YAML in the `nro-scheduler.yaml` file:
 +
 [source,yaml,subs="attributes+"]
 ----
@@ -70,16 +22,7 @@ metadata:
   name: numaresourcesscheduler
 spec:
   imageSpec: "registry.redhat.io/openshift4/noderesourcetopology-scheduler-rhel9:v{product-version}"
-  cacheResyncPeriod: "5s" <1>
 ----
-<1> Enter an interval value in seconds for synchronization of the scheduler cache. A value of `5s` is typical for most implementations.
-+
-[NOTE]
-====
-* Enable the `cacheResyncPeriod` specification to help the NUMA Resource Operator report more exact resource availability by monitoring pending resources on nodes and synchronizing this information in the scheduler cache at a defined interval. This also helps to minimize `Topology Affinity Error` errors because of sub-optimal scheduling decisions. The lower the interval the greater the network load. The `cacheResyncPeriod` specification is disabled by default.
-
-* Setting a value of `Enabled` for the `podsFingerprinting` specification in the `NUMAResourcesOperator` CR is a requirement for the implementation of the `cacheResyncPeriod` specification.
-====
 
 .. Create the `NUMAResourcesScheduler` CR by running the following command:
 +
@@ -88,16 +31,7 @@ spec:
 $ oc create -f nro-scheduler.yaml
 ----
 
-.Verification
-
-. Verify that the performance profile was applied by running the following command:
-+
-[source,terminal]
-----
-$ oc describe performanceprofile <performance-profile-name>
-----
-
-. Verify that the required resources deployed successfully by running the following command:
+. After a few seconds, run the following command to confirm the successful deployment of the required resources:
 +
 [source,terminal]
 ----
@@ -108,16 +42,20 @@ $ oc get all -n openshift-numaresources
 [source,terminal]
 ----
 NAME                                                    READY   STATUS    RESTARTS   AGE
-pod/numaresources-controller-manager-7575848485-bns4s   1/1     Running   0          13m
-pod/numaresourcesoperator-worker-dvj4n                  2/2     Running   0          16m
-pod/numaresourcesoperator-worker-lcg4t                  2/2     Running   0          16m
-pod/secondary-scheduler-56994cf6cf-7qf4q                1/1     Running   0          16m
+pod/numaresources-controller-manager-7d9d84c58d-qk2mr   1/1     Running   0          12m
+pod/numaresourcesoperator-worker-7d96r                  2/2     Running   0          97s
+pod/numaresourcesoperator-worker-crsht                  2/2     Running   0          97s
+pod/numaresourcesoperator-worker-jp9mw                  2/2     Running   0          97s
+pod/secondary-scheduler-847cb74f84-9whlm                1/1     Running   0          10m
+
 NAME                                          DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                     AGE
-daemonset.apps/numaresourcesoperator-worker   2         2         2       2            2           node-role.kubernetes.io/worker=   16m
+daemonset.apps/numaresourcesoperator-worker   3         3         3       3            3           node-role.kubernetes.io/worker=   98s
+
 NAME                                               READY   UP-TO-DATE   AVAILABLE   AGE
-deployment.apps/numaresources-controller-manager   1/1     1            1           13m
-deployment.apps/secondary-scheduler                1/1     1            1           16m
+deployment.apps/numaresources-controller-manager   1/1     1            1           12m
+deployment.apps/secondary-scheduler                1/1     1            1           10m
+
 NAME                                                          DESIRED   CURRENT   READY   AGE
-replicaset.apps/numaresources-controller-manager-7575848485   1         1         1       13m
-replicaset.apps/secondary-scheduler-56994cf6cf                1         1         1       16m
+replicaset.apps/numaresources-controller-manager-7d9d84c58d   1         1         1       12m
+replicaset.apps/secondary-scheduler-847cb74f84                1         1         1       10m
 ----
@@ -20,7 +20,7 @@ As a cluster administrator, you can install the NUMA Resources Operator using th
 
 .. In the {product-title} web console, click *Operators* -> *OperatorHub*.
 
-.. Choose *NUMA Resources Operator* from the list of available Operators, and then click *Install*.
+.. Choose *numaresources-operator* from the list of available Operators, and then click *Install*.
 
 .. In the *Installed Namespaces* field, select the `openshift-numaresources` namespace, and then click *Install*.
Original file line number	Diff line number	Diff line change
@@ -20,5 +20,5 @@ You can use the `oc adm must-gather` CLI command to collect information about yo
`20`	`20`	`+`
`21`	`21`	`[source,terminal,subs="attributes+"]`
`22`	`22`	`----`
`23`		`-$ oc adm must-gather --image=registry.redhat.io/numaresources-must-gather/numaresources-must-gather-rhel9:{product-version}`
	`23`	`+$ oc adm must-gather --image=registry.redhat.io/numaresources-must-gather/numaresources-must-gather-rhel9:v{product-version}`
`24`	`24`	`----`