You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/Researcher/scheduling/node-level-scheduler.md
+31-22Lines changed: 31 additions & 22 deletions
Original file line number
Diff line number
Diff line change
@@ -7,46 +7,55 @@ authors:
7
7
date: 2024-Apr-4
8
8
---
9
9
10
-
Node Level Scheduler optimizes the performance of your pods and maximizes the utilization of GPUs by making optimal local decisions on GPU allocation to your pods. While the Cluster Scheduler chooses the specific node for a POD, but has no visibility to node’s GPUs internal state, the Node Level Scheduler is aware of the local GPUs states and makes optimal local decisions such that it can optimize both the GPU utilization and pods’ performance running on the node’s GPUs.
10
+
The Node Level Scheduler optimizes the performance of your pods and maximizes the utilization of GPUs by making optimal local decisions on GPU allocation to your pods. While the Cluster Scheduler chooses the specific node for a POD, but has no visibility to node’s GPUs internal state, the Node Level Scheduler is aware of the local GPUs states and makes optimal local decisions such that it can optimize both the GPU utilization and pods’ performance running on the node’s GPUs.
11
11
12
12
Node Level Scheduler applies to all workload types, but will best optimize the performance of burstable workloads, giving those more GPU memory than requested and up to the limit specified. Be aware, burstable workloads are always susceptible to an OOM Kill signal if the owner of the excess memory requires it back. This means that using the Node Level Scheduler with Inference or Training workloads may cause pod preemption. Interactive workloads that are using notebooks behave differently since the OOM Kill signal will cause the GPU process to exit but not the notebook, hence keeping the Interactive pod running and retrying to attach a GPU again. This makes Interactive workloads with notebooks a great use case for burstable workloads and Node Level Scheduler.
13
-
Interactive Notebooks Use Case
13
+
14
+
## Interactive Notebooks Use Case
15
+
14
16
Consider the following example of a node with 2 GPUs and 2 interactive pods that are submitted and want GPU resources.
15
17
16
-
[Add picture #1]
18
+

17
19
18
20
The Scheduler instructs the node to put the two pods on a single GPU, bin packing a single GPU and leaving the other free for a workload that might want a full GPU or more than half GPU. However that would mean GPU#2 is idle while the two notebooks can only use up to half a GPU, even if they temporarily need more.
19
21
20
-
[Add picture #2]
22
+

21
23
22
24
However, with Node Level Scheduler enabled, the local decision will be to spread those two pods on two GPUs and allow them to maximize bot pods’ performance and GPUs’ utilization by bursting out up to the full GPU memory and GPU compute resources.
23
25
24
-
[Add picture #3]
26
+

25
27
26
-
The Cluster Scheduler still sees a node with a full empty GPU.
27
-
When a 3rd pod is scheduled, and it requires a full GPU (or more than 0.5 GPU), the scheduler will send it to that node, and Node Level Scheduler will move one of the Interactive workloads to run with the other pod in GPU#1, as was the Cluster Scheduler initial plan.
28
+
The Cluster Scheduler still sees a node with a full empty GPU.
29
+
When a 3rd pod is scheduled, and it requires a full GPU (or more than 0.5 GPU), the scheduler will send it to that node, and Node Level Scheduler will move one of the Interactive workloads to run with the other pod in GPU#1, as was the Cluster Scheduler initial plan.
To use Node Level Scheduler the Administrator should follow the steps:
46
40
47
-
1. Enable ‘GPU resource optimization’ on your tenant’s, go to your tenant’s UI: ‘Settings->General->Resources->GPU Resource Optimization’ and enable the flag.
41
+
1. Enable Node Level Scheduler at the cluster level (per cluster), edit the `runaiconfig` file and set:
48
42
49
-
2. Enable ‘Node Level Scheduler’ on any of the Node Pools you want to use this feature. Go to the tenant’s UI ‘Node Pools’ tab (under ‘Nodes’), and either create a new Node-Pool or edit an existing Node-Pool. In the Node-Pool’s form, under the ‘Resource Utilization Optimization’ tab - change the ‘Number of workloads on each GPU’ to any value other than ‘Not Enforced’ (i.e. 2, 3, 4, 5).
43
+
```YAML
44
+
spec:
45
+
global:
46
+
core:
47
+
nodeScheduler:
48
+
enabled: true
49
+
```
50
50
51
-
The Node Level Scheduler is now ready to be used on that Node-Pool.
51
+
The Administrator can also use this patch command to perform the change:
2. To enable ‘GPU resource optimization’ on your tenant’s, go to your tenant’s UI and press *Tools & Settings*, *General*, the open the *Resources* pane and toggle *Resource Optimization* to on.
58
+
59
+
3. To enable ‘Node Level Scheduler’ on any of the Node Pools you want to use this feature, go to the tenant’s UI ‘Node Pools’ tab (under ‘Nodes’), and either create a new Node-Pool or edit an existing Node-Pool. In the Node-Pool’s form, under the ‘Resource Utilization Optimization’ tab, change the ‘Number of workloads on each GPU’ to any value other than ‘Not Enforced’ (i.e. 2, 3, 4, 5).
60
+
61
+
The Node Level Scheduler is now ready to be used on that Node-Pool.
0 commit comments