Skip to content

Commit 7ed7540

Browse files
o-alexAlexandru OrmenisanSirOibaf
authored
[HWORKS-1627] Kubernetes Priority Classes & Labels (#421)
Co-authored-by: Alexandru Ormenisan <alex@Alexandrus-MBP.localdomain> Co-authored-by: Fabio Buso <fabio@hopsworks.ai>
1 parent 9101caa commit 7ed7540

File tree

6 files changed

+102
-0
lines changed

6 files changed

+102
-0
lines changed
Loading
Loading
Loading
Loading
Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
---
2+
description: Documentation on how to configure Kubernetes scheduling options for Hopsworks workloads.
3+
---
4+
# Scheduler
5+
6+
## Introduction
7+
8+
Hopsworks allows users to configure [Affinity](https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/) and [Priority Classes](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass) when running workloads on Hopsworks, this includes jobs, jupyter notebooks and model deployments.
9+
10+
Hopsworks Admins can control which labels and priority classes can be used the cluster (see [Cluster configuration](#cluster-configuration) section) and by which project (see [Default Project configuration](#default-project-configuration) section)
11+
12+
Within a project, data owners can set defaults for jobs and Jupyter notebooks running within that project (see: [Project defaults](#project-defaults) section).
13+
14+
### Node Labels, Node Affinity and Node Anti-Affinity
15+
16+
Labels in Kubernetes are key-value pairs used to organize and select resources. Hopsworks relies on labels applied to nodes for pod-node affinity to determine where the pod can (or cannot) run.
17+
Some uses cases where labels and affinity can be used include:
18+
19+
- Hardware constraints (GPU, SSD)
20+
- Environment separation (prod/dev)
21+
- Co-locating related pods
22+
- Spreading pods for high availability
23+
24+
Hopsworks uses the node affinity `IN` operator for the Hopsworks Node Affinity and the `NOT IN` operator for the Hopsworks Node Anti Affinity.
25+
26+
For more information on Kubernetes Affinity, you can check the Kubernetes [Affinity documentation](https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/) page.
27+
28+
### Priority Classes
29+
30+
Priority classes in Kubernetes determine the scheduling and eviction priority of pods.
31+
32+
Pods with higher priority:
33+
34+
- Get scheduled first
35+
- Can preempt (evict) lower priority pods
36+
- Less likely to be evicted under resource pressure
37+
38+
Common uses:
39+
40+
- Protecting critical workloads
41+
- Ensuring core services stay running
42+
- Managing resource competition
43+
- Guaranteeing QoS for important applications
44+
45+
For more information on Priority Classes, you can check the Kubernetes [Priority Classes documentation](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass) page.
46+
47+
## Cluster Configuration
48+
49+
Hopsworks admins can control the affinity labels and priority classes available on the Hopsworks cluster from the `Cluster Settings -> Scheduler` page:
50+
51+
![Cluster Configuration - Node Labels and Priority Classes](../../../assets/images/guides/project/scheduler/admin_cluster_scheduler.png)
52+
53+
Hopsworks Cluster can run within a shared Kubernets Cluster. The first configuration level is to limit the subset of labels and priority classes that can be used within the Hopsworks Cluster. This can be done from the `Available in Hopsworks` sub-section.
54+
55+
!!! note "Permissions"
56+
57+
In order to be able to list all the Kubernetes Node Labels, Hopsworks requires the following cluster role:
58+
59+
```
60+
- apiGroups: [""]
61+
resources: ["nodes"]
62+
verbs: ["get", "list"]
63+
```
64+
65+
In order to be able to list all the Kubernetes Cluster Priority Classes, Hopsworsk requires this cluster role:
66+
67+
```
68+
- apiGroups: ["scheduling.k8s.io"]
69+
resources: ["priorityclasses"]
70+
verbs: ["get", "list"]
71+
```
72+
73+
If the roles above are configured properly (default behaviour), admins can only select values from the drop down menu. If the roles are missing, admins would be required to enter them as free text and should be careful about typos. Any typos here will be propagated in the other configuration and use levels leading to errors or missbehaviour when running computation.
74+
75+
## Project Configuration
76+
77+
Hopsworks admins can configure the labels and priority classes that can be used by default within a project. This will be a subset of the ones configured for Hopsworks.
78+
In the figure above, in the sub-section `Available in Project` Hopsworks admins can configure the labels and priority classes available by default in any Hopsworks Project.
79+
80+
Hopsworks admins can also override the default project configuration on a per-project basis. That is, Hopsworks admins can make certain labels and priority classes available only to certain projects. This can be achieved from the `Cluster Settings -> Project -> <ProjectName> -> edit configuration` configuration page:
81+
82+
![Custom Project Configuration - Node Labels and Priority Classes](../../../assets/images/guides/project/scheduler/admin_project_scheduler.png)
83+
84+
## Project defaults
85+
86+
Within a project, different jobs, Jupyter notebooks and model deployments can run with different labels and/or priority classes. `Data Owners` in a project can specify the default values from the project settings:
87+
The default Label will be used for the default Node Affinity for jobs, notebooks, and model deployments.
88+
89+
![ Project Default - Labels and Priority Classes](../../../assets/images/guides/project/scheduler/project_default.png)
90+
91+
## Configuration of Jobs, Notebooks, and Deployments
92+
93+
In the advanced configuration sections for job, notebook, and model deployments, users can set affinity, anti affinity and priority class. The Affinity and Anti Affinity can be selected from the list of allowed labels.
94+
95+
`Affinity` configures on which nodes this pod can run. If a node has any of the labels present in the Affinity option, the pod can be scheduler to run to run there.
96+
97+
`Anti Affinity` configures on which nodes this pod will not run on. If a node has any of the labels present in the Anti Affinity option, the pod will not be scheduler to run there.
98+
99+
`Priority Class` specifies with which priority a pod will run.
100+
101+
![ Job Configuration - Affinity and Priority Classes](../../../assets/images/guides/project/scheduler/job_configuration.png)

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,7 @@ nav:
154154
- Run Python Job: user_guides/projects/jobs/python_job.md
155155
- Run Jupyter Notebook Job: user_guides/projects/jobs/notebook_job.md
156156
- Scheduling: user_guides/projects/jobs/schedule_job.md
157+
- Kubernetes Scheduling: user_guides/projects/scheduling/kube_scheduler.md
157158
- Airflow: user_guides/projects/airflow/airflow.md
158159
- OpenSearch:
159160
- Connect: user_guides/projects/opensearch/connect.md

0 commit comments

Comments
 (0)