You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jan 29, 2025. It is now read-only.
Update files after combining tas-extender and tas-controller
The following files are modified:
- cmd/scheduler-extender/main.go modified to allow tasExtender and
tasController functions to access the same cache.
- deploy/tas-deployment.yaml modified to use only one container.
- pkg/scheduler/scheduler.go fixed for multiple registrations with
ServeMux
- Makefile to attend the modifications from the merge of the two
previous components into one.
- README.md updated to reflect the changes
The following files are removed
- cmd/tas-policy-controller/main.go
- deploy/images/Dockerfile_controller
- pkg/cache/remote.go
- pkg/cache/server.go
Copy file name to clipboardExpand all lines: README.md
+11-17Lines changed: 11 additions & 17 deletions
Original file line number
Diff line number
Diff line change
@@ -7,20 +7,17 @@ For example - a pod that requires certain cache characteristics can be schedule
7
7
**This software is a pre-production alpha version and should not be deployed to production servers.**
8
8
9
9
10
-
## Components
11
-
Telemetry Aware Scheduling is made up of two components deployed in a single pod on a Kubernetes Cluster.
10
+
## Introduction
12
11
13
-
### Telemetry Aware Scheduler Extender
14
12
Telemetry Aware Scheduler Extender is contacted by the generic Kubernetes Scheduler every time it needs to make a scheduling decision.
15
13
The extender checks if there is a telemetry policy associated with the workload.
16
14
If so, it inspects the strategies associated with the policy and returns opinions on pod placement to the generic scheduler.
17
15
The scheduler extender has two strategies it acts on - scheduleonmetric and dontschedule.
18
16
This is implemented and configured as a [Kubernetes Scheduler Extender.](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#cluster-level-extended-resources)
19
17
20
-
### Telemetry Policy Controller
21
-
The Telemetry Policy Controller consumes TAS Policies - a Custom Resource. The controller parses this policy for deschedule, scheduleonmetric and dontschedule strategies and places them in a cache to make them locally available to all TAS components.
18
+
The Scheduler consumes TAS Policies - a Custom Resource. The extender parses this policy for deschedule, scheduleonmetric and dontschedule strategies and places them in a cache to make them locally available to all TAS components.
22
19
It consumes new Telemetry Policies as they are created, removes them when deleted, and updates them as they are changed.
23
-
The policy controller also monitors the current state of policies to see if they are violated. For example if it notes that a deschedule policy is violated it labels the node as a violator allowing pods relating to that policy to be descheduled.
20
+
The extender also monitors the current state of policies to see if they are violated. For example if it notes that a deschedule policy is violated it labels the node as a violator allowing pods relating to that policy to be descheduled.
24
21
25
22
## Usage
26
23
A worked example for TAS is available [here](docs/health-metric-example.md)
@@ -33,7 +30,7 @@ There are three strategies that TAS acts on.
33
30
**2 dontschedule** strategy has multiple rules, each with a metric name and operator and a target. A pod with this policy will never be scheduled on a node breaking any one of these rules.
34
31
- example: **dontschedule** if **gpu_usage** is **GreaterThan 10**
35
32
36
-
**3 deschedule** is consumed by the Telemetry Policy Controller. If a pod with this policy is running on a node that violates it can be descheduled with the kubernetes descheduler.
33
+
**3 deschedule** is consumed by the extender. If a pod with this policy is running on a node that violates it can be descheduled with the kubernetes descheduler.
37
34
- example: **deschedule** if **network_bandwidth_percent_free** is **LessThan 10**
38
35
39
36
The policy definition section below describes how to actually create these strategies in a kubernetes cluster.
@@ -163,28 +160,25 @@ spec:
163
160
There are three strategy types in a policy file and rules associated with each.
164
161
-**scheduleonmetric** has only one rule. It is consumed by the Telemetry Aware Scheduling Extender and prioritizes nodes based on the rule.
165
162
-**dontschedule** strategy has multiple rules, each with a metric name and operator and a target. A pod with this policy will never be scheduled on a node breaking any one of these rules.
166
-
-**deschedule** is consumed by the Telemetry Policy Controller. If a pod with this policy is running on a node that violates that pod can be descheduled with the kubernetes descheduler.
163
+
-**deschedule** is consumed by the extender. If a pod with this policy is running on a node that violates that pod can be descheduled with the kubernetes descheduler.
167
164
168
165
dontschedule and deschedule - which incorporate multiple rules - function with an OR operator. That is if any single rule is broken the strategy is considered violated.
169
166
Telemetry policies are namespaced, meaning that under normal circumstances a workload can only be associated with a pod in the same namespaces.
170
167
171
168
### Configuration flags
172
-
The below flags can be passed to the binaries at run time.
169
+
The below flags can be passed to the binary at run time.
173
170
174
-
#### TAS Policy Controller
171
+
#### TAS Scheduler Extender
175
172
name |type | description| usage | default|
176
173
-----|------|-----|-------|-----|
177
174
|kubeConfig| string |location of kubernetes configuration file | -kubeConfig /root/filename|~/.kube/config
178
175
|syncPeriod|duration string| interval between refresh of telemetry data|-syncPeriod 1m| 1s
179
176
|cachePort | string | port number at which the cache server will listen for requests | --cachePort 9999 | 8111
180
-
181
-
#### TAS Scheduler Extender
182
-
name |type | description| usage | default|
183
-
-----|------|-----|-------|-----|
184
177
|syncPeriod|duration string| interval between refresh of telemetry data|-syncPeriod 1m| 1s
185
178
|port| int | port number on which the scheduler extender will listen| -port 32000 | 9001
186
179
|cert| string | location of the cert file for the TLS endpoint | --cert=/root/cert.txt| /etc/kubernetes/pki/ca.crt
187
180
|key| string | location of the key file for the TLS endpoint| --key=/root/key.txt | /etc/kubernetes/pki/ca.key
181
+
|cacert| string | location of the ca certificate for the TLS endpoint| --key=/root/cacert.txt | /etc/kubernetes/pki/ca.crt
188
182
|unsafe| bool | whether or not to listen on a TLS endpoint with the scheduler extender | --unsafe=true| false
189
183
190
184
## Linking a workload to a policy
@@ -235,10 +229,10 @@ There are three changes to the demo policy here:
235
229
- Affinity rules which add a requiredDuringSchedulingIgnoredDuringExecution affinity to nodes which are labelled ``<POLICYNAME>=violating`` This is used by the descheduler to identify pods on nodes which break their TAS telemetry policies.
236
230
237
231
### Security
238
-
TAS Policy Controller is set up to use in-Cluster config in order to access the Kubernetes API Server. When deployed inside the cluster this along with RBAC controls configured in the installation guide, will give it access to the required resources.
239
-
If outside the cluster TAS Policy Controller will try to use a kubernetes config file in order to get permission to get resources from the API server. This can be passed with the --kubeconfig flag to the controller.
232
+
TAS Scheduler Extender is set up to use in-Cluster config in order to access the Kubernetes API Server. When deployed inside the cluster this along with RBAC controls configured in the installation guide, will give it access to the required resources.
233
+
If outside the cluster TAS will try to use a kubernetes config file in order to get permission to get resources from the API server. This can be passed with the --kubeconfig flag to the binary.
240
234
241
-
TAS Scheduler Extender contacts api server in the same way as policy controller. An identical flag --kubeConfig can be passed if it's operating outside the cluster.
235
+
When TAS Scheduler Extender contacts api server an identical flag --kubeConfig can be passed if it's operating outside the cluster.
242
236
Additionally TAS Scheduler Extender listens on a TLS endpoint which requires a cert and a key to be supplied.
243
237
These are passed to the executable using command line flags. In the provided deployment these certs are added in a Kubernetes secret which is mounted in the pod and passed as flags to the executable from there.
0 commit comments