Skip to content

Commit 8ce04ce

Browse files
authored
Merge pull request #12980 from mburke5678/scheduler-to-4.0
new assemblies and modules for scheduler
2 parents cb2e864 + 519412d commit 8ce04ce

File tree

45 files changed

+3207
-1
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+3207
-1
lines changed

_topic_map.yml

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -219,6 +219,47 @@ Name: Nodes
219219
Dir: nodes
220220
Distros: openshift-*
221221
Topics:
222+
- Name: About Pods
223+
File: nodes-pods-using
224+
- Name: Viewing Pods
225+
File: nodes-pods-viewing
226+
- Name: Configuring a cluster for Pods
227+
File: nodes-pods-configuring
228+
- Name: Providing sensitive data to Pods
229+
File: nodes-pods-secrets
230+
- Name: Using Device Manager to make devices available to nodes
231+
File: nodes-pods-plugins
232+
- Name: Including pod priority in Pod scheduling decisions
233+
File: nodes-pods-priority
234+
- Name: Automatically scaling Pods
235+
File: nodes-pods-autoscaling
236+
- Name: Running background tasks on nodes automatically with daemonsets
237+
File: nodes-pods-daemonsets
238+
- Name: Disabling features using feature gates
239+
File: nodes-pods-disabling-features
240+
- Name: Controlling pod placement onto nodes (scheduling)
241+
Dir: scheduling
242+
Topics:
243+
- Name: About pod placement using the scheduler
244+
File: nodes-scheduler-about
245+
- Name: Placing pods onto nodes with the default scheduler
246+
File: nodes-scheduler-default
247+
- Name: Placing pods relative to other pods using pod affinity/anti-affinity rules
248+
File: nodes-scheduler-pod-affinity
249+
- Name: Placing a pod on a specific node by name
250+
File: nodes-scheduler-node-names
251+
- Name: Placing a pod in a specific project
252+
File: nodes-scheduler-node-projects
253+
- Name: Placing pods onto overcommited nodes
254+
File: nodes-scheduler-overcommit
255+
- Name: Controlling pod placement on nodes using node affinity rules
256+
File: nodes-scheduler-node-affinity
257+
- Name: Controlling pod placement using node taints
258+
File: nodes-scheduler-taints-tolerations
259+
- Name: Constraining pod placement using node selectors
260+
File: nodes-scheduler-node-selectors
261+
- Name: Keeping your cluster balanced using the descheduler
262+
File: nodes-scheduler-descheduler
222263
- Name: Viewing and listing the nodes in your cluster
223264
File: nodes-nodes-viewing
224265
- Name: Working with nodes

applications_and_projects/PLACEHOLDER

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
1-
Please delete this file once you have assemblies here.
1+
Please leave this file until after Node PRs merge, as is it needed for the topic_yaml. Subtopics are not allowed, apparently, without at least one topic in the TOC
22

Lines changed: 199 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,199 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * nodes/nodes-scheduler-default.adoc
4+
5+
[id='nodes-scheduler-default-about_{context}']
6+
= Understanding default scheduling in {product-title}
7+
8+
The existing generic scheduler is the default platform-provided scheduler
9+
_engine_ that selects a node to host the pod in a three-step operation:
10+
11+
12+
Filters the Nodes::
13+
The available nodes are filtered based on the constraints or requirements
14+
specified. This is done by running each node through the list of filter
15+
functions called _predicates_.
16+
17+
Prioritize the Filtered List of Nodes::
18+
This is achieved by passing each node through a series of priority_ functions
19+
that assign it a score between 0 - 10, with 0 indicating a bad fit and 10
20+
indicating a good fit to host the pod. The scheduler configuration can also take
21+
in a simple _weight_ (positive numeric value) for each priority function. The
22+
node score provided by each priority function is multiplied by the weight
23+
(default weight for most priorities is 1) and then combined by adding the scores for each node
24+
provided by all the priorities. This weight attribute can be used by
25+
administrators to give higher importance to some priorities.
26+
27+
Select the Best Fit Node::
28+
The nodes are sorted based on their scores and the node with the highest score
29+
is selected to host the pod. If multiple nodes have the same high score, then
30+
one of them is selected at random.
31+
32+
[nodes-scheduler-default-about-understanding]
33+
== Understanding Scheduler Policy
34+
35+
The selection of the predicate and priorities defines the policy for the scheduler.
36+
37+
The scheduler configuration file is a JSON file that specifies the predicates and priorities the scheduler
38+
will consider.
39+
40+
In the absence of the scheduler policy file, the default configuration file,
41+
*_/etc/origin/master/scheduler.json_*, gets applied.
42+
43+
// we are working on how to configures this in 4.0 right now in https://github.com/openshift/api/pull/181
44+
45+
[IMPORTANT]
46+
====
47+
The predicates and priorities defined in
48+
the scheduler configuration file completely override the default scheduler
49+
policy. If any of the default predicates and priorities are required,
50+
you must explicitly specify the functions in the policy configuration.
51+
====
52+
53+
.Default scheduler configuration file
54+
[source,json]
55+
----
56+
{
57+
"apiVersion": "v1",
58+
"kind": "Policy",
59+
"predicates": [
60+
{
61+
"name": "NoVolumeZoneConflict"
62+
},
63+
{
64+
"name": "MaxEBSVolumeCount"
65+
},
66+
{
67+
"name": "MaxGCEPDVolumeCount"
68+
},
69+
{
70+
"name": "MaxAzureDiskVolumeCount"
71+
},
72+
{
73+
"name": "MatchInterPodAffinity"
74+
},
75+
{
76+
"name": "NoDiskConflict"
77+
},
78+
{
79+
"name": "GeneralPredicates"
80+
},
81+
{
82+
"name": "PodToleratesNodeTaints"
83+
},
84+
{
85+
"name": "CheckNodeMemoryPressure"
86+
},
87+
{
88+
"name": "CheckNodeDiskPressure"
89+
},
90+
{
91+
"argument": {
92+
"serviceAffinity": {
93+
"labels": [
94+
"region"
95+
]
96+
}
97+
},
98+
"name": "Region"
99+
100+
}
101+
],
102+
"priorities": [
103+
{
104+
"name": "SelectorSpreadPriority",
105+
"weight": 1
106+
},
107+
{
108+
"name": "InterPodAffinityPriority",
109+
"weight": 1
110+
},
111+
{
112+
"name": "LeastRequestedPriority",
113+
"weight": 1
114+
},
115+
{
116+
"name": "BalancedResourceAllocation",
117+
"weight": 1
118+
},
119+
{
120+
"name": "NodePreferAvoidPodsPriority",
121+
"weight": 10000
122+
},
123+
{
124+
"name": "NodeAffinityPriority",
125+
"weight": 1
126+
},
127+
{
128+
"name": "TaintTolerationPriority",
129+
"weight": 1
130+
},
131+
{
132+
"argument": {
133+
"serviceAntiAffinity": {
134+
"label": "zone"
135+
}
136+
},
137+
"name": "Zone",
138+
"weight": 2
139+
}
140+
]
141+
}
142+
----
143+
144+
[nodes-scheduler-default-about-understanding]
145+
== Scheduler Use Cases
146+
147+
One of the important use cases for scheduling within {product-title} is to
148+
support flexible affinity and anti-affinity policies.
149+
ifdef::openshift-enterprise,openshift-origin[]
150+
151+
[[infrastructure-topological-levels]]
152+
=== Infrastructure Topological Levels
153+
154+
Administrators can define multiple topological levels for their infrastructure
155+
(nodes) by specifying labels on nodes. For example: `region=r1`, `zone=z1`, `rack=s1`.
156+
157+
These label names have no particular meaning and
158+
administrators are free to name their infrastructure levels anything, such as
159+
city/building/room. Also, administrators can define any number of levels
160+
for their infrastructure topology, with three levels usually being adequate
161+
(such as: `regions` -> `zones` -> `racks`). Administrators can specify affinity
162+
and anti-affinity rules at each of these levels in any combination.
163+
endif::openshift-enterprise,openshift-origin[]
164+
165+
[[affinity]]
166+
=== Affinity
167+
168+
Administrators should be able to configure the scheduler to specify affinity at
169+
any topological level, or even at multiple levels. Affinity at a particular
170+
level indicates that all pods that belong to the same service are scheduled
171+
onto nodes that belong to the same level. This handles any latency requirements
172+
of applications by allowing administrators to ensure that peer pods do not end
173+
up being too geographically separated. If no node is available within the same
174+
affinity group to host the pod, then the pod is not scheduled.
175+
176+
If you need greater control over where the pods are scheduled, see Using Node Affinity and
177+
Using Pod Affinity and Anti-affinity.
178+
179+
These advanced scheduling features allow administrators
180+
to specify which node a pod can be scheduled on and to force or reject scheduling relative to other pods.
181+
182+
183+
[[anti-affinity]]
184+
=== Anti-Affinity
185+
186+
Administrators should be able to configure the scheduler to specify
187+
anti-affinity at any topological level, or even at multiple levels.
188+
Anti-affinity (or 'spread') at a particular level indicates that all pods that
189+
belong to the same service are spread across nodes that belong to that
190+
level. This ensures that the application is well spread for high availability
191+
purposes. The scheduler tries to balance the service pods across all
192+
applicable nodes as evenly as possible.
193+
194+
If you need greater control over where the pods are scheduled, see Using Node Affinity and
195+
Using Pod Affinity and Anti-affinity.
196+
197+
These advanced scheduling features allow administrators
198+
to specify which node a pod can be scheduled on and to force or reject scheduling relative to other pods.
199+
Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * nodes/nodes-scheduler-default.adoc
4+
5+
[id='nodes-scheduler-default-modifying_{context}']
6+
= Modifying Scheduler Policies in {product-title}
7+
8+
// this will need complete rewrite for 4.0
9+
10+
You change the sets of predicates and priorities in a scheduler policy.
11+
12+
The scheduler policy is defined in a file on the master,
13+
named *_/etc/origin/master/scheduler.json_* by default,
14+
unless overridden by the `kubernetesMasterConfig.schedulerConfigFile`
15+
field in the master configuration file.
16+
17+
.Procedure
18+
19+
To modify the scheduler policy:
20+
21+
. Edit the scheduler configuration file to configure the desired
22+
predicates and priorities.
23+
+
24+
.Sample modified scheduler configuration file
25+
[source,json]
26+
----
27+
kind: "Policy"
28+
version: "v1"
29+
"predicates": [
30+
{
31+
"name": "PodFitsResources"
32+
},
33+
{
34+
"name": "NoDiskConflict"
35+
},
36+
{
37+
"name": "MatchNodeSelector"
38+
},
39+
{
40+
"name": "HostName"
41+
},
42+
{
43+
"argument": {
44+
"serviceAffinity": {
45+
"labels": [
46+
"region"
47+
]
48+
}
49+
},
50+
"name": "Region"
51+
}
52+
],
53+
"priorities": [
54+
{
55+
"name": "LeastRequestedPriority",
56+
"weight": 1
57+
},
58+
{
59+
"name": "BalancedResourceAllocation",
60+
"weight": 1
61+
},
62+
{
63+
"name": "ServiceSpreadingPriority",
64+
"weight": 1
65+
},
66+
{
67+
"argument": {
68+
"serviceAntiAffinity": {
69+
"label": "zone"
70+
}
71+
},
72+
"name": "Zone",
73+
"weight": 2
74+
}
75+
]
76+
----
77+
78+
. Restart the {product-title} for the changes to take effect.
79+
+
80+
----
81+
# master-restart api
82+
# master-restart controllers
83+
----
84+

0 commit comments

Comments
 (0)