Skip to content

Commit 37f704f

Browse files
authored
Merge pull request #13610 from kalexand-rh/autoscaler
draft of autoscaler assembly
2 parents ea1a497 + bf84e1c commit 37f704f

11 files changed

+301
-1
lines changed

_topic_map.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -191,6 +191,13 @@ Topics:
191191
- Name: Pruning objects
192192
File: pruning-objects
193193
---
194+
Name: Control Plane management
195+
Dir: control-plane-management
196+
Distros: openshift-origin, openshift-enterprise
197+
Topics:
198+
- Name: Applying autoscaling to a cluster
199+
File: applying-autoscaling
200+
---
194201
Name: Networking
195202
Dir: networking
196203
Distros: openshift-*
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
[id='applying-autoscaling']
2+
= Applying autoscaling to a {product-title} cluster
3+
include::modules/common-attributes.adoc[]
4+
:context: pplying-autoscaling
5+
6+
toc::[]
7+
8+
Applying autoscaling to a {product-title} cluster involves deploying a
9+
ClusterAutoscaler and then deploying MachineAutoscalers for each Machine type
10+
in your cluster.
11+
12+
include::modules/cluster-autoscaler-about.adoc[leveloffset=+1]
13+
14+
include::modules/machine-autoscaler-about.adoc[leveloffset=+1]
15+
16+
[id='configuring-clusterautoscaler']
17+
= Configuring the ClusterAutoscaler
18+
19+
First, deploy the ClusterAutoscaler to manage automatic resource scaling in
20+
your {product-title} cluster.
21+
22+
include::modules/cluster-autoscaler-crd.adoc[leveloffset=+2]
23+
24+
:FeatureName: ClusterAutoscaler
25+
include::modules/deploying-resource.adoc[leveloffset=+2]
26+
27+
[id='configuring-machineautoscaler']
28+
= Configuring the MachineAutoscalers
29+
30+
After you deploy the ClusterAutoscaler, you can
31+
deploy MachineAutoscaler resources for each of the machine types in your
32+
cluster to manage deployments of individual machines.
33+
34+
[NOTE]
35+
====
36+
You must configure separate resources for each MachineSet that you want to
37+
autoscale.
38+
====
39+
40+
include::modules/machine-autoscaler-crd.adoc[leveloffset=+2]
41+
42+
:FeatureName: MachineAutoscaler
43+
include::modules/deploying-resource.adoc[leveloffset=+2]
44+
45+
= Additional resources
46+
47+
* For more information about pod priority, see
48+
xref:../nodes/nodes-pods-priority.adoc#nodes-pods-priority[Including pod priority in pod scheduling decisions in {product-title}].
49+

control-plane-management/images

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../images

control-plane-management/modules

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../modules

modules/cluster-autoscaler-about.adoc

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * control-plane-management/applying-autoscaling.adoc
4+
5+
[id='cluster-autoscaler-about-{context}']
6+
= About the ClusterAutoscaler
7+
8+
The ClusterAutoscaler adjusts the size of an {product-title} cluster to meet
9+
its current deployment needs. It uses declarative, Kubernetes-style arguments to
10+
provide infrastructure management that does not rely on objects of a specific
11+
cloud provider.
12+
13+
The ClusterAutoscaler increases the size of the cluster when there are pods
14+
that failed to schedule on any of the current nodes due to insufficient
15+
resources or when another node is necessary to meet deployment needs. The
16+
ClusterAutoscaler does not increase the cluster resources beyond the limits
17+
that you specify.
18+
19+
The ClusterAutoscaler decreases the size of the cluster when some nodes are
20+
consistently not needed for a significant period, such as when it has low
21+
resource use and all of its important pods can fit on other nodes.
22+
23+
If the following types of pods are present on a node, the ClusterAutoscaler
24+
will not remove the node:
25+
26+
* Pods with restrictive PodDisruptionBudgets (PDBs).
27+
* Kube-system pods that do not run on the node by default.
28+
* Kube-system pods that do not have a PDBB or have a PDB that is too restrictive.
29+
* Pods that are not backed by a controller object such as a Deployment,
30+
ReplicaSet, or StatefulSet.
31+
* Pods with local storage.
32+
* Pods that cannot be moved elsewhere because of a lack of resources,
33+
incompatible node selectors or affinity, matching anti-affinity, and so on.
34+
* Unless they also have a `"cluster-autoscaler.kubernetes.io/safe-to-evict": "true"`
35+
annotation, pods that have a `"cluster-autoscaler.kubernetes.io/safe-to-evict": "false"`
36+
annotation.
37+
38+
If you configure the ClusterAutoscaler, additional usage restrictions apply:
39+
40+
* Do not modify the nodes that are in autoscaled node groups directly. All nodes
41+
within the same node group have the same capacity and labels and run the same
42+
system pods.
43+
* Specify requests for your pods.
44+
* If you need to prevent pods from being deleted too quickly, configure
45+
appropriate PDBs.
46+
* Confirm that your cloud provider quota is large enough to support the
47+
maximum node pools that you configure.
48+
* Do not run additional node group autoscalers, especially the ones offered by
49+
your cloud provider.
50+
51+
52+
The Horizontal Pod Autoscaler (HPA) and the ClusterAutoscaler modify cluster
53+
resources in different ways. The HPA changes the deployment's or ReplicaSet's
54+
number of replicas based on the current CPU load.
55+
If the load increases, the HPA creates new replicas, regardless of the amount
56+
of resources available to the cluster.
57+
If there are not enough resources, the ClusterAutoscaler adds resources so that
58+
the HPA-created pods can run.
59+
If the load decreases, the HPA stops some replicas. If this action causes some
60+
nodes to be underutilized or completely empty, the ClusterAutoscaler deletes
61+
the unnecessary nodes.
62+
63+
64+
The ClusterAutoscaler takes pod priorities into account. The Pod Priority and
65+
Preemption feature enables scheduling pods based on priorities if the cluster
66+
does not have enough resources, but the ClusterAutoscaler ensures that the
67+
cluster has resources to run all pods. To honor the intention of both features,
68+
the ClusterAutoscaler inclues a priority cutoff. You can use this cutoff to
69+
schedule "best-effort" pods, which do not cause the ClusterAutoscaler to
70+
increase resources but instead run only when spare resources are available.
71+
72+
Pods with priority lower than the cutoff value do not cause the cluster to scale
73+
up or prevent the cluster from scaling down. No new nodes are added to run the
74+
pods, and nodes running these pods might be deleted to free resources.
75+
76+
////
77+
Default priority cutoff is 0. It can be changed using `--expendable-pods-priority-cutoff` flag,
78+
but we discourage it.
79+
ClusterAutoscaler also doesn't trigger scale-up if an unschedulable pod is already waiting for a lower
80+
priority pod preemption.
81+
////
82+

modules/cluster-autoscaler-crd.adoc

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * control-plane-management/applying-autoscaling.adoc
4+
5+
[id='cluster-autoscaler-crd-{context}']
6+
= ClusterAutoscaler resource definition
7+
8+
This `ClusterAutoscaler` resource definition shows the parameters and sample
9+
values for the ClusterAutoscaler.
10+
11+
12+
[source,yaml]
13+
----
14+
apiVersion: "autoscaling.openshift.io/v1alpha1"
15+
kind: "ClusterAutoscaler"
16+
metadata:
17+
name: "default"
18+
spec:
19+
podPriorityThreshold: -10 <1>
20+
resourceLimits:
21+
maxNodesTotal: 24 <2>
22+
cores:
23+
min: 8 <3>
24+
max: 128 <4>
25+
memory:
26+
min: 4 <5>
27+
max: 256 <6>
28+
gpus:
29+
- type: nvidia.com/gpu <7>
30+
min: 0 <8>
31+
max: 16 <9>
32+
- type: amd.com/gpu <7>
33+
min: 0 <8>
34+
max: 4 <9>
35+
scaleDown:
36+
enabled: true <10>
37+
delayAfterAdd: 10s <11>
38+
delayAfterDelete: 10s <12>
39+
delayAfterFailure: 10s <13>
40+
unneededTime: 10s <14>
41+
----
42+
<1> Specify the priority that a pod must exceed to cause the ClusterAutoscaler
43+
to deploy additional nodes. Enter a 32-bit integer value. The
44+
`podPriorityThreshold` value is compared to the value of the `PriorityClass` that
45+
you assign to each pod.
46+
<2> Specify the maximum number of nodes to deploy.
47+
<3> Specify the minimum number of cores to deploy.
48+
<4> Specify the maximum number of cores to deploy.
49+
<5> Specify the minimum amount of memory, in GiB, per node.
50+
<6> Specify the maximum amount of memory, in GiB, per node.
51+
<7> Specify the type of GPU node to deploy. Only `nvidia.com/gpu` and `amd.com/gpu`
52+
are valid types.
53+
<8> Specify the minimum number of GPU cores to deploy.
54+
<9> Specify the maxiumum number of GPU cores to deploy.
55+
<10> Specify whether the ClusterAutoscaler can remove unnecessary nodes.
56+
<11> Specify the period, in seconds, to wait before deploying another node.
57+
<12> Specify the period, in seconds, to wait before deleting another node.
58+
<13> Specify the period, in seconds, to wait to deploy another node if the
59+
current deployment fails.
60+
<14> Specify the period, in seconds, before an unnecessary node is deleted.
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * control-plane-management/applying-autoscaling.adoc
4+
5+
[id='cluster-autoscaler-deploying-{context}']
6+
= Deploying the ClusterAutoscaler
7+
8+
To deploy the ClusterAutoscaler, you create an instance of the `ClusterAutoscaler`
9+
resource.
10+
11+
.Procedure
12+
13+
. Create a YAML file for the `ClusterAutoscaler` resource that is called
14+
`default.yaml`, and, after you customize it, save the resource definition.
15+
16+
. Create the resource in the cluster:
17+
+
18+
[source,bash]
19+
----
20+
$ oc create -f default.yaml
21+
----

modules/deploying-resource.adoc

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
// Be sure to set the :FeatureName: value in each assembly on the line before
2+
// the include statement for this module. For example, to set the FeatureName
3+
// value to "ClusterAutoscaler", add the following line to the assembly:
4+
// :FeatureName: ClusterAutoscaler
5+
// Module included in the following assemblies:
6+
//
7+
// * control-plane-management/applying-autoscaling.adoc
8+
9+
10+
11+
[id='{FeatureName}-deploying-{context}']
12+
= Deploying the {FeatureName}
13+
14+
To deploy the {FeatureName}, you create an instance of the `{FeatureName}`
15+
resource.
16+
17+
.Procedure
18+
19+
. Create a YAML file for the `{FeatureName}` resource that contains the
20+
customized resource definition.
21+
22+
. Create the resource in the cluster:
23+
+
24+
[source,bash]
25+
----
26+
$ oc create -f <filename>.yaml <1>
27+
----
28+
<1> `<filename>` is the name of the resource file that you customized.
29+
30+
// Undefine {FeatureName} attribute, so that any mistakes are easily spotted
31+
:!FeatureName:

modules/machine-api-overview.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ available by the `ClusterAutoscalerOperator`.
3939
`MachineHealthChecker`:: This resource detects when a machine is unhealthy,
4040
deletes it, and, on supported platforms, makes a new machine.
4141
`ClusterAutoscaler`:: This resource is based on the upstream
42-
link:https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler[Cluster Autoscaler]
42+
link:https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler[ClusterAutoscaler]
4343
project. In the {product-title} implementation, it is integrated with the
4444
Cluster API by extending the `MachineSet` API.
4545
`ClusterAutoscalerOperator`:: Instead of interacting with the `ClusterAutoscaler`

modules/machine-autoscaler-about.adoc

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * control-plane-management/applying-autoscaling.adoc
4+
5+
[id='machine-autoscaler-about-{context}']
6+
= About the MachineAutoscaler
7+
8+
The MachineAutoscaler adjusts the number of Machines in the MachineSets that you
9+
deploy in a {product-title} cluster. You can scale both the default `worker`
10+
MachineSet and any other MachineSets that you create. The MachineAutoscaler
11+
makes more Machines when the cluster runs out of resources to support more
12+
deployments.

0 commit comments

Comments
 (0)