Skip to content

Commit 9d963e2

Browse files
authored
Merge pull request #74917 from jeana-redhat/OSDOCS-9800-CAS-expanders
OSDOCS-9800: CAS expanders
2 parents 3a907dc + 98ac288 commit 9d963e2

File tree

3 files changed

+142
-24
lines changed

3 files changed

+142
-24
lines changed

machine_management/applying-autoscaling.adoc

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ You can configure the cluster autoscaler only in clusters where the Machine API
1616
include::modules/cluster-autoscaler-about.adoc[leveloffset=+1]
1717

1818
[id="configuring-clusterautoscaler_{context}"]
19-
== Configuring the cluster autoscaler
19+
=== Configuring the cluster autoscaler
2020

2121
First, deploy the cluster autoscaler to manage automatic resource scaling in your {product-title} cluster.
2222

@@ -25,7 +25,9 @@ First, deploy the cluster autoscaler to manage automatic resource scaling in you
2525
Because the cluster autoscaler is scoped to the entire cluster, you can make only one cluster autoscaler for the cluster.
2626
====
2727

28-
include::modules/cluster-autoscaler-cr.adoc[leveloffset=+2]
28+
include::modules/cluster-autoscaler-cr.adoc[leveloffset=+3]
29+
30+
include::modules/cluster-autoscaler-config-priority-expander.adoc[leveloffset=+3]
2931

3032
:FeatureName: cluster autoscaler
3133
:FeatureResourceName: ClusterAutoscaler
@@ -36,7 +38,7 @@ include::modules/deploying-resource.adoc[leveloffset=+2]
3638
include::modules/machine-autoscaler-about.adoc[leveloffset=+1]
3739

3840
[id="configuring-machineautoscaler_{context}"]
39-
== Configuring machine autoscalers
41+
=== Configuring machine autoscalers
4042

4143
After you deploy the cluster autoscaler, deploy `MachineAutoscaler` resources that reference the compute machine sets that are used to scale the cluster.
4244

@@ -50,7 +52,7 @@ You must deploy at least one `MachineAutoscaler` resource after you deploy the `
5052
You must configure separate resources for each compute machine set. Remember that compute machine sets are different in each region, so consider whether you want to enable machine scaling in multiple regions. The compute machine set that you scale must have at least one machine in it.
5153
====
5254

53-
include::modules/machine-autoscaler-cr.adoc[leveloffset=+2]
55+
include::modules/machine-autoscaler-cr.adoc[leveloffset=+3]
5456

5557
:FeatureName: machine autoscaler
5658
:FeatureResourceName: MachineAutoscaler
Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * machine_management/applying-autoscaling.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="cluster-autoscaler-config-priority-expander_{context}"]
7+
= Configuring a priority expander for the cluster autoscaler
8+
9+
When the cluster autoscaler uses the priority expander, it scales up by using the machine set with the highest user-assigned priority.
10+
To use this expander, you must create a config map that defines the priority of your machine sets.
11+
12+
For each specified priority level, you must create regular expressions to identify machine sets that you want to use when prioritizing a machine set for selection.
13+
The regular expressions must match the name of any compute machine set that you want the cluster autoscaler to consider for selection.
14+
15+
.Prerequisites
16+
17+
* You have deployed an {product-title} cluster that uses the Machine API.
18+
* You have access to the cluster using an account with `cluster-admin` permissions.
19+
* You have installed the {oc-first}.
20+
21+
.Procedure
22+
23+
. List the compute machine sets on your cluster by running the following command:
24+
+
25+
[source,terminal]
26+
----
27+
$ oc get machinesets.machine.openshift.io
28+
----
29+
+
30+
.Example output
31+
[source,terminal]
32+
----
33+
NAME DESIRED CURRENT READY AVAILABLE AGE
34+
archive-agl030519-vplxk-worker-us-east-1c 1 1 1 1 25m
35+
fast-01-agl030519-vplxk-worker-us-east-1a 1 1 1 1 55m
36+
fast-02-agl030519-vplxk-worker-us-east-1a 1 1 1 1 55m
37+
fast-03-agl030519-vplxk-worker-us-east-1b 1 1 1 1 55m
38+
fast-04-agl030519-vplxk-worker-us-east-1b 1 1 1 1 55m
39+
prod-01-agl030519-vplxk-worker-us-east-1a 1 1 1 1 33m
40+
prod-02-agl030519-vplxk-worker-us-east-1c 1 1 1 1 33m
41+
----
42+
43+
. Using regular expressions, construct one or more patterns that match the name of any compute machine set that you want to set a priority level for.
44+
+
45+
For example, use the regular expression pattern `\*fast*` to match any compute machine set that includes the string `fast` in its name.
46+
47+
. Create a `cluster-autoscaler-priority-expander.yml` YAML file that defines a config map similar to the following:
48+
+
49+
--
50+
.Example priority expander config map
51+
[source,yaml]
52+
----
53+
apiVersion: v1
54+
kind: ConfigMap
55+
metadata:
56+
name: cluster-autoscaler-priority-expander # <1>
57+
namespace: openshift-machine-api # <2>
58+
data:
59+
priorities: |- # <3>
60+
10:
61+
- *fast*
62+
- *archive*
63+
40:
64+
- *prod*
65+
----
66+
<1> You must name config map `cluster-autoscaler-priority-expander`.
67+
<2> You must create the config map in the same namespace as cluster autoscaler pod, which is the `openshift-machine-api` namespace.
68+
<3> Define the priority of your machine sets.
69+
+
70+
The `priorities` values must be positive integers.
71+
The cluster autoscaler uses higher-value priorities before lower-value priorities.
72+
+
73+
For each priority level, specify the regular expressions that correspond to the machine sets you want to use.
74+
--
75+
76+
. Create the config map by running the following command:
77+
+
78+
[source,terminal]
79+
----
80+
$ oc create configmap cluster-autoscaler-priority-expander \
81+
--from-file=<location_of_config_map_file>/cluster-autoscaler-priority-expander.yml
82+
----
83+
84+
.Verification
85+
86+
* Review the config map by running the following command:
87+
+
88+
[source,terminal]
89+
----
90+
$ oc get configmaps cluster-autoscaler-priority-expander -o yaml
91+
----
92+
93+
.Next steps
94+
95+
* To use the priority expander, ensure that the `ClusterAutoscaler` resource definition is configured to use the `expanders: ["Priority"]` parameter.

modules/cluster-autoscaler-cr.adoc

Lines changed: 41 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -9,38 +9,38 @@
99

1010
This `ClusterAutoscaler` resource definition shows the parameters and sample values for the cluster autoscaler.
1111

12-
1312
[source,yaml]
1413
----
1514
apiVersion: "autoscaling.openshift.io/v1"
1615
kind: "ClusterAutoscaler"
1716
metadata:
1817
name: "default"
1918
spec:
20-
podPriorityThreshold: -10 <1>
19+
podPriorityThreshold: -10 # <1>
2120
resourceLimits:
22-
maxNodesTotal: 24 <2>
21+
maxNodesTotal: 24 # <2>
2322
cores:
24-
min: 8 <3>
25-
max: 128 <4>
23+
min: 8 # <3>
24+
max: 128 # <4>
2625
memory:
27-
min: 4 <5>
28-
max: 256 <6>
26+
min: 4 # <5>
27+
max: 256 # <6>
2928
gpus:
30-
- type: nvidia.com/gpu <7>
31-
min: 0 <8>
32-
max: 16 <9>
29+
- type: nvidia.com/gpu # <7>
30+
min: 0 # <8>
31+
max: 16 # <9>
3332
- type: amd.com/gpu
3433
min: 0
3534
max: 4
36-
logVerbosity: 4 <10>
37-
scaleDown: <11>
38-
enabled: true <12>
39-
delayAfterAdd: 10m <13>
40-
delayAfterDelete: 5m <14>
41-
delayAfterFailure: 30s <15>
42-
unneededTime: 5m <16>
43-
utilizationThreshold: "0.4" <17>
35+
logVerbosity: 4 # <10>
36+
scaleDown: # <11>
37+
enabled: true # <12>
38+
delayAfterAdd: 10m # <13>
39+
delayAfterDelete: 5m # <14>
40+
delayAfterFailure: 30s # <15>
41+
unneededTime: 5m # <16>
42+
utilizationThreshold: "0.4" # <17>
43+
expanders: ["Random"] # <18>
4444
----
4545
<1> Specify the priority that a pod must exceed to cause the cluster autoscaler to deploy additional nodes. Enter a 32-bit integer value. The `podPriorityThreshold` value is compared to the value of the `PriorityClass` that you assign to each pod.
4646
<2> Specify the maximum number of nodes to deploy. This value is the total number of machines that are deployed in your cluster, not just the ones that the autoscaler controls. Ensure that this value is large enough to account for all of your control plane and compute machines and the total number of replicas that you specify in your `MachineAutoscaler` resources.
@@ -66,8 +66,29 @@ If you do not specify a value, the default value of `1` is used.
6666
<14> Optional: Specify the period to wait before deleting a node after a node has recently been _deleted_. If you do not specify a value, the default value of `0s` is used.
6767
<15> Optional: Specify the period to wait before deleting a node after a scale down failure occurred. If you do not specify a value, the default value of `3m` is used.
6868
<16> Optional: Specify a period of time before an unnecessary node is eligible for deletion. If you do not specify a value, the default value of `10m` is used.
69-
<17> Optional: Specify the _node utilization level_. Nodes below this utilization level are eligible for deletion. If you do not specify a value, the default value of `10m` is used.. The node utilization level is the sum of the requested resources divided by the allocated resources for the node, and must be a value greater than `"0"` but less than `"1"`. If you do not specify a value, the cluster autoscaler uses a default value of `"0.5"`, which corresponds to 50% utilization. This value must be expressed as a string.
70-
// Might be able to add a formula to show this visually, but need to look into asciidoc math formatting and what our tooling supports.
69+
<17> Optional: Specify the _node utilization level_. Nodes below this utilization level are eligible for deletion. If you do not specify a value, the default value of `10m` is used.
70+
+
71+
The node utilization level is the sum of the requested resources divided by the allocated resources for the node, and must be a value greater than `"0"` but less than `"1"`. If you do not specify a value, the cluster autoscaler uses a default value of `"0.5"`, which corresponds to 50% utilization. You must express this value as a string.
72+
<18> Optional: Specify any expanders that you want the cluster autoscaler to use.
73+
The following values are valid:
74+
+
75+
--
76+
* `LeastWaste`: Selects the machine set that minimizes the idle CPU after scaling.
77+
If multiple machine sets would yield the same amount of idle CPU, the selection minimizes unused memory.
78+
* `Priority`: Selects the machine set with the highest user-assigned priority.
79+
To use this expander, you must create a config map that defines the priority of your machine sets.
80+
For more information, see "Configuring a priority expander for the cluster autoscaler."
81+
* `Random`: (Default) Selects the machine set randomly.
82+
--
83+
+
84+
If you do not specify a value, the default value of `Random` is used.
85+
+
86+
You can specify multiple expanders by using the `[LeastWaste, Priority]` format.
87+
The cluster autoscaler applies each expander according to the specified order.
88+
+
89+
In the `[LeastWaste, Priority]` example, the cluster autoscaler first evaluates according to the `LeastWaste` criteria.
90+
If more than one machine set satisfies the `LeastWaste` criteria equally well, the cluster autoscaler then evaluates according to the `Priority` criteria.
91+
If more than one machine set satisfies all of the specified expanders equally well, the cluster autoscaler selects one to use at random.
7192

7293
[NOTE]
7394
====

0 commit comments

Comments
 (0)