diff --git a/advanced_cohorts/using-cohorts.adoc b/advanced_cohorts/using-cohorts.adoc index 871a05f098a4..20a04c0ce0c3 100644 --- a/advanced_cohorts/using-cohorts.adoc +++ b/advanced_cohorts/using-cohorts.adoc @@ -14,3 +14,16 @@ Cohorts can also help to simplify resource management and allocation between tea You can also use cohorts to set resource quotas at a group level to define the limits for resources that a group of cluster queues can consume. include::modules/clusterqueue-configuring-cohorts-reference.adoc[leveloffset=+1] + +//// +When a ClusterQueue is part of a cohort, Kueue satisfies the following admission semantics: + +When assigning flavors, Kueue goes through the list of flavors in the relevant ResourceGroup inside ClusterQueue’s (.spec.resourceGroups[*].flavors). For each flavor, Kueue attempts to fit a Workload’s pod set according to the quota defined in the ClusterQueue for the flavor and the unused quota in the cohort. If the Workload doesn’t fit, Kueue evaluates the next flavor in the list. + +A Workload’s pod set resource fits in a flavor defined for a ClusterQueue resource if the sum of requests for the resource: +Is less than or equal to the unused nominalQuota for the flavor in the ClusterQueue; or +Is less than or equal to the sum of unused nominalQuota for the flavor in the ClusterQueues in the cohort, and +Is less than or equal to the unused nominalQuota + borrowingLimit for the flavor in the ClusterQueue. In Kueue, when (2) and (3) are satisfied, but not (1), this is called borrowing quota. +A ClusterQueue can only borrow quota for flavors that the ClusterQueue defines. +For each pod set resource in a Workload, a ClusterQueue can only borrow quota for one flavor. +//// diff --git a/modules/configuring-quota-limits.adoc b/modules/configuring-quota-limits.adoc new file mode 100644 index 000000000000..98954e182941 --- /dev/null +++ b/modules/configuring-quota-limits.adoc @@ -0,0 +1,50 @@ +// Module included in the following assemblies: +// +// * configure/configuring-quotas.adoc + +:_mod-docs-content-type: PROCEDURE +[id="configuring-quota-limits_{context}"] += Configuring quota limits + +When you create a cluster queue that is part of a cohort, you can configure quota limits to define the maximum amount of resources that the cluster queue can borrow, as well as the maximum amount of unused resources that other cluster queues can borrow from this cluster queue. + +You can configure these limits by specifying values for the `borrowingLimit` and `lendingLimit`, respectively. + +.Prerequisites + +include::snippets/prereqs-snippet-yaml.adoc[] + +.Procedure + +. Create a `ClusterQueue` object as a YAML file: ++ +.Example of a basic `ClusterQueue` object using quota limits +[source,yaml] +---- +apiVersion: kueue.x-k8s.io/v1beta1 +kind: ClusterQueue +metadata: + name: "example-q" +spec: + namespaceSelector: {} + cohort: "example-cohort" + resourceGroups: + - coveredResources: ["cpu", "memory"] + flavors: + - name: "default-flavor" + resources: + - name: "cpu" + nominalQuota: 9 + borrowingLimit: 1 # <1> + lendingLimit: 3 # <2> +# ... +---- +<1> In this example, the borrowing limit is set to `1`, and there is a nominal quota of `9`, so assuming that there is 1 CPU of borrowable resources available in the cohort, the `example-q` cluster queue could admit workloads with resources totaling 10 CPUs. If the borrowing limit is empty or omitted for a cluster queue, the cluster queue can borrow up to the sum of nominal quotas from all the cluster queues in the cohort. +<2> In this example, the lending limit is set to `3`, which means that if all admitted workloads in the cluster queue have a total quota usage below the `nominalQuota` value of `9` CPUs, then cluster queue `example-q` can admit workloads with resources totaling 12 CPUs. + +. Apply the `ClusterQueue` object by running the following command: ++ +[source,terminal] +---- +$ oc apply -f .yaml +---- diff --git a/modules/configuring-resourceflavors.adoc b/modules/configuring-resourceflavors.adoc index a5ef768a81a7..cd803f7aa969 100644 --- a/modules/configuring-resourceflavors.adoc +++ b/modules/configuring-resourceflavors.adoc @@ -6,11 +6,11 @@ [id="configuring-resourceflavors_{context}"] = Configuring a resource flavor -After you have configured a `ClusterQueue` object, you can configure a `ResourceFlavor` object. +Resource flavors, represented as `ResourceFlavor` objects, define how a flavor maps to a group of nodes. Flavors represent different variations of a resource, for example, different GPU models. -Resources in a cluster are typically not homogeneous. If the resources in your cluster are homogeneous, you can use an empty `ResourceFlavor` instead of adding labels to custom resource flavors. +After you have configured a `ClusterQueue` object, you can configure a `ResourceFlavor` object. You can define quotas in a cluster queue for multiple different flavors that provide certain compute resources. -You can use a custom `ResourceFlavor` object to represent different resource variations that are associated with cluster nodes through labels, taints, and tolerations. You can then associate workloads with specific node types to enable fine-grained resource management. +Resources in a cluster are typically not homogeneous. If the resources in your cluster are homogeneous, you can use an empty `ResourceFlavor` instead of adding labels to custom resource flavors. .Prerequisites diff --git a/modules/create-kueue-cr.adoc b/modules/create-kueue-cr.adoc index 2612617f268c..bced7326f6bd 100644 --- a/modules/create-kueue-cr.adoc +++ b/modules/create-kueue-cr.adoc @@ -25,19 +25,21 @@ include::snippets/prereqs-snippet-console.adoc[] apiVersion: kueue.openshift.io/v1 kind: Kueue metadata: + name: cluster # <1> labels: - app.kubernetes.io/name: kueue-operator app.kubernetes.io/managed-by: kustomize - name: cluster # <1> + app.kubernetes.io/name: kueue-operator namespace: openshift-kueue-operator spec: - managementState: Managed config: integrations: frameworks: # <2> - - BatchJob + - BatchJob preemption: preemptionPolicy: Classical # <3> + logLevel: Normal + operatorLogLevel: Normal + managementState: Managed # ... ---- <1> The name of the `Kueue` CR must be `cluster`. diff --git a/modules/resource-groups-flavors.adoc b/modules/resource-groups-flavors.adoc new file mode 100644 index 000000000000..38ed4c0d836c --- /dev/null +++ b/modules/resource-groups-flavors.adoc @@ -0,0 +1,43 @@ +// Module included in the following assemblies: +// +// * configure/using-cohorts.adoc + +:_mod-docs-content-type: CONCEPT +[id="resource-groups-flavors_{context}"] += Resource groups and flavors + +You can configure resource groups within cluster queues to define a list of resources and a list of flavors that provide quotas for these resources. +Each resource type and flavor can only belong to one resource group. + +.Example of a basic `ClusterQueue` object with resource groups defined +[source,yaml] +---- +apiVersion: kueue.x-k8s.io/v1beta1 +kind: ClusterQueue +metadata: + name: example-queue +spec: + namespaceSelector: {} + resourceGroups: # <1> + - coveredResources: ["cpu", "memory"] # <2> + flavors: + - name: "resource-flavor-a" # <3> + resources: + - name: "cpu" + nominalQuota: 9 + - name: "memory" + nominalQuota: 36Gi + - coveredResources: ["pods", "foo.com/gpu"] # <4> + flavors: + - name: "resource-flavor-b" # <5> + resources: # <4> + - name: "pods" + nominalQuota: 5 + - name: "foo.com/gpu" + nominalQuota: 100 +---- +<1> You can define up to 16 resource groups for your cluster. +<2> Defines the resource types governed the first resource group. This resource group governs CPU and memory resources. +<3> Defines the resource flavor that is applied to the resource types listed in the first resource group. In this example, the `resource-flavor-a` resource flavor is applied to CPU and memory. +<4> Defines the resource types governed the second resource group. This resource group governs pods and GPU resources. +<5> Defines the resource flavor that is applied to the resource types listed in the second resource group. In this example, the `resource-flavor-b` resource flavor is applied to pods and GPU. diff --git a/quotas_workloads/configuring-quotas.adoc b/quotas_workloads/configuring-quotas.adoc index aeac2a0d2a58..dec03ade35e6 100644 --- a/quotas_workloads/configuring-quotas.adoc +++ b/quotas_workloads/configuring-quotas.adoc @@ -19,12 +19,16 @@ Users can then submit their workloads to the local queue. include::modules/configuring-clusterqueues.adoc[leveloffset=+1] +include::modules/configuring-quota-limits.adoc[leveloffset=+1] + [role="_next-steps"] [id="clusterqueues-next-steps_{context}"] .Next steps The cluster queue is not ready for use until a xref:../quotas_workloads/configuring-quotas.adoc#configuring-resourceflavors_configuring-quotas[`ResourceFlavor` object] has also been configured. +include::modules/resource-groups-flavors.adoc[leveloffset=+1] + include::modules/configuring-resourceflavors.adoc[leveloffset=+1] include::modules/configuring-localqueues.adoc[leveloffset=+1] diff --git a/running_jobs/running-kueue-jobs.adoc b/running_jobs/running-kueue-jobs.adoc index 822e1b6633b3..4b6e7cd7bbac 100644 --- a/running_jobs/running-kueue-jobs.adoc +++ b/running_jobs/running-kueue-jobs.adoc @@ -8,5 +8,9 @@ toc::[] You can run Kubernetes jobs with {product-title} enabled to manage resource allocation within defined quota limits. This can help to ensure predictable resource availability, cluster stability, and optimized performance. +You can create jobs to be admitted to {product-title} by using the standard link:https://kubernetes.io/docs/concepts/workloads/controllers/job/[Kubernetes `Job` API]. + +You can create any supported `Job` object, and then add the `kueue.x-k8s.io/queue-name` label to that object to connect it to a `LocalQueue` resource. + include::modules/identifying-local-queues.adoc[leveloffset=+1] include::modules/defining-running-jobs.adoc[leveloffset=+1]