Skip to content

General updates #1438

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 3, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/admin/config/advanced-cluster-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ The following configurations allow you to enable or disable features, control pe
| spec.limitRange.memoryDefaultRequestGpuFactor (string) | Sets a default amount of memory allocated per GPU when the memory is not specified | 100Mi |
| spec.limitRange.memoryDefaultLimitGpuFactor (string) | Sets a default memory limit based on the number of GPUs requested when no memory limit is specified | NO DEFAULT |
| spec.global.core.timeSlicing.mode (string) | Sets the GPU time-slicing mode.Possible values:`timesharing` - all pods on a GPU share the GPU compute time evenly.‘strict’ - each pod gets an exact time slice according to its memory fraction value.`fair` - each pod gets an exact time slice according to its memory fraction value and any unused GPU compute time is split evenly between the running pods.| timesharing |
| runai-scheduler.fullHierarchyFairness (boolean) | Enables fairness between departments, on top of projects fairness | true |
| spec.runai-scheduler.fullHierarchyFairness (boolean) | Enables fairness between departments, on top of projects fairness | true |
| spec.pod-grouper.args.gangSchedulingKnative (boolean) | Enables gang scheduling for inference workloads.For backward compatibility with versions earlier than v2.19, change the value to false | true |
| runai-scheduler.args.defaultStalenessGracePeriod | Sets the timeout in seconds before the scheduler evicts a stale pod-group (gang) that went below its min-members in running state: `0s` - Immediately (no timeout) `-1` - Never | 60s
| spec.runai-scheduler.args.verbosity (int) | Configures the level of detail in the logs generated by the scheduler service | 4 |
Expand Down
2 changes: 1 addition & 1 deletion docs/admin/config/node-affinity-with-cloud-node-pools.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Node affinity with cloud node pools

Run:ai allows for [node affinity](../../platform-admin/aiinitiatives/org/projects.md). Node affinity is the ability to assign a Project to run on specific nodes.
To use the node affinity feature, You will need to label the target nodes with the label `run.ai/node-type`. Most cloud clusters allow configuring node labels for the node pools in the cluster. This guide shows how to apply this configuration to different cloud providers.
To use the node affinity feature, You will need to label the target nodes with the label `run.ai/type`. Most cloud clusters allow configuring node labels for the node pools in the cluster. This guide shows how to apply this configuration to different cloud providers.

To make the node affinity work with node pools on various cloud providers, we need to make sure the node pools are configured with the appropriate Kubernetes label (`run.ai/type=<TYPE_VALUE>`).

Expand Down
Loading