Skip to content

Commit cc64a08

Browse files
Merge branch 'v2.18' into RUN-19295-Addition
2 parents 6f9dcd9 + 309a58c commit cc64a08

File tree

3 files changed

+52
-53
lines changed

3 files changed

+52
-53
lines changed

.github/workflows/automated-publish-docs.yaml

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,13 @@ jobs:
1919
steps:
2020
- name: Checkout code
2121
uses: actions/checkout@v4
22+
with:
23+
fetch-depth: 0
2224

2325
- name: Get all v*.* branches
2426
id: calculate-env
2527
run: |
26-
BRANCHES=$(git branch --list --all | grep -v master | grep 'origin/v*.*' | sed -n -E 's:.*/(v[0-9]+\.[0-9]+).*:\1:p' | sort -Vu)
28+
BRANCHES=$(git branch -r | grep -E '^ *origin/v[0-9]{1,2}\.[0-9]{1,2}$' | sort -Vu | sed 's/origin\///g' | sed 's/ //g')
2729
NEWEST_VERSION=$(printf '%s\n' "${BRANCHES[@]}" | sort -V | tail -n 1)
2830
CURRENT_BRANCH=${GITHUB_REF#refs/heads/}
2931
ALIAS=$CURRENT_BRANCH-alias
@@ -48,7 +50,6 @@ jobs:
4850
uses: actions/checkout@v4
4951
with:
5052
ref: ${{ needs.env.outputs.CURRENT_BRANCH }}
51-
fetch-depth: 0
5253

5354
- name: setup python
5455
uses: actions/setup-python@v5
@@ -97,4 +98,5 @@ jobs:
9798
SLACK_MESSAGE_ON_SUCCESS: "Docs were updated successfully for version ${{ needs.env.outputs.TITLE }}"
9899
SLACK_MESSAGE_ON_FAILURE: "Docs update FAILED for version ${{ needs.env.outputs.TITLE }}"
99100
MSG_MINIMAL: true
100-
SLACK_FOOTER: ""
101+
SLACK_FOOTER: ""
102+

docs/admin/workloads/policies/README.md

Lines changed: 45 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,16 @@ date: 2023-Dec-12
88

99
## Introduction
1010

11-
*Policies* allow administrators to impose restrictions and set default values for researcher workloads. Restrictions and default values can be placed on CPUs, GPUs, and other resources or entities. Enabling the *New Policy Manager* provides information about resources that are non-compliant to applied policies. Resources that are non-compliant will appear greyed out. To see how a resource is not compliant, press on the clipboard icon in the upper right hand corner of the resource.
11+
*Policies* allow administrators to impose restrictions and set default values for researcher workloads. Restrictions and default values can be placed on CPUs, GPUs, and other resources or entities. Enabling the *New Policy Manager* provides information about resources that are non-compliant to applied policies. Resources that are non-compliant will appear greyed out. To see how a resource is not compliant, press on the clipboard icon in the upper right-hand corner of the resource.
1212

1313
!!! Note
14-
Policies from Run:ai versions 2.15 or lower will still work after enabling the *New Policy Manager*. However, showing non-compliant policy rules will not be available. For more information about policies for version 2.15 or lower, see [What are Policies](policies.md#what-are-policies).
14+
Policies from Run:ai versions 2.17 or lower will still work after enabling the New Policy Manager. For more information about policies for version 2.17 or lower, see [What are Policies](policies.md#what-are-policies).
1515

1616
For example, an administrator can create and apply a policy that will restrict researchers from requesting more than 2 GPUs, or less than 1GB of memory per type of workload.
1717

1818
Another example is an administrator who wants to set different amounts of CPU, GPUs and memory for different kinds of workloads. A training workload can have a default of 1 GB of memory, or an interactive workload can have a default amount of GPUs.
1919

20-
Policies are created for each Run:ai project (Kubernetes namespace). When a policy is created in the `runai` namespace, it will take effect when there is no project-specific policy for the workloads of the same kind.
20+
Policies are created for each Run:ai project (Kubernetes namespace). When a policy is created in the `runai` namespace, it will take effect when there is no project-specific policy for workloads of the same kind.
2121

2222
In interactive workloads or workspaces, applied policies will only allow researchers access to resources that are permitted in the policy. This can include compute resources as well as node pools and node pool priority.
2323

@@ -47,7 +47,7 @@ A policy configured to a specific scope, is applied to all elements in that scop
4747

4848
### Policy Editor UI
4949

50-
Policies are added to the system using the policy editor and are written in YAML format. YAML™ is a human-friendly, cross language, Unicode based data serialization language designed around the common native data types of dynamic programming languages. It is useful for programming needs ranging from configuration files to internet messaging to object persistence to data auditing and visualization. For more information, see [YAML.org](https://yaml.org/){target=_blank}.
50+
Policies are added to the system using the policy editor and are written in YAML format. YAML™ is a human-friendly, cross-language, Unicode-based data serialization language designed around the common native data types of dynamic programming languages. It is useful for programming needs ranging from configuration files to internet messaging to object persistence to data auditing and visualization. For more information, see [YAML.org](https://yaml.org/){target=_blank}.
5151

5252
### Policy API
5353

@@ -59,50 +59,47 @@ The following is an example of a workspace policy you can apply in your platform
5959

6060
```YAML
6161
defaults:
62-
environment:
63-
allowPrivilegeEscalation: false
64-
createHomeDir: true
65-
environmentVariables:
66-
- name: MY_ENV
67-
value: my_value
68-
workspace:
69-
allowOverQuota: true
62+
createHomeDir: true
63+
environmentVariables:
64+
instances:
65+
- name: MY_ENV
66+
value: my_value
67+
security:
68+
allowPrivilegeEscalation: false
7069
rules:
71-
compute:
72-
cpuCoreLimit:
73-
min: 0
74-
max: 9
75-
required: true
76-
gpuPortionRequest:
77-
min: 0
78-
max: 10
70+
imagePullPolicy:
71+
required: true
72+
options:
73+
- value: Always
74+
displayed: Always
75+
- value: Never
76+
displayed: Never
77+
createHomeDir:
78+
canEdit: false
79+
security:
80+
runAsUid:
81+
min: 1
82+
max: 32700
83+
allowPrivilegeEscalation:
84+
canEdit: false
85+
compute:
86+
cpuCoreLimit:
87+
required: true
88+
min: 0
89+
max: 9
90+
gpuPortionRequest:
91+
min: 0
92+
max: 10
93+
storage:
94+
nfs:
95+
instances:
96+
canAdd: false
7997
s3:
80-
url:
81-
options:
82-
- displayed: "https://www.google.com"
83-
value: "https://www.google.com"
84-
- displayed: "https://www.yahoo.com"
85-
value: "https://www.yahoo.com"
86-
environment:
87-
imagePullPolicy:
88-
options:
89-
- displayed: "Always"
90-
value: "Always"
91-
- displayed: "Never"
92-
value: "Never"
93-
required: true
94-
runAsUid:
95-
min: 1
96-
max: 32700
97-
createHomeDir:
98-
canEdit: false
99-
allowPrivilegeEscalation:
100-
canEdit: false
101-
workspace:
102-
allowOverQuota:
103-
canEdit: false
104-
imposedAssets:
105-
dataSources:
106-
nfs:
107-
canAdd: false
98+
attributes:
99+
url:
100+
options:
101+
- value: https://www.google.com
102+
displayed: https://www.google.com
103+
- value: https://www.yahoo.com
104+
displayed: https://www.yahoo.com
108105
```

docs/home/whats-new-2-18.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,11 +22,11 @@ date: 2024-June-14
2222

2323
* <!-- RUN-16917/RUN-19363 move to top Expose secrets in workload submission -->Added new *Data sources* of type *Secret* to workload form. *Data sources* of type *Secret* are used to hide 3rd party access credentials when submitting workloads. For more information, see [Submitting Workloads](../admin/workloads/submitting-workloads.md#how-to-submit-a-workload).
2424

25-
* <!-- RUN-16830/RUN-16831 - Graphs & special metrics for inference -->Added new graphs for *Inference* workloads. The new graphs provide more information for *Inference* workloads to help analyze performance of the workloads. New graphs include Latency, Throughput, and number of replicas. For more information, see [Workloads View](../admin/workloads/README.md#workloads-view) (Requires minimum cluster version v2.18).
25+
* <!-- RUN-16830/RUN-16831 - Graphs & special metrics for inference -->Added new graphs for *Inference* workloads. The new graphs provide more information for *Inference* workloads to help analyze performance of the workloads. New graphs include Latency, Throughput, and number of replicas. For more information, see [Workloads View](../admin/workloads/README.md#workloads-view). (Requires minimum cluster version v2.18).
2626

2727
* <!-- TODO add link to doc when ready - get approval for text RUN-16805/RUN-17416 - Provide latency-based metric for autoscaling for requests -->Added latency metric for autoscaling. This feature allows automatic scale-up/down the number of replicas of a Run:ai inference workload based on the threshold set by the ML Engineer. This ensures that response time is kept under the target SLA. (Requires minimum cluster version v2.18).
2828

29-
* <!-- TODO Add to inference doc models explanation after autoscaling. RUN-16872/RUN-18526 Separating ChatUi from model in favor of coherent autoscaling -->Improved autoscaling for inference models by taking out ChatBot UI from models images. By moving ChatBot UI to predefined *Environments*, autoscaling is more accurate by taking into account all types of requests (API, and ChatBot UI). Adding a ChatBot UI environment preset by Run:ai allows AI practitioners to easily connect them to workloads.
29+
* <!-- Add to inference doc models explanation after autoscaling. RUN-16872/RUN-18526 Separating ChatUi from model in favor of coherent autoscaling -->Improved autoscaling for inference models by taking out ChatBot UI from models images. By moving ChatBot UI to predefined *Environments*, autoscaling is more accurate by taking into account all types of requests (API, and ChatBot UI). Adding a ChatBot UI environment preset by Run:ai allows AI practitioners to easily connect them to workloads.
3030

3131
* <!-- RUN-16832/ RUN-16833 - Custom value for auto-scale to zero-->Added more precision to trigger auto-scaling to zero. Now users can configure a precise consecutive idle threshold custom setting to trigger Run:ai inference workloads to scale-to-zero. (Requires minimum cluster version v2.18).
3232

0 commit comments

Comments
 (0)