diff --git a/.github/workflows/automated-publish-docs.yaml b/.github/workflows/automated-publish-docs.yaml index 97c38ef47d..4823a8d9ac 100644 --- a/.github/workflows/automated-publish-docs.yaml +++ b/.github/workflows/automated-publish-docs.yaml @@ -19,11 +19,13 @@ jobs: steps: - name: Checkout code uses: actions/checkout@v4 + with: + fetch-depth: 0 - name: Get all v*.* branches id: calculate-env run: | - BRANCHES=$(git branch --list --all | grep -v master | grep 'origin/v*.*' | sed -n -E 's:.*/(v[0-9]+\.[0-9]+).*:\1:p' | sort -Vu) + BRANCHES=$(git branch -r | grep -E '^ *origin/v[0-9]{1,2}\.[0-9]{1,2}$' | sort -Vu | sed 's/origin\///g' | sed 's/ //g') NEWEST_VERSION=$(printf '%s\n' "${BRANCHES[@]}" | sort -V | tail -n 1) CURRENT_BRANCH=${GITHUB_REF#refs/heads/} ALIAS=$CURRENT_BRANCH-alias @@ -48,7 +50,6 @@ jobs: uses: actions/checkout@v4 with: ref: ${{ needs.env.outputs.CURRENT_BRANCH }} - fetch-depth: 0 - name: setup python uses: actions/setup-python@v5 @@ -97,4 +98,5 @@ jobs: SLACK_MESSAGE_ON_SUCCESS: "Docs were updated successfully for version ${{ needs.env.outputs.TITLE }}" SLACK_MESSAGE_ON_FAILURE: "Docs update FAILED for version ${{ needs.env.outputs.TITLE }}" MSG_MINIMAL: true - SLACK_FOOTER: "" \ No newline at end of file + SLACK_FOOTER: "" + diff --git a/docs/admin/workloads/policies/README.md b/docs/admin/workloads/policies/README.md index f76540e3ad..bdb3a90afb 100644 --- a/docs/admin/workloads/policies/README.md +++ b/docs/admin/workloads/policies/README.md @@ -8,16 +8,16 @@ date: 2023-Dec-12 ## Introduction -*Policies* allow administrators to impose restrictions and set default values for researcher workloads. Restrictions and default values can be placed on CPUs, GPUs, and other resources or entities. Enabling the *New Policy Manager* provides information about resources that are non-compliant to applied policies. Resources that are non-compliant will appear greyed out. To see how a resource is not compliant, press on the clipboard icon in the upper right hand corner of the resource. +*Policies* allow administrators to impose restrictions and set default values for researcher workloads. Restrictions and default values can be placed on CPUs, GPUs, and other resources or entities. Enabling the *New Policy Manager* provides information about resources that are non-compliant to applied policies. Resources that are non-compliant will appear greyed out. To see how a resource is not compliant, press on the clipboard icon in the upper right-hand corner of the resource. !!! Note - Policies from Run:ai versions 2.15 or lower will still work after enabling the *New Policy Manager*. However, showing non-compliant policy rules will not be available. For more information about policies for version 2.15 or lower, see [What are Policies](policies.md#what-are-policies). + Policies from Run:ai versions 2.17 or lower will still work after enabling the New Policy Manager. For more information about policies for version 2.17 or lower, see [What are Policies](policies.md#what-are-policies). For example, an administrator can create and apply a policy that will restrict researchers from requesting more than 2 GPUs, or less than 1GB of memory per type of workload. Another example is an administrator who wants to set different amounts of CPU, GPUs and memory for different kinds of workloads. A training workload can have a default of 1 GB of memory, or an interactive workload can have a default amount of GPUs. -Policies are created for each Run:ai project (Kubernetes namespace). When a policy is created in the `runai` namespace, it will take effect when there is no project-specific policy for the workloads of the same kind. +Policies are created for each Run:ai project (Kubernetes namespace). When a policy is created in the `runai` namespace, it will take effect when there is no project-specific policy for workloads of the same kind. In interactive workloads or workspaces, applied policies will only allow researchers access to resources that are permitted in the policy. This can include compute resources as well as node pools and node pool priority. @@ -47,7 +47,7 @@ A policy configured to a specific scope, is applied to all elements in that scop ### Policy Editor UI -Policies are added to the system using the policy editor and are written in YAML format. YAML™ is a human-friendly, cross language, Unicode based data serialization language designed around the common native data types of dynamic programming languages. It is useful for programming needs ranging from configuration files to internet messaging to object persistence to data auditing and visualization. For more information, see [YAML.org](https://yaml.org/){target=_blank}. +Policies are added to the system using the policy editor and are written in YAML format. YAML™ is a human-friendly, cross-language, Unicode-based data serialization language designed around the common native data types of dynamic programming languages. It is useful for programming needs ranging from configuration files to internet messaging to object persistence to data auditing and visualization. For more information, see [YAML.org](https://yaml.org/){target=_blank}. ### Policy API @@ -59,50 +59,47 @@ The following is an example of a workspace policy you can apply in your platform ```YAML defaults: - environment: - allowPrivilegeEscalation: false - createHomeDir: true - environmentVariables: - - name: MY_ENV - value: my_value - workspace: - allowOverQuota: true + createHomeDir: true + environmentVariables: + instances: + - name: MY_ENV + value: my_value + security: + allowPrivilegeEscalation: false rules: - compute: - cpuCoreLimit: - min: 0 - max: 9 - required: true - gpuPortionRequest: - min: 0 - max: 10 + imagePullPolicy: + required: true + options: + - value: Always + displayed: Always + - value: Never + displayed: Never + createHomeDir: + canEdit: false + security: + runAsUid: + min: 1 + max: 32700 + allowPrivilegeEscalation: + canEdit: false + compute: + cpuCoreLimit: + required: true + min: 0 + max: 9 + gpuPortionRequest: + min: 0 + max: 10 + storage: + nfs: + instances: + canAdd: false s3: - url: - options: - - displayed: "https://www.google.com" - value: "https://www.google.com" - - displayed: "https://www.yahoo.com" - value: "https://www.yahoo.com" - environment: - imagePullPolicy: - options: - - displayed: "Always" - value: "Always" - - displayed: "Never" - value: "Never" - required: true - runAsUid: - min: 1 - max: 32700 - createHomeDir: - canEdit: false - allowPrivilegeEscalation: - canEdit: false - workspace: - allowOverQuota: - canEdit: false - imposedAssets: - dataSources: - nfs: - canAdd: false + attributes: + url: + options: + - value: https://www.google.com + displayed: https://www.google.com + - value: https://www.yahoo.com + displayed: https://www.yahoo.com ``` diff --git a/docs/home/whats-new-2-18.md b/docs/home/whats-new-2-18.md index 6a4c840d3c..017808cf18 100644 --- a/docs/home/whats-new-2-18.md +++ b/docs/home/whats-new-2-18.md @@ -22,11 +22,11 @@ date: 2024-June-14 * Added new *Data sources* of type *Secret* to workload form. *Data sources* of type *Secret* are used to hide 3rd party access credentials when submitting workloads. For more information, see [Submitting Workloads](../admin/workloads/submitting-workloads.md#how-to-submit-a-workload). -* Added new graphs for *Inference* workloads. The new graphs provide more information for *Inference* workloads to help analyze performance of the workloads. New graphs include Latency, Throughput, and number of replicas. For more information, see [Workloads View](../admin/workloads/README.md#workloads-view) (Requires minimum cluster version v2.18). +* Added new graphs for *Inference* workloads. The new graphs provide more information for *Inference* workloads to help analyze performance of the workloads. New graphs include Latency, Throughput, and number of replicas. For more information, see [Workloads View](../admin/workloads/README.md#workloads-view). (Requires minimum cluster version v2.18). * Added latency metric for autoscaling. This feature allows automatic scale-up/down the number of replicas of a Run:ai inference workload based on the threshold set by the ML Engineer. This ensures that response time is kept under the target SLA. (Requires minimum cluster version v2.18). -* Improved autoscaling for inference models by taking out ChatBot UI from models images. By moving ChatBot UI to predefined *Environments*, autoscaling is more accurate by taking into account all types of requests (API, and ChatBot UI). Adding a ChatBot UI environment preset by Run:ai allows AI practitioners to easily connect them to workloads. +* Improved autoscaling for inference models by taking out ChatBot UI from models images. By moving ChatBot UI to predefined *Environments*, autoscaling is more accurate by taking into account all types of requests (API, and ChatBot UI). Adding a ChatBot UI environment preset by Run:ai allows AI practitioners to easily connect them to workloads. * Added more precision to trigger auto-scaling to zero. Now users can configure a precise consecutive idle threshold custom setting to trigger Run:ai inference workloads to scale-to-zero. (Requires minimum cluster version v2.18). @@ -85,7 +85,7 @@ date: 2024-June-14 #### Single Sign On -* Added support for Single Sign On using OpenShift v4 (OIDC based). When using OpenShift, you must first define OAuthClient which interacts with OpenShift's OAuth server to authenticate users and request access tokens. For more information, see [Single Sign-On](../admin/runai-setup/authentication/sso/). +* Added support for Single Sign On using OpenShift v4 (OIDC based). When using OpenShift, you must first define OAuthClient which interacts with OpenShift's OAuth server to authenticate users and request access tokens. For more information, see [Single Sign-On](../admin/runai-setup/authentication/sso/). * Added OIDC scopes to authentication requests. OIDC Scopes are used to specify what access privileges are being requested for access tokens. The scopes associated with the access tokens determine what resource are available when they are used to access OAuth 2.0 protected endpoints. Protected endpoints may perform different actions and return different information based on the scope values and other parameters used when requesting the presented access token. For more information, see [UI configuration](../admin/runai-setup/authentication/sso/#step-1-ui-configuration). @@ -101,7 +101,7 @@ date: 2024-June-14 #### Policy for distributed and inference workloads in the API -Added a new API for creating distributed training workload policies and inference workload policies. These new policies in the API allow to set defaults, enforce rules and impose setup on distributed training and inference workloads. For distributed policies, worker and master may require different rules due to their different specifications. The new capability is currently available via API only. Documentation on submitting policies to follow shortly. +* Added a new API for creating distributed training workload policies and inference workload policies. These new policies in the API allow to set defaults, enforce rules and impose setup on distributed training and inference workloads. For distributed policies, worker and master may require different rules due to their different specifications. The new capability is currently available via API only. Documentation on submitting policies to follow shortly. ## Deprecation Notifications