run-ai · yarongol · Aug 7, 2024 · Aug 7, 2024 · Aug 7, 2024 · Aug 7, 2024
diff --git a/docs/Researcher/best-practices/researcher-notifications.md b/docs/Researcher/best-practices/researcher-notifications.md
@@ -11,7 +11,7 @@ date: 2024-Jul-4
 
 Managing numerous data science workloads requires monitoring various stages, including submission, scheduling, initialization, execution, and completion. Additionally, handling suspensions and failures is crucial for ensuring timely workload completion. Email Notifications address this need by sending alerts for critical workload life cycle changes. This empowers data scientists to take necessary actions and prevent delays.
 
-Once the system administrator configures the email notifications, users will receive notifications about their jobs that transition from one status to another. In addition, the user will get warning notifications before workload termination due to project-defined timeouts. Details included in the email are:
+Once the system administrator [configures the email notifications](../../admin/runai-setup/notifications/notifications.md), users will receive notifications about their jobs that transition from one status to another. In addition, the user will get warning notifications before workload termination due to project-defined timeouts. Details included in the email are:
 
 * Workload type
 * Project and cluster information

diff --git a/docs/Researcher/cli-reference/runai-submit-dist-TF.md b/docs/Researcher/cli-reference/runai-submit-dist-TF.md
@@ -75,7 +75,7 @@ runai submit-dist tf --name distributed-job --workers=2 -g 1 \
 
 #### --create-home-dir
 
-> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
+> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
 
 #### -e `<stringArray>  | --environment `<stringArray>`
 
@@ -335,7 +335,7 @@ runai submit-dist tf --name distributed-job --workers=2 -g 1 \
 
 #### --run-as-user
 
->  Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
+>  Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
 
 ### Scheduling
 

diff --git a/docs/Researcher/cli-reference/runai-submit-dist-mpi.md b/docs/Researcher/cli-reference/runai-submit-dist-mpi.md
@@ -78,7 +78,7 @@ You can start an unattended mpi training Job of name dist1, based on Project *te
 
 #### --create-home-dir
 
-> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
+> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
 
 #### -e `<stringArray>  | --environment `<stringArray>`
 
@@ -334,7 +334,7 @@ You can start an unattended mpi training Job of name dist1, based on Project *te
 
 #### --run-as-user
 
->  Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
+>  Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
 
 ### Scheduling
 

diff --git a/docs/Researcher/cli-reference/runai-submit-dist-pytorch.md b/docs/Researcher/cli-reference/runai-submit-dist-pytorch.md
@@ -82,7 +82,7 @@ runai submit-dist pytorch --name distributed-job --workers=2 -g 1 \
 
 #### --create-home-dir
 
-> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
+> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
 
 #### -e `<stringArray>  | --environment `<stringArray>`
 
@@ -342,7 +342,7 @@ runai submit-dist pytorch --name distributed-job --workers=2 -g 1 \
 
 #### --run-as-user
 
->  Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
+>  Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
 
 ### Scheduling
 

diff --git a/docs/Researcher/cli-reference/runai-submit-dist-xgboost.md b/docs/Researcher/cli-reference/runai-submit-dist-xgboost.md
@@ -70,7 +70,7 @@ runai submit-dist xgboost --name distributed-job --workers=2 -g 1 \
 
 #### --create-home-dir
 
-> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
+> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
 
 #### -e `<stringArray>  | --environment `<stringArray>`
 
@@ -326,7 +326,7 @@ runai submit-dist xgboost --name distributed-job --workers=2 -g 1 \
 
 #### --run-as-user
 
->  Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
+>  Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
 
 ### Scheduling
 

diff --git a/docs/Researcher/cli-reference/runai-submit.md b/docs/Researcher/cli-reference/runai-submit.md
@@ -144,7 +144,7 @@ runai submit --job-name-prefix -i gcr.io/run-ai-demo/quickstart -g 1
 
 #### --create-home-dir
 
-> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
+> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
 
 #### -e `<stringArray>` | --environment `<stringArray>`
 
@@ -400,7 +400,7 @@ runai submit --job-name-prefix -i gcr.io/run-ai-demo/quickstart -g 1
 
 #### --run-as-user
 
-> Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is *root* (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
+> Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is *root* (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
 
 ### Scheduling
 

diff --git a/docs/Researcher/overview-researcher.md b/docs/Researcher/overview-researcher.md
@@ -3,7 +3,7 @@ title: Researcher Documentation Overview
 ---
 # Overview: Researcher Documentation
 
-Researchers use Run:ai to submit jobs. 
+_Researchers_, or _AI practitioners_, use Run:ai to submit Workloads. 
 
 As part of the Researcher documentation you will find:
 

diff --git a/...n/runai-setup/config/img/uid-explicit.png → ...admin/authentication/img/uid-explicit.png b/...n/runai-setup/config/img/uid-explicit.png → ...admin/authentication/img/uid-explicit.png
diff --git a/...runai-setup/config/non-root-containers.md → ...min/authentication/non-root-containers.md b/...runai-setup/config/non-root-containers.md → ...min/authentication/non-root-containers.md
@@ -18,7 +18,7 @@ then run `id`, you will see the **root** user.
 
 ## Use Run:ai flags to limit root access
 
-There are two [runai submit](../../../Researcher/cli-reference/runai-submit.md) flags which control user identity at the Researcher level:
+There are two [runai [submit](../../Researcher/cli-reference/runai-submit.md) flags that control user identity at the Researcher level:
 
 * The flag `--run-as-user` starts the container with a specific user. The user is the current Linux user (see below for other behaviors if used in conjunction with Single sign-on).
 * The flag `--prevent-privilege-escalation` prevents the container from elevating its own privileges into `root` (e.g. running `sudo` or changing system files.).
@@ -50,7 +50,7 @@ then verify that you cannot run `su` to become root within the container.
 ### Setting a Cluster-Wide Default
 
 
-The two flags are voluntary. They are not enforced by the system. It is however possible to enforce them using [Policies](../../workloads/policies/policies.md). Polices allow an Administrator to force compliance on both the User Interface and Command-line interface.
+The two flags are voluntary. They are not enforced by the system. It is however possible to enforce them using [Policies](../workloads/policies/policies.md). Policies allow an Administrator to force compliance on both the User Interface and Command-line interface.
 
 
 ## Passing user identity
@@ -60,7 +60,7 @@ A best practice is to store the user identifier (UID) and the group identifier (
 
 To perform this, you must:
 
-* Set up [single sign-on](../../authentication/authentication-overview.md). Perform the steps for UID/GID integration.
+* Set up [single sign-on](authentication-overview.md). Perform the steps for UID/GID integration.
 * Run: `runai login` and enter your credentials
 * Use the flag --run-as-user
 

diff --git a/docs/admin/overview-administrator.md b/docs/admin/overview-administrator.md
@@ -6,9 +6,9 @@ The Infrastructure Administrator is an IT person, responsible for the installati
 As part of the Infrastructure Administrator documentation you will find:
 
 * Install Run:ai 
-    * How to set up and modify a GPU cluster with Run:ai.
+    * Set up a Run:ai Cluster.
     * Set up Researchers to work with Run:ai.
-* Configure the Run:ai system
-* Setup users by connecting Run:ai to an identity provider.
-* IT maintenance of the Run:ai system
-* Troubleshooting Run:ai and understanding cluster health.
+* IT Configuration of the Run:ai system
+* Connect Run:ai to an identity provider.
+* Maintenance & monitoring of the Run:ai system
+* Troubleshooting.
diff --git a/docs/admin/runai-setup/config/overview.md b/docs/admin/runai-setup/config/overview.md
@@ -9,9 +9,12 @@ This section provides a list of installation-related articles dealing with a wid
 |     Article                                             |  Purpose  |
 |---------------------------------------------------------|-----------|
 | [Designating Specific Role Nodes](node-roles.md) | Set one or more designated Run:ai system nodes or limit Run:ai monitoring and scheduling to specific nodes in the cluster. |
-| [Setup Project-based Researcher Access Control](../../authentication/researcher-authentication.md) | Enable  Run:ai access control is at the __Project__ level. | 
-| [Single sign-on](../../authentication/authentication-overview.md) | Integrate with the organization's Identity Provider to provide single sign-on for Run:ai | 
 | [Review Kubernetes Access provided to Run:ai](access-roles.md)     | In Restrictive Kubernetes environments such as when using OpenShift, understand and control what Kubernetes roles are provided to Run:ai | 
 | [External access to Containers](allow-external-access-to-containers.md) | Understand the available options for Researchers to access containers from the outside | 
-| [User Identity in Container](non-root-containers.md) | The identity of the user in the container determines its access to cluster resources. The document explains multiple way on how to propagate the user identity into the container. |
 | [Install the Run:ai Administrator Command-line Interface](cli-admin-install.md) | The Administrator command-line is useful in a variety of flows such as cluster upgrade, node setup etc. | 
+| [Set Node affinity with cloud node pools](node-affinity-with-cloud-node-pools.md) | Set node affinity when using a cloud provider for your cluster | 
+| [Local Certificate Authority](org-cert.md) | For self-hosted Run:ai environments, specifically air-gapped installation, setup a local certificate authority to allow customers to safely connect to Run:ai  | 
+| [Backup & Restore](dr.md) | For self-hosted Run:ai environments, set up a scheduled backup of Run:ai data | 
+| [High Availability](ha.md) | Configure Run:ai such that it will continue to provide service even if parts of the system are down. | 
+| [Scaling](large-clusters.md) | Scale the Run:ai cluster and the Run:ai control-plane to withstand large transaction loads | 
+| [Emails and system notification](../notifications/notifications.md) | Configure e-mail notification |
diff --git a/docs/home/overview.md b/docs/home/overview.md
@@ -45,7 +45,7 @@ Run:ai cloud availability is monitored at [status.run.ai](https://status.run.ai)
 
 As an IT Administrator, you can collect Run:ai logs to send to support:
 
-* Install the [Run:ai Administrator command-line interface](admin/runai-setup/config/cli-admin-install.md).
+* Install the [Run:ai Administrator command-line interface](../admin/runai-setup/config/cli-admin-install.md).
 * Run `runai-adm collect-logs`. The command will generate a compressed file containing all of the existing Run:ai log files.
 
 !!! Note

diff --git a/docs/platform-admin/overview.md b/docs/platform-admin/overview.md
@@ -7,7 +7,7 @@ The Platform Administrator is responsible for the day-to-day administration of t
 As part of the Platform Administrator documentation you will find:
 
 
-* Provide the right access to system users.
+* Provide the right access level to users.
 * Configure Run:ai meta-data such as Projects, Departments, Node pools etc.  
 * Setup Workload Policies and Assets
 * Analyze system performance and perform suggested actions. 
diff --git a/docs/snippets/common-submit-cli-commands.md b/docs/snippets/common-submit-cli-commands.md
@@ -37,7 +37,7 @@
 
 #### --create-home-dir
 
-> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../admin/runai-setup/config/non-root-containers.md).
+> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../admin/authentication/non-root-containers.md).
 
 #### -e `<stringArray>  | --environment `<stringArray>`
 
@@ -265,7 +265,7 @@
 
 #### --run-as-user
 
->  Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../admin/runai-setup/config/non-root-containers.md).
+>  Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../admin/authentication/non-root-containers.md).
 
 ### Scheduling