Skip to content

Commit bbf99d9

Browse files
authored
Merge pull request #924 from run-ai/small-fixes
Small fixes
2 parents 96093fd + 7603c57 commit bbf99d9

21 files changed

+44
-39
lines changed

docs/Researcher/best-practices/researcher-notifications.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ date: 2024-Jul-4
1111

1212
Managing numerous data science workloads requires monitoring various stages, including submission, scheduling, initialization, execution, and completion. Additionally, handling suspensions and failures is crucial for ensuring timely workload completion. Email Notifications address this need by sending alerts for critical workload life cycle changes. This empowers data scientists to take necessary actions and prevent delays.
1313

14-
Once the system administrator configures the email notifications, users will receive notifications about their jobs that transition from one status to another. In addition, the user will get warning notifications before workload termination due to project-defined timeouts. Details included in the email are:
14+
Once the system administrator [configures the email notifications](../../admin/runai-setup/notifications/notifications.md), users will receive notifications about their jobs that transition from one status to another. In addition, the user will get warning notifications before workload termination due to project-defined timeouts. Details included in the email are:
1515

1616
* Workload type
1717
* Project and cluster information

docs/Researcher/cli-reference/runai-submit-dist-TF.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@ runai submit-dist tf --name distributed-job --workers=2 -g 1 \
7575
7676
#### --create-home-dir
7777

78-
> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
78+
> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
7979
8080
#### -e `<stringArray> | --environment `<stringArray>`
8181

@@ -335,7 +335,7 @@ runai submit-dist tf --name distributed-job --workers=2 -g 1 \
335335
336336
#### --run-as-user
337337

338-
> Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
338+
> Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
339339
340340
### Scheduling
341341

docs/Researcher/cli-reference/runai-submit-dist-mpi.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ You can start an unattended mpi training Job of name dist1, based on Project *te
7878
7979
#### --create-home-dir
8080

81-
> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
81+
> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
8282
8383
#### -e `<stringArray> | --environment `<stringArray>`
8484

@@ -334,7 +334,7 @@ You can start an unattended mpi training Job of name dist1, based on Project *te
334334
335335
#### --run-as-user
336336

337-
> Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
337+
> Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
338338
339339
### Scheduling
340340

docs/Researcher/cli-reference/runai-submit-dist-pytorch.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ runai submit-dist pytorch --name distributed-job --workers=2 -g 1 \
8282
8383
#### --create-home-dir
8484

85-
> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
85+
> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
8686
8787
#### -e `<stringArray> | --environment `<stringArray>`
8888

@@ -342,7 +342,7 @@ runai submit-dist pytorch --name distributed-job --workers=2 -g 1 \
342342
343343
#### --run-as-user
344344

345-
> Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
345+
> Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
346346
347347
### Scheduling
348348

docs/Researcher/cli-reference/runai-submit-dist-xgboost.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ runai submit-dist xgboost --name distributed-job --workers=2 -g 1 \
7070
7171
#### --create-home-dir
7272

73-
> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
73+
> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
7474
7575
#### -e `<stringArray> | --environment `<stringArray>`
7676

@@ -326,7 +326,7 @@ runai submit-dist xgboost --name distributed-job --workers=2 -g 1 \
326326
327327
#### --run-as-user
328328

329-
> Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
329+
> Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
330330
331331
### Scheduling
332332

docs/Researcher/cli-reference/runai-submit.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,7 @@ runai submit --job-name-prefix -i gcr.io/run-ai-demo/quickstart -g 1
144144
145145
#### --create-home-dir
146146

147-
> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
147+
> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
148148
149149
#### -e `<stringArray>` | --environment `<stringArray>`
150150

@@ -400,7 +400,7 @@ runai submit --job-name-prefix -i gcr.io/run-ai-demo/quickstart -g 1
400400
401401
#### --run-as-user
402402

403-
> Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is *root* (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/runai-setup/config/non-root-containers.md).
403+
> Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is *root* (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../../admin/authentication/non-root-containers.md).
404404
405405
### Scheduling
406406

docs/Researcher/overview-researcher.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Researcher Documentation Overview
33
---
44
# Overview: Researcher Documentation
55

6-
Researchers use Run:ai to submit jobs.
6+
_Researchers_, or _AI practitioners_, use Run:ai to submit Workloads.
77

88
As part of the Researcher documentation you will find:
99

docs/admin/runai-setup/config/non-root-containers.md renamed to docs/admin/authentication/non-root-containers.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ then run `id`, you will see the **root** user.
1818

1919
## Use Run:ai flags to limit root access
2020

21-
There are two [runai submit](../../../Researcher/cli-reference/runai-submit.md) flags which control user identity at the Researcher level:
21+
There are two [runai [submit](../../Researcher/cli-reference/runai-submit.md) flags that control user identity at the Researcher level:
2222

2323
* The flag `--run-as-user` starts the container with a specific user. The user is the current Linux user (see below for other behaviors if used in conjunction with Single sign-on).
2424
* The flag `--prevent-privilege-escalation` prevents the container from elevating its own privileges into `root` (e.g. running `sudo` or changing system files.).
@@ -50,7 +50,7 @@ then verify that you cannot run `su` to become root within the container.
5050
### Setting a Cluster-Wide Default
5151

5252

53-
The two flags are voluntary. They are not enforced by the system. It is however possible to enforce them using [Policies](../../workloads/policies/policies.md). Polices allow an Administrator to force compliance on both the User Interface and Command-line interface.
53+
The two flags are voluntary. They are not enforced by the system. It is however possible to enforce them using [Policies](../workloads/policies/policies.md). Policies allow an Administrator to force compliance on both the User Interface and Command-line interface.
5454

5555

5656
## Passing user identity
@@ -60,7 +60,7 @@ A best practice is to store the user identifier (UID) and the group identifier (
6060

6161
To perform this, you must:
6262

63-
* Set up [single sign-on](../../authentication/authentication-overview.md). Perform the steps for UID/GID integration.
63+
* Set up [single sign-on](authentication-overview.md). Perform the steps for UID/GID integration.
6464
* Run: `runai login` and enter your credentials
6565
* Use the flag --run-as-user
6666

docs/admin/overview-administrator.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,9 @@ The Infrastructure Administrator is an IT person, responsible for the installati
66
As part of the Infrastructure Administrator documentation you will find:
77

88
* Install Run:ai
9-
* How to set up and modify a GPU cluster with Run:ai.
9+
* Set up a Run:ai Cluster.
1010
* Set up Researchers to work with Run:ai.
11-
* Configure the Run:ai system
12-
* Setup users by connecting Run:ai to an identity provider.
13-
* IT maintenance of the Run:ai system
14-
* Troubleshooting Run:ai and understanding cluster health.
11+
* IT Configuration of the Run:ai system
12+
* Connect Run:ai to an identity provider.
13+
* Maintenance & monitoring of the Run:ai system
14+
* Troubleshooting.

docs/admin/runai-setup/config/overview.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,12 @@ This section provides a list of installation-related articles dealing with a wid
99
| Article | Purpose |
1010
|---------------------------------------------------------|-----------|
1111
| [Designating Specific Role Nodes](node-roles.md) | Set one or more designated Run:ai system nodes or limit Run:ai monitoring and scheduling to specific nodes in the cluster. |
12-
| [Setup Project-based Researcher Access Control](../../authentication/researcher-authentication.md) | Enable Run:ai access control is at the __Project__ level. |
13-
| [Single sign-on](../../authentication/authentication-overview.md) | Integrate with the organization's Identity Provider to provide single sign-on for Run:ai |
1412
| [Review Kubernetes Access provided to Run:ai](access-roles.md) | In Restrictive Kubernetes environments such as when using OpenShift, understand and control what Kubernetes roles are provided to Run:ai |
1513
| [External access to Containers](allow-external-access-to-containers.md) | Understand the available options for Researchers to access containers from the outside |
16-
| [User Identity in Container](non-root-containers.md) | The identity of the user in the container determines its access to cluster resources. The document explains multiple way on how to propagate the user identity into the container. |
1714
| [Install the Run:ai Administrator Command-line Interface](cli-admin-install.md) | The Administrator command-line is useful in a variety of flows such as cluster upgrade, node setup etc. |
15+
| [Set Node affinity with cloud node pools](node-affinity-with-cloud-node-pools.md) | Set node affinity when using a cloud provider for your cluster |
16+
| [Local Certificate Authority](org-cert.md) | For self-hosted Run:ai environments, specifically air-gapped installation, setup a local certificate authority to allow customers to safely connect to Run:ai |
17+
| [Backup & Restore](dr.md) | For self-hosted Run:ai environments, set up a scheduled backup of Run:ai data |
18+
| [High Availability](ha.md) | Configure Run:ai such that it will continue to provide service even if parts of the system are down. |
19+
| [Scaling](large-clusters.md) | Scale the Run:ai cluster and the Run:ai control-plane to withstand large transaction loads |
20+
| [Emails and system notification](../notifications/notifications.md) | Configure e-mail notification |

docs/home/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ Run:ai cloud availability is monitored at [status.run.ai](https://status.run.ai)
4545

4646
As an IT Administrator, you can collect Run:ai logs to send to support:
4747

48-
* Install the [Run:ai Administrator command-line interface](admin/runai-setup/config/cli-admin-install.md).
48+
* Install the [Run:ai Administrator command-line interface](../admin/runai-setup/config/cli-admin-install.md).
4949
* Run `runai-adm collect-logs`. The command will generate a compressed file containing all of the existing Run:ai log files.
5050

5151
!!! Note

docs/platform-admin/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ The Platform Administrator is responsible for the day-to-day administration of t
77
As part of the Platform Administrator documentation you will find:
88

99

10-
* Provide the right access to system users.
10+
* Provide the right access level to users.
1111
* Configure Run:ai meta-data such as Projects, Departments, Node pools etc.
1212
* Setup Workload Policies and Assets
1313
* Analyze system performance and perform suggested actions.

docs/snippets/common-submit-cli-commands.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737
3838
#### --create-home-dir
3939

40-
> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../admin/runai-setup/config/non-root-containers.md).
40+
> Create a temporary home directory for the user in the container. Data saved in this directory will not be saved when the container exits. For more information see [non root containers](../admin/authentication/non-root-containers.md).
4141
4242
#### -e `<stringArray> | --environment `<stringArray>`
4343

@@ -265,7 +265,7 @@
265265
266266
#### --run-as-user
267267

268-
> Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../admin/runai-setup/config/non-root-containers.md).
268+
> Run in the context of the current user running the Run:ai command rather than the root user. While the default container user is _root_ (same as in Docker), this command allows you to submit a Job running under your Linux user. This would manifest itself in access to operating system resources, in the owner of new folders created under shared directories, etc. Alternatively, if your cluster is connected to Run:ai via SAML, you can map the container to use the Linux UID/GID which is stored in the organization's directory. For more information see [non root containers](../admin/authentication/non-root-containers.md).
269269
270270
### Scheduling
271271

0 commit comments

Comments
 (0)