You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. Replace `<VERSION>` with the Run:ai control plane version.
39
34
2. Domain name described [here](prerequisites.md#domain-name).
40
-
3. `custom-env.yaml` should have been created by the _prepare installation_ script in the previous section.
35
+
3. See the Local Certificate Authority instructions below
36
+
4. `custom-env.yaml` should have been created by the _prepare installation_ script in the previous section.
41
37
42
38
!!! Tip
43
39
Use the `--dry-run` flag to gain an understanding of what is being installed before the actual installation.
44
40
45
-
## (Air-gapped only) Local Certificate Authority
46
-
47
-
Perform the instructions for [local certificate authority](../../config/org-cert.md).
48
41
49
42
50
-
## (Optional) Additional Configurations
51
-
43
+
### Additional configurations (optional)
52
44
There may be cases where you need to set additional properties as follows:
53
45
54
46
| Key | Change | Description |
@@ -63,29 +55,29 @@ There may be cases where you need to set additional properties as follows:
63
55
|`grafana.adminUser`| Grafana username | Override the Run:ai default user name for accessing Grafana |
64
56
|`grafana.adminPassword`| Grafana password | Override the Run:ai default password for accessing Grafana |
65
57
|`thanos.receive.persistence.storageClass` and `postgresql.primary.persistence.storageClass`| Storage class | The installation to work with a specific storage class rather than the default one |
66
-
|`global.imagePullSecrets:` <br>  `- name: <secret-name>`| Docker secret | Provide credentials for accessing the organization's docker registry. This is required for air-gapped environments |
67
58
|`<component>` <br>  `resources:` <br>  `limits:` <br>   `cpu: 500m` <br>   `memory: 512Mi` <br>  `requests:` <br>   `cpu: 250m` <br>   `memory: 256Mi`| Pod request and limits |`<component>` may be anyone of the following: `backend`, `frontend`, `assetsService`, `identityManager`, `tenantsManager`, `keycloakx`, `grafana`, `authorization`, `orgUnitService`,`policyService`|
68
59
|<divstyle="width:200px"></div>|||
69
60
70
-
71
-
72
-
73
61
Use the `--set` syntax in the helm command above.
74
62
75
-
### Connect to Run:ai User Interface
63
+
##Next Steps
76
64
77
-
Go to: `runai.<company-name>`. Log in using the default credentials: User: `test@run.ai`, Password: `Abcd!234`. Go to the Users area and change the password.
65
+
### Connect to Run:ai User interface
78
66
67
+
Go to: `runai.<domain>`. Log in using the default credentials: User: `test@run.ai`, Password: `Abcd!234`. Go to the Users area and change the password.
79
68
80
-
##(Optional) Enable "Forgot password"
69
+
### Enable Forgot Password (optional)
81
70
82
-
To support the “Forgot password” functionality, follow the steps below.
71
+
To support the *Forgot password* functionality, follow the steps below.
83
72
84
-
* Go to `runai.<company-name>/auth` and Log in.
73
+
* Go to `runai.<domain>/auth` and Log in.
85
74
* Under `Realm settings`, select the `Login` tab and enable the `Forgot password` feature.
86
75
* Under the `Email` tab, define an SMTP server, as explained [here](https://www.keycloak.org/docs/latest/server_admin/#_email){target=_blank}
87
76
88
-
## Next Steps
89
77
78
+
### Install Run:ai Cluster
90
79
Continue with installing a [Run:ai Cluster](cluster.md).
title: Self-Hosted Installation over Kubernetes - Preparations
2
+
title: SelfHosted installation over Kubernetes - preparations
3
3
---
4
+
# Preparing for a Run:ai Kubernetes installation
4
5
5
-
## Prerequisites
6
+
The following section provides IT with the information needed to prepare for a Run:ai installation.
6
7
7
-
See the Prerequisites section [above](prerequisites.md).
8
+
## Prerequisites
8
9
10
+
Follow the prerequisites as explained in [Self-Hosted installation over Kubernetes](prerequisites.md).
9
11
10
-
## Prepare Installation Artifacts
11
-
12
-
### Run:ai Software Files
13
-
14
-
SSH into a node with `kubectl` access to the cluster and `Docker` installed.
15
-
12
+
## Software artifacts
16
13
17
14
=== "Connected"
15
+
You should receive a file: `runai-gcr-secret.yaml` from Run:ai Customer Support. The file provides access to the Run:ai Container registry.
16
+
17
+
SSH into a node with `kubectl` access to the cluster and `Docker` installed.
18
18
Run the following to enable image download from the Run:ai Container Registry on Google cloud:
19
19
20
20
``` bash
21
21
kubectl create namespace runai-backend
22
-
kubectl apply -f runai-gcr-secret.yaml
22
+
kubectl apply -f runai-reg-creds.yaml
23
23
```
24
24
25
-
=== "Airgapped"
25
+
=== "Airgapped"
26
+
You should receive a single file `runai-air-gapped-<version>.tar.gz` from Run:ai customer support
27
+
28
+
SSH into a node with `kubectl` access to the cluster and `Docker` installed.
29
+
30
+
Run:ai assumes the existence of a Docker registry for images. Most likely installed within the organization. The installation requires the network address and port for the registry (referenced below as `<REGISTRY_URL>`).
31
+
26
32
To extract Run:ai files, replace `<VERSION>` in the command below and run:
27
33
28
34
``` bash
29
-
tar xvf runai-air-gapped-<version>.tar.gz
35
+
tar xvf runai-air-gapped-<VERSION>.tar.gz
30
36
cd deploy
31
37
32
38
kubectl create namespace runai-backend
33
39
```
34
40
35
-
__Upload images__
41
+
**Upload images**
36
42
37
43
Upload images to a local Docker Registry. Set the Docker Registry address in the form of `NAME:PORT` (do not add `https`):
38
44
@@ -50,25 +56,67 @@ SSH into a node with `kubectl` access to the cluster and `Docker` installed.
50
56
51
57
The script should create a file named `custom-env.yaml` which will be used by the control-plane installation.
52
58
59
+
### Private Docker Registry (optional)
60
+
61
+
To access the organization's docker registry it is required to set the registry's credentials (imagePullSecret)
62
+
63
+
Create the secret named `runai-reg-creds` based on your existing credentials. For more information, see [Allowing pods to reference images from other secured registries](https://docs.openshift.com/container-platform/latest/openshift_images/managing_images/using-image-pull-secrets.html#images-allow-pods-to-reference-images-from-secure-registries_using-image-pull-secrets){target=_blank}.
64
+
65
+
## Configure your environment
66
+
67
+
### Domain Certificate
68
+
69
+
The Run:ai control plane requires a domain name (FQDN). You must supply a domain name as well as a trusted certificate for that domain.
70
+
71
+
* When installing the first Run:ai cluster on the same Kubernetes cluster as the control plane, the Run:ai cluster URL will be the same as the control-plane URL.
72
+
* When installing the Run:ai cluster on a separate Kubernetes cluster, follow the Run:ai[domain name](../../cluster-setup/cluster-prerequisites.md#cluster-url) requirements.
73
+
* If your network is air-gapped, you will need to provide the Run:ai control-plane and cluster with information about the [local certificate authority](../../config/org-cert.md).
74
+
75
+
You must provide the domain's private key and crt as a Kubernetes secret in the `runai-backend` namespace. Run:
In air-gapped environments, you must prepare the public key of your local certificate authority as described [here](../../config/org-cert.md). It will need to be installed in Kubernetes for the installation to succeed.
84
+
85
+
### Mark Run:ai system workers (optional)
55
86
56
-
You can __optionally__ set the Run:ai control plane to run on specific nodes. Kubernetes will attempt to schedule Run:ai pods to these nodes. If lacking resources, the Run:ai nodes will move to another, non-labeled node.
87
+
You can **optionally** set the Run:ai control plane to run on specific nodes. Kubernetes will attempt to schedule Run:ai pods to these nodes. If lacking resources, the Run:ai nodes will move to another, non-labeled node.
Do not select the Kubernetes master as a `runai-system` node. This may cause Kubernetes to stop working (specifically if Kubernetes API Server is configured on 443 instead of the default 6443).
66
97
67
-
## Additional Permissions
98
+
## Additional permissions
99
+
100
+
As part of the installation, you will be required to install the [Run:ai Control Plane](backend.md) and [Cluster](cluster.md) Helm [Charts](https://helm.sh/){target=_blank}. The Helm Charts require Kubernetes administrator permissions. You can review the exact permissions provided by using the `--dry-run` on both helm charts.
101
+
102
+
## Validate Prerequisites
103
+
104
+
Once you believe that the Run:ai prerequisites and preperations are met, we highly recommend installing and running the Run:ai[pre-install diagnostics script](https://github.com/run-ai/preinstall-diagnostics){target=_blank}. The tool:
105
+
106
+
* Tests the below requirements as well as additional failure points related to Kubernetes, NVIDIA, storage, and networking.
107
+
* Looks at additional components installed and analyze their relevance to a successful Run:ai installation.
108
+
109
+
To use the script [download](https://github.com/run-ai/preinstall-diagnostics/releases){target=_blank} the latest version of the script and run:
As part of the installation, you will be required to install the [Run:ai Control Plane](backend.md) and [Cluster](cluster.md) Helm [Charts](https://helm.sh/){target=_blank}. The Helm Charts require Kubernetes administrator permissions. You can review the exact permissions provided by using the `--dry-run` on both helm charts.
116
+
If the script fails, or if the script succeeds but the Kubernetes system contains components other than Run:ai, locate the file `runai-preinstall-diagnostics.txt` in the current directory and send it to Run:ai technical support.
70
117
118
+
For more information on the script including additional command-line flags, see [here](https://github.com/run-ai/preinstall-diagnostics){target=_blank}.
71
119
72
-
## Next Steps
120
+
## Next steps
73
121
74
122
Continue with installing the [Run:ai Control Plane](backend.md).
-title: Self-Hosted installation over Kubernetes - Prerequisites
2
-
---
1
+
# Self-Hosted installation over Kubernetes - Prerequisites
3
2
4
3
Before proceeding with this document, please review the [installation types](../../installation-types.md) documentation to understand the difference between _air-gapped_ and _connected_ installations.
5
4
6
-
7
-
## Control-plane and clusters
5
+
## Run:ai Components
8
6
9
7
As part of the installation process you will install:
10
8
11
9
* A control-plane managing cluster
12
-
* One or more Run:aiclusters
10
+
* One or more clusters
13
11
14
12
Both the control plane and clusters require Kubernetes. Typically the control plane and first cluster are installed on the same Kubernetes cluster but this is not a must.
15
13
16
-
## Hardware Requirements
14
+
!!! Important
15
+
In OpenShift environments, adding a cluster connecting to a __remote__ control plane currently requires the assistance of customer support.
17
16
18
-
See Cluster prerequisites [hardware](../../cluster-setup/cluster-prerequisites.md#hardware-requirements) requirements.
17
+
## Installer machine
19
18
20
-
In addition, the control plane installation of Run:ai requires the configuration of Kubernetes Persistent Volumes of a total size of 110GB.
19
+
The machine running the installation script (typically the Kubernetes master) must have:
20
+
21
+
* At least 50GB of free space.
22
+
* Docker installed.
23
+
24
+
25
+
### Helm
21
26
22
-
## Run:ai Software
27
+
Run:ai requires [Helm](https://helm.sh/){target=_blank} 3.10 or later. To install Helm, see [Installing Helm](https://helm.sh/docs/intro/install/){target=_blank}. If you are installing an air-gapped version of Run:ai, The Run:ai tar file contains the helm binary.
23
28
24
-
=== "Connected"
25
-
You should receive a file: `runai-gcr-secret.yaml` from Run:ai Customer Support. The file provides access to the Run:ai Container registry.
29
+
## Cluster hardware requirements
26
30
27
-
=== "Airgapped"
28
-
You should receive a single file `runai-air-gapped-<version>.tar.gz` from Run:ai customer support
31
+
See Cluster prerequisites [hardware](../../cluster-setup/cluster-prerequisites.md#hardware-requirements) requirements.
29
32
30
-
## Run:aiSoftware Prerequisites
33
+
In addition, the control plane installation of Run:airequires the configuration of Kubernetes Persistent Volumes of a total size of 110GB.
31
34
32
-
### Operating System
35
+
36
+
## Run:ai software requirements
37
+
38
+
### Operating system
33
39
34
40
See Run:ai Cluster prerequisites [operating system](../../cluster-setup/cluster-prerequisites.md#operating-system) requirements.
35
41
@@ -53,77 +59,31 @@ The Run:ai control-plane requires a __default storage class__ to create persiste
In Air-gapped environments, you must prepare the public key of your local certificate authority as described [here](../../config/org-cert.md). It will need to be installed in Kubernetes for the installation to succeed.
64
+
### Ingress Controller
60
65
66
+
The Run:ai control plane installation assumes an existing installation of NGINX as the ingress controller. You can follow the Run:ai_Cluster_ prerequisites [ingress controller](../../cluster-setup/cluster-prerequisites.md#ingress-controller) installation.
61
67
62
-
### NVIDIA Prerequisites
68
+
### NVIDIA GPU Operator
63
69
64
70
See Run:ai Cluster prerequisites [NVIDIA](../../cluster-setup/cluster-prerequisites.md#nvidia) requirements.
65
71
66
72
The Run:ai control plane, when installed without a Run:ai cluster, does not require the NVIDIA prerequisites.
67
73
68
-
### Prometheus Prerequisites
74
+
### Prometheus
69
75
70
76
See Run:ai Cluster prerequisites [Prometheus](../../cluster-setup/cluster-prerequisites.md#prometheus) requirements.
71
77
72
78
The Run:ai control plane, when installed without a Run:ai cluster, does not require the Prometheus prerequisites.
73
79
74
-
### (Optional) Inference Prerequisites
80
+
81
+
### Inference (optional)
75
82
76
83
See Run:ai Cluster prerequisites [Inference](../../cluster-setup/cluster-prerequisites.md#inference) requirements.
77
84
78
85
The Run:ai control plane, when installed without a Run:ai cluster, does not require the Inference prerequisites.
79
86
80
-
### Helm
81
-
82
-
Run:ai requires [Helm](https://helm.sh/){target=_blank} 3.10 or later. To install Helm, see [https://helm.sh/docs/intro/install/](https://helm.sh/docs/intro/install/){target=_blank}. If you are installing an air-gapped version of Run:ai, The Run:ai tar file contains the helm binary.
83
-
84
-
85
-
## Network Requirements
86
-
87
-
### Ingress Controller
88
-
89
-
The Run:ai control plane installation assumes an existing installation of NGINX as the ingress controller. You can follow the Run:ai_Cluster_ prerequisites [ingress controller](../../cluster-setup/cluster-prerequisites.md#ingress-controller) installation.
90
-
91
-
### Domain name
92
-
93
-
The Run:ai control plane requires a domain name (FQDN). You must supply a domain name as well as a trusted certificate for that domain.
94
-
95
-
* When installing the first Run:ai cluster on the same Kubernetes cluster as the control plane, the Run:ai cluster URL will be the same as the control-plane URL.
96
-
* When installing the Run:ai cluster on a separate Kubernetes cluster, follow the Run:ai[domain name](../../cluster-setup/cluster-prerequisites.md#cluster-url) requirements.
97
-
* If your network is air-gapped, you will need to provide the Run:ai control-plane and cluster with information about the [local certificate authority](../../config/org-cert.md).
98
-
99
-
## Installer Machine
100
-
101
-
The machine running the installation script (typically the Kubernetes master) must have:
102
-
103
-
* At least 50GB of free space.
104
-
* Docker installed.
105
-
106
-
## Other
107
-
108
-
* (Airgapped installation only) __Private Docker Registry__. Run:ai assumes the existence of a Docker registry for images. Most likely installed within the organization. The installation requires the network address and port for the registry (referenced below as `<REGISTRY_URL>`).
109
-
* (Optional) __SAML Integration__ as described under [single sign-on](../../authentication/sso.md).
110
-
111
-
112
-
## Pre-install Script
113
-
114
-
Once you believe that the Run:ai prerequisites are met, we highly recommend installing and running the Run:ai[pre-install diagnostics script](https://github.com/run-ai/preinstall-diagnostics){target=_blank}. The tool:
115
-
116
-
* Tests the below requirements as well as additional failure points related to Kubernetes, NVIDIA, storage, and networking.
117
-
* Looks at additional components installed and analyze their relevance to a successful Run:ai installation.
118
-
119
-
To use the script [download](https://github.com/run-ai/preinstall-diagnostics/releases){target=_blank} the latest version of the script and run:
If the script fails, or if the script succeeds but the Kubernetes system contains components other than Run:ai, locate the file `runai-preinstall-diagnostics.txt` in the current directory and send it to Run:ai technical support.
127
-
128
-
For more information on the script including additional command-line flags, see [here](https://github.com/run-ai/preinstall-diagnostics){target=_blank}.
129
-
87
+
## Next steps
88
+
Continue to [Preparing for a Run:ai Kubernetes Installation
0 commit comments