Skip to content

Revert "deprecate old versions, improve install & upgrade docs" #920

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 16 additions & 2 deletions docs/admin/runai-setup/cluster-setup/cluster-delete.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,23 @@
# Deleting a Cluster Installation

To delete a Run:ai Cluster installation run the following commands:
To delete a Run:ai Cluster installation while retaining existing running jobs, run the following commands:

=== "Version 2.9 or later"
```
helm uninstall runai-cluster -n runai
```

The command will **not** delete existing Projects, Departments, or Workloads submitted by users.
=== "Version 2.8"
```
kubectl delete RunaiConfig runai -n runai
helm uninstall runai-cluster -n runai
```

=== "Version 2.7 or earlier"
```
kubectl patch RunaiConfig runai -n runai -p '{"metadata":{"finalizers":[]}}' --type="merge"
kubectl delete RunaiConfig runai -n runai
helm uninstall runai-cluster runai -n runai
```

The commands will **not** delete existing Jobs submitted by users.
6 changes: 2 additions & 4 deletions docs/admin/runai-setup/cluster-setup/cluster-install.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,13 @@ Using the cluster wizard:

* Choose a name for your cluster.
* Choose the Run:ai version for the cluster.
* (v2.13 only) Choose a target Kubernetes distribution.

* Choose a target Kubernetes distribution (see [table](cluster-prerequisites.md#kubernetes) for supported distributions).
* (SaaS and remote self-hosted cluster only) Enter a URL for the Kubernetes cluster. The URL need only be accessible within the organization's network. For more informtaion see [here](cluster-prerequisites.md#cluster-url).
* Press `Continue`.

On the next page:

* (SaaS and remote self-hosted cluster only) Install a [trusted certificate](cluster-prerequisites.md#cluster-url) to the domain entered above.

* (SaaS and remote self-hosted cluster only) Install a trusted certificate to the domain entered above.
* Run the [Helm](https://helm.sh/docs/intro/install/) command provided in the wizard.
* In case of a failure, see the [Installation troubleshooting guide](../../troubleshooting/troubleshooting.md#installation).

Expand Down
3 changes: 3 additions & 0 deletions docs/admin/runai-setup/cluster-setup/cluster-prerequisites.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,10 @@ Following is a Kubernetes support matrix for the latest Run:ai releases:<a name=

| Run:ai version | Supported Kubernetes versions | Supported OpenShift versions |
|----------------|-------------------------------|--------|
| Run:ai 2.9 | 1.21 through 1.26 | 4.8 through 4.11 |
| Run:ai 2.10 | 1.21 through 1.26 (see note below) | 4.8 through 4.11 |
| Run:ai 2.13 | 1.23 through 1.28 (see note below) | 4.10 through 4.13 |
| Run:ai 2.15 | 1.25 through 1.28 | 4.11 through 4.13 |
| Run:ai 2.16 | 1.26 through 1.28 | 4.11 through 4.14 |
| Run:ai 2.17 | 1.27 through 1.29 | 4.12 through 4.15 |
| Run:ai 2.18 | 1.28 through 1.30 | 4.12 through 4.16 |
Expand Down
93 changes: 65 additions & 28 deletions docs/admin/runai-setup/cluster-setup/cluster-upgrade.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,76 @@

# Upgrading a Cluster Installation
Below are instructions on how to upgrade a Run:ai cluster.

## Upgrade Run:ai cluster
Follow the steps bellow, based on the Run:ai Cluster version you would like to upgrade to:
## Find out Run:ai Cluster version

=== "2.15-latest"
* In the Run:ai interface, navigate to `Clusters`.
* Select the cluster you want to upgrade.
* Click on `Get Installation instructions`.
* Choose the `Run:ai version` to upgrade to.
* Press `Continue`.
* Copy the [Helm](https://helm.sh/docs/intro/install/) command provided in the `Installation Instructions` and run it on in the cluster.
* In the case of a failure, refer to the [Installation troubleshooting guide](../../troubleshooting/troubleshooting.md#installation).
To find the Run:ai cluster version, run:

=== "2.13"
Run:
```
helm list -n runai -f runai-cluster
```

and record the chart version in the form of `runai-cluster-<version-number>`

## Upgrade Run:ai cluster

### Upgrade from version 2.15+
* In the Run:ai interface, navigate to `Clusters`.
* Select the cluster you want to upgrade.
* Click on `Get Installation instructions`.
* Choose the `Run:ai version` to be installed on the Cluster.
* Press `Continue`.
* Copy the [Helm](https://helm.sh/docs/intro/install/) command provided in the `Installation Instructions` and run it on in the cluster.
* In the case of a failure, refer to the [Installation troubleshooting guide](../../troubleshooting/troubleshooting.md#installation).

### Upgrade from version 2.9, 2.10, 2.11 or 2.12
Run:

```
helm get values runai-cluster -n runai > old-values.yaml
```

1. Review the file `old-values.yaml` and see if there are any changes performed during the last installation.
2. Follow the instructions for [Installing Run:ai](cluster-install.md#install-runai) to download a new values file.
3. Merge the changes from Step 1 into the new values file.
4. Run `helm upgrade` as per the instructions in the link above.


!!! Note
To upgrade to a __specific__ version of the Run:ai cluster, add `--version <version-number>` to the `helm upgrade` command. You can find the relevant version with `helm search repo` as described above.

### Upgrade from version 2.7 or 2.8

The process of upgrading from 2.7 or 2.8 requires [uninstalling](./cluster-delete.md) and then [installing](./cluster-install.md) again. No data is lost during the process.

!!! Note
The reason for this process is that Run:ai 2.9 cluster installation no longer installs pre-requisites. As such ownership of dependencies such as Prometheus will be undefined if a `helm upgrade` is run.

The process:

* Delete the Run:ai cluster installation according to these [instructions](cluster-delete.md) (do not delete the Run:ai cluster __object__ from the user interface).
* The following commands should be executed __after__ running the helm uninstall command
```
helm get values runai-cluster -n runai > old-values.yaml
```
kubectl -n runai delete all --all
kubectl -n runai delete cm --all
kubectl -n runai delete secret --all
kubectl -n runai delete roles --all
kubectl -n runai delete rolebindings --all
kubectl -n runai delete ingress --all
kubectl -n runai delete servicemonitors --all
kubectl -n runai delete podmonitors --all
kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io -l app=runai
kubectl delete mutatingwebhookconfigurations.admissionregistration.k8s.io -l app=runai
kubectl delete svc -n kube-system runai-cluster-kube-prometh-kubelet
```
* Install the mandatory Run:ai [prerequisites](cluster-prerequisites.md):
* If you have previously installed the SaaS version of Run:ai version 2.7 or below, you will need to install both [Ingress Controller](cluster-prerequisites.md#ingress-controller) and [Prometheus](cluster-prerequisites.md#prometheus).
* If you have previously installed the SaaS version of Run:ai version 2.8 or any Self-hosted version of Run:ai, you will need to install [Prometheus](cluster-prerequisites.md#prometheus) only.


* Install Run:ai cluster as described [here](cluster-install.md)

## Verify Successful Installation

* Review the file `old-values.yaml` and see if there are any changes performed during the last installation.
* In the Run:ai interface, navigate to `Clusters`.
* Select the cluster you want to upgrade.
* Click on `Get Installation instructions`.
* Select `Run:ai version: 2.13`.
* Select the `cluster's Kubernetes distribution` and the `Cluster location`
* If the Cluster locaiton is remote to the control plane - Enter a URL for the Kubernetes cluster.
* Press `Continue`.
* Follow the instructions to download a new values file.
* Merge the changes from Step 1 into the new values file.
* Copy the [Helm](https://helm.sh/docs/intro/install/) command provided in the `Installation Instructions` and run it on in the cluster.

## Verify Successful Upgrade
See [Verify your installation](cluster-install.md#verify-your-clusters-health) on how to verify a Run:ai cluster installation


Expand Down
37 changes: 34 additions & 3 deletions docs/admin/runai-setup/self-hosted/k8s/upgrade.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,33 @@ If you are installing an air-gapped version of Run:ai, The Run:ai tar file conta

Before proceeding with the upgrade, it's crucial to apply the specific prerequisites associated with your current version of Run:ai and every version in between up to the version you are upgrading to.

### Upgrade from version 2.9
### Upgrade from version 2.7 or 2.8

Before upgrading the control plane, run:

``` bash
POSTGRES_PV=$(kubectl get pvc pvc-postgresql -n runai-backend -o jsonpath='{.spec.volumeName}')
THANOS_PV=$(kubectl get pvc pvc-thanos-receive -n runai-backend -o jsonpath='{.spec.volumeName}')
kubectl patch pv $POSTGRES_PV $THANOS_PV -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

kubectl delete secret -n runai-backend runai-backend-postgresql
kubectl delete sts -n runai-backend keycloak runai-backend-postgresql
```

Before version 2.9, the Run:ai installation, by default, included NGINX. It was possible to disable this installation. If NGINX is enabled in your current installation, as per the default, run the following 2 lines:

``` bash
kubectl delete ValidatingWebhookConfiguration runai-backend-nginx-ingress-admission
kubectl delete ingressclass nginx
```
(If Run:ai configuration has previously disabled NGINX installation then these lines should not be run).

Next, install NGINX as described [here](../../cluster-setup/cluster-prerequisites.md#ingress-controller)

Then create a TLS secret and upgrade the control plane as described in the [control plane installation](backend.md). Before upgrading, find customizations and merge them as discussed below.


### Upgrade from version 2.9, 2.10 , or 2.11

Two significant changes to the control-plane installation have happened with version 2.12: _PVC ownership_ and _installation customization_.

Expand All @@ -55,6 +81,7 @@ To remove the ownership in an older installation, run:
kubectl patch pvc -n runai-backend pvc-thanos-receive -p '{"metadata": {"annotations":{"helm.sh/resource-policy": "keep"}}}'
kubectl patch pvc -n runai-backend pvc-postgresql -p '{"metadata": {"annotations":{"helm.sh/resource-policy": "keep"}}}'
```

#### Ingress

Delete the ingress object which will be recreated by the control plane upgrade
Expand All @@ -71,7 +98,9 @@ The Run:ai control-plane installation has been rewritten and is no longer using


## Upgrade Control Plane

### Upgrade from version 2.13, or later

=== "Connected"

``` bash
Expand All @@ -84,7 +113,9 @@ The Run:ai control-plane installation has been rewritten and is no longer using
helm get values runai-backend -n runai-backend > runai_control_plane_values.yaml
helm upgrade runai-backend control-plane-<NEW-VERSION>.tgz -n runai-backend -f runai_control_plane_values.yaml --reset-then-reuse-values
```
### Upgrade from version 2.9

### Upgrade from version 2.7, 2.8, 2.9, or 2.11

* Create a `tls secret` as described in the [control plane installation](backend.md).
* Upgrade the control plane as described in the [control plane installation](backend.md). During the upgrade, you must tell the installation __not__ to create the two PVCs:

Expand All @@ -111,4 +142,4 @@ The Run:ai control-plane installation has been rewritten and is no longer using

## Upgrade Cluster

To upgrade the cluster follow the instructions [here](../../cluster-setup/cluster-upgrade.md).
To upgrade the cluster follow the instructions [here](../../cluster-setup/cluster-upgrade.md).
22 changes: 18 additions & 4 deletions docs/admin/runai-setup/self-hosted/ocp/upgrade.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,23 @@
title: Upgrade self-hosted OpenShift installation
---
# Upgrade Run:ai


!!! Important
Run:ai data is stored in Kubernetes persistent volumes (PVs). Prior to Run:ai 2.12, PVs are owned by the Run:ai installation. Thus, uninstalling the `runai-backend` helm chart may delete all of your data.

From version 2.12 forward, PVs are owned the customer and are independent of the Run:ai installation. As such, they are subject to storage class [reclaim](https://kubernetes.io/docs/concepts/storage/storage-classes/#reclaim-policy){target=_blank} policy.

## Preparations

### Helm
Run:ai requires [Helm](https://helm.sh/){target=_blank} 3.14 or later.
Before you continue, validate your installed helm client version.
To install or upgrade Helm, see [Installing Helm](https://helm.sh/docs/intro/install/){target=_blank}.
If you are installing an air-gapped version of Run:ai, The Run:ai tar file contains the helm binary.

### Software files

=== "Connected"
Run the helm command below:

Expand All @@ -28,8 +32,11 @@ If you are installing an air-gapped version of Run:ai, The Run:ai tar file conta
* Upload the images as described [here](preparations.md#software-artifacts).

## Before upgrade

Before proceeding with the upgrade, it's crucial to apply the specific prerequisites associated with your current version of Run:ai and every version in between up to the version you are upgrading to.

### Upgrade from version 2.7 or 2.8

Before upgrading the control plane, run:

``` bash
Expand All @@ -42,10 +49,11 @@ kubectl delete sts -n runai-backend keycloak runai-backend-postgresql

Then upgrade the control plane as described [below](#upgrade-control-plane). Before upgrading, find customizations and merge them as discussed below.

### Upgrade from version 2.9
Two significant changes to the control-plane installation have happened with version 2.12: _PVC ownership_ and _installation customization_.
### Upgrade from version 2.9, 2.10 or 2.11

Two significant changes to the control-plane installation have happened with version 2.12: _PVC ownership_ and _installation customization_.
#### PVC ownership

Run:ai will no longer directly create the PVCs that store Run:ai data (metrics and database). Instead, going forward,

* Run:ai requires a Kubernetes storage class to be installed.
Expand All @@ -60,13 +68,17 @@ kubectl patch pvc -n runai-backend pvc-postgresql -p '{"metadata": {"annotation
```

#### Installation customization

The Run:ai control-plane installation has been rewritten and is no longer using a _backend values file_. Instead, to customize the installation use standard `--set` flags. If you have previously customized the installation, you must now extract these customizations and add them as `--set` flag to the helm installation:

* Find previous customizations to the control plane if such exist. Run:ai provides a utility for that here `https://raw.githubusercontent.com/run-ai/docs/v2.13/install/backend/cp-helm-vals-diff.sh`. For information on how to use this utility please contact Run:ai customer support.
* Search for the customizations you found in the [optional configurations](./backend.md#additional-runai-configurations-optional) table and add them in the new format.


## Upgrade Control Plane

### Upgrade from version 2.13, or later

=== "Connected"

``` bash
Expand All @@ -79,7 +91,8 @@ The Run:ai control-plane installation has been rewritten and is no longer using
helm get values runai-backend -n runai-backend > runai_control_plane_values.yaml
helm upgrade runai-backend control-plane-<NEW-VERSION>.tgz -n runai-backend -f runai_control_plane_values.yaml --reset-then-reuse-values
```
### Upgrade from version 2.9

### Upgrade from version 2.7, 2.8, 2.9, or 2.11

=== "Connected"

Expand Down Expand Up @@ -111,4 +124,5 @@ The Run:ai control-plane installation has been rewritten and is no longer using
1. The subdomain configured for the OpenShift cluster.

## Upgrade Cluster
To upgrade the cluster follow the instructions [here](../../cluster-setup/cluster-upgrade.md).

To upgrade the cluster follow the instructions [here](../../cluster-setup/cluster-upgrade.md).
Loading