Skip to content

Added overview #1430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 26, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions docs/admin/config/clusters.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ Before starting, make sure you have the following:
* Try to identify the problem from the logs. If you cannot resolve the issue, continue to the next step.

5. Contact Run:ai’s support
* If the issue persists, [contact Run:ai’s support](../../home/overview.md#how-to-get-support) for assistance.
* If the issue persists, [contact Run:ai’s support](../../home/documentation-library.md.md#how-to-get-support) for assistance.

??? "Cluster has service issues"
__Description__: When a cluster's status is _Has service issues_, it means that one or more Run:ai services running in the cluster are not available.
Expand Down Expand Up @@ -194,7 +194,7 @@ Before starting, make sure you have the following:
```

4. Contact Run:ai’s Support
* If the issue persists, contact [contact Run:ai’s support](../../home/overview.md#how-to-get-support) for assistance.
* If the issue persists, contact [contact Run:ai’s support](../../home/documentation-library.md#how-to-get-support) for assistance.

??? "Cluster is waiting to connect"
__Description__: When the cluster's status is ‘waiting to connect’, it means that no communication from the cluster services reaches the Run:ai Platform. This may be due to networking issues or issues with Run:ai services.
Expand Down Expand Up @@ -285,7 +285,7 @@ Before starting, make sure you have the following:
* Try to identify the problem from the logs. If you cannot resolve the issue, continue to the next step

5. Contact Run:ai’s support
* If the issue persists, [contact Run:ai’s support](../../home/overview.md#how-to-get-support) for assistance.
* If the issue persists, [contact Run:ai’s support](../../home/documentation-library.md#how-to-get-support) for assistance.

??? "Cluster is missing prerequisites"
__Description__: When a cluster's status displays Missing prerequisites, it indicates that at least one of the Mandatory Prerequisites has not been fulfilled. In such cases, Run:ai services may not function properly.
Expand Down Expand Up @@ -316,5 +316,5 @@ Before starting, make sure you have the following:
* This section provides detailed information about any missing resources or prerequisites. Review this information to identify what is needed

5. Contact Run:ai’s support
* If the issue persists, [contact Run:ai’s support](../../home/overview.md#how-to-get-support) for assistance.
* If the issue persists, [contact Run:ai’s support](../../home/documentation-library.md#how-to-get-support) for assistance.

Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ This section is a step-by-step guide for setting up a Run:ai cluster.
* A Run:ai cluster connects to the Run:ai control plane on the cloud. The control plane provides a control point as well as a monitoring and control user interface for Administrators and Researchers.
* A customer may have multiple Run:ai Clusters, all connecting to a single control plane.

For additional details see the [Run:ai system components](../../../home/components.md)
For additional details see the [Run:ai system components](../../../home/overview.md#runai-system-components)

## Documents

Expand Down
4 changes: 2 additions & 2 deletions docs/admin/runai-setup/installation-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@

Run:ai consists of two components:

* The Run:ai [Cluster](../../home/components.md#runai-cluster). One or more data-science GPU clusters hosted by the customer (on-prem or cloud).
* The Run:ai [Control plane](../../home/components.md#components). A single entity that monitors clusters, sets priorities, and business policies.
* The Run:ai [Cluster](../../home/overview.md#runai-cluster). One or more data-science GPU clusters hosted by the customer (on-prem or cloud).
* The Run:ai [Control plane](../../home/overview.md#runai-control-plane). A single entity that monitors clusters, sets priorities, and business policies.

There are two main installation options:

Expand Down
2 changes: 1 addition & 1 deletion docs/developer/overview-developer.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Developers can access Run:ai through various programmatic interfaces.

## API Architecture

Run:ai is composed of a single, multi-tenant control plane. Each tenant can be connected to one or more GPU clusters. See [Run:ai system components](../home/components.md) for detailed information.
Run:ai is composed of a single, multi-tenant control plane. Each tenant can be connected to one or more GPU clusters. See [Run:ai system components](../home/overview.md#runai-system-components) for detailed information.

The following programming interfaces are available:

Expand Down
42 changes: 0 additions & 42 deletions docs/home/components.md

This file was deleted.

63 changes: 63 additions & 0 deletions docs/home/documentation-library.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Run:ai Documentation Library


Welcome to the Run:ai documentation area. For an introduction about what is the Run:ai Platform see [Run:ai platform](https://www.run.ai/platform/){target=_blank} on the run.ai website.

The Run:ai documentation is targeting four personas:

* __Infrastructure Administrator__ - An IT person, responsible for the installation, setup and IT maintenance of the Run:ai product. Infrastructure Administrator documentation can be found [here](../admin/overview-administrator.md).

* __Platform Administrator__ - Responsible for the day-to-day administration of the product. Platform Administrator documentation can be found [here](../platform-admin/overview.md).


* __Researcher__ — Using Run:ai to spin up notebooks, submit Workloads, prompt models, etc. Researcher documentation can be found [here](../Researcher/overview-researcher.md).

* __Developer__ — Using various APIs to automate work with Run:ai. The Developer documentation can be found [here](../developer/overview-developer.md).

## How to Get Support

To get support use the following channels:

* On the Run:ai user interface at `<company-name>.run.ai`, use the 'Contact Support' link on the top right.

* Or submit a ticket by clicking the button below:

[Submit a Ticket](https://runai.secure.force.com/casesupport/CreateCaseForm){target=_blank .md-button .custom-ticket-button}



## Community

Run:ai provides its customers with access to the _Run:ai Customer Community portal_ to submit tickets, track ticket progress and update support cases.

[Customer Community Portal](https://runai-support.force.com/community/s/){target=_blank .md-button .custom-ticket-button}

Reach out to customer support for credentials.


## Run:ai Cloud Status Page

Run:ai cloud availability is monitored at [status.run.ai](https://status.run.ai){target=_blank}.

## Collect Logs to Send to Support

As an IT Administrator, you can collect Run:ai logs to send to support. For more information see [logs collection](../admin/troubleshooting/logs-collection.md).

## Example Code

Code for the Docker images referred to on this site is available at [https://github.com/run-ai/docs/tree/master/quickstart](https://github.com/run-ai/docs/tree/master/quickstart){target=_blank}.

The following images are used throughout the documentation:

| Image | Description | Source |
|--------|-------------|--------|
| [runai.jfrog.io/demo/quickstart](https://runai.jfrog.io/artifactory/demo/quickstart){target=_blank} | Basic training image. Multi-GPU support | [https://github.com/run-ai/docs/tree/master/quickstart/main](https://github.com/run-ai/docs/tree/master/quickstart/main){target=_blank} |
| [runai.jfrog.io/demo/quickstart-distributed](https://runai.jfrog.io/artifactory/demo/quickstart-distributed){target=_blank} | Distributed training using MPI and Horovod | [https://github.com/run-ai/docs/tree/master/quickstart/distributed](https://github.com/run-ai/docs/tree/master/quickstart/distributed){target=_blank} |
| [zembutsu/docker-sample-nginx](https://hub.docker.com/r/zembutsu/docker-sample-nginx) | Build (interactive) with Connected Ports | [https://github.com/zembutsu/docker-sample-nginx](https://github.com/zembutsu/docker-sample-nginx){target=_blank} |
| [runai.jfrog.io/demo/quickstart-x-forwarding](https://runai.jfrog.io/artifactory/demo/quickstart-x-forwarding){target=_blank} | Use X11 forwarding from Docker image | [https://github.com/run-ai/docs/tree/master/quickstart/x-forwarding](https://github.com/run-ai/docs/tree/master/quickstart/x-forwarding){target=_blank} |
| [runai.jfrog.io/demo/pycharm-demo](https://runai.jfrog.io/artifactory/demo/pycharm-demo){target=_blank} | Image used for tool integration (PyCharm and VSCode) | [https://github.com/run-ai/docs/tree/master/quickstart/python%2Bssh](https://github.com/run-ai/docs/tree/master/quickstart/python%2Bssh){target=_blank} |
| [runai.jfrog.io/demo/example-triton-client](https://runai.jfrog.io/artifactory/demo/example-triton-client){target=_blank} and [runai.jfrog.io/demo/example-triton-server](https://runai.jfrog.io/artifactory/demo/example-triton-server){target=_blank} | Basic Inference | [https://github.com/run-ai/models/tree/main/models/triton](https://github.com/run-ai/models/tree/main/models/triton){target=_blank} |

## Contributing to the documentation

This documentation is made better by individuals from our customer and partner community. If you see something worth fixing, please comment at the bottom of the page or create a pull request via GitHub. The public GitHub repository can be found on the top-right of this page.
Loading