diff --git a/docs/Researcher/Walkthroughs/quickstart-inference.md b/docs/Researcher/Walkthroughs/quickstart-inference.md index cba5d22515..b124b51f88 100644 --- a/docs/Researcher/Walkthroughs/quickstart-inference.md +++ b/docs/Researcher/Walkthroughs/quickstart-inference.md @@ -34,10 +34,10 @@ As described, the inference client can be created via CLI. To perform this, you ### Login -=== "CLI V1" +=== "CLI V2" Run `runai login` and enter your credentials. -=== "CLI V2" +=== "CLI V1 (Deprecated)" Run `runai login` and enter your credentials. === "User Interface" @@ -65,11 +65,10 @@ Under `Environments` Select __NEW ENVIRONMENT__. Then select: ### Run an Inference Workload - -=== "CLI V1" +=== "CLI V2" Not available right now. -=== "CLI V2" +=== "CLI V1 (Deprecated)" Not available right now. === "User Interface" @@ -145,22 +144,21 @@ You can use the Run:ai Triton demo client to send requests to the server * Copy the inference endpoint URL. -=== "CLI V1" +=== "CLI V2" Open a terminal and run: ``` bash - runai config project team-a - runai submit inference-client-1 -i runai.jfrog.io/demo/example-triton-client \ - -- perf_analyzer -m inception_graphdef -p 3600000 -u + runai project set team-a + runai training submit inference-client-1 -i runai.jfrog.io/demo/example-triton-client \ + -- perf_analyzer -m inception_graphdef -p 3600000 -u ``` - -=== "CLI V2" +=== "CLI V1 (Deprecated)" Open a terminal and run: ``` bash - runai project set team-a - runai training submit inference-client-1 -i runai.jfrog.io/demo/example-triton-client \ + runai config project team-a + runai submit inference-client-1 -i runai.jfrog.io/demo/example-triton-client \ -- perf_analyzer -m inception_graphdef -p 3600000 -u ``` @@ -185,10 +183,10 @@ In the user interface, under `inference-server-1`, go to the `Metrics` tab and w Run the following: -=== "CLI V1" +=== "CLI V2" Not available right now -=== "CLI V2" +=== "CLI V1 (Deprecated)" Not available right now === "User Interface" diff --git a/docs/Researcher/Walkthroughs/quickstart-vscode.md b/docs/Researcher/Walkthroughs/quickstart-vscode.md index c0cf140015..d30e5ce6a0 100644 --- a/docs/Researcher/Walkthroughs/quickstart-vscode.md +++ b/docs/Researcher/Walkthroughs/quickstart-vscode.md @@ -30,10 +30,10 @@ To complete this Quickstart __via the CLI__, you will need to have the Run:ai CL ### Login -=== "CLI V1" +=== "CLI V2" Run `runai login` and enter your credentials. -=== "CLI V2" +=== "CLI V1 (Deprecated)" Run `runai login` and enter your credentials. === "User Interface" @@ -57,29 +57,28 @@ Under `Environments` Select __NEW ENVIRONMENT__. Then select: ### Run Workload - -=== "CLI V1" +=== "CLI V2" Open a terminal and run: ``` bash - runai config project team-a - runai submit vs1 --jupyter -g 1 + runai project set team-a + runai workspace submit vs1 --image quay.io/opendatahub-contrib/workbench-images:vscode-datascience-c9s-py311_2023c_latest \ + --gpu-devices-request 1 --external-url container=8787 ``` !!! Note - For more information on the workload submit command, see [cli documentation](../cli-reference/runai-submit.md). + For more information on the workspace submit command, see [cli documentation](../cli-reference/new-cli/runai_workspace_submit.md). -=== "CLI V2" +=== "CLI V1 (Deprecated)" Open a terminal and run: ``` bash - runai project set team-a - runai workspace submit vs1 --image quay.io/opendatahub-contrib/workbench-images:vscode-datascience-c9s-py311_2023c_latest \ - --gpu-devices-request 1 --external-url container=8787 + runai config project team-a + runai submit vs1 --jupyter -g 1 ``` !!! Note - For more information on the workspace submit command, see [cli documentation](../cli-reference/new-cli/runai_workspace_submit.md). + For more information on the workload submit command, see [cli documentation](../cli-reference/runai-submit.md). === "User Interface" * In the Run:ai UI select __Workloads__ @@ -141,16 +140,16 @@ Via the Run:ai user interface, go to `Workloads`, select the `vs1` Workspace and Run the following: -=== "CLI V1" - ``` bash - runai delete job vs1 - ``` - === "CLI V2" ``` runai workspace delete vs1 ``` +=== "CLI V1 (Deprecated)" + ``` bash + runai delete job vs1 + ``` + === "User Interface" Select the Workspace and press __DELETE__. diff --git a/docs/Researcher/Walkthroughs/walkthrough-build-ports.md b/docs/Researcher/Walkthroughs/walkthrough-build-ports.md index 95f5a64286..1c73fab717 100644 --- a/docs/Researcher/Walkthroughs/walkthrough-build-ports.md +++ b/docs/Researcher/Walkthroughs/walkthrough-build-ports.md @@ -24,11 +24,23 @@ * At the command-line run: -``` bash -runai config project team-a -runai submit nginx-test -i zembutsu/docker-sample-nginx --interactive -runai port-forward nginx-test --port 8080:80 -``` +=== "CLI V2" + Open a terminal and run: + + ``` bash + runai project set team-a + runai training submit nginx-test -i zembutsu/docker-sample-nginx + runai port-forward nginx-test --port 8080:80 + ``` + +=== "CLI V1 (Deprecated)" + Open a terminal and run: + + ```shell + runai config project team-a + runai submit nginx-test -i zembutsu/docker-sample-nginx --interactive + runai port-forward nginx-test --port 8080:80 + ``` * The Job is based on a sample _NGINX_ webserver docker image `zembutsu/docker-sample-nginx`. Once accessed via a browser, the page shows the container name. * Note the _interactive_ flag which means the Job will not have a start or end. It is the Researcher's responsibility to close the Job. @@ -37,13 +49,22 @@ runai port-forward nginx-test --port 8080:80 The result will be: -``` bash -The job 'nginx-test-0' has been submitted successfully -You can run `runai describe job nginx-test-0 -p team-a` to check the job status - -Forwarding from 127.0.0.1:8080 -> 80 -Forwarding from [::1]:8080 -> 80 -``` +=== "CLI V2" + ```shell + Creating workspace nginx-test... + To track the workload's status, run 'runai workspace list' + + port-forward stared, opening ports [8080:80] + ``` + +=== "CLI V1 (Deprecated)" + ```shell + The job 'nginx-test-0' has been submitted successfully + You can run `runai describe job nginx-test-0 -p team-a` to check the job status + + Forwarding from 127.0.0.1:8080 -> 80 + Forwarding from [::1]:8080 -> 80 + ``` ### Access the Webserver diff --git a/docs/Researcher/Walkthroughs/walkthrough-build.md b/docs/Researcher/Walkthroughs/walkthrough-build.md index 901dd9747f..20ee9680a0 100644 --- a/docs/Researcher/Walkthroughs/walkthrough-build.md +++ b/docs/Researcher/Walkthroughs/walkthrough-build.md @@ -27,11 +27,10 @@ To complete this Quickstart __via the CLI__, you will need to have the Run:ai CL ## Step by Step Quickstart ### Login - -=== "CLI V1" +=== "CLI V2" Run `runai login` and enter your credentials. -=== "CLI V2" +=== "CLI V1 (Deprecated)" Run `runai login` and enter your credentials. === "User Interface" @@ -43,27 +42,26 @@ To complete this Quickstart __via the CLI__, you will need to have the Run:ai CL ### Create a Workspace - -=== "CLI V1" +=== "CLI V2" Open a terminal and run: ``` bash - runai config project team-a - runai submit build1 -i ubuntu -g 1 --interactive -- sleep infinity + runai project set team-a + runai workspace submit build1 -i ubuntu -g 1 --command -- sleep infinity ``` - !!! Note - For more information on the workload submit command, see [cli documentation](../cli-reference/runai-submit.md). + For more information on the workspace submit command, see [cli documentation](../cli-reference/new-cli/runai_workspace_submit.md). -=== "CLI V2" +=== "CLI V1 (Deprecated)" Open a terminal and run: ``` bash - runai project set team-a - runai workspace submit build1 -i ubuntu -g 1 --command -- sleep infinity + runai config project team-a + runai submit build1 -i ubuntu -g 1 --interactive -- sleep infinity ``` + !!! Note - For more information on the workspace submit command, see [cli documentation](../cli-reference/new-cli/runai_workspace_submit.md). + For more information on the workload submit command, see [cli documentation](../cli-reference/runai-submit.md). === "User Interface" * In the Run:ai UI select __Workloads__ @@ -115,14 +113,6 @@ To complete this Quickstart __via the CLI__, you will need to have the Run:ai CL Follow up on the Workload's progress by running: -=== "CLI V1" - ``` bash - runai list jobs - ``` - The result: - ![mceclip20.png](img/mceclip20.png) - - === "CLI V2" ``` bash runai workspace list @@ -136,6 +126,14 @@ Follow up on the Workload's progress by running: vs1 Workspace Running team-a No 1/1 1.00 ``` +=== "CLI V1 (Deprecated)" + ``` bash + runai list jobs + ``` + The result: + ![mceclip20.png](img/mceclip20.png) + + === "User Interface" * Open the Run:ai user interface. * Under "Workloads" you can view the new Workspace: @@ -159,14 +157,14 @@ A full list of Job statuses can be found [here](../../platform-admin/workloads/o To get additional status on your Workload run: -=== "CLI V1" +=== "CLI V2" ``` bash - runai describe job build1 + runai workspace describe build1 ``` -=== "CLI V2" +=== "CLI V1 (Deprecated)" ``` bash - runai workspace describe build1 + runai describe job build1 ``` === "User Interface" @@ -175,15 +173,15 @@ To get additional status on your Workload run: ### Get a Shell to the container -=== "CLI V1" - Run: +=== "CLI V2" ``` bash - runai bash build1 + runai workspace bash build1 ``` -=== "CLI V2" +=== "CLI V1 (Deprecated)" + Run: ``` bash - runai workspace bash build1 + runai bash build1 ``` This should provide a direct shell into the computer @@ -195,16 +193,16 @@ This should provide a direct shell into the computer Run the following: -=== "CLI V1" - ``` bash - runai delete job build1 - ``` - === "CLI V2" ``` runai workspace delete build1 ``` +=== "CLI V1 (Deprecated)" + ``` bash + runai delete job build1 + ``` + === "User Interface" Select the Workspace and press __DELETE__. diff --git a/docs/Researcher/Walkthroughs/walkthrough-fractions.md b/docs/Researcher/Walkthroughs/walkthrough-fractions.md index 5cef9ee9a3..2bca4f7a72 100644 --- a/docs/Researcher/Walkthroughs/walkthrough-fractions.md +++ b/docs/Researcher/Walkthroughs/walkthrough-fractions.md @@ -33,10 +33,10 @@ To complete this Quickstart, the [Platform Administrator](../../platform-admin/o ### Login -=== "CLI V1" +=== "CLI V2" Run `runai login` and enter your credentials. -=== "CLI V2" +=== "CLI V1 (Deprecated)" Run `runai login` and enter your credentials. === "User Interface" @@ -50,13 +50,6 @@ To complete this Quickstart, the [Platform Administrator](../../platform-admin/o Open a terminal and run: -=== "CLI V1" - ``` bash - runai config project team-a - runai submit frac05 -i runai.jfrog.io/demo/quickstart -g 0.5 - runai submit frac05-2 -i runai.jfrog.io/demo/quickstart -g 0.5 - ``` - === "CLI V2" ``` bash runai project set team-a @@ -64,6 +57,13 @@ Open a terminal and run: runai training submit frac05-2 -i runai.jfrog.io/demo/quickstart --gpu-portion-request 0.5 ``` +=== "CLI V1 (Deprecated)" + ``` bash + runai config project team-a + runai submit frac05 -i runai.jfrog.io/demo/quickstart -g 0.5 + runai submit frac05-2 -i runai.jfrog.io/demo/quickstart -g 0.5 + ``` + === "User Interface" * In the Run:ai UI select __Workloads__ * Select __New Workload__ and then __Training__ @@ -113,32 +113,34 @@ Open a terminal and run: Follow up on the Workload's progress by running: -=== "CLI V1" +=== "CLI V2" ``` bash - runai list jobs + runai training list ``` + The result: ``` - Showing jobs for project team-a - NAME STATUS AGE NODE IMAGE TYPE PROJECT USER GPUs Allocated (Requested) PODs Running (Pending) SERVICE URL(S) - frac05 Running 9s runai-cluster-worker runai.jfrog.io/demo/quickstart Train team-a yaron 0.50 (0.50) 1 (0) - frac05-2 Running 8s runai-cluster-worker runai.jfrog.io/demo/quickstart Train team-a yaron 0.50 (0.50) 1 (0) + Workload Type Status Project Preemptible Running/Requested Pods GPU Allocation + ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── + frac05 Training Running team-a Yes 0/1 0.00 + frac05-2 Training Running team-a Yes 0/1 0.00 ``` -=== "CLI V2" +=== "CLI V1 (Deprecated)" ``` bash - runai training list + runai list jobs ``` The result: ``` - Workload Type Status Project Preemptible Running/Requested Pods GPU Allocation - ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── - frac05 Training Running team-a Yes 0/1 0.00 - frac05-2 Training Running team-a Yes 0/1 0.00 + Showing jobs for project team-a + NAME STATUS AGE NODE IMAGE TYPE PROJECT USER GPUs Allocated (Requested) PODs Running (Pending) SERVICE URL(S) + frac05 Running 9s runai-cluster-worker runai.jfrog.io/demo/quickstart Train team-a yaron 0.50 (0.50) 1 (0) + frac05-2 Running 8s runai-cluster-worker runai.jfrog.io/demo/quickstart Train team-a yaron 0.50 (0.50) 1 (0) ``` + === "User Interface" * Open the Run:ai user interface. * Under `Workloads` you can view the two new Training Workloads @@ -147,16 +149,16 @@ Follow up on the Workload's progress by running: To verify that the Workload sees only parts of the GPU memory run: -=== "CLI V1" - ``` - runai exec frac05 nvidia-smi - ``` - === "CLI V2" ``` bash runai training exec frac05 nvidia-smi ``` +=== "CLI V1 (Deprecated)" + ``` + runai exec frac05 nvidia-smi + ``` + The result: ![mceclip32.png](img/mceclip32.png) @@ -170,16 +172,16 @@ Notes: Instead of requesting a fraction of the GPU, you can ask for specific GPU memory requirements. For example: -=== "CLI V1" - ``` bash - runai submit -i runai.jfrog.io/demo/quickstart --gpu-memory 5G - ``` - === "CLI V2" ``` runai training submit -i runai.jfrog.io/demo/quickstart --gpu-memory-request 5G ``` +=== "CLI V1 (Deprecated)" + ``` bash + runai submit -i runai.jfrog.io/demo/quickstart --gpu-memory 5G + ``` + === "User Interface" As part of the Workload submission, Create a new `Compute Resource`, with 1 GPU Device and 5GB of `GPU memory per device`. See picture below: ![](img/compute-resource-5gb.png) diff --git a/docs/Researcher/Walkthroughs/walkthrough-overquota.md b/docs/Researcher/Walkthroughs/walkthrough-overquota.md index 6ed4fdcf25..beb94e5f9c 100644 --- a/docs/Researcher/Walkthroughs/walkthrough-overquota.md +++ b/docs/Researcher/Walkthroughs/walkthrough-overquota.md @@ -30,12 +30,6 @@ Run `runai login` and enter your credentials. Open a terminal and run the following command: -=== "CLI V1" - ``` - runai submit a2 -i runai.jfrog.io/demo/quickstart -g 2 -p team-a - runai submit a1 -i runai.jfrog.io/demo/quickstart -g 1 -p team-a - runai submit b1 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b - ``` === "CLI V2" ``` runai training submit a2 -i runai.jfrog.io/demo/quickstart -g 2 -p team-a @@ -43,6 +37,13 @@ Open a terminal and run the following command: runai training submit b1 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b ``` +=== "CLI V1 (Deprecated)" + ``` + runai submit a2 -i runai.jfrog.io/demo/quickstart -g 2 -p team-a + runai submit a1 -i runai.jfrog.io/demo/quickstart -g 1 -p team-a + runai submit b1 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b + ``` + System status after run: ![overquota1](img/overquota1.png) @@ -56,16 +57,17 @@ System status after run: Run the following command: -=== "CLI V1" - - ``` - runai submit b2 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b - ``` === "CLI V2" ``` runai training submit b2 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b ``` +=== "CLI V1 (Deprecated)" + + ``` + runai submit b2 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b + ``` + System status after run: ![overquota2](img/overquota2.png) @@ -78,56 +80,62 @@ System status after run: Run the following command: -=== "CLI V1" - - ``` - runai delete job a2 -p team-a - ``` === "CLI V2" ``` runai training delete a2 ``` + +=== "CLI V1 (Deprecated)" + + ``` + runai delete job a2 -p team-a + ``` + _a1_ is now going to start running again. Run: -=== "CLI V1" - - ``` - runai list jobs -A - ``` === "CLI V2" ``` runai training list -A ``` +=== "CLI V1 (Deprecated)" + + ``` + runai list jobs -A + ``` + You have __two__ Jobs that are running on the first node and __one__ Job that is running alone the second node. Choose one of the two Jobs from the full node and delete it: -=== "CLI V1" - - ``` - runai delete job -p - ``` === "CLI V2" ``` runai training delete -p ``` +=== "CLI V1 (Deprecated)" + + ``` + runai delete job -p + ``` + The status now is: ![overquota3](img/overquota3.png) Now, run a 2 GPU Job: -=== "CLI V1" - ``` - runai submit a2 -i runai.jfrog.io/demo/quickstart -g 2 -p team-a - ``` === "CLI V2" ``` runai training submit a2 -i runai.jfrog.io/demo/quickstart -g 2 -p team-a ``` +=== "CLI V1 (Deprecated)" + + ``` + runai submit a2 -i runai.jfrog.io/demo/quickstart -g 2 -p team-a + ``` + _ The status now is: ![overquota4](img/overquota4.png) diff --git a/docs/Researcher/Walkthroughs/walkthrough-queue-fairness.md b/docs/Researcher/Walkthroughs/walkthrough-queue-fairness.md index 7946bb1314..bba945c77f 100644 --- a/docs/Researcher/Walkthroughs/walkthrough-queue-fairness.md +++ b/docs/Researcher/Walkthroughs/walkthrough-queue-fairness.md @@ -29,13 +29,6 @@ Run `runai login` and enter your credentials. Run the following commands: -=== "CLI V1" - ``` - runai submit a1 -i runai.jfrog.io/demo/quickstart -g 1 -p team-a - runai submit a2 -i runai.jfrog.io/demo/quickstart -g 1 -p team-a - runai submit a3 -i runai.jfrog.io/demo/quickstart -g 1 -p team-a - runai submit a4 -i runai.jfrog.io/demo/quickstart -g 1 -p team-a - ``` === "CLI V2" ``` runai training submit a1 -i runai.jfrog.io/demo/quickstart -g 1 -p team-a @@ -44,6 +37,14 @@ Run the following commands: runai training submit a4 -i runai.jfrog.io/demo/quickstart -g 1 -p team-a ``` +=== "CLI V1 (Deprecated)" + ``` + runai submit a1 -i runai.jfrog.io/demo/quickstart -g 1 -p team-a + runai submit a2 -i runai.jfrog.io/demo/quickstart -g 1 -p team-a + runai submit a3 -i runai.jfrog.io/demo/quickstart -g 1 -p team-a + runai submit a4 -i runai.jfrog.io/demo/quickstart -g 1 -p team-a + ``` + System status after run: ![overquota-fairness11](img/overquota-fairness1.png) @@ -54,13 +55,6 @@ System status after run: Run the following commands: -=== "CLI V1" - ``` - runai submit b1 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b - runai submit b2 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b - runai submit b3 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b - runai submit b4 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b - ``` === "CLI V2" ``` runai training submit b1 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b @@ -69,6 +63,14 @@ Run the following commands: runai training submit b4 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b ``` +=== "CLI V1 (Deprecated)" + ``` + runai submit b1 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b + runai submit b2 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b + runai submit b3 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b + runai submit b4 -i runai.jfrog.io/demo/quickstart -g 1 -p team-b + ``` + System status after run: ![overquota-fairness12](img/overquota-fairness2.png) @@ -81,14 +83,14 @@ System status after run: Now lets start deleting Jobs. Alternatively, you can wait for Jobs to complete. -=== "CLI V1" - ``` - runai delete job b2 -p team-b - ``` === "CLI V2" ``` runai training delete b2 -p team-b ``` +=== "CLI V1 (Deprecated)" + ``` + runai delete job b2 -p team-b + ``` !!! Discussion As the quotas are equal (1 for each Project, the remaining pending Jobs will get scheduled one by one alternating between Projects, regardless of the time in which they were submitted. diff --git a/docs/Researcher/best-practices/convert-to-unattended.md b/docs/Researcher/best-practices/convert-to-unattended.md index 4b6e6b3e28..d9388171df 100644 --- a/docs/Researcher/best-practices/convert-to-unattended.md +++ b/docs/Researcher/best-practices/convert-to-unattended.md @@ -25,7 +25,7 @@ Realizing that Researchers are not always proficient with building docker files, You will want to minimize the cycle of code change-and-run. There are a couple of best practices which you can choose from: 1. Code resides on the network file storage. This way you can change the code and immediately run the Job. The Job picks up the new files from the network. -2. Use the `runai submit` flag `--git-sync`. The flag allows the Researcher to provide details of a Git repository. The repository will be automatically cloned into a specified directory when the container starts. +2. Use the `runai training submit` flag `--git-sync`. The flag allows the Researcher to provide details of a Git repository. The repository will be automatically cloned into a specified directory when the container starts. 3. The code can be embedded within the image. In this case, you will want to create an automatic CI/CD process, which packages the code into a modified image. The document below assumes option #1. @@ -72,28 +72,48 @@ For more information on best practices for saving checkpoints, see [Saving Deep ## Running the Job -Using ``runai submit``, drop the flag ``--interactive``. For submitting a Job using the script created above, please use ``-- [COMMAND]`` flag to specify a command, use the `--` syntax to pass arguments, and pass environment variables using the flag ``--environment``. - Example with Environment variables: -``` -runai submit train1 -i tensorflow/tensorflow:1.14.0-gpu-py3 - -v /nfs/john:/mydir -g 1 --working-dir /mydir/ - -e 'EPOCHS=30' -e 'LEARNING_RATE=0.02' - -- ./startup.sh -``` +=== "CLI V2" + Using ``runai training submit``. For submitting a Job using the script created above, please use ``-- [COMMAND]`` flag to specify a command, use the `--` syntax to pass arguments, and pass environment variables using the flag ``--environment``. + + ```shell + runai training submit -i tensorflow/tensorflow:1.14.0-gpu-py3 \ + --host-path path=/nfs/john,mount=/mydir -g 1 --working-dir /mydir/ \ + -e 'EPOCHS=30' -e 'LEARNING_RATE=0.02' \ + -- ./startup.sh + ``` + +=== "CLI V1 (Deprecated)" + Using ``runai submit``, drop the flag ``--interactive``. For submitting a Job using the script created above, please use ``-- [COMMAND]`` flag to specify a command, use the `--` syntax to pass arguments, and pass environment variables using the flag ``--environment``. + + ```shell + runai submit train1 -i tensorflow/tensorflow:1.14.0-gpu-py3 \ + -v /nfs/john:/mydir -g 1 --working-dir /mydir/ \ + -e 'EPOCHS=30' -e 'LEARNING_RATE=0.02' \ + -- ./startup.sh + ``` Example with Command-line arguments: +=== "CLI V2" + ```shell + runai training submit -i tensorflow/tensorflow:1.14.0-gpu-py3 \ + --host-path path=/nfs/john,mount=/mydir -g 1 --working-dir /mydir/ \ + -e 'EPOCHS=30' -e 'LEARNING_RATE=0.02' \ + -- ./startup.sh + ``` -``` -runai submit train1 -i tensorflow/tensorflow:1.14.0-gpu-py3 - -v /nfs/john:/mydir -g 1 --working-dir /mydir/ - -- ./startup.sh batch-size=64 number-of-epochs=3 -``` + Please refer to [Command-Line Interface, runai submit](../cli-reference/new-cli/runai_training_submit.md) for a list of all arguments accepted by the Run:ai CLI. +=== "CLI V1 (Deprecated)" + ```shell + runai submit train1 -i tensorflow/tensorflow:1.14.0-gpu-py3 \ + -v /nfs/john:/mydir -g 1 --working-dir /mydir/ \ + -- ./startup.sh batch-size=64 number-of-epochs=3 + ``` -Please refer to [Command-Line Interface, runai submit](../cli-reference/runai-submit.md) for a list of all arguments accepted by the Run:ai CLI. + Please refer to [Command-Line Interface, runai submit](../cli-reference/runai-submit.md) for a list of all arguments accepted by the Run:ai CLI. ### Use CLI Policies diff --git a/docs/Researcher/best-practices/secrets-as-env-var-in-cli.md b/docs/Researcher/best-practices/secrets-as-env-var-in-cli.md index c4f3bd3ee7..17c2f4fe4a 100644 --- a/docs/Researcher/best-practices/secrets-as-env-var-in-cli.md +++ b/docs/Researcher/best-practices/secrets-as-env-var-in-cli.md @@ -40,13 +40,24 @@ kubectl apply -f When you submit a new Workload, you will want to connect the secret to the new Workload. To do that, run: -``` -runai submit -e =SECRET:, .... -``` - -For example: - -``` -runai submit -i ubuntu -e MYUSERNAME=SECRET:my-secret,username -``` - +=== "CLI V2" + ```shell + runai workspace submit -e =SECRET:, .... + ``` + + For example: + + ```shell + runai workspace submit -i ubuntu -e MYUSERNAME=SECRET:my-secret,username + ``` + +=== "CLI V1 (Deprecated)" + ```shell + runai submit -e =SECRET:, .... + ``` + + For example: + + ```shell + runai submit -i ubuntu -e MYUSERNAME=SECRET:my-secret,username + ``` diff --git a/docs/Researcher/tools/dev-pycharm.md b/docs/Researcher/tools/dev-pycharm.md index 36fb809f90..30d0146950 100644 --- a/docs/Researcher/tools/dev-pycharm.md +++ b/docs/Researcher/tools/dev-pycharm.md @@ -11,41 +11,39 @@ You will need your image to run an SSH server (e.g [OpenSSH](https://www.ssh.co Run the following command to connect to the container as if it were running locally: +```shell +runai workspace submit build-remote -i runai.jfrog.io/demo/pycharm-demo ``` -runai submit build-remote -i runai.jfrog.io/demo/pycharm-demo --interactive \ - --service-type=portforward --port 2222:22 + +Track the workload status: +```shell +runai workspace list ``` -The terminal will show the connection: - -``` shell -The job 'build-remote' has been submitted successfully -You can run `runai describe job build-remote -p team-a` to check the job status -INFO[0007] Waiting for job to start -Waiting for job to start -Waiting for job to start -Waiting for job to start -INFO[0045] Job started -Open access point(s) to service from localhost:2222 -Forwarding from [::1]:2222 -> 22 +Once the workload is running, you can connect using: +```shell +runai workspace port-forward build-remote --port 2222:22 ``` -* The Job starts an sshd server on port 22. +* The Workload starts and sshd server on port 22. * The connection is redirected to the local machine (127.0.0.1) on port 2222 !!! Note - It is possible to connect to the container using a remote IP address. However, this would be less convinient as you will need to maintain port numbers manually and change them when remote accessing using the development tool. As an example, run: - - ``` - runai submit build-remote -i runai.jfrog.io/demo/pycharm-demo -g 1 --interactive --service-type=nodeport --port 30022:22 - ``` - - * The Job starts an sshd server on port 22. - * The Job redirects the external port 30022 to port 22 and uses a [Node Port](https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types){target=_blank} service type. - * Run: `runai list worklaods` - - * Next to the Job, under the "Service URL" column you will find the IP address and port. The port is 30222 + It is possible to connect to the container using a remote IP address. However, this would be less convinient as you will need to maintain port numbers manually and change them when remote accessing using the development tool. + As an example, run: + ```shell + runai workspace submit build-remote -i runai.jfrog.io/demo/pycharm-demo -g 1 --port service-type=NodePort,container=22,external=30022 + ``` + * The Workload starts an sshd server on port 22. + * The Workload redirects the external port 30022 to port 22 and uses a [Node Port](https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types){target=_blank} service type. + * Run: `runai workspace describe build-remote` + * Under the "Networks" title you will find the IP address and port. The port is 30222 + + Networks + Name Tool Type Connection Type URL + ───────────────────────────────────────────────────────────────────── + port NodePort 172.18.0.5:30022 ## PyCharm diff --git a/docs/Researcher/tools/dev-tensorboard.md b/docs/Researcher/tools/dev-tensorboard.md index 03c876ef5e..ad7cbecb8f 100644 --- a/docs/Researcher/tools/dev-tensorboard.md +++ b/docs/Researcher/tools/dev-tensorboard.md @@ -30,11 +30,11 @@ model.fit(x_train, y_train, The `logs` directory must be saved on a Network File Server such that it can be accessed by the TensorBoard Job. For example, by running the Job as follows: ``` -runai submit train-with-logs -i tensorflow/tensorflow:1.14.0-gpu-py3 \ - -v /mnt/nfs_share/john:/mydir -g 1 --working-dir /mydir --command -- ./startup.sh +runai training submit train-with-logs -i tensorflow/tensorflow:1.14.0-gpu-py3 \ + --host-path path=/mnt/nfs_share/john,mount=/mydir -g 1 --working-dir /mydir --command -- ./startup.sh ``` -Note the volume flag (`-v`) and working directory flag (`--working-dir`). The logs directory will be created on `/mnt/nfs_share/john/logs/fit`. +Note the host path flag (`--host-path`) and working directory flag (`--working-dir`). The logs directory will be created on `/mnt/nfs_share/john/logs/fit`. ## Submit a TensorBoard Workload @@ -57,7 +57,30 @@ There are two ways to submit a TensorBoard Workload: via the Command-line interf 1. Juypter 2. TensorBoard -=== "CLI V1" +=== "CLI V2" + Run the following: + + ```shell + runai workspace submit tb -i tensorflow/tensorflow:latest \ + --external-url container=8888 --working-dir /mydir \ + --host-path path=/mnt/nfs_share/john,mount=/mydir --command \ + -- tensorboard --logdir logs/fit --port 8888 --host 0.0.0.0 + ``` + + The terminal will show the following: + + ``` shell + Creating workspace tb... + To track the workload's status, run 'runai workspace list' + ``` + + Once the workload is running, you can connect using: + ``` + runai workspace port-forward tb --port 8888 + ``` + Browse to [http://localhost:8888/](http://localhost:8888/){target=_blank} to view TensorBoard. + +=== "CLI V1 (Deprecated)" Run the following: diff --git a/docs/Researcher/tools/dev-vscode.md b/docs/Researcher/tools/dev-vscode.md index ac0ebffc92..a862086eb3 100644 --- a/docs/Researcher/tools/dev-vscode.md +++ b/docs/Researcher/tools/dev-vscode.md @@ -12,41 +12,38 @@ You will need your image to run an SSH server (e.g [OpenSSH](https://www.ssh.co Run the following command to connect to the container as if it were running locally: +```shell +runai workspace submit build-remote -i runai.jfrog.io/demo/pycharm-demo ``` -runai submit build-remote -i runai.jfrog.io/demo/pycharm-demo --interactive \ - --service-type=portforward --port 2222:22 + +Track the workload status: +```shell +runai workspace list ``` -The terminal will show the connection: - -``` shell -The job 'build-remote' has been submitted successfully -You can run `runai describe job build-remote -p team-a` to check the job status -INFO[0007] Waiting for job to start -Waiting for job to start -Waiting for job to start -Waiting for job to start -INFO[0045] Job started -Open access point(s) to service from localhost:2222 -Forwarding from [::1]:2222 -> 22 +Once the workload is running, you can connect using: +```shell +runai workspace port-forward build-remote --port 2222:22 ``` * The Job starts an sshd server on port 22. * The connection is redirected to the local machine (127.0.0.1) on port 2222 !!! Note - It is possible to connect to the container using a remote IP address. However, this would be less convinient as you will need to maintain port numbers manually and change them when remote accessing using the development tool. As an example, run: + It is possible to connect to the container using a remote IP address. However, this would be less convinient as you will need to maintain port numbers manually and change them when remote accessing using the development tool. As an example, run: + ```shell + runai workspace submit build-remote -i runai.jfrog.io/demo/pycharm-demo -g 1 --port service-type=NodePort,container=22,external=30022 ``` - runai submit build-remote -i runai.jfrog.io/demo/pycharm-demo -g 1 --interactive --service-type=nodeport --port 30022:22 - ``` - - * The Job starts an sshd server on port 22. - * The Job redirects the external port 30022 to port 22 and uses a [Node Port](https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types){target=_blank} service type. - * Run: `runai list jobs` - - * Next to the Job, under the "Service URL" column you will find the IP address and port. The port is 30222 - + * The Workload starts an sshd server on port 22. + * The Workload redirects the external port 30022 to port 22 and uses a [Node Port](https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types){target=_blank} service type. + * Run: `runai workspace describe build-remote` + * Under the "Networks" title you will find the IP address and port. The port is 30222 + + Networks + Name Tool Type Connection Type URL + ───────────────────────────────────────────────────────────────────── + port NodePort 172.18.0.5:30022 ## Visual Studio Code diff --git a/docs/Researcher/tools/dev-x11forward-pycharm.md b/docs/Researcher/tools/dev-x11forward-pycharm.md index bcfd56c7c0..7c9756ec05 100644 --- a/docs/Researcher/tools/dev-x11forward-pycharm.md +++ b/docs/Researcher/tools/dev-x11forward-pycharm.md @@ -18,23 +18,18 @@ Details on how to create the image are [here](https://github.com/run-ai/docs/tre Run the following command to connect to the container as if it were running locally: -``` -runai submit xforward-remote -i runai.jfrog.io/demo/quickstart-x-forwarding --interactive \ - --service-type=portforward --port 2222:22 +```shell +runai workspace submit build-remote -i runai.jfrog.io/demo/pycharm-demo ``` -The terminal will show the connection: +Track the workload status: +```shell +runai workspace list +``` -``` shell -The job 'xforward-remote' has been submitted successfully -You can run `runai describe job xforward-remote -p team-a` to check the job status -INFO[0007] Waiting for job to start -Waiting for job to start -Waiting for job to start -Waiting for job to start -INFO[0045] Job started -Open access point(s) to service from localhost:2222 -Forwarding from [::1]:2222 -> 22 +Once the workload is running, you can connect using: +```shell +runai workspace port-forward build-remote --port 2222:22 ``` * The Job starts an sshd server on port 22. diff --git a/docs/Researcher/workloads/training/distributed-training/quickstart-distributed-training.md b/docs/Researcher/workloads/training/distributed-training/quickstart-distributed-training.md index 4a7627c026..d321247665 100644 --- a/docs/Researcher/workloads/training/distributed-training/quickstart-distributed-training.md +++ b/docs/Researcher/workloads/training/distributed-training/quickstart-distributed-training.md @@ -17,20 +17,20 @@ Before you start, make sure: === "User Interface" Browse to the provided Run:ai user interface and log in with your credentials. -=== "CLI V1" - Log in using the following command. You will be prompted to enter your username and password: - - ``` bash - runai login - ``` - === "CLI V2" Run the below --help command to obtain the login options and log in according to your setup: - + ``` bash runai login --help ``` +=== "CLI V1 (Deprecated)" + Log in using the following command. You will be prompted to enter your username and password: + + ``` bash + runai login + ``` + === "API" To use the API, you will need to obtain a token. Please follow the [API authentication](../../../../developer/rest-auth.md) article. @@ -77,18 +77,6 @@ Before you start, make sure: After the distributed training workload is created, it is added to the [workloads table](../../../../platform-admin/workloads/overviews/managing-workloads.md). - -=== "CLI V1" - Copy the following command to your terminal. Make sure to update the below with the name of your project and workload: - - ``` bash - runai config project "project-name" - runai submit-dist pytorch "workload-name" --workers=2 -g 0.1 \ - -i kubeflow/pytorch-dist-mnist:latest - ``` - - This would start a distributed training workload based on kubeflow/pytorch-dist-mnist:latest with one master and two workers. - === "CLI V2" Copy the following command to your terminal. Make sure to update the below with the name of your project and workload: @@ -101,6 +89,17 @@ Before you start, make sure: This would start a distributed training workload based on kubeflow/pytorch-dist-mnist:latest with one master and two workers. +=== "CLI V1 (Deprecated)" + Copy the following command to your terminal. Make sure to update the below with the name of your project and workload: + + ``` bash + runai config project "project-name" + runai submit-dist pytorch "workload-name" --workers=2 -g 0.1 \ + -i kubeflow/pytorch-dist-mnist:latest + ``` + + This would start a distributed training workload based on kubeflow/pytorch-dist-mnist:latest with one master and two workers. + === "API" Copy the following command to your terminal. Make sure to update the below parameters according to the comments. For more details, see [Distributed API reference](https://api-docs.run.ai/latest/tag/Distributed): diff --git a/docs/Researcher/workloads/training/standard-training/quickstart-standard-training.md b/docs/Researcher/workloads/training/standard-training/quickstart-standard-training.md index 9f6122aead..1421562344 100644 --- a/docs/Researcher/workloads/training/standard-training/quickstart-standard-training.md +++ b/docs/Researcher/workloads/training/standard-training/quickstart-standard-training.md @@ -18,13 +18,6 @@ Before you start, make sure: === "User Interface" Browse to the provided Run:ai user interface and log in with your credentials. -=== "CLI V1" - Log in using the following command. You will be prompted to enter your username and password: - - ``` bash - runai login - ``` - === "CLI V2" Run the below --help command to obtain the login options and log in according to your setup: @@ -32,6 +25,13 @@ Before you start, make sure: runai login --help ``` +=== "CLI V1 (Deprecated)" + Log in using the following command. You will be prompted to enter your username and password: + + ``` bash + runai login + ``` + === "API" To use the API, you will need to obtain a token. Please follow the [API authentication](../../../../developer/rest-auth.md) article. @@ -79,23 +79,22 @@ Before you start, make sure: After the standard training workload is created, it is added to the [workloads table](../../../../platform-admin/workloads/overviews/managing-workloads.md). +=== "CLI V2" + Copy the following command to your terminal. Make sure to update the below with the name of your project and workload: -=== "CLI V1" - Copy the following command to your terminal. Make sure to update the below with the name of your project: - ``` bash - runai config project "project-name" - runai submit "workload-name" -i runai.jfrog.io/demo/quickstart -g 1 + runai project set "project-name" + runai training submit "workload-name" -i runai.jfrog.io/demo/quickstart -g 1 ``` This would start a standard training workload based on a sample docker image, runai.jfrog.io/demo/quickstart, with one GPU allocated. -=== "CLI V2" - Copy the following command to your terminal. Make sure to update the below with the name of your project and workload: +=== "CLI V1 (Deprecated)" + Copy the following command to your terminal. Make sure to update the below with the name of your project: ``` bash - runai project set "project-name" - runai training submit "workload-name" -i runai.jfrog.io/demo/quickstart -g 1 + runai config project "project-name" + runai submit "workload-name" -i runai.jfrog.io/demo/quickstart -g 1 ``` This would start a standard training workload based on a sample docker image, runai.jfrog.io/demo/quickstart, with one GPU allocated. diff --git a/docs/Researcher/workloads/workspaces/quickstart-jupyter.md b/docs/Researcher/workloads/workspaces/quickstart-jupyter.md index 5210f55a89..30d123b8dc 100644 --- a/docs/Researcher/workloads/workspaces/quickstart-jupyter.md +++ b/docs/Researcher/workloads/workspaces/quickstart-jupyter.md @@ -20,18 +20,18 @@ Before you start, make sure: === "UI" Browse to the provided Run:ai user interface and log in with your credentials. -=== "CLI V1" - Log in using the following command. You will be prompted to enter your username and password: - +=== "CLI V2" + Run the below --help command to obtain the login options and log in according to your setup: + ``` bash runai login ``` -=== "CLI V2" - Run the below --help command to obtain the login options and log in according to your setup: - +=== "CLI V1 (Deprecated)" + Log in using the following command. You will be prompted to enter your username and password: + ``` bash - runai login --help + runai login ``` === "API" @@ -96,8 +96,17 @@ Before you start, make sure: After the workspace is created, it is added to the [workloads table](../../../platform-admin/workloads/overviews/managing-workloads.md). +=== "CLI V2" + Copy the following command to your terminal. Make sure to update the below with the name of your project and workload: -=== "CLI V1" + ``` bash + runai project set "project-name" + runai workspace submit jupyter-notebook -i jupyter/scipy-notebook -g 1 \ + --external-url container=8888 --command \ + -- start-notebook.sh --NotebookApp.base_url=/\${RUNAI_PROJECT}/\${RUNAI_JOB_NAME} --NotebookApp.token='' + ``` + +=== "CLI V1 (Deprecated)" Copy the following command to your terminal. Make sure to update the below with the name of your project and workload: ``` bash @@ -107,17 +116,6 @@ Before you start, make sure: This would start a workspace with a pre-configured Jupyter image with one GPU allocated. -=== "CLI V2" - Copy the following command to your terminal. Make sure to update the below with the name of your project and workload: - - ``` bash - runai project set "project-name" - runai workspace submit "workload-name" --image jupyter/scipy-notebook --gpu-devices-request 1 \ - --external-url container=8888 --command start-notebook.sh \ - -- --NotebookApp.base_url=/\${RUNAI_PROJECT}/\${RUNAI_JOB_NAME} --NotebookApp.token='' - ``` - - === "API" Copy the following command to your terminal. Make sure to update the below parameters according to the comments. For more details, see [Workspaces API reference](https://api-docs.run.ai/latest/tag/Workspaces): @@ -175,6 +173,9 @@ Before you start, make sure: To connect to the Jupyter Notebook, browse directly to `https:////` +=== "CLI V1 (Deprecated)" + To connect to the Jupyter Notebook, browse directly to `https:////jup1`. + === "API" To connect to the Jupyter Notebook, browse directly to `https:////` diff --git a/quickstart/python+ssh/Dockerfile b/quickstart/python+ssh/Dockerfile index a610d02471..017a673176 100644 --- a/quickstart/python+ssh/Dockerfile +++ b/quickstart/python+ssh/Dockerfile @@ -1,8 +1,8 @@ # adapted from https://docs.docker.com/engine/examples/running_ssh_service/ -FROM python +FROM python:3.13-slim-bullseye -RUN apt-get update && apt-get install -y openssh-server -RUN mkdir /var/run/sshd +RUN apt update && apt install -y openssh-server +RUN mkdir /run/sshd RUN echo 'root:root' | chpasswd RUN sed -i 's/#*PermitRootLogin prohibit-password/PermitRootLogin yes/g' /etc/ssh/sshd_config