Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/test-e2e.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ jobs:
pip install papermill==2.6.0 jupyter==1.1.1 ipykernel==6.29.5

echo "Install Kubeflow SDK"
pip install git+https://github.com/kubeflow/sdk.git@main#subdirectory=python
pip install git+https://github.com/kubeflow/sdk.git@main

- name: Setup cluster
run: |
Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,4 +135,4 @@ On ubuntu the default go package appears to be gccgo-go which has problems. It's

Changes to the Kubeflow Trainer Python SDK can be made in the https://github.com/kubeflow/sdk repo.

The Trainer SDK can be found at https://github.com/kubeflow/sdk/tree/main/python/kubeflow/trainer.
The Trainer SDK can be found at https://github.com/kubeflow/sdk/tree/main/kubeflow/trainer.
4 changes: 1 addition & 3 deletions examples/deepspeed/text-summarization/T5-Fine-Tuning.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,7 @@
"id": "4900404c5d532bdf",
"metadata": {},
"outputs": [],
"source": [
"# !pip install git+https://github.com/kubeflow/sdk.git@main#subdirectory=python"
]
"source": "# !pip install git+https://github.com/kubeflow/sdk.git@main"
},
{
"cell_type": "markdown",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -82,9 +82,7 @@
"id": "bd62189280760f42",
"metadata": {},
"outputs": [],
"source": [
"# !pip install git+https://github.com/kubeflow/sdk.git@main#subdirectory=python"
]
"source": "# !pip install git+https://github.com/kubeflow/sdk.git@main"
},
{
"cell_type": "markdown",
Expand Down
4 changes: 1 addition & 3 deletions examples/pytorch/image-classification/mnist.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,7 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# !pip install git+https://github.com/kubeflow/sdk.git@main#subdirectory=python"
]
"source": "# !pip install git+https://github.com/kubeflow/sdk.git@main"
},
{
"cell_type": "markdown",
Expand Down
32 changes: 15 additions & 17 deletions examples/pytorch/question-answering/fine-tune-distilbert.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,7 @@
"id": "e5e86ea307b3eec9",
"metadata": {},
"outputs": [],
"source": [
"# !pip install git+https://github.com/kubeflow/sdk.git@main#subdirectory=python"
]
"source": "# !pip install git+https://github.com/kubeflow/sdk.git@main"
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -116,8 +114,8 @@
"Requirement already satisfied: pycparser in /Users/andrew/git/trainer/.venv/lib/python3.13/site-packages (from cffi>=1.12->cryptography>=2.1.4->azure-storage-blob>=12->cloudpathlib[all]) (2.22)\n",
"Requirement already satisfied: pyasn1<0.7.0,>=0.6.1 in /Users/andrew/git/trainer/.venv/lib/python3.13/site-packages (from pyasn1-modules>=0.2.1->google-auth<3.0dev,>=2.26.1->google-cloud-storage->cloudpathlib[all]) (0.6.1)\n",
"\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m25.0.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.1\u001b[0m\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
"\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m A new release of pip is available: \u001B[0m\u001B[31;49m25.0.1\u001B[0m\u001B[39;49m -> \u001B[0m\u001B[32;49m25.1\u001B[0m\n",
"\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m To update, run: \u001B[0m\u001B[32;49mpip install --upgrade pip\u001B[0m\n"
]
}
],
Expand Down Expand Up @@ -449,21 +447,21 @@
" 0%| | 0/40 [00:00<?, ?it/s][rank0]:[W429 01:22:58.895439547 reducer.cpp:1400] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\n",
"[node-0]: [rank1]:[W429 01:22:58.895689005 reducer.cpp:1400] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\n",
"100%|██████████| 40/40 [02:36<00:00, 4.10s/it]\n",
" 0%| | 0/10 [00:00<?, ?it/s]\u001b[A\n",
" 20%|██ | 2/10 [00:00<00:03, 2.54it/s]\u001b[A\n",
" 30%|███ | 3/10 [00:01<00:03, 1.80it/s]\u001b[A\n",
" 40%|████ | 4/10 [00:02<00:03, 1.55it/s]\u001b[A\n",
" 50%|█████ | 5/10 [00:03<00:03, 1.43it/s]\u001b[A\n",
" 60%|██████ | 6/10 [00:03<00:02, 1.37it/s]\u001b[A\n",
" 70%|███████ | 7/10 [00:04<00:02, 1.32it/s]\u001b[A\n",
" 80%|████████ | 8/10 [00:05<00:01, 1.30it/s]\u001b[A\n",
" 90%|█████████ | 9/10 [00:06<00:00, 1.29it/s]\u001b[A\n",
" \u001b[A\n",
" 0%| | 0/10 [00:00<?, ?it/s]\u001B[A\n",
" 20%|██ | 2/10 [00:00<00:03, 2.54it/s]\u001B[A\n",
" 30%|███ | 3/10 [00:01<00:03, 1.80it/s]\u001B[A\n",
" 40%|████ | 4/10 [00:02<00:03, 1.55it/s]\u001B[A\n",
" 50%|█████ | 5/10 [00:03<00:03, 1.43it/s]\u001B[A\n",
" 60%|██████ | 6/10 [00:03<00:02, 1.37it/s]\u001B[A\n",
" 70%|███████ | 7/10 [00:04<00:02, 1.32it/s]\u001B[A\n",
" 80%|████████ | 8/10 [00:05<00:01, 1.30it/s]\u001B[A\n",
" 90%|█████████ | 9/10 [00:06<00:00, 1.29it/s]\u001B[A\n",
" \u001B[A\n",
"[node-0]: {'eval_loss': 5.543211936950684, 'eval_runtime': 7.9713, 'eval_samples_per_second': 2.509, 'eval_steps_per_second': 1.254, 'epoch': 1.0}\n",
"100%|██████████| 40/40 [02:45<00:00, 4.10s/it]\n",
"100%|██████████| 10/10 [00:07<00:00, 1.28it/s]\u001b[A\n",
"100%|██████████| 10/10 [00:07<00:00, 1.28it/s]\u001B[A\n",
"[node-0]: {'train_runtime': 165.12, 'train_samples_per_second': 0.484, 'train_steps_per_second': 0.242, 'train_loss': 5.764264678955078, 'epoch': 1.0}\n",
"100%|██████████| 40/40 [02:45<00:00, 4.13s/it]\u001b[A\n"
"100%|██████████| 40/40 [02:45<00:00, 4.13s/it]\u001B[A\n"
]
}
],
Expand Down
4 changes: 1 addition & 3 deletions examples/torchtune/llama3_2/alpaca-trainjob-yaml.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,7 @@
"id": "288ec515",
"metadata": {},
"outputs": [],
"source": [
"!pip install git+https://github.com/kubeflow/sdk.git@main#subdirectory=python"
]
"source": "!pip install git+https://github.com/kubeflow/sdk.git@main"
},
{
"cell_type": "markdown",
Expand Down
Loading