feat(torch): Update `torch` libraries to v2.5.0, bundle `triton`, patch TransformerEngine #85

Eta0 · 2024-10-22T21:57:17Z

PyTorch v2.5.0, `triton`, & Patched TransformerEngine v1.11

This change includes version updates, some patches, and bundles a source build of triton in the non-nightly ml-containers/torch images (as they were previously only bundled in the nightly ones).

PyTorch

The following PyTorch components have been updated:

torch v2.4.1 → v2.5.0
torchvision v0.19.1 → v0.20.0
torchaudio v2.4.1 → v2.5.0

In addition, the patch to fix torchaudio compilation on CUDA 12.5+ (from #82) is now obsolete in the nightly builds of torchaudio (so, for some version after v2.5.0), as the contents of the patch were included as a secondary change in pytorch/audio#3843, so this adds a check to the build process to only apply that patch in versions where it is necessary.

`triton`

triton is normally listed as a dependency for the x86_64 Linux releases of PyTorch on PyPI, but it is not automatically registered as a dependency in our images given the way that we build PyTorch from source. Because it would normally be seen as a required dependency, and because each PyTorch release expects a specific version of triton to be paired with it (when using features that require triton), this PR adds a source build of triton as a bundled part of all ml-containers/torch images.

The version of triton to build is pulled with the same method that torch-nightly images use: a commit hash is read from the .ci/docker/ci_commit_pins/triton.txt file in the cloned pytorch/pytorch repository, and that exact git commit of triton is used for a build. PyPI releases of triton are not used, even though they are generally compatible with stable PyTorch versions built from source, like ours, as we already had the code to build triton from a specific commit, and this could potentially be more flexible when building e.g. PyTorch release candidates.

The version of triton to be built can still be overridden by setting the BUILD_TRITON_VERSION build argument explicitly, and can be turned off completely by setting the new separate BUILD_TRITON build argument to any value other than 1.

TransformerEngine Updates

This change additionally updates TransformerEngine by a few commits, as it was previously using a commit from right before the v1.11 release. Now that v1.11 is out, it builds from that tag, and additionally includes a patch for a bug in the v1.11 release (NVIDIA/TransformerEngine#1213) that was fixed on the v1.11 branch (and in future versions) but not the v1.11 git tag as part of NVIDIA/TransformerEngine#1222.

github-actions · 2024-10-22T22:34:17Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368570
Image: ghcr.io/coreweave/ml-containers/torch:es-torch-updates-3a941b9-base-cuda12.4.1-ubuntu20.04-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-22T22:34:18Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368570
Image: ghcr.io/coreweave/ml-containers/torch:es-torch-updates-3a941b9-base-cuda12.4.1-ubuntu22.04-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-22T22:34:24Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368570
Image: ghcr.io/coreweave/ml-containers/torch:es-torch-updates-3a941b9-base-cuda12.6.1-ubuntu22.04-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-22T22:35:33Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368570
Image: ghcr.io/coreweave/ml-containers/torch:es-torch-updates-3a941b9-base-cuda12.2.2-ubuntu22.04-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-22T22:35:39Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368570
Image: ghcr.io/coreweave/ml-containers/torch:es-torch-updates-3a941b9-base-cuda12.2.2-ubuntu20.04-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-22T22:51:53Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368567
Image: ghcr.io/coreweave/ml-containers/torch:es-torch-updates-3a941b9-nccl-cuda12.4.1-ubuntu20.04-nccl2.23.4-1-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-22T22:52:24Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368570
Image: ghcr.io/coreweave/ml-containers/torch:es-torch-updates-3a941b9-base-cuda12.6.1-ubuntu20.04-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-22T22:53:08Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368567
Image: ghcr.io/coreweave/ml-containers/torch:es-torch-updates-3a941b9-nccl-cuda12.4.1-ubuntu22.04-nccl2.23.4-1-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-22T22:53:30Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368567
Image: ghcr.io/coreweave/ml-containers/torch:es-torch-updates-3a941b9-nccl-cuda12.2.2-ubuntu20.04-nccl2.21.5-1-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-22T22:54:22Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368567
Image: ghcr.io/coreweave/ml-containers/torch:es-torch-updates-3a941b9-nccl-cuda12.2.2-ubuntu22.04-nccl2.23.4-1-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-22T23:09:48Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368567
Image: ghcr.io/coreweave/ml-containers/torch:es-torch-updates-3a941b9-nccl-cuda12.6.1-ubuntu20.04-nccl2.23.4-1-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-22T23:10:07Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368567
Image: ghcr.io/coreweave/ml-containers/torch:es-torch-updates-3a941b9-nccl-cuda12.6.1-ubuntu22.04-nccl2.23.4-1-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-23T00:49:12Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-torch-updates-3a941b9-base-24102222-cuda12.4.1-ubuntu20.04-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T00:56:42Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-torch-updates-3a941b9-base-24102222-cuda12.2.2-ubuntu20.04-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T00:57:14Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-torch-updates-3a941b9-base-24102222-cuda12.2.2-ubuntu22.04-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T01:04:22Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-torch-updates-3a941b9-base-24102222-cuda12.4.1-ubuntu22.04-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T01:04:40Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-torch-updates-3a941b9-base-24102222-cuda12.6.1-ubuntu20.04-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T02:44:30Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-torch-updates-3a941b9-base-24102222-cuda12.6.1-ubuntu22.04-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T02:46:13Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-torch-updates-3a941b9-nccl-24102222-cuda12.4.1-ubuntu20.04-nccl2.23.4-1-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T02:58:43Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-torch-updates-3a941b9-nccl-24102222-cuda12.6.1-ubuntu22.04-nccl2.23.4-1-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T03:04:06Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-torch-updates-3a941b9-nccl-24102222-cuda12.2.2-ubuntu22.04-nccl2.23.4-1-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T03:44:40Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368570
Image: ghcr.io/coreweave/ml-containers/torch-extras:es-torch-updates-3a941b9-base-cuda12.4.1-ubuntu20.04-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-23T03:50:14Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368570
Image: ghcr.io/coreweave/ml-containers/torch-extras:es-torch-updates-3a941b9-base-cuda12.4.1-ubuntu22.04-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-23T03:57:33Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368570
Image: ghcr.io/coreweave/ml-containers/torch-extras:es-torch-updates-3a941b9-base-cuda12.6.1-ubuntu22.04-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-23T04:19:29Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368570
Image: ghcr.io/coreweave/ml-containers/torch-extras:es-torch-updates-3a941b9-base-cuda12.2.2-ubuntu22.04-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-23T04:33:51Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-torch-updates-3a941b9-nccl-24102222-cuda12.6.1-ubuntu20.04-nccl2.23.4-1-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T04:47:21Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368567
Image: ghcr.io/coreweave/ml-containers/torch-extras:es-torch-updates-3a941b9-nccl-cuda12.4.1-ubuntu20.04-nccl2.23.4-1-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-23T04:56:44Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368570
Image: ghcr.io/coreweave/ml-containers/torch-extras:es-torch-updates-3a941b9-base-cuda12.6.1-ubuntu20.04-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-23T05:00:43Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368570
Image: ghcr.io/coreweave/ml-containers/torch-extras:es-torch-updates-3a941b9-base-cuda12.2.2-ubuntu20.04-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-23T05:15:26Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368567
Image: ghcr.io/coreweave/ml-containers/torch-extras:es-torch-updates-3a941b9-nccl-cuda12.4.1-ubuntu22.04-nccl2.23.4-1-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-23T05:49:27Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368567
Image: ghcr.io/coreweave/ml-containers/torch-extras:es-torch-updates-3a941b9-nccl-cuda12.2.2-ubuntu20.04-nccl2.21.5-1-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-23T05:53:21Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368567
Image: ghcr.io/coreweave/ml-containers/torch-extras:es-torch-updates-3a941b9-nccl-cuda12.6.1-ubuntu20.04-nccl2.23.4-1-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-23T05:57:37Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368567
Image: ghcr.io/coreweave/ml-containers/torch-extras:es-torch-updates-3a941b9-nccl-cuda12.6.1-ubuntu22.04-nccl2.23.4-1-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-23T06:01:12Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368567
Image: ghcr.io/coreweave/ml-containers/torch-extras:es-torch-updates-3a941b9-nccl-cuda12.2.2-ubuntu22.04-nccl2.23.4-1-torch2.5.0-vision0.20.0-audio2.5.0

github-actions · 2024-10-23T06:13:08Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-torch-updates-3a941b9-base-24102222-cuda12.4.1-ubuntu20.04-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T06:55:31Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-torch-updates-3a941b9-base-24102222-cuda12.4.1-ubuntu22.04-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T07:03:15Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-torch-updates-3a941b9-base-24102222-cuda12.6.1-ubuntu20.04-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T07:06:17Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-torch-updates-3a941b9-base-24102222-cuda12.2.2-ubuntu20.04-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T07:09:17Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-torch-updates-3a941b9-base-24102222-cuda12.2.2-ubuntu22.04-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T07:11:21Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-torch-updates-3a941b9-base-24102222-cuda12.6.1-ubuntu22.04-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T07:51:12Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-torch-updates-3a941b9-nccl-24102222-cuda12.4.1-ubuntu20.04-nccl2.23.4-1-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T07:59:46Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-torch-updates-3a941b9-nccl-24102222-cuda12.4.1-ubuntu22.04-nccl2.23.4-1-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T08:02:58Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-torch-updates-3a941b9-nccl-24102222-cuda12.6.1-ubuntu22.04-nccl2.23.4-1-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T08:08:55Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-torch-updates-3a941b9-nccl-24102222-cuda12.6.1-ubuntu20.04-nccl2.23.4-1-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T08:23:55Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-torch-updates-3a941b9-nccl-24102222-cuda12.2.2-ubuntu22.04-nccl2.23.4-1-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

github-actions · 2024-10-23T19:22:22Z

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/11469368645
Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-torch-updates-3a941b9-nccl-24102222-cuda12.2.2-ubuntu20.04-nccl2.21.5-1-torch2.6.0a0-vision0.20.0a0-audio2.5.0a0

wbrown

👍

Eta0 added 6 commits October 14, 2024 15:31

feat(torch): Update torch libraries to v2.5.0 release candidates

f78a0c3

feat(torch): Update torch to v2.5.0, vision to v0.20.0, audio to v2.5.0

45d1368

build(torch): Build triton for non-nightly images too

705b904

build(torch): Set BUILD_TRITON to 1 by default

e3a8f08

fix(torch): Don't apply torchaudio patch if it is no longer needed

416d624

feat(torch): Update TransformerEngine to v1.11 release, plus a patch

87a8c82

Eta0 added bug Something isn't working enhancement New feature or request labels Oct 22, 2024

Eta0 requested a review from wbrown October 22, 2024 21:57

Eta0 self-assigned this Oct 22, 2024

build(torch): Use git apply without --index

3a941b9

wbrown approved these changes Oct 23, 2024

View reviewed changes

wbrown merged commit f575d1b into main Oct 23, 2024
102 checks passed

wbrown deleted the es/torch-updates branch October 23, 2024 19:49

feat(torch): Update torch libraries to v2.5.0, bundle triton, patch TransformerEngine #85

feat(torch): Update torch libraries to v2.5.0, bundle triton, patch TransformerEngine #85

Uh oh!

Conversation

Eta0 commented Oct 22, 2024

PyTorch v2.5.0, triton, & Patched TransformerEngine v1.11

PyTorch

triton

TransformerEngine Updates

Uh oh!

github-actions bot commented Oct 22, 2024

Uh oh!

github-actions bot commented Oct 22, 2024

Uh oh!

github-actions bot commented Oct 22, 2024

Uh oh!

github-actions bot commented Oct 22, 2024

Uh oh!

github-actions bot commented Oct 22, 2024

Uh oh!

github-actions bot commented Oct 22, 2024

Uh oh!

github-actions bot commented Oct 22, 2024

Uh oh!

github-actions bot commented Oct 22, 2024

Uh oh!

github-actions bot commented Oct 22, 2024

Uh oh!

github-actions bot commented Oct 22, 2024

Uh oh!

github-actions bot commented Oct 22, 2024

Uh oh!

github-actions bot commented Oct 22, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

github-actions bot commented Oct 23, 2024

Uh oh!

feat(torch): Update `torch` libraries to v2.5.0, bundle `triton`, patch TransformerEngine #85

feat(torch): Update `torch` libraries to v2.5.0, bundle `triton`, patch TransformerEngine #85

PyTorch v2.5.0, `triton`, & Patched TransformerEngine v1.11

`triton`