Skip to content

Commit de57071

Browse files
authored
float8 readme: add key features section (#2448)
Adds a section summarizing the key features of float8 training
1 parent 353dd44 commit de57071

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

torchao/float8/README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,15 @@ and up to [**1.25x at 8 GPU / 8B parameter count scale**](#training-benchmarks).
66
The codebase strives to stay small, hackable, debuggable with native PyTorch tooling
77
and composable with key systems such as autograd, ```torch.compile``` and distributed.
88

9+
## Key features
10+
11+
* e2e pretraining speedups of up to [**1.5x at 512 GPU / 405B parameter count scale**](https://pytorch.org/blog/training-using-float8-fsdp2/),
12+
and up to [**1.25x at 8 GPU / 8B parameter count scale**](#training-benchmarks), with performance and accuracy validated on up to [**2k GPUs**](https://pytorch.org/blog/accelerating-large-scale-training-and-convergence-with-pytorch-float8-rowwise-on-crusoe-2k-h200s/), via [torchtitan's float8 integration](https://github.com/pytorch/torchtitan/blob/main/docs/float8.md)
13+
* seamless composability with [torch.compile](https://docs.pytorch.org/docs/stable/torch.compiler.html)
14+
* seamless composability with [DTensor](https://docs.pytorch.org/docs/stable/distributed.tensor.html), including [FSDP2 with float8 weight all-gather](https://dev-discuss.pytorch.org/t/enabling-float8-all-gather-in-fsdp2/2359) and [Async TP](https://discuss.pytorch.org/t/distributed-w-torchtitan-introducing-async-tensor-parallelism-in-pytorch/209487)
15+
* seamless composability with [PyTorch Activation Checkpointing](https://pytorch.org/blog/activation-checkpointing-techniques/)
16+
* three different scaling recipes to trade off performance vs accuracy: tensorwise (fastest), rowwise, rowwise_with_gw_hp (most accurate)
17+
918
ℹ️ <em>See the [feature tracker](https://github.com/pytorch/ao/issues/556) for upcoming features.</em>
1019

1120
ℹ️ <em>These APIs are training-only and float8-only, and we plan to [unify them with the rest of torchao](https://github.com/pytorch/ao/issues/894) in the future.</em>

0 commit comments

Comments
 (0)