Skip to content

feature matrix

Leo Gao edited this page Feb 7, 2021 · 6 revisions
GPT-NeoX (Deepspeed) NVIDIA Megatron DeepSpeed Megatron
model parallel ? ? ?
data parallel y ? ?
pipeline parallel y ? ?
other optimizations ZeRO ? ?
benchmarks
Clone this wiki locally