feature matrix

Jump to bottom

Leo Gao edited this page Feb 7, 2021 · 6 revisions

	GPT-NeoX (Deepspeed)	NVIDIA Megatron	DeepSpeed Megatron
model parallel	?	?	?
data parallel	y	?	?
pipeline parallel	y	?	?
other optimizations	ZeRO	?	?
benchmarks