Skip to content

feature matrix

Stella Biderman edited this page Feb 7, 2021 · 6 revisions
GPT-NeoX NVIDIA Megatron DeepSpeed Megatron
model parallel n ? y
data parallel y ? y
pipeline parallel y ? y
other optimizations ZeRO ? ZeRO
benchmarks
Clone this wiki locally