GPT throughput (v.s. Megatron-LM) #2842

yurishin929 · 2023-02-21T06:45:28Z

yurishin929
Feb 21, 2023

Hi,
In README, when using GPT-3 model, colossal-ai shows better performance(sec/iter and Thoughput) than Megatron-LM. and Thoughput for GPT-2, Bert too.
so my question is, for these performances, did you guys use Titans?
especially in Colossal-AI for GPT-3 Table, second one that shows 4.99 throuput.

I'm using Gemini and Megatron-LM for GPT2-medium. and Gemini has worse throughput than Megatron-LM(but better memory efficient). Does Titans show better thoughput(and lower sec/iter) than Gemini and Megatron? Thank you.

binmakeswell · 2023-03-02T09:45:24Z

binmakeswell
Mar 2, 2023
Maintainer

Hi @yurishin929 Titans is our temporary model zoo because some complicate parallel strategies may require users to modify the model. We offer some mainstream models for users in Titans.
Gemini is desgined to increase the model capacity of the same hardware through offload the model from GPU memory, which may hurt the training speed to some extent because the speed of CPU memory and NVME SSD is much slower.
You may need to do some debugging on your hardware to get the best performance. Thanks.

0 replies

Agoniii · 2023-03-23T07:45:09Z

Agoniii
Mar 23, 2023

Hi @binmakeswell ,

could you share the GPT2 and Bert test configurations of ColossalAI and Megatron-LM? like tp/pp/mbs/gbs/checkpointing/zero?
Also could you share which GPU you ran on and the bandwidth of inter/intra nodes?
I saw a comparison between different Tensor Parallel, could you share the configurations, like seqlen, hidden_size, num_heads, tp_size? It would be even better if test code could be provided.

Thanks in advance!

0 replies

kurisusnowdeng · 2023-03-23T09:27:43Z

kurisusnowdeng
Mar 23, 2023

Hi @Agoniii Using Gemini with zero_stage=2 as well as placement_policy='cuda' could usually provide better throughput than Megatron-LM. In the meantime, using zero_stage=3 or placement_policy='auto' could trade off some training speed for further GPU memory reduction.

1 reply

Agoniii Mar 24, 2023

@kurisusnowdeng @binmakeswell Thanks for your reply. How to set tp/pp/mbs/gbs/checkpointing for GPT2 and Bert?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPT throughput (v.s. Megatron-LM) #2842

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

GPT throughput (v.s. Megatron-LM) #2842

Uh oh!

Uh oh!

yurishin929 Feb 21, 2023

Replies: 3 comments · 1 reply

Uh oh!

Uh oh!

binmakeswell Mar 2, 2023 Maintainer

Uh oh!

Uh oh!

Agoniii Mar 23, 2023

Uh oh!

Uh oh!

kurisusnowdeng Mar 23, 2023

Uh oh!

Agoniii Mar 24, 2023

yurishin929
Feb 21, 2023

Replies: 3 comments 1 reply

binmakeswell
Mar 2, 2023
Maintainer

Agoniii
Mar 23, 2023

kurisusnowdeng
Mar 23, 2023