Post-train DeepSeek V3/R1 with DPO using just a few GPU nodes? #58

sfc-gh-sbekman · 2025-02-21T00:38:15Z

sfc-gh-sbekman
Feb 21, 2025
Maintainer

Hello AI Community!

We are pondering over the features we can bring to ArcticTraining in the near future that would offer value to the AI community. One such feature we are considering is the ability to post-train DeepSeek V3 or DeepSeek R1 model with DPO using just a few GPU nodes.

Our upper-bound estimate to post-train 500M tokens with DPO is around 2-3 days on 8x H100 nodes (64x H100 GPUs).

We would like to ask you for your feedback and if you will find this feature valuable, and if you would use it if we were to build it out.

It would be incredibly helpful if you could answer the poll and tell others about it.

If there are some other features that you would like us to support, please feel free to share as well in the comments below.

We are looking forward to hearing from you.

p.s. if you didn't know, ArcticTraining is an open-source, easy to use post-training framework for NVIDIA GPUs built on top of DeepSpeed.

Best,
Snowflake AI Research

Do you want us to add an ability to post-train DeepSeek V3/R1 models with DPO using just a few GPU nodes?

valuable and will use

79%

valuable but will not use directly

20%

not valuable

0%

24 votes

sheshansh-ctx · 2025-03-05T19:51:30Z

sheshansh-ctx
Mar 5, 2025

This would be incredibly helpful.

0 replies

sfc-gh-sbekman · 2025-03-10T22:50:29Z

sfc-gh-sbekman
Mar 10, 2025
Maintainer Author

Thanks a ton for those who voted for this project!

So I will start working on it once Ulysses sequence parallelism integration has been finished and this PR is merged: #45

We hope that those who are interested would like to collaborate on this work.

2 replies

Haishen-ll May 13, 2025

Hi, any progress for this project?

sfc-gh-sbekman May 13, 2025
Maintainer Author

I'm still working on the Ulysses sequence parallelism - getting close to wrapping things up.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Post-train DeepSeek V3/R1 with DPO using just a few GPU nodes? #58

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Post-train DeepSeek V3/R1 with DPO using just a few GPU nodes? #58

Uh oh!

Uh oh!

sfc-gh-sbekman Feb 21, 2025 Maintainer

Replies: 2 comments · 2 replies

Uh oh!

sheshansh-ctx Mar 5, 2025

Uh oh!

sfc-gh-sbekman Mar 10, 2025 Maintainer Author

Uh oh!

Haishen-ll May 13, 2025

Uh oh!

sfc-gh-sbekman May 13, 2025 Maintainer Author

sfc-gh-sbekman
Feb 21, 2025
Maintainer

Replies: 2 comments 2 replies

sheshansh-ctx
Mar 5, 2025

sfc-gh-sbekman
Mar 10, 2025
Maintainer Author

sfc-gh-sbekman May 13, 2025
Maintainer Author