How to load unsharded pre-trained weights into hybrid parallel model? #3126
Replies: 1 comment 1 reply
-
Hi @ShinoharaHare , could you please try |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I refer to this example to build my own hybrid parallel model (TP + ZeRO + PP) for bloom and it seems to work well so far.
But now i want to initialize the model with the pre-trained weights which is unsharded, the thing is that i have no idea what slices of the unsharded weights is corespond to the sharded weights.
I wonder if there is a simple solution for this intention, or i need to convert it by myself.
I would appreciate it if someone could help.
Edit:
I think this is similar to this issue #2770.
Beta Was this translation helpful? Give feedback.
All reactions