You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I was trying to load the checkpoint, it gives the following error:
Missing key(s) in state_dict: Missing key(s) in state_dict: "bert.embeddings.position_ids", "bert.embeddings.word_embeddings.weight", "bert.embeddings.position_embeddings.weight", "bert.embeddings.token_type_embeddings.weight", "bert.embeddings.LayerNorm.weight", "bert.embeddings.LayerNorm.bias", "bert.encoder.layer.0.attention.self.query.weight", "bert.encoder.layer.0.attention.self.query.bias", "bert.encoder.layer.0.attention.self.key.weight",......
and a lot of other layer infos.
It looks like the state_dict has keys "module.bert...." rather than "bert..."as expected. Seems it's similar to issue #17 so please kindly help. How would I fix this issue? Thanks in advance.
P.S. I got the model checkpoints by running DDP_main.py. I saved earlier-stage checkpoints and stopped training as it took too long in eval mode with warnings "NAN encountered ... times". Does your training look the same?
The text was updated successfully, but these errors were encountered:
Uh oh!
There was an error while loading. Please reload this page.
Hi,
When I was trying to load the checkpoint, it gives the following error:
Missing key(s) in state_dict: Missing key(s) in state_dict: "bert.embeddings.position_ids", "bert.embeddings.word_embeddings.weight", "bert.embeddings.position_embeddings.weight", "bert.embeddings.token_type_embeddings.weight", "bert.embeddings.LayerNorm.weight", "bert.embeddings.LayerNorm.bias", "bert.encoder.layer.0.attention.self.query.weight", "bert.encoder.layer.0.attention.self.query.bias", "bert.encoder.layer.0.attention.self.key.weight",......
and a lot of other layer infos.
It looks like the state_dict has keys "module.bert...." rather than "bert..."as expected. Seems it's similar to issue #17 so please kindly help. How would I fix this issue? Thanks in advance.
P.S. I got the model checkpoints by running DDP_main.py. I saved earlier-stage checkpoints and stopped training as it took too long in eval mode with warnings "NAN encountered ... times". Does your training look the same?
The text was updated successfully, but these errors were encountered: