Multi-GPU Testing #11
Closed
CCranney
started this conversation in
Base Code improvements
Replies: 1 comment
-
This has been implemented. The main change for end users is that now they need to initialize all AttentionSmithy class instances in the main pytorch lightning model itself (I was previously passing them in as args). This enables the model to assign different devices to different iterations of the model during training. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
AttentionSmithy was created with access to limited computational resources. We were not able to test thoroughly its ability to make models that are trainable on multiple GPUs, which is an absolute must for most transformer models. As we get access to more such resources or collaborators with expertise in this area, we'd love to start nailing this down for future development.
Beta Was this translation helpful? Give feedback.
All reactions