Multi-GPU Testing #11

CCranney · 2025-02-13T19:25:47Z

CCranney
Feb 13, 2025
Maintainer

AttentionSmithy was created with access to limited computational resources. We were not able to test thoroughly its ability to make models that are trainable on multiple GPUs, which is an absolute must for most transformer models. As we get access to more such resources or collaborators with expertise in this area, we'd love to start nailing this down for future development.

CCranney · 2025-02-25T18:11:54Z

CCranney
Feb 25, 2025
Maintainer Author

This has been implemented. The main change for end users is that now they need to initialize all AttentionSmithy class instances in the main pytorch lightning model itself (I was previously passing them in as args). This enables the model to assign different devices to different iterations of the model during training.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multi-GPU Testing #11

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Multi-GPU Testing #11

Uh oh!

CCranney Feb 13, 2025 Maintainer

Replies: 1 comment

Uh oh!

CCranney Feb 25, 2025 Maintainer Author

CCranney
Feb 13, 2025
Maintainer

CCranney
Feb 25, 2025
Maintainer Author