Replies: 1 comment 6 replies
-
(I converted your issue to a discussion, since it seems more like a discussion to me!) I'd recommend With The I think all folks working on large-scale transformers at the moment are using |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey folks,
I’m trying to write a model parallel Transformer implementation but have come across the seemingly similar xmap and pjit APIs; xmap is semi-documented in the Jax docs, but it seems pjit has been used more in practice (T5X).
Which API is the current best practice/what are the trade offs of each? Are there minimal examples that might be good to work through (happy to help create one if not, if y’all can just point me in the right direction)?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions