You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add check for seq_len%tensor_parallel_degree==0 for parallelized Llama (#1312)
Mitigates #1306
Following discussions in #1306, the `seq_len%tensor_parallel_degree==0`
seems to be a necessary condition for the tp Llama3 model to work (since
it is a workaround of
[this](pytorch/pytorch#130646) numerical issue
in pytorch Dtensors of complex numbers.
This PR makes this requirements explicit.
---------
Co-authored-by: tianyu-l <150487191+tianyu-l@users.noreply.github.com>
0 commit comments