-
Beta Was this translation helpful? Give feedback.
Answered by
titu1994
Jun 4, 2023
Replies: 1 comment 4 replies
-
Fp16 stability is dependent on many factors including global batch size, gradient norm, type of loss etc. We have some stability fixes in 1.18 and 1.19 but it is not a perfect solution and will still depend on training recipe |
Beta Was this translation helpful? Give feedback.
4 replies
Answer selected by
mehadi92
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Fp16 stability is dependent on many factors including global batch size, gradient norm, type of loss etc. We have some stability fixes in 1.18 and 1.19 but it is not a perfect solution and will still depend on training recipe