Skip to content
Discussion options

You must be logged in to vote

Fp16 stability is dependent on many factors including global batch size, gradient norm, type of loss etc. We have some stability fixes in 1.18 and 1.19 but it is not a perfect solution and will still depend on training recipe

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@mehadi92
Comment options

@titu1994
Comment options

@VahidooX
Comment options

@mehadi92
Comment options

Answer selected by mehadi92
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants