Skip to content

Why does the loss value spike? #134

Answered by mrdbourke
Lombiz asked this question in Q&A
Discussion options

You must be logged in to vote

Hey Lombiz,

Great question.

It looks like you've run into the exploding gradient problem.

In a nutshell, what happens is that a small error turns into a big error very quickly.

This often happens after a neural networks finds a good set of patterns (also called weights) but then keeps training.

Think of it like doing a workout, you get to a certain point and you feel good afterwards. But if you keep going, you'll probably end up feeling tired afterwards.

To fix this problem you can:

  • Stop training earlier (before the gradients explode) - you can implement this using the EarlyStopping callback in TensorFlow
  • Add in other regularization techniques such as learning rate decay (decrease how mu…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@Lombiz
Comment options

Answer selected by mrdbourke
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants