-
Notifications
You must be signed in to change notification settings - Fork 45
Open
Description
Hey there,
I am training the q5 problem, but the training is not going well. I wondered if you have any clue what is wrong.
Here is a copy of training data:
122501/5000000 [..............................] - ETA: 34518s - Loss: 0.1562 - Avg_R: -21.0000 - Max_R: -21.0000 - eps: -55.3107 - Grads: 0.0000 - Max_Q: 0.0000 - lr: -0.0048
The gradient is 0, Max_Q is 0. Reward has not improved at all.
Metadata
Metadata
Assignees
Labels
No labels