-
Notifications
You must be signed in to change notification settings - Fork 18
Open
Description
Thank you very much for your great work.! I would like to ask for your advice: when I was reproducing your code for training, the mean rewards showed an upward trend, but the loss calculated through the probability distribution hardly decreased and remained constant at around 0.693. Is this normal? Looking forward to your response.
Metadata
Metadata
Assignees
Labels
No labels