PPO outputs NaNs action values #321
Unanswered
giangdao1402
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi @Toni-SM
Thank for your good work !
Currently i am training differential mobile robot (2 wheels) using PPO in IssacLab
But there is a problem, my network output nan action value:
acts : tensor([[nan, nan]], device='cuda:0')
When i use the default training template with .yaml config, everything works well, but when i try to implement my own network to clip action, errors occur
My simple network is shown as below:
i have already try to use the default network of skrl without modify the compute function of Policy class but it do not work. Can you give me some suggestions to solve this problem
Beta Was this translation helpful? Give feedback.
All reactions