Better rewarding sytem #2597
Unanswered
Narsimha-swamy
asked this question in
Asking for Help
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Intro
Hi!
I am a graduate student at UMN. I use MuJoCo for my research on imitation learning for manipulation and plan to extend it to reinforcement learning.
My setup
Mujoco==3.2.4
Linux ==20.04
My question
I am looking for a better way to assign rewards in MuJoCo. Currently, I assign rewards in this format.
However, during execution, even if the object touches the gripper and then slips, it is still considered successful since the max reward for the task is reached. I can think of a quick workaround; the maximum reward should be achieved for n consecutive time steps for the task to be called successful.
But, I wonder if there is any standard method or procedure that the community follows to assign sparse/dense rewards, since rewards are a big requirement for RL tasks and not so much in Imitation learning ( other than to evaluate the policy). I would like to know the community's thoughts on this.
Minimal model and/or code that explain my question
link for xml
link for code
Confirmations
Beta Was this translation helpful? Give feedback.
All reactions