Better rewarding sytem #2597

Narsimha-swamy · 2025-04-25T06:46:01Z

Narsimha-swamy
Apr 25, 2025

Intro

Hi!

I am a graduate student at UMN. I use MuJoCo for my research on imitation learning for manipulation and plan to extend it to reinforcement learning.

My setup

Mujoco==3.2.4
Linux ==20.04

My question

I am looking for a better way to assign rewards in MuJoCo. Currently, I assign rewards in this format.

However, during execution, even if the object touches the gripper and then slips, it is still considered successful since the max reward for the task is reached. I can think of a quick workaround; the maximum reward should be achieved for n consecutive time steps for the task to be called successful.

But, I wonder if there is any standard method or procedure that the community follows to assign sparse/dense rewards, since rewards are a big requirement for RL tasks and not so much in Imitation learning ( other than to evaluate the policy). I would like to know the community's thoughts on this.

Minimal model and/or code that explain my question

link for xml
link for code

Confirmations

I searched the latest documentation thoroughly before posting.
I searched previous Issues and Discussions, I am certain this has not been raised before.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Better rewarding sytem #2597

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Better rewarding sytem #2597

Uh oh!

Uh oh!

Narsimha-swamy Apr 25, 2025

Intro

My setup

My question

Minimal model and/or code that explain my question

Confirmations

Replies: 0 comments

Narsimha-swamy
Apr 25, 2025