Skip to content

[Bug Report] Step reward retains stale values when weight is dynamically set back to zero #2391

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
4 tasks done
bikcrum opened this issue Apr 28, 2025 · 2 comments · May be fixed by #2392
Open
4 tasks done

[Bug Report] Step reward retains stale values when weight is dynamically set back to zero #2391

bikcrum opened this issue Apr 28, 2025 · 2 comments · May be fixed by #2392
Labels
bug Something isn't working

Comments

@bikcrum
Copy link

bikcrum commented Apr 28, 2025

Describe the bug

In the reward computation logic (compute function), when a reward term has a weight of zero, the code skips computation and does not update self._step_reward for that term.
This is not an issue if a term's weight remains zero from initialization or is changed from zero to non-zero after the reward manager is created.
However, if a term's weight is first changed to non-zero and later back to zero, the self._step_reward would retain stale (nonzero) values from earlier updates, since the loop skips updating it when the weight is zero.

Explicitly setting self._step_reward to zero when the weight is zero ensures correctness and prevents stale data,

Steps to reproduce

  1. Create or initialize a reward manager with a reward term having zero weight.
  2. Change the weight of the term dynamically to a non-zero value during runtime.
  3. Compute the reward and observe that self._step_reward is correctly updated with nonzero values.
  4. Change the term’s weight back to zero dynamically.
  5. Compute the reward again.
  6. Observe that self._step_reward still retains the previous nonzero value because the update is skipped when the weight is zero.

Example (pseudo-code):

# Assume reward manager initialized normally
reward_manager._term_cfgs[idx].weight = 2.0  # Set to non-zero
reward_manager.compute(dt=0.02)

print(reward_manager._step_reward[:, idx])  # --> Should show nonzero value

reward_manager._term_cfgs[idx].weight = 0.0  # Set back to zero
reward_manager.compute(dt=0.02)

print(reward_manager._step_reward[:, idx])  
# Expected: 0.0
# Observed: still showing stale nonzero value

System Info

Describe the characteristic of your environment:

  • Commit: 2e6946a
  • Isaac Lab Version: 2.1.0
  • Isaac Sim Version: 4.5
  • OS: Ubuntu 22.04
  • GPU: RTX 3060
  • CUDA: 12.4
  • GPU Driver: 550.120

Additional context

This bug was detected during live visualization of rewards in the IsaacSim GUI, where a curriculum reward setting was used.
The ManagerLiveVisualizer relies on _step_reward to report reward terms visually, and in this case, it incorrectly reported stale reward values when the weight was dynamically changed back to zero.
This issue affects any tool that uses _step_reward for visualization or logging.

Note that the total reward (_reward_buf) computation remains correct, since skipping a term with zero weight or explicitly adding zero yields the same result.

Checklist

  • I have checked that there is no similar issue in the repo (required)
  • I have checked that the issue is not in running Isaac Sim itself and is related to the repo

Acceptance Criteria

Add the criteria for which this task is considered done. If not known at issue creation time, you can add this once the issue is assigned.

  • self._step_reward is explicitly set to zero for zero-weight terms
  • Per-term step reward accurately reflects zero contribution when weight is zero
@bikcrum bikcrum linked a pull request Apr 28, 2025 that will close this issue
10 tasks
@RandomOakForest
Copy link
Collaborator

Thank you for posting this. Did you mean Isaac Lab 2.1.0 and Isaac Sim 4.5? Thank you for adding the PR.

@RandomOakForest RandomOakForest added the bug Something isn't working label Apr 28, 2025
@bikcrum
Copy link
Author

bikcrum commented Apr 28, 2025

Thank you for posting this. Did you mean Isaac Lab 2.1.0 and Isaac Sim 4.5? Thank you for adding the PR.

That is correct. Thanks for pointing that out. I have updated them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants