[Bug Report] Step reward retains stale values when weight is dynamically set back to zero

### Describe the bug

In the reward computation logic (`compute` function), when a reward term has a weight of zero, the code skips computation and does not update `self._step_reward` for that term.  
This is not an issue if a term's weight remains zero from initialization or is changed from zero to non-zero after the reward manager is created.  
However, if a term's weight is first changed to non-zero and later back to zero, the `self._step_reward` would retain stale (nonzero) values from earlier updates, since the loop skips updating it when the weight is zero.

Explicitly setting `self._step_reward` to zero when the weight is zero ensures correctness and prevents stale data,


### Steps to reproduce

1. Create or initialize a reward manager with a reward term having zero weight.
2. Change the weight of the term dynamically to a non-zero value during runtime.
3. Compute the reward and observe that `self._step_reward` is correctly updated with nonzero values.
4. Change the term’s weight back to zero dynamically.
5. Compute the reward again.
6. Observe that `self._step_reward` still retains the previous nonzero value because the update is skipped when the weight is zero.

Example (pseudo-code):

```python
# Assume reward manager initialized normally
reward_manager._term_cfgs[idx].weight = 2.0  # Set to non-zero
reward_manager.compute(dt=0.02)

print(reward_manager._step_reward[:, idx])  # --> Should show nonzero value

reward_manager._term_cfgs[idx].weight = 0.0  # Set back to zero
reward_manager.compute(dt=0.02)

print(reward_manager._step_reward[:, idx])  
# Expected: 0.0
# Observed: still showing stale nonzero value
```

### System Info

Describe the characteristic of your environment:


- Commit: 2e6946afb9b26f6949d4b1fd0a00e9f4ef733fcc
- Isaac Lab Version: 2.1.0
- Isaac Sim Version: 4.5
- OS: Ubuntu 22.04
- GPU: RTX 3060
- CUDA: 12.4
- GPU Driver: 550.120

### Additional context

This bug was detected during live visualization of rewards in the IsaacSim GUI, where a curriculum reward setting was used.  
The `ManagerLiveVisualizer` relies on `_step_reward` to report reward terms visually, and in this case, it incorrectly reported stale reward values when the weight was dynamically changed back to zero.  
This issue affects any tool that uses `_step_reward` for visualization or logging.  

Note that the total reward (`_reward_buf`) computation remains correct, since skipping a term with zero weight or explicitly adding zero yields the same result.


### Checklist

- [x] I have checked that there is no similar issue in the repo (**required**)
- [x] I have checked that the issue is not in running Isaac Sim itself and is related to the repo

### Acceptance Criteria

Add the criteria for which this task is considered **done**. If not known at issue creation time, you can add this once the issue is assigned.

- [x] `self._step_reward` is explicitly set to zero for zero-weight terms
- [x] Per-term step reward accurately reflects zero contribution when weight is zero

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug Report] Step reward retains stale values when weight is dynamically set back to zero #2391

Describe the bug

Steps to reproduce

System Info

Additional context

Checklist

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug Report] Step reward retains stale values when weight is dynamically set back to zero #2391

Description

Describe the bug

Steps to reproduce

System Info

Additional context

Checklist

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions