Document Atari wrapper reset behavior #2170
Merged
+35
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR addresses issue #666 by documenting the confusing behavior of
env.reset()
in Atari environments when using the defaultterminal_on_life_loss=True
setting.closes #666
Problem
When using
make_atari_env()
with default settings, theEpisodicLifeEnv
wrapper causesenv.reset()
to perform a no-op step instead of truly resetting the environment when the episode ends due to a life loss (not game over). This leads to unexpected behavior where consecutive calls toreset()
return the same observation:Solution
This PR adds comprehensive documentation to explain:
Root cause: The
EpisodicLifeEnv
wrapper'sreset()
method only performs a true reset whenself.was_real_done
is True (game over), otherwise it executes a no-op step.Workaround: Users can avoid this behavior by setting
terminal_on_life_loss=False
:Changes Made
EpisodicLifeEnv
class andreset()
method docstringsmake_atari_env()
docstring with a note about this behaviorThis is purely a documentation change with no code behavior modifications, ensuring existing functionality remains intact while helping users understand and work around this potentially confusing behavior.
💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.