-
Notifications
You must be signed in to change notification settings - Fork 757
Upgrade gymnasium to 1.0.0 #502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for starting this @sdpkjc
Ale-py should be updated to v0.10.1
And the auto reset mode of the vector environment should be updated
pyproject.toml
Outdated
stable-baselines3 = "2.0.0" | ||
gymnasium = ">=0.28.1" | ||
stable-baselines3 = ">=2.4.0" | ||
gymnasium = ">=1.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be specified as v1.1.0, if sb3 is the limitation then I think see if I can update it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sb3 depends on <=1.0.0.
If I remember correctly, SB3 is used for the replay buffer and the atari wrappers. IMO, those features can probably be shifted in |
|
||
|
||
# Only for gymnasium v1.0.0 | ||
class SameModelSyncVectorEnv(gym.vector.SyncVectorEnv): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be called SameStepModeSyncVectorEnv
or we just shift to gymnasium v1.1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -12,13 +12,13 @@ license="MIT" | |||
readme = "README.md" | |||
|
|||
[tool.poetry.dependencies] | |||
python = ">=3.8,<3.11" | |||
python = ">=3.9,<3.11" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is limiting us increasing this?
tensorboard = "^2.10.0" | ||
wandb = "^0.13.11" | ||
gym = "0.23.1" | ||
torch = ">=1.12.1" | ||
stable-baselines3 = "2.0.0" | ||
gymnasium = ">=0.28.1" | ||
stable-baselines3 = "^2.4.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect we will have a new release of sb3 with support for gymnasium v1.1.0 as no changes seem to be required on their end (DLR-RM/stable-baselines3#2095)
Hey, with the updates in gymnasium 1.1 would it not be easier to simply use the 'Same-Step Mode' or am I missing something? Does it have to do with the support for the other wrappers that are only supported in the 'Next step' mode? |
@MarcusBinderDTU it is more about minimising implementation changes. |
Thanks for the fast reply! I agree, but I dont understand why not going directly to gymnasium 1.1 and then using Would that not be the easiest way of doing it? |
Thanks for your suggestion. However, since the currently released version of sb3 depends on gymnasium < 1.1, we can’t upgrade to 1.1 directly. Once #505 is merged, we'll remove the sb3 dependency and then update to gymnasium 1.1, which will allow us to use |
Ahh, now I see! Thanks for clarifying it, that makes sense :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sdpkjc
For this line:
real_next_obs[idx] = infos["final_observation"][idx]
I think "final_observation" should be "final_obs".
Tried running and "final_observation" gives a key error but not "final_obs".
@sdpkjc I noticed that the scripts in if "final_info" in infos:
for info in infos["final_info"]:
if "episode" not in info: So we might need to update all the files in this directory too! |
Description
Upgrade gymnasium to 1.0.0
gymnasium classic control
c51.py
c51_jax.py
dqn.py
dqn_jax.py
ppo.py
pqn.py
gymnasium mujoco
ddpg_continuous_action.py
ddpg_continuous_action_jax.py
td3_continuous_action.py
td3_continuous_action_jax.py
sac_continuous_action.py
ppo_continuous_action.py
rpo_continuous_action.py
gymnasium atari (
EpisodicLifeEnv
conflicts with gymnasium v1.0.0'sRecordEpisodeStatistics
and will be fixed later.)c51_atari.py
c51_atari_jax.py
dqn_atari.py
dqn_atari_jax.py
qdagger_dqn_atari_impalacnn.py
qdagger_dqn_atari_jax_impalacnn.py
sac_atari.py
ppo_atari.py
ppo_atari_lstm.py
ppo_atari_multigpu.py
envpool
ppo_rnd_envpool.py
pqn_atari_envpool_lstm.py
pqn_atari_envpool.py
ppo_atari_envpool.py
ppo_atari_envpool_xla_jax.py
ppo_atari_envpool_xla_jax_scan.py
other
ppg_procgen.py
ppo_pettingzoo_ma_atari.py
ppo_procgen.py
ppo_trxl.py
ppo_continuous_action_isaacgym.py
Types of changes
Checklist:
pre-commit run --all-files
passes (required).mkdocs serve
.If you need to run benchmark experiments for a performance-impacting changes:
--capture_video
.python -m openrlbenchmark.rlops
.python -m openrlbenchmark.rlops
utility to the documentation.python -m openrlbenchmark.rlops ....your_args... --report
, to the documentation.