Skip to content

Unable to reproduce results on Humanoid-v2 in new SAC #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
zwfightzw opened this issue Aug 30, 2019 · 6 comments
Open

Unable to reproduce results on Humanoid-v2 in new SAC #16

zwfightzw opened this issue Aug 30, 2019 · 6 comments
Labels
help wanted Extra attention is needed

Comments

@zwfightzw
Copy link

I am unable to obtain the result as reported in the paper ‘Soft Actor-Critic Algorithms and Applications ’ on the openai environment Humanoid-v2. The result is 6000 while the original paper is 8000, after 10million steps.

Do you know what might be causing this issue? Thank you!

@pranz24 pranz24 assigned pranz24 and unassigned pranz24 Aug 30, 2019
@pranz24 pranz24 added the question Further information is requested label Aug 30, 2019
@pranz24
Copy link
Owner

pranz24 commented Aug 30, 2019

Hmm.....
Don't know why this would happen, although I have never tested on humainoid for 10 million steps.
The result 8000, on humanoid, is for learned temperature (alpha).
For fixed alpha, I think, the result 6000 is alright.
I don't know if you changed the argument automatic_entropy_tuning to True (by default it is False). For --automatic_entropy_tuning = False 6000 is the expected result.

@pranz24
Copy link
Owner

pranz24 commented Sep 22, 2019

I just ran Humanoid for 10 million steps, and unfortunately cannot reproduce the problems you're observing.
Here are the results I see across 2 seeds:
Screenshot from 2019-09-22 12-07-24

Maybe there's something different with the arguments or environment you used?
Also, which mujoco version are you using?
P.S. running the env for 10 million steps twice costs a lot 😝. But I will run again if you can be more specific 😬.

@zwfightzw
Copy link
Author

Thank you very much!!!
The parameter setting of the experiment refers to the original code. Namespace(alpha=0.2, automatic_entropy_tuning=True, batch_size=256, env_name='Humanoid-v2', eval=True, gamma=0.99, hidden_size=256, lr=0.0003, num_steps=10000001, policy='Gaussian', replay_size=1000000, seed=0, start_steps=10000, target_update_interval=1, tau=0.005, updates_per_step=1).
The version of GYM is '0.14.0' and mujoco_py is '1.50.1.68'. The mujoco physical engine version is 150.

@qyz55
Copy link

qyz55 commented Dec 16, 2019

Thank you very much!!!
The parameter setting of the experiment refers to the original code. Namespace(alpha=0.2, automatic_entropy_tuning=True, batch_size=256, env_name='Humanoid-v2', eval=True, gamma=0.99, hidden_size=256, lr=0.0003, num_steps=10000001, policy='Gaussian', replay_size=1000000, seed=0, start_steps=10000, target_update_interval=1, tau=0.005, updates_per_step=1).
The version of GYM is '0.14.0' and mujoco_py is '1.50.1.68'. The mujoco physical engine version is 150.

I use almost the same parameters except that "--automatic_entropy_tuning = True" for 10 million steps, and I got the following result:
image
I just ran the experiment once. But I could't reproduce the score of 8000, either. Would @pranz24 mind share the parameters? My version of GYM is 0.10.9 and mujoco_py is 1.50.1.68. The mujoco physical engine version is 200.

@pranz24
Copy link
Owner

pranz24 commented Dec 17, 2019

For --automatic_entropy_tuning = False 6000 is the expected result.

For fixed temperature, the results should be around 6000.
You can also check this in the paper

@pranz24 pranz24 removed the question Further information is requested label Jun 27, 2020
@xfdywy
Copy link

xfdywy commented Oct 17, 2020

Hi, I have the same problems. I run the code with automatic_entropy_tuning = True, but the result is still around 6000. Would you mind share the running config for you curve? Thank you very much.

@pranz24 pranz24 added the help wanted Extra attention is needed label Dec 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants