Unable to reproduce results on Humanoid-v2 in new SAC #16

zwfightzw · 2019-08-30T02:40:46Z

I am unable to obtain the result as reported in the paper ‘Soft Actor-Critic Algorithms and Applications ’ on the openai environment Humanoid-v2. The result is 6000 while the original paper is 8000, after 10million steps.

Do you know what might be causing this issue? Thank you!

pranz24 · 2019-08-30T04:24:51Z

Hmm.....
Don't know why this would happen, although I have never tested on humainoid for 10 million steps.
The result 8000, on humanoid, is for learned temperature (alpha).
For fixed alpha, I think, the result 6000 is alright.
I don't know if you changed the argument automatic_entropy_tuning to True (by default it is False). For --automatic_entropy_tuning = False 6000 is the expected result.

pranz24 · 2019-09-22T06:52:04Z

I just ran Humanoid for 10 million steps, and unfortunately cannot reproduce the problems you're observing.
Here are the results I see across 2 seeds:

Maybe there's something different with the arguments or environment you used?
Also, which mujoco version are you using?
P.S. running the env for 10 million steps twice costs a lot 😝. But I will run again if you can be more specific 😬.

zwfightzw · 2019-09-24T00:42:02Z

Thank you very much!!!
The parameter setting of the experiment refers to the original code. Namespace(alpha=0.2, automatic_entropy_tuning=True, batch_size=256, env_name='Humanoid-v2', eval=True, gamma=0.99, hidden_size=256, lr=0.0003, num_steps=10000001, policy='Gaussian', replay_size=1000000, seed=0, start_steps=10000, target_update_interval=1, tau=0.005, updates_per_step=1).
The version of GYM is '0.14.0' and mujoco_py is '1.50.1.68'. The mujoco physical engine version is 150.

qyz55 · 2019-12-16T07:46:03Z

Thank you very much!!!
The parameter setting of the experiment refers to the original code. Namespace(alpha=0.2, automatic_entropy_tuning=True, batch_size=256, env_name='Humanoid-v2', eval=True, gamma=0.99, hidden_size=256, lr=0.0003, num_steps=10000001, policy='Gaussian', replay_size=1000000, seed=0, start_steps=10000, target_update_interval=1, tau=0.005, updates_per_step=1).
The version of GYM is '0.14.0' and mujoco_py is '1.50.1.68'. The mujoco physical engine version is 150.

I use almost the same parameters except that "--automatic_entropy_tuning = True" for 10 million steps, and I got the following result:

I just ran the experiment once. But I could't reproduce the score of 8000, either. Would @pranz24 mind share the parameters? My version of GYM is 0.10.9 and mujoco_py is 1.50.1.68. The mujoco physical engine version is 200.

pranz24 · 2019-12-17T05:48:33Z

For --automatic_entropy_tuning = False 6000 is the expected result.

For fixed temperature, the results should be around 6000.
You can also check this in the paper

xfdywy · 2020-10-17T03:28:38Z

Hi, I have the same problems. I run the code with automatic_entropy_tuning = True, but the result is still around 6000. Would you mind share the running config for you curve? Thank you very much.

pranz24 assigned pranz24 and unassigned pranz24 Aug 30, 2019

pranz24 added the question Further information is requested label Aug 30, 2019

pranz24 removed the question Further information is requested label Jun 27, 2020

pranz24 added the help wanted Extra attention is needed label Dec 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to reproduce results on Humanoid-v2 in new SAC #16

Unable to reproduce results on Humanoid-v2 in new SAC #16

zwfightzw commented Aug 30, 2019

pranz24 commented Aug 30, 2019 •

edited

Loading

pranz24 commented Sep 22, 2019

zwfightzw commented Sep 24, 2019

qyz55 commented Dec 16, 2019

pranz24 commented Dec 17, 2019

xfdywy commented Oct 17, 2020

Unable to reproduce results on Humanoid-v2 in new SAC #16

Unable to reproduce results on Humanoid-v2 in new SAC #16

Comments

zwfightzw commented Aug 30, 2019

pranz24 commented Aug 30, 2019 • edited Loading

pranz24 commented Sep 22, 2019

zwfightzw commented Sep 24, 2019

qyz55 commented Dec 16, 2019

pranz24 commented Dec 17, 2019

xfdywy commented Oct 17, 2020

pranz24 commented Aug 30, 2019 •

edited

Loading