Skip to content

Stable Transformer on Pong #16

@furmans

Description

@furmans

Hello,

I am currently unable to recreate the results of the stable transformer on the Pong environment. I believe from the paper the last 100 episode returns should be ~17.62 for this model and environment.

I am running the train program with arguments as specified in README for Best Performing Stable Transformer on Pong.

In train.py line 731 I changed ctx = mp.get_context("fork") to ctx = mp.get_context("spawn")

The final results I obtained one one run:

[INFO:17181 train:962 2020-12-01 19:35:33,350] Steps 10001513 @ 668.5 SPS. Loss -15.672254. Return per episode: -12.7. Stats:
{'baseline_loss': 11.395485877990723,
 'entropy_loss': -18.699639002482098,
 'episode_returns': [-20.0, -18.0, -19.0],
 'last_100_episode_returns': -19.530000686645508,
 'learning_rate': 8.657589688233862e-05,
 'len_max_traj': 239,
 'max_return_achieved': '-14.0 at step 5366379',
 'mean_episode_return': -12.666666666666666,
 'num_unpadded_steps': 3346,
 'pg_loss': -8.368099212646484,
 'total_loss': -15.672253926595053}
[INFO:17181 train:969 2020-12-01 19:35:33,350] Learning finished after 10001513 steps.

Results from another run:

[INFO:15271 train:962 2020-12-04 19:47:48,776] Steps 10001156 @ 661.4 SPS. Loss -9.595014. Return per episode: -19.7. Stats:
{'baseline_loss': 14.119840621948242,
 'entropy_loss': -18.633128484090168,
 'episode_returns': [-21.0, -19.0, -20.0, -19.0],
 'last_100_episode_returns': -19.540000915527344,
 'learning_rate': 9.02709105067138e-05,
 'len_max_traj': 239,
 'max_return_achieved': '-14.0 at step 7824133',
 'mean_episode_return': -19.666666666666668,
 'num_unpadded_steps': 3309,
 'pg_loss': -5.081725597381592,
 'total_loss': -9.595013936360678}
[INFO:15271 train:969 2020-12-04 19:47:48,776] Learning finished after 10001156 steps.

I am on Ubuntu 18.04.4, using Cuda 10.2, cudnn 7, torch 1.6.0.

Thanks in advance for any help.

Best,
Sean

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions