Stable Transformer on Pong

Hello,

I am currently unable to recreate the results of the stable transformer on the Pong environment. I believe from the paper the last 100 episode returns should be ~17.62 for this model and environment. 

I am running the train program with arguments as specified in README for Best Performing Stable Transformer on Pong. 

In train.py line 731 I changed `ctx = mp.get_context("fork")` to `ctx = mp.get_context("spawn")`

The final results I obtained one one run:
```
[INFO:17181 train:962 2020-12-01 19:35:33,350] Steps 10001513 @ 668.5 SPS. Loss -15.672254. Return per episode: -12.7. Stats:
{'baseline_loss': 11.395485877990723,
 'entropy_loss': -18.699639002482098,
 'episode_returns': [-20.0, -18.0, -19.0],
 'last_100_episode_returns': -19.530000686645508,
 'learning_rate': 8.657589688233862e-05,
 'len_max_traj': 239,
 'max_return_achieved': '-14.0 at step 5366379',
 'mean_episode_return': -12.666666666666666,
 'num_unpadded_steps': 3346,
 'pg_loss': -8.368099212646484,
 'total_loss': -15.672253926595053}
[INFO:17181 train:969 2020-12-01 19:35:33,350] Learning finished after 10001513 steps.
```

Results from another run:
```
[INFO:15271 train:962 2020-12-04 19:47:48,776] Steps 10001156 @ 661.4 SPS. Loss -9.595014. Return per episode: -19.7. Stats:
{'baseline_loss': 14.119840621948242,
 'entropy_loss': -18.633128484090168,
 'episode_returns': [-21.0, -19.0, -20.0, -19.0],
 'last_100_episode_returns': -19.540000915527344,
 'learning_rate': 9.02709105067138e-05,
 'len_max_traj': 239,
 'max_return_achieved': '-14.0 at step 7824133',
 'mean_episode_return': -19.666666666666668,
 'num_unpadded_steps': 3309,
 'pg_loss': -5.081725597381592,
 'total_loss': -9.595013936360678}
[INFO:15271 train:969 2020-12-04 19:47:48,776] Learning finished after 10001156 steps.
```

I am on Ubuntu 18.04.4, using Cuda 10.2, cudnn 7, torch 1.6.0. 

Thanks in advance for any help. 

Best,
Sean





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stable Transformer on Pong #16

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Stable Transformer on Pong #16

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions