Skip to content

Commit 6f75366

Browse files
authored
Update README.md
1 parent 76d036e commit 6f75366

File tree

1 file changed

+15
-3
lines changed

1 file changed

+15
-3
lines changed

README.md

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
# Adaptive Transformers in RL
22

3-
In this experiment we replicate several results from [Stabilizing Transformers for RL](https://arxiv.org/abs/1910.06764) on both [Pong](https://gym.openai.com/envs/Pong-v0/) and [rooms_select_nonmatching_object](https://github.com/deepmind/lab/tree/master/game_scripts/levels/contributed/dmlab30#select-non-matching-object) from DMLab30.
3+
Official implementation of [Adaptive Transformers in RL](http://arxiv.org/abs/2004.03761)
44

5-
We also extend the Stable Transformer architecture with [Adaptive Attention Span](https://arxiv.org/abs/1905.07799) on a partially observable (POMDP) setting of Reinforcement Learning. To our knowledge this is one of the first attempts to stabilize and explore Adaptive Attention Span in an RL domain.
5+
In this work we replicate several results from [Stabilizing Transformers for RL](https://arxiv.org/abs/1910.06764) on both [Pong](https://gym.openai.com/envs/Pong-v0/) and [rooms_select_nonmatching_object](https://github.com/deepmind/lab/tree/master/game_scripts/levels/contributed/dmlab30#select-non-matching-object) from DMLab30.
66

7-
The arxiv preprint for this work can be found here [Adaptive Transformers in RL](http://arxiv.org/abs/2004.03761)
7+
We also extend the Stable Transformer architecture with [Adaptive Attention Span](https://arxiv.org/abs/1905.07799) on a partially observable (POMDP) setting of Reinforcement Learning. To our knowledge this is one of the first attempts to stabilize and explore Adaptive Attention Span in an RL domain.
88

99
### Steps to replicate what we did on your own machine
1010
1. Downloading DMLab:
@@ -48,3 +48,15 @@ python train.py --total_steps 20000000 \
4848
--num_actors 32 --num_learner_threads 1 --sleep_length 20 \
4949
--level_name rooms_select_nonmatching_object --mem_len 200
5050
```
51+
52+
### Reference
53+
If you find this repository useful, do cite it with,
54+
```
55+
@article{kumar2020adaptive,
56+
title={Adaptive Transformers in RL},
57+
author={Shakti Kumar and Jerrod Parker and Panteha Naderian},
58+
year={2020},
59+
eprint={2004.03761},
60+
archivePrefix={arXiv},
61+
primaryClass={cs.LG}
62+
}

0 commit comments

Comments
 (0)