Skip to content

Commit 8838d37

Browse files
albertbou92vmoens
andauthored
[Algorithm] Update A2C examples (#1521)
Co-authored-by: vmoens <vincentmoens@gmail.com>
1 parent 95b1bfe commit 8838d37

File tree

13 files changed

+903
-646
lines changed

13 files changed

+903
-646
lines changed

.github/unittest/linux_examples/scripts/run_test.sh

Lines changed: 12 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -78,15 +78,19 @@ python .github/unittest/helpers/coverage_run_parallel.py examples/ddpg/ddpg.py \
7878
logger.backend=
7979
# record_video=True \
8080
# record_frames=4 \
81-
python .github/unittest/helpers/coverage_run_parallel.py examples/a2c/a2c.py \
82-
env.num_envs=1 \
83-
collector.total_frames=48 \
84-
collector.frames_per_batch=16 \
85-
collector.collector_device=cuda:0 \
81+
python .github/unittest/helpers/coverage_run_parallel.py examples/a2c/a2c_mujoco.py \
82+
env.env_name=HalfCheetah-v4 \
83+
collector.total_frames=40 \
84+
collector.frames_per_batch=20 \
85+
loss.mini_batch_size=10 \
8686
logger.backend= \
87-
logger.log_interval=4 \
88-
optim.lr_scheduler=False \
89-
optim.device=cuda:0
87+
logger.test_interval=40
88+
python .github/unittest/helpers/coverage_run_parallel.py examples/a2c/a2c_atari.py \
89+
collector.total_frames=80 \
90+
collector.frames_per_batch=20 \
91+
loss.mini_batch_size=20 \
92+
logger.backend= \
93+
logger.test_interval=40
9094
python .github/unittest/helpers/coverage_run_parallel.py examples/dqn/dqn.py \
9195
total_frames=48 \
9296
init_random_frames=10 \
@@ -190,15 +194,6 @@ python .github/unittest/helpers/coverage_run_parallel.py examples/ddpg/ddpg.py \
190194
logger.backend=
191195
# record_video=True \
192196
# record_frames=4 \
193-
python .github/unittest/helpers/coverage_run_parallel.py examples/a2c/a2c.py \
194-
env.num_envs=1 \
195-
collector.total_frames=48 \
196-
collector.frames_per_batch=16 \
197-
collector.collector_device=cuda:0 \
198-
logger.backend= \
199-
logger.log_interval=4 \
200-
optim.lr_scheduler=False \
201-
optim.device=cuda:0
202197
python .github/unittest/helpers/coverage_run_parallel.py examples/dqn/dqn.py \
203198
total_frames=48 \
204199
init_random_frames=10 \

examples/a2c/README.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
## Reproducing Advantage Actor Critic (A2C) Algorithm Results
2+
3+
This repository contains scripts that enable training agents using the Advantage Actor Critic (A2C) Algorithm on MuJoCo and Atari environments. We follow the original paper [Asynchronous Methods for Deep Reinforcement Learning](https://arxiv.org/abs/1602.01783) by Mnih et al. (2016) to implement the A2C algorithm but fix the number of steps during the collection phase.
4+
5+
6+
## Examples Structure
7+
8+
Please note that each example is independent of each other for the sake of simplicity. Each example contains the following files:
9+
10+
1. **Main Script:** The definition of algorithm components and the training loop can be found in the main script (e.g. a2c_atari.py).
11+
12+
2. **Utils File:** A utility file is provided to contain various helper functions, generally to create the environment and the models (e.g. utils_atari.py).
13+
14+
3. **Configuration File:** This file includes default hyperparameters specified in the original paper. Users can modify these hyperparameters to customize their experiments (e.g. config_atari.yaml).
15+
16+
17+
## Running the Examples
18+
19+
You can execute the A2C algorithm on Atari environments by running the following command:
20+
21+
```bash
22+
python a2c_atari.py
23+
```
24+
25+
You can execute the A2C algorithm on MuJoCo environments by running the following command:
26+
27+
```bash
28+
python a2c_mujoco.py
29+
```

examples/a2c/a2c.py

Lines changed: 0 additions & 146 deletions
This file was deleted.

0 commit comments

Comments
 (0)