pytorch
diff --git a/‎.github/unittest/linux_examples/scripts/run_test.sh
Lines changed: 12 additions & 17 deletions b/‎.github/unittest/linux_examples/scripts/run_test.sh
Lines changed: 12 additions & 17 deletions
diff --git a/‎examples/a2c/README.md
Lines changed: 29 additions & 0 deletions b/‎examples/a2c/README.md
Lines changed: 29 additions & 0 deletions
diff --git a/‎examples/a2c/a2c.py
Lines changed: 0 additions & 146 deletions b/‎examples/a2c/a2c.py
Lines changed: 0 additions & 146 deletions
@@ -78,15 +78,19 @@ python .github/unittest/helpers/coverage_run_parallel.py examples/ddpg/ddpg.py \
   logger.backend=
 #  record_video=True \
 #  record_frames=4 \
-python .github/unittest/helpers/coverage_run_parallel.py examples/a2c/a2c.py \
-  env.num_envs=1 \
-  collector.total_frames=48 \
-  collector.frames_per_batch=16 \
-  collector.collector_device=cuda:0 \
+python .github/unittest/helpers/coverage_run_parallel.py examples/a2c/a2c_mujoco.py \
+  env.env_name=HalfCheetah-v4 \
+  collector.total_frames=40 \
+  collector.frames_per_batch=20 \
+  loss.mini_batch_size=10 \
   logger.backend= \
-  logger.log_interval=4 \
-  optim.lr_scheduler=False \
-  optim.device=cuda:0
+  logger.test_interval=40
+python .github/unittest/helpers/coverage_run_parallel.py examples/a2c/a2c_atari.py \
+  collector.total_frames=80 \
+  collector.frames_per_batch=20 \
+  loss.mini_batch_size=20 \
+  logger.backend= \
+  logger.test_interval=40
 python .github/unittest/helpers/coverage_run_parallel.py examples/dqn/dqn.py \
   total_frames=48 \
   init_random_frames=10 \
@@ -190,15 +194,6 @@ python .github/unittest/helpers/coverage_run_parallel.py examples/ddpg/ddpg.py \
   logger.backend=
 #  record_video=True \
 #  record_frames=4 \
-python .github/unittest/helpers/coverage_run_parallel.py examples/a2c/a2c.py \
-  env.num_envs=1 \
-  collector.total_frames=48 \
-  collector.frames_per_batch=16 \
-  collector.collector_device=cuda:0 \
-  logger.backend= \
-  logger.log_interval=4 \
-  optim.lr_scheduler=False \
-  optim.device=cuda:0
 python .github/unittest/helpers/coverage_run_parallel.py examples/dqn/dqn.py \
   total_frames=48 \
   init_random_frames=10 \
 
@@ -0,0 +1,29 @@
+## Reproducing Advantage Actor Critic (A2C) Algorithm Results
+
+This repository contains scripts that enable training agents using the Advantage Actor Critic (A2C) Algorithm on MuJoCo and Atari environments. We follow the original paper [Asynchronous Methods for Deep Reinforcement Learning](https://arxiv.org/abs/1602.01783) by Mnih et al. (2016) to implement the A2C algorithm but fix the number of steps during the collection phase.
+
+
+## Examples Structure
+
+Please note that each example is independent of each other for the sake of simplicity. Each example contains the following files:
+
+1. **Main Script:** The definition of algorithm components and the training loop can be found in the main script  (e.g. a2c_atari.py).
+
+2. **Utils File:** A utility file is provided to contain various helper functions, generally to create the environment and the models (e.g. utils_atari.py).
+
+3. **Configuration File:** This file includes default hyperparameters specified in the original paper. Users can modify these hyperparameters to customize their experiments  (e.g. config_atari.yaml).
+
+
+## Running the Examples
+
+You can execute the A2C algorithm on Atari environments by running the following command:
+
+```bash
+python a2c_atari.py
+```
+
+You can execute the A2C algorithm on MuJoCo environments by running the following command:
+
+```bash
+python a2c_mujoco.py
+```