Train and evaluate reinforcement learning agents in OpenAI Gym environments using Deep Q-learning (DQN).
The following environments are implemented for training and evaluating:
- CartPole
- MountainCar
- Acrobot
- Pong
- Breakout
more can be added, and hyperparameters can be tuned in config.py
The following are trained models included in this repository for evaluation:
- CartPole-v1
- Acrobot-v1
- Pong-v0
- Pong-v5
Each trained model can be found in the /models
directory.
Demonstration videos of trained models can be found in the OpenAI Gym YouTube playlist.
For training and evaluating the OpenAI Gym environments with a GPU, the following setup has been used:
- Anaconda 2.11
- CUDA 11.3
- Python 3.10
- PyTorch 1.11
- OpenAI gym 0.24
pip install -r requirements.txt
python src/train.py --env CartPole-v0 --eval_freq <int> --num_eval_episodes <int>
python src/evaluate.py --env CartPole-v0 --num_eval_episodes <int> --is_render --is_record
python src/train.py --env CartPole-v1 --eval_freq <int> --num_eval_episodes <int>
python src/evaluate.py --env CartPole-v1 --num_eval_episodes <int> --is_render --is_record
python src/train.py --env Acrobot-v1 --eval_freq <int> --num_eval_episodes <int>
python src/evaluate.py --env Acrobot-v1 --num_eval_episodes <int> --is_render --is_record
python src/train.py --env MountainCar-v0 --eval_freq <int> --num_eval_episodes <int>
python src/evaluate.py --env MountainCar-v0 --num_eval_episodes <int> --is_render --is_record
python src/train.py --env Pong-v0 --eval_freq <int> --num_eval_episodes <int>
python src/evaluate.py --env Pong-v0 --num_eval_episodes <int> --is_render --is_record
python src/train.py --env Pong-v4 --eval_freq <int> --num_eval_episodes <int>
python src/evaluate.py --env Pong-v4 --num_eval_episodes <int> --is_render --is_record
python src/train.py --env ALE/Pong-v5 --eval_freq <int> --num_eval_episodes <int>
python src/evaluate.py --env ALE/Pong-v5 --num_eval_episodes <int> --is_render --is_record
python src/train.py --env Breakout-v0 --eval_freq <int> --num_eval_episodes <int>
python src/evaluate.py --env Breakout-v0 --num_eval_episodes <int> --is_render --is_record
python src/train.py --env Breakout-v4 --eval_freq <int> --num_eval_episodes <int>
python src/evaluate.py --env Breakout-v4 --num_eval_episodes <int> --is_render --is_record
python src/train.py --env ALE/Breakout-v5 --eval_freq <int> --num_eval_episodes <int>
python src/evaluate.py --env ALE/Breakout-v5 --num_eval_episodes <int> --is_render --is_record