DRL_implementations

This project contains the implementation of various deep reinforcement learning algorithms. Four algorithms have already been implemented :

REINFORCE ¹
Deep Q-Network (DQN) ²
Double Duelling Deep Q-Network (3DQN) ³ ⁴
Proximal Policy Optimization (PPO) ⁵
Twin Delayed Deep Deterministic Policy Gradient (TD3) ⁶

Trained Environments

Here are some Gym environments that have been solved (or nearly solved) using the implemented algorithms.

Classic Control


Acrobot-v1 with 3DQN	CartPole-v1 with REINFORCE

Box2D with Continuous Action Space


LunarLanderContinuous-v2 with TD3	BipedalWalker-v3 with TD3

Atari (using pixels)


ALE/Pong-v5 with 3DQN	ALE/Breakout-v5 with 3DQN

ALE/BeamRider-v5 with PPO	ALE/SpaceInvaders-v5 with PPO

Special thanks to fg91 and vwxyzjn whose work helped me achieve the results presented with the Atari environments.

Dependencies and installation

This project uses Python 3.10.12. Use the package manager pip to install the dependencies :

pip install -r requirements.txt

Usage

The files that can be executed are train.py, test.py and mp4_to_gif.py. train.py trains a DRL agent using a specified algorithm on a sepcified Gym environment. The trained neural network used by the agent are saved in the files/ directory. An image displaying the sum of rewards for each episode as well as the average sum of rewards for the last 100 episodes is also saved in the images/ directory. The parameters of train.py are :

-a/--algorithm : specify the algorithm to use. Accepted string are 3DQN, PPO, and TD3.
-m/--module : the Gym environment to train the algorithm on.

test.py displays an episode of a specified environment using an agent previously trained with a specified algorithm. -a and -m can be used with the same purposes as for train.py. If the --save parameter is used, the episode is not displayed but saved in the rgb_array/ directory instead. mp4_to_gif.py can then be used to convert the video in a gif file and to save it in the images/ directory. The -a and -m parameters are used by mp4_to_gif.py to name the GIF file.

Warning

PPO only works with discrete action spaces for now.

Sutton, Richard S., et al. "Policy gradient methods for reinforcement learning with function approximation." Advances in neural information processing systems 12 (1999). ↩
Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." nature 518.7540 (2015): 529-533. ↩
Van Hasselt, Hado, Arthur Guez, and David Silver. "Deep reinforcement learning with double q-learning." Proceedings of the AAAI conference on artificial intelligence. Vol. 30. No. 1. 2016. ↩
Wang, Ziyu, et al. "Dueling network architectures for deep reinforcement learning." International conference on machine learning. PMLR, 2016. ↩
Schulman, John, et al. "Proximal policy optimization algorithms." arXiv preprint arXiv:1707.06347 (2017). ↩
Fujimoto, Scott, Herke Hoof, and David Meger. "Addressing function approximation error in actor-critic methods." International conference on machine learning. PMLR, 2018. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
files		files
images		images
python		python
.gitignore		.gitignore
Pipfile		Pipfile
README.md		README.md
REINFORCE.py		REINFORCE.py
mp4_to_gif.py		mp4_to_gif.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DRL_implementations

Trained Environments

Classic Control

Box2D with Continuous Action Space

Atari (using pixels)

Dependencies and installation

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

LucasMagnana/DRL_implementations

Folders and files

Latest commit

History

Repository files navigation

DRL_implementations

Trained Environments

Classic Control

Box2D with Continuous Action Space

Atari (using pixels)

Dependencies and installation

Usage

Footnotes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages