Deep Deterministic Policy Gradient for Healthcare

This repository contains an implementation of the DDPG algorithm applied to the "Pendulum-v1" environment from the Gymnasium library. DDPG is an actor-critic, model-free, off-policy algorithm used for reinforcement learning tasks, particularly advantageous in those involving continuous action spaces.

Requirements

Ensure that you have the following dependencies installed:

Python 3.x
tensorflow
numpy
matplotlib
gymnasium
imageio
Any additional libraries listed in requirements.txt

You can install the dependencies using pip:

pip install -r requirements.txt

Getting Started

Clone the repository:

git clone https://github.com/eg424/DDPG-Pendulum.git
cd DDPG-Pendulum

Install the dependencies listed in requirements.txt:

pip install -r requirements.txt

Run the code to train the agent:

python3 main.py

The script will train a DDPG agent to balance the Pendulum-v1 environment and display a performance comparison between using a target network and not using a target network.

Test the trained agent:

Use test.py to evaluate the trained models:

python test.py

Code Overview

Key Components

DDPG Algorithm:
- Actor Network: Learns the policy (what action to take given a state).
- Critic Network: Evaluates the action taken by the actor based on the state.
- Replay Buffer: Stores past experiences (state, action, reward, next state) for training.
- Ornstein-Uhlenbeck Noise: Added to actions for exploration.
Files:
- main.py: Trains the DDPG agent and saves trained models.
- test.py: Evaluates the trained agent and generates a performance GIF.
- utils/ddpg_agent.py: Implements the DDPGAgent class, encapsulating training logic.
- utils/actor_critic.py: Contains actor and critic network definitions.
- utils/noise.py: Implements the Ornstein-Uhlenbeck noise process.
- utils/replay_buffer.py: Manages the replay buffer for experience storage and sampling.
- utils/target_update.py: Implements the target network soft-update logic.
- utils/helper_functions.py: Helper functions for saving models, plotting rewards, and generating GIFs.
- ddpg_pendulum.py: A self-contained script demonstrating the full implementation.
Hyperparameters:
- gamma: Discount factor for future rewards.
- tau: Soft update coefficient for target networks.
- critic_lr: Learning rate for the critic network.
- actor_lr: Learning rate for the actor network.
- total_episodes: Number of episodes for training.

How It Works

Training Loop

Initialization: The environment is created, and the actor, critic, and target networks are initialized.
Action Selection: The agent selects actions using the actor network, adding noise for exploration.
Experience Replay: The agent stores its experiences in the buffer and learns from them in batches.
Target Network Update: The target networks (actor' and critic') are updated after each episode using a soft update.
Evaluation: The agent's performance is tracked, and rewards are averaged over recent episodes for smoother plots.

Testing

Load Models: The trained actor model is loaded from the saved_models/ directory.
Generate GIF: The trained agent is evaluated in the environment. A sequence of frames is recorded and saved as a GIF in the gifs/ directory.

Example Output

Learning Curves: The agent's average episodic reward over training episodes, comparing performance with and without target networks.

GIF of Agent Performance: A GIF showcasing the trained agent balancing the pendulum.

Authors

Erik Garcia Oyono (www.linkedin.com/in/erik-garcia-oyono)
Deep Deterministic Policy Gradient: How can AI be used in healthcare? (https://medium.com/@eg424/deep-deterministic-policy-gradient-how-can-ai-be-used-in-healthcare-13ed7ca64ce3)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Deep Deterministic Policy Gradient for Healthcare

Requirements

Getting Started

Code Overview

Key Components

How It Works

Training Loop

Testing

Example Output

Authors

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
utils		utils
LICENSE		LICENSE
README.md		README.md
ddpg_pendulum.py		ddpg_pendulum.py
main.py		main.py
requirements.txt		requirements.txt
test.py		test.py

License

eg424/DDPG-Pendulum

Folders and files

Latest commit

History

Repository files navigation

Deep Deterministic Policy Gradient for Healthcare

Requirements

Getting Started

Code Overview

Key Components

How It Works

Training Loop

Testing

Example Output

Authors

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages