Deep RL library with concise implementations of popular algorithms. Implemented using Flux.jl and fits into the POMDPs.jl interface.
Supports CPU and GPU computation and implements deep reinforcement learning, imitation learning, batch RL, adversarial RL, and continual learning algorithms. See the documentation for more details.
- Deep Q-Learning (DQN)
- Prioritized Experience Replay
- Soft Q-Learning
- REINFORCE
- Proximal Policy Optimization (PPO)
- Lagrange-Constrained PPO
- Advantage Actor Critic (A2C)
- Deep Deterministic Policy Gradient (DDPG)
- Twin Delayed DDPG (TD3)
- Soft Actor Critic (SAC)
- Behavioral Cloning
- Generative Adversarial Imitation Learning (GAIL) w/ On-Policy and Off-Policy Versions
- Adversarial Value Moment Imitation Learning (AdVIL)
- Adversarial Reward-moment Imitation Learning (AdRIL)
- Soft Q Imitation Learning (SQIL)
- Adversarial Soft Advantage Fitting (ASAF)
- Inverse Q-Learning (IQLearn)
An example usage of the REINFORCE
algorithm with a simple Flux network for the Cart Pole problem is shown here:
using Crux, POMDPGym
# Problem setup
mdp = GymPOMDP(:CartPole)
as = actions(mdp)
S = state_space(mdp)
# Flux network: Map states to actions
A() = DiscreteNetwork(Chain(Dense(dim(S)..., 64, relu), Dense(64, length(as))), as)
# Setup REINFORCE solver
solver_reinforce = REINFORCE(S=S, π=A())
# Solve the `mdp` to get the `policy`
policy_reinforce = solve(solver_reinforce, mdp)
To install the package, run:
] add Crux
See the installation documentation for more details on how to install POMDPGym for more environment.