Skip to content

valterschutz/SpaceCadetPinball

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Q-learning SpaceCadetPinball

Training and evaluation of a DQN agent on the game SpaceCadetPinball.

Instructions

Build the game first using cmake . and then make.


Train the agent using train_agent.py:

usage: train_agent.py [-h] [--gamma GAMMA] [--tau TAU] [--lr LR] [--eps_min EPS_MIN] [--eps_max EPS_MAX] [--eps_eval EPS_EVAL]                                              
                      [--eps_decay_per_episode EPS_DECAY_PER_EPISODE] [--buffer_size BUFFER_SIZE] [--batch_size BATCH_SIZE]                                                 
                      [--test_every_n_episodes TEST_EVERY_N_EPISODES] [--use_target_model USE_TARGET_MODEL] [--buffer_start BUFFER_START] [--n_frames N_FRAMES]             
                      mode name                                                                                                                                             
                                                                                                                                                                            
Train a RL agent to play pinball                                                                                                                                            
                                                                                                                                                                            
positional arguments:                                                                                                                                                       
  mode                  Whether to 'load' an old model or to create a 'new' model                                                                                           
  name                  Name of model                                                                                                                                       
                                                                                                                                                                            
options:                                                                                                                                                                    
  -h, --help            show this help message and exit                                                                                                                     
  --gamma GAMMA         Discount factor                                                                                                                                     
  --tau TAU             Target model update rate                                                                                                                            
  --lr LR               Learning rate                                                                                                                                       
  --eps_min EPS_MIN     Minimum allowed epsilon                                                                                                                             
  --eps_max EPS_MAX     Maximum allowed epsilon                                                                                                                             
  --eps_eval EPS_EVAL   Epsilon to use during evaluation of policy                                                                                                          
  --eps_decay_per_episode EPS_DECAY_PER_EPISODE                                                                                                                             
                        How much to decay epsilon by each episode                                                                                                           
  --buffer_size BUFFER_SIZE                                                                                                                                                 
                        Size of replay buffer                                                                                                                               
  --batch_size BATCH_SIZE                                                                                                                                                   
                        Batch size to use during training on replay buffer                                                                                                  
  --test_every_n_episodes TEST_EVERY_N_EPISODES                                                                                                                             
                        How many episodes to wait before evaluating the model again                                                                                         
  --use_target_model USE_TARGET_MODEL                                                                                                                                       
                        Whether to use a target model                                                                                                                       
  --buffer_start BUFFER_START                                                                                                                                               
                        How much to fill the replay buffer (in terms of batch size) before starting training                                                                
  --n_frames N_FRAMES   How many frames to wait between each action

Data is gathered during training and can be visualized with plotter.py:

usage: plotter.py [-h] name

Plot data gathered from a RL agent playing pinball

positional arguments:
  name        Name of model

options:
  -h, --help  show this help message and exit

Evaluate the agent and see how it plays using eval_agent.py:

usage: eval_agent.py [-h] [--episodes EPISODES] [--eps EPS] [--delay DELAY] [--n_frames N_FRAMES] name

Evaluate a RL agent to play pinball

positional arguments:
  name                 Name of model

options:
  -h, --help           show this help message and exit
  --episodes EPISODES  How many episodes to play
  --eps EPS            Which epsilon to use for evaluation
  --delay DELAY        How many seconds to wait between each step in the simulation
  --n_frames N_FRAMES  How many frames to wait between each action

Links

About

Training an agent to play SpaceCadetPinball using reinforcement learning

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 91.2%
  • C 7.4%
  • Other 1.4%