Q-learning SpaceCadetPinball

Training and evaluation of a DQN agent on the game SpaceCadetPinball.

Instructions

Build the game first using cmake . and then make.

Train the agent using train_agent.py:

usage: train_agent.py [-h] [--gamma GAMMA] [--tau TAU] [--lr LR] [--eps_min EPS_MIN] [--eps_max EPS_MAX] [--eps_eval EPS_EVAL]                                              
                      [--eps_decay_per_episode EPS_DECAY_PER_EPISODE] [--buffer_size BUFFER_SIZE] [--batch_size BATCH_SIZE]                                                 
                      [--test_every_n_episodes TEST_EVERY_N_EPISODES] [--use_target_model USE_TARGET_MODEL] [--buffer_start BUFFER_START] [--n_frames N_FRAMES]             
                      mode name                                                                                                                                             
                                                                                                                                                                            
Train a RL agent to play pinball                                                                                                                                            
                                                                                                                                                                            
positional arguments:                                                                                                                                                       
  mode                  Whether to 'load' an old model or to create a 'new' model                                                                                           
  name                  Name of model                                                                                                                                       
                                                                                                                                                                            
options:                                                                                                                                                                    
  -h, --help            show this help message and exit                                                                                                                     
  --gamma GAMMA         Discount factor                                                                                                                                     
  --tau TAU             Target model update rate                                                                                                                            
  --lr LR               Learning rate                                                                                                                                       
  --eps_min EPS_MIN     Minimum allowed epsilon                                                                                                                             
  --eps_max EPS_MAX     Maximum allowed epsilon                                                                                                                             
  --eps_eval EPS_EVAL   Epsilon to use during evaluation of policy                                                                                                          
  --eps_decay_per_episode EPS_DECAY_PER_EPISODE                                                                                                                             
                        How much to decay epsilon by each episode                                                                                                           
  --buffer_size BUFFER_SIZE                                                                                                                                                 
                        Size of replay buffer                                                                                                                               
  --batch_size BATCH_SIZE                                                                                                                                                   
                        Batch size to use during training on replay buffer                                                                                                  
  --test_every_n_episodes TEST_EVERY_N_EPISODES                                                                                                                             
                        How many episodes to wait before evaluating the model again                                                                                         
  --use_target_model USE_TARGET_MODEL                                                                                                                                       
                        Whether to use a target model                                                                                                                       
  --buffer_start BUFFER_START                                                                                                                                               
                        How much to fill the replay buffer (in terms of batch size) before starting training                                                                
  --n_frames N_FRAMES   How many frames to wait between each action

Data is gathered during training and can be visualized with plotter.py:

usage: plotter.py [-h] name

Plot data gathered from a RL agent playing pinball

positional arguments:
  name        Name of model

options:
  -h, --help  show this help message and exit

Evaluate the agent and see how it plays using eval_agent.py:

usage: eval_agent.py [-h] [--episodes EPISODES] [--eps EPS] [--delay DELAY] [--n_frames N_FRAMES] name

Evaluate a RL agent to play pinball

positional arguments:
  name                 Name of model

options:
  -h, --help           show this help message and exit
  --episodes EPISODES  How many episodes to play
  --eps EPS            Which epsilon to use for evaluation
  --delay DELAY        How many seconds to wait between each step in the simulation
  --n_frames N_FRAMES  How many frames to wait between each action

Name		Name	Last commit message	Last commit date
Latest commit History 591 Commits
.github/workflows		.github/workflows
CMakeModules		CMakeModules
Doc		Doc
Platform		Platform
SpaceCadetPinball		SpaceCadetPinball
.gitattributes		.gitattributes
.gitignore		.gitignore
BuildForWindows.ps1		BuildForWindows.ps1
CMakeLists.txt		CMakeLists.txt
CMakeSettings.json		CMakeSettings.json
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
ballbuffer.py		ballbuffer.py
ballhandler.py		ballhandler.py
build-mac-app.sh		build-mac-app.sh
dqn.py		dqn.py
eval_agent.py		eval_agent.py
example.webm		example.webm
kill_screen.sh		kill_screen.sh
lib.py		lib.py
mingwcc.cmake		mingwcc.cmake
mock_screen.sh		mock_screen.sh
plotter.py		plotter.py
poster.pdf		poster.pdf
report.pdf		report.pdf
train_agent.py		train_agent.py
tree.py		tree.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Q-learning SpaceCadetPinball

Instructions

Links

About

Uh oh!

Releases

Packages

Languages

License

valterschutz/SpaceCadetPinball

Folders and files

Latest commit

History

Repository files navigation

Q-learning SpaceCadetPinball

Instructions

Links

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages