Tangled Program Graphs (TPG)

This code reproduces results from the paper:

Stephen Kelly, Tatiana Voegerl, Wolfgang Banzhaf, and Cedric Gondro. Evolving Hierarchical Memory-Prediction Machines in Multi-Task Reinforcement Learning. Genetic Programming and Evolvable Machines, 2021. pdf

For the latest versions of this project, please visit Creative Algorithms Lab's GitLab page.

Quick Start

This code is designed to be used in Linux. If you use Windows, you can use Windows Subsystem for Linux (WSL). You can work with WSL in Visual Studio Code by following this tutorial.

1. Install required software

From the tpg directory run:

sudo xargs --arg-file requirements.txt apt install

Note that MuJoco must be downloaded and unpacked separately.

2. Set environment variables

In order to easily access tpg scripts, we add appropriate folders to the $PATH environment variable. To do so, add the following to ~/.profile

export TPG=<YOUR_PATH_HERE>/tpg
export PATH=$PATH:$TPG/scripts/plot
export PATH=$PATH:$TPG/scripts/run
export MUJOCO=<YOUR_PATH_TO_MUJOCO>/mujoco-3.2.2
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$MUJOCO/lib/

Then run:

source ~/.profile

3. Compile

From the tpg directory run:

scons --opt

4. Run an experiment

The folder tpg/experiment_directories/classic_control contains scripts to evolve policies for classic control tasks. Parameters are set in parameters.txt. The default settings will evolve a policy for the CartPole task.

To run an experiment using 4 parallel MPI processes, make tpg/experiment_directories/classic_control your working directory and run:

tpg-run-mpi.sh -n 4

Note that as of right now, the number of assigned processes must be greater than the number of active tasks.

5. Plot results

Generate classic_control_p0.pdf with various statistics:

tpg-plot-stats.sh

The first page will be a training curve looking something like the plot below. A fitness of 500 indicates the agent balances the pole for 500 timesteps, thus solving the task.

6. Visualize the best policy's behaviour

Display an OpenGL animation of the single best policy interacting with the environment:

tpg-run-mpi.sh -m 1

7. Cleanup

Delete all checkpoints and output files:

tpg-cleanup.sh

Name		Name	Last commit message	Last commit date
Latest commit History 733 Commits
Visualization		Visualization
datasets		datasets
experiment_directories		experiment_directories
scripts		scripts
src		src
venv		venv
.clang-format		.clang-format
.gitignore		.gitignore
README.md		README.md
SConstruct		SConstruct
requirements.txt		requirements.txt
setting.sh		setting.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Tangled Program Graphs (TPG)

Quick Start

1. Install required software

2. Set environment variables

3. Compile

4. Run an experiment

5. Plot results

6. Visualize the best policy's behaviour

7. Cleanup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 10

Uh oh!

Languages

tanya-jp/SharedMemTPG

Folders and files

Latest commit

History

Repository files navigation

Tangled Program Graphs (TPG)

Quick Start

1. Install required software

2. Set environment variables

3. Compile

4. Run an experiment

5. Plot results

6. Visualize the best policy's behaviour

7. Cleanup

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 10

Uh oh!

Languages

Packages