Instructions for Reproducing Results

First, install the required packages.

pip install -r requirements.txt

Gridworld Experiments

We train and evaluate the gridworld agent in both (fenced) cliff environment by running the following experiments.

python -m src.gridworld.evaluation constrained_gridworld_value configured default  # 1
python -m src.gridworld.evaluation constrained_gridworld_entropy configured default  # 2
python -m src.gridworld.evaluation unconstrained_gridworld_value configured default  # 3
python -m src.gridworld.evaluation delta_as_function_of_failure_penalty configured default  # 4
python -m src.gridworld.evaluation safety_as_function_of_failure_alpha configured default  # 5

Pendulum Experiments

To train a constraints-penalized, SAC Pendulum model, run the following command.

python -m src.pendulum.training --alpha={alpha} --seed={seed}

where {alpha} is the temperature parameter and {seed} is the random seed. Extract the best-performing model and store it at checkpoints_pendulum/PenalizedPendulumEnvironment__{seed}__{alpha}__{timestamp}/best-model.pth.

In our experiments, we used alpha={0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0} and seeds ranging from 1 to 25. We evaluate the trained models by running the following experiments.

python -m src.pendulum.evaluation evaluate_mode_policies configured default  # 6
python -m src.pendulum.evaluation evaluate_disturbed_mode_policies configured default  # 7

The results are automatically stored to results/pendulum.

Hopper Experiments

To train a constraints-penalized, SAC Hopper model, run the following command.

python -m src.hopper.training --alpha={alpha} --seed={seed}

where {alpha} is the temperature parameter and {seed} is the random seed. Extract the best-performing model and store it at checkpoints_pendulum/PenalizedHopperEnvironment__{seed}__{alpha}__{timestamp}/best-model.pth.

In our experiments, we used alpha={0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0} and seeds ranging from 1 to 25. We evaluate the trained models by running the following experiments.

python -m src.hopper.evaluation evaluate_mode_policies configured default  # 8
python -m src.hopper.evaluation evaluate_disturbed_mode_policies configured default  # 9

The results are automatically stored to results/hopper.

Figures

Figure 1

Path. results/gridworld/constrained_gridworld_value/{timestamp}/plots/plot_values_constrained.pdf

Figure 2

Path. results/gridworld/unconstrained_gridworld_entropy/{timestamp}/plots/plot_value_grid_unconstrained.pdf

Figure 3

python -m src.pendulum.analyze_disturbed_mode_success_rates configured small  # 10
python -m src.hopper.analyze_disturbed_mode_success_rates configured small  # 11
python -m src.figures.figure_3 --run_pendulum_success_rate={timestamp result 10} run_hopper_success_rate={timestamp result 11}

Path. results/figure_3/{timestamp}/plots/plot_figure_3.pdf

Figure 4

Path. results/gridworld/constrained_gridworld_entropy/{timestamp}/plots/plot_entropy_constrained/alpha_4_small.pdf

Figure 5

python -m src.figures.figure_5 --run_name_delta={timestamp result 4} --run_name_safety={timestamp result 5}

Path. results/figure_5/{timestamp}/plots/plot_figure_5.pdf

Figure 6

python -m src.pendulum.evaluation analyze_mode_environment_returns --run_name={timestamp result 6}  --height=2 --width=2.75  # 12
python -m src.hopper.evaluation analyze_mode_environment_returns --run_name={timestamp result 8}  --height=2 --width=2.75  # 13
python -m src.figures.figure_6 --run_pendulum_environment_return={timestamp result 12} --run_hopper_environment_return={timestamp result 13}

Path. results/figure_6/{timestamp}/plots/plot_figure_6.pdf

Figure 7

python -m src.pendulum.evaluation analyze_mode_environment_returns configured full # 14

Path. results/pendulum/analyze_mode_environment_returns/{timestamp}/plots/heatmap_disturbed_success_rate.pdf

Figure 8

python -m src.hopper.evaluation analyze_mode_environment_returns configured full # 15

Path. results/hopper/analyze_mode_environment_returns/{timestamp}/plots/heatmap_disturbed_success_rate.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
expyro		expyro
gridworld_maps		gridworld_maps
learning_curves		learning_curves
src		src
.gitignore		.gitignore
README.md		README.md
config.json		config.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Instructions for Reproducing Results

Gridworld Experiments

Pendulum Experiments

Hopper Experiments

Figures

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Data-Science-in-Mechanical-Engineering/entropy_robustness

Folders and files

Latest commit

History

Repository files navigation

Instructions for Reproducing Results

Gridworld Experiments

Pendulum Experiments

Hopper Experiments

Figures

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages