Hierarchical dreamer

!MOST OF THE CODE WAS NOT PRODUCED BY US!

As specified in the project proposal, we have based the implementation of the Hierarchical Dreamer (HDreamer) based on publically available DreamerV3 implementation the original repository. The repository's author is Naoki Morihira who re-created the DreamerV3 (link to the paper) for pytorch.

HDreamer is a hierarchical extension of the DreamerV3 architecture for model-based reinforcement learning. Inspired by HQ-VAE, HDreamer introduces multiple levels of discrete latent variables to disentangle low-level sensory information from high-level semantic structure. This aims to mitigate codebook collapse - a failure mode in VQ-based models where the discrete latent space is underutilized due to entangled representations

Code in this repository made by us

The following files/classes/function were created/modified by us:

HierarchicalRSSM in networks.py. The main hierarchical components (sequence model) for the HDreamer. We also made modifications to MultiEncoder and MLP forward passes to make HDreamer work.
./envs/pendulum.py. The inverted double pendulum adaption to the dreamer's code. (Mujoco InvertedDoublePendulum-v5)
./envs/minigrid.py. The minigirid adaption to dreamer's code. (MiniGrid-Unlock-v0)
HDreamers' architectures specification in configs.yaml
./test_masked_eval.py. Ablation study, testing the reward return difference
./eval* Files for evaluation HDreamer experiments/training
./togif.py Showcase HDreamer perfomance via GIF
Code in ./dreamer.py that enables launching multiple dreamer experiments
p*.sh, minigrid*.sh and global_config.py auxillary files.

Only code produced by us is properly documented and commented. More code was created (such as "Gumbel-Softmax differentiable multi-hot-encoded distribution" or "Global hidden state aligner") but was eventually discarded due to time constraints.

Overview of the files

dreamer.py main dreamer file containing the dreamer model and the code for launching training experiments
models.py contains the world model (sequence model, reward and continue predictor) and the imaginary behaviour (actor and critic)
networks.py all the NN architectures (sequence model, hierarchical sequence model, encoder/decoder, MLPs/CNNs)
./logdir/ folder please initialize this folder yourself and download the zip at this link containing the final checkpoints and unzip it here
configs.yaml The individual dreamer architectures reside there (ours are minigrid* and pendulum_small*)
./envs/ contains all the available environments (ours are minigrid.py and pendulum.py)
eval.py used for evaluating the trained dreamers and plotting graphs
eval_training.ipynb graphs for training
eval_dreamer.ipynb graphs for evaluation and testing the statisiticall difference
tools.py utils code, most important contains tools.simulate function which is used to run the dreamers for N episodes
test_masked_eval.py and decoder_eval.py used for the ablation studies (decoder for testing the qualitative difference and eval for quantitative reward return difference)
global_config.py contains global flags for tracking the codebook usage and ablation masks
reconstruct_minigrid_final.py function that converts minigrid observation to an image
togif.py render Dreamer's evaluation perfomance into a gif
./presentation/ contains figures for report and the presentation

Installation

Get dependencies with python 3.11 (make vritual env):

pip install -r requirements.txt

We did not want to force a particular Pytorch installation so install the pytorch version you need. We used:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126

if you install non-cuda capable installation then you need to go to configs.yaml and change device to "cpu" (in the default settings)

Training

To launch 8 training experiments run:

python dreamer.py --config pendulum[hnumber]_small

where [hnumber] corresponds to the HDreamer level you want (1 - vanilla, 2 and 3 possible). For Minigrid

python dreamer.py --config minigrid[hnumber]

The checkpoint with metric will appear in ./logdir

Evaluation

To run the evaluation on our pretrained models first download the checkpoints here (university account needed)

Place the folders to ./logdir/* and then you can launch:

python eval.py

if you want to eval different environment go to eval.py and change the env variable on the bottom to either idp or minigrid. The results will apear in ./data/eval_[env].csv, you can use the jupyter notebooks to analyze them.

Ablation

To run the ablations on the same models first download the the checkpoints here (university account needed)

Place the folders to ./models/* and then you can launch either the qualitative (decoder reconstruction) ablation using:

python decoder_eval.py --config [config_name] --checkpoint_dir [path_to_the_checkpoint_folder]

or the quantitative ablation (reward difference) using

python test_masked_eval.py.py --config [config_name] --checkpoint_dir [path_to_the_checkpoint_folder]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hierarchical dreamer

Code in this repository made by us

Overview of the files

Installation

Training

Evaluation

Ablation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 8

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 184 Commits
ablation		ablation
data		data
envs		envs
habrok		habrok
pilot_exp		pilot_exp
presentation		presentation
.DS_Store		.DS_Store
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
configs.yaml		configs.yaml
decoder_eval.py		decoder_eval.py
dreamer.py		dreamer.py
eval.py		eval.py
eval_dreamer.ipynb		eval_dreamer.ipynb
eval_training.ipynb		eval_training.ipynb
exploration.py		exploration.py
global_config.py		global_config.py
models.py		models.py
networks.py		networks.py
parallel.py		parallel.py
reconstruct_minigrid_final.py		reconstruct_minigrid_final.py
requirements.txt		requirements.txt
test.sh		test.sh
test_masked_eval.py		test_masked_eval.py
togif.py		togif.py
tools.py		tools.py

License

viktorvesely/hdreamer

Folders and files

Latest commit

History

Repository files navigation

Hierarchical dreamer

Code in this repository made by us

Overview of the files

Installation

Training

Evaluation

Ablation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 8

Uh oh!

Languages

Packages