Skip to content

Hierarchical extension of Dreamer-V3 pytorch implementation. Decouples low level and high level features in the latent space.

License

Notifications You must be signed in to change notification settings

viktorvesely/hdreamer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hierarchical dreamer

!MOST OF THE CODE WAS NOT PRODUCED BY US!

As specified in the project proposal, we have based the implementation of the Hierarchical Dreamer (HDreamer) based on publically available DreamerV3 implementation the original repository. The repository's author is Naoki Morihira who re-created the DreamerV3 (link to the paper) for pytorch.

HDreamer is a hierarchical extension of the DreamerV3 architecture for model-based reinforcement learning. Inspired by HQ-VAE, HDreamer introduces multiple levels of discrete latent variables to disentangle low-level sensory information from high-level semantic structure. This aims to mitigate codebook collapse - a failure mode in VQ-based models where the discrete latent space is underutilized due to entangled representations

Code in this repository made by us

The following files/classes/function were created/modified by us:

  1. HierarchicalRSSM in networks.py. The main hierarchical components (sequence model) for the HDreamer. We also made modifications to MultiEncoder and MLP forward passes to make HDreamer work.
  2. ./envs/pendulum.py. The inverted double pendulum adaption to the dreamer's code. (Mujoco InvertedDoublePendulum-v5)
  3. ./envs/minigrid.py. The minigirid adaption to dreamer's code. (MiniGrid-Unlock-v0)
  4. HDreamers' architectures specification in configs.yaml
  5. ./test_masked_eval.py. Ablation study, testing the reward return difference
  6. ./eval* Files for evaluation HDreamer experiments/training
  7. ./togif.py Showcase HDreamer perfomance via GIF
  8. Code in ./dreamer.py that enables launching multiple dreamer experiments
  9. p*.sh, minigrid*.sh and global_config.py auxillary files.

Only code produced by us is properly documented and commented. More code was created (such as "Gumbel-Softmax differentiable multi-hot-encoded distribution" or "Global hidden state aligner") but was eventually discarded due to time constraints.

Overview of the files

  • dreamer.py main dreamer file containing the dreamer model and the code for launching training experiments
  • models.py contains the world model (sequence model, reward and continue predictor) and the imaginary behaviour (actor and critic)
  • networks.py all the NN architectures (sequence model, hierarchical sequence model, encoder/decoder, MLPs/CNNs)
  • ./logdir/ folder please initialize this folder yourself and download the zip at this link containing the final checkpoints and unzip it here
  • configs.yaml The individual dreamer architectures reside there (ours are minigrid* and pendulum_small*)
  • ./envs/ contains all the available environments (ours are minigrid.py and pendulum.py)
  • eval.py used for evaluating the trained dreamers and plotting graphs
  • eval_training.ipynb graphs for training
  • eval_dreamer.ipynb graphs for evaluation and testing the statisiticall difference
  • tools.py utils code, most important contains tools.simulate function which is used to run the dreamers for N episodes
  • test_masked_eval.py and decoder_eval.py used for the ablation studies (decoder for testing the qualitative difference and eval for quantitative reward return difference)
  • global_config.py contains global flags for tracking the codebook usage and ablation masks
  • reconstruct_minigrid_final.py function that converts minigrid observation to an image
  • togif.py render Dreamer's evaluation perfomance into a gif
  • ./presentation/ contains figures for report and the presentation

Installation

Get dependencies with python 3.11 (make vritual env):

pip install -r requirements.txt

We did not want to force a particular Pytorch installation so install the pytorch version you need. We used:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126

if you install non-cuda capable installation then you need to go to configs.yaml and change device to "cpu" (in the default settings)

Training

To launch 8 training experiments run:

python dreamer.py --config pendulum[hnumber]_small

where [hnumber] corresponds to the HDreamer level you want (1 - vanilla, 2 and 3 possible). For Minigrid

python dreamer.py --config minigrid[hnumber]

The checkpoint with metric will appear in ./logdir

Evaluation

To run the evaluation on our pretrained models first download the checkpoints here (university account needed)

Place the folders to ./logdir/* and then you can launch:

python eval.py

if you want to eval different environment go to eval.py and change the env variable on the bottom to either idp or minigrid. The results will apear in ./data/eval_[env].csv, you can use the jupyter notebooks to analyze them.

Ablation

To run the ablations on the same models first download the the checkpoints here (university account needed)

Place the folders to ./models/* and then you can launch either the qualitative (decoder reconstruction) ablation using:

python decoder_eval.py --config [config_name] --checkpoint_dir [path_to_the_checkpoint_folder]

or the quantitative ablation (reward difference) using

python test_masked_eval.py.py --config [config_name] --checkpoint_dir [path_to_the_checkpoint_folder]

About

Hierarchical extension of Dreamer-V3 pytorch implementation. Decouples low level and high level features in the latent space.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 8