PlaySlot: Controllable Object-Centric Video Prediction

Official implementation of: PlaySlot: Learning Inverse Latent Dynamics for Controllable Object-Centric Video Prediction and Planning by Angel Villar-Corrales and Sven Behnke. ICML. 2025.

[Paper] [Project Page] [BibTeX]

Main Figure	Target	Preds.	Segm.	Obj.1	Obj.2
Main Figure

Installation and Dataset Preparation

Clone the repository and install all required packages including in our conda environment, as well as other external dependencies, such as the multi-object-fetch environment or MetaWorld.

git clone git@github.com:angelvillar96/PlaySlot.git
cd PlaySlot
./create_conda_env.sh
source ~/.bashrc
conda activate PlaySlot

Download and extract the pretrained models, including checkpoints for the SAVi decomposition, predictor modules and behavior modules:

chmod +x download_pretrained.sh
./download_pretrained.sh

Download the datasets:

ButtonPress & BlockPush: You can automatically download and place the ButtonPress and BlockPush datasets by running the following commands:

chmod +x download_datasets.sh
./download_datasets.sh

Sketchy: For downloading the Sketchy robot dataset, we refer to the original source

Training

We refer to docs/TRAIN.md for detailed instructions for training your own PlaySlot. We include instractions for all training stages, including training SAVi, jointly training cOCVP and InvDyn, and learning behaviors from unlabelled expert demonstrations.

Evaluation and Figure Generation

We provide bash scripts for evaluating and generating figures using our pretrained checkpoints.
Simply run the bash scripts by:

./scripts/SCRIPT_NAME

Example:

./scripts/05_eval_PlaySlot_BlockPush.sh 
./scripts/06_generate_figs_pred_BlockPush.sh
./scripts/06_generate_action_figs_BlockPush.sh

Below we discuss more in detail the different evaluation and figure generation scripts and processes.

Evaluate SAVi for Image Decomposition

You can quantitatively and qualitatively evaluate a SAVi video decomposition model using the src/03_evaluate_savi.py and src/06_generate_figs_savi.py scripts, respectively.

This scrips will evaluate the model on the test set and generate figures for the results.

Example:

python src/03_evaluate_savi.py \
  -d experiments/BlockPush/ \
  --savi_ckpt SAVi_BlockPush.pth \
  --results_name quant_eval_savi

python src/06_generate_figs_savi.py \
  -d experiments/BlockPush/ \
  --savi_ckpt SAVi_ButtonPress.pth \
  --num_seqs 10 \
  --num_frames 8

Show SAVi Figures

Generating figures with SAVi should produce figures as follows:

Evaluate PlaySlot for Video Prediction

You can evaluate PlaySlot for video prediction using the src/05_evaluate_PlaySlot.py script. This script takes a pretrained SAVi and PlaySlot checkpoints and evaluates the visual quality of the predicted frames.

Example:

python src/05_evaluate_PlaySlot.py \
  -d experiments/BlockPush/ \
  --name_pred_exp PlaySlot \
  --savi_ckpt SAVi_BlockPush.pth \
  --pred_ckpt PlaySlot_BlockPush.pth \
  --results_name quant_eval_playslot \
  --post_only \
  --num_seed 6 \
  --num_preds 15 \
  --set_expert_policy

Generate Figures and Animations

We provide two scripts to generate video prediction, object prediction, and segmentation figures and animations.

src/06_generate_figs_pred.py generates images and animations of frames, objects and slot masks predicted by PlaySlot conditioned on latent actions inferred by the Inverse Dynamics model.

Example:

python src/06_generate_figs_pred.py \
  -d experiments/BlockPush/ \
  --name_pred_exp PlaySlot \
  --savi_ckpt SAVi_BlockPush.pth \
  --pred_ckpt PlaySlot_BlockPush.pth \
  --num_seqs 10 \
  --num_seed 1 \
  --num_preds 15 \
  --set_expert_policy

Show Example Outputs of `src/06_generate_figs_pred.py`

Generating figures with PlaySlot should produce animations as follows:

src/06_generate_action_figs.py generates images and animations of frames generated by PlaySlot by repeatedly conditioning the predition process on a single learned action prototype.

Example:

python src/06_generate_action_figs.py \
  -d experiments/BlockPush/ \
  --name_pred_exp PlaySlot \
  --savi_ckpt SAVi_BlockPush.pth \
  --pred_ckpt PlaySlot_BlockPush.pth \
  --num_seqs 10 \
  --num_seed 1 \
  --num_preds 15 \
  --set_expert_policy

Show Example Outputs of `src/06_generate_action_figs.py`

Generating figures with this script should produce animations as follows:

Evaluate Behaviors Learned by PlaySlot

You can quantitatively and qualitatively evaluate PlaySlot's behaviors using the src/11_evaluate_behavior_on_simulation.py. This scrips will evaluate the model using a simulator and generate figures for the results.

Example:

python src/11_evaluate_behavior_on_simulation.py \
  -d experiments/zz_Clean/Final_00/ \
  --savi_ckpt checkpoint_epoch_440.pth \
  --name_pred_exp NewSlotLatent \
  --pred_ckpt checkpoint_epoch_800.pth \
  --name_beh_exp zzBehn \
  --beh_ckpt Policy_checkpoint_last_saved.pth \
  --seed 1000 \
  --num_sims 10

Show Example Outputs of `src/11_evaluate_behavior_on_simulation.py`

Generating figures of PlaySlot's learned behaviors should produce animations as follows:

Acknowledgement

Our work is inspired and uses resources from the following repositories:

Contact and Citation

This repository is maintained by Angel Villar-Corrales.

Please consider citing our paper if you find our work or our repository helpful.

@inproceedings{villar_PlaySlot_2025,
  title={PlaySlot: Learning Inverse Latent Dynamics for Controllable Object-Centric Video Prediction and Planning},
  author={Villar-Corrales, Angel and Behnke, Sven},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2025}
}

In case of any questions or problems regarding the project or repository, do not hesitate to contact the authors at villar@ais.uni-bonn.de.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PlaySlot: Controllable Object-Centric Video Prediction

Installation and Dataset Preparation

Training

Evaluation and Figure Generation

Evaluate SAVi for Image Decomposition

Evaluate PlaySlot for Video Prediction

Generate Figures and Animations

Evaluate Behaviors Learned by PlaySlot

Acknowledgement

Contact and Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
datasets		datasets
experiments		experiments
scripts		scripts
src		src
README.md		README.md
create_conda_env.sh		create_conda_env.sh
download_datasets.sh		download_datasets.sh
download_pretrained.sh		download_pretrained.sh
environment.yml		environment.yml
setup.py		setup.py

angelvillar96/PlaySlot

Folders and files

Latest commit

History

Repository files navigation

PlaySlot: Controllable Object-Centric Video Prediction

Installation and Dataset Preparation

Training

Evaluation and Figure Generation

Evaluate SAVi for Image Decomposition

Evaluate PlaySlot for Video Prediction

Generate Figures and Animations

Evaluate Behaviors Learned by PlaySlot

Acknowledgement

Contact and Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages