Skip to content

automl/AutoNNUnet

Repository files navigation

AutoNNU-Net

Integration of Automated Machine Learning (AutoML) methods into nnU-Net. Free software: BSD license.

Auto-nnU-Net Logo Auto-nnU-Net Logo

Python License


Repo Structure

The repository is structured in the following directories:

  • autonnunet: The AutoNNU-Net python package, including
    • analysis: Plotting, DeepCAVE utilities
    • datasets: MSD Dataset handling
    • evaluation: Predicition tools for the MSD test set
    • experiment_planning: Extensions to the nnU-Net prediction tools for AutoNNU-Net
    • hnas: Hierarchical NAS search space and integration into AutoNNU-Net
    • inference: Prediction within AutoNNU-Net+
    • utils: Collection of various utiltities, e.g., paths
  • data: Everything related to (MSD) datasets
  • output: Everything that is generated by AutoNNU-Net locally, e.g. optimization results, MSD submissions
  • results_zipped: Compressed output, this is stored in the repo
  • runscripts: Here are the actual scrips to execute experiments etc.
  • submodules: Git submodules, e.g. hypersweeper, nnU-Net etc
  • tests: Unit tests for AutoNNU-Net
  • paper: Plots and tables generated by plotting scripts

Installation

Important: This code was only tested for Rocky Linux 9.5 and CUDA 12.4. Other operating systems/GPUs/CUDA versions may not be supported. In order to install AutoNNU-Net, CUDA drivers are highly recommended - otherwise the installation of PyTorch may fail. On HPCs, for example, this means that you have to load the CUDA module before installing the package.

Important: Due to compatibilityi issues with numpy, DeepCAVE is not listed as a requirement of AutoNNU-Net. However, in order to create the plots and tables, you need to install DeepCAVE. Therefore, we recommend installing DeepCAVE manually after running the experiments.

  1. Clone the repository and its submodulues
git clone https://github.com/automl/AutoNNUnet.git autonnunet
cd autonnunet
  1. Create and activate an Anaconda/Miniconda environment with Python 3.10
conda create -n autonnunet python=3.10
conda activate autonnunet
  1. Install AutoNNU-Net
make install

Important: The automated installation is great if you want to install all submodules automatically. However, it is also quite sensible to system-specific python and package versions. Therefore, if the installation using make fails, we recommend to install the subpackages manually:

# submodules
cd submodules/batchgenerators && git checkout master && git pull &&  pip install . && cd ../../
cd submodules/hypersweeper && git checkout dev && git pull &&  pip install . && cd ../../
cd submodules/MedSAM && git checkout MedSAM2 && git pull &&  pip install . && cd ../../
cd submodules/neps && git checkout master && git pull &&  pip install . && cd ../../
cd submodules/nnUNet && git checkout dev && git pull &&  pip install . && cd ../../

# AutoNNUNet
pip install -e ".[dev]"

Reproduction of Experiments

Cluster Setup

For our experiments, we used submitit-slurm to run code on a SLURM cluster. You can define your custom SLURM cluster configuration in runscripts/configs/cluster.

We ran all experiments using the gpu cluster configurations. If you want to run your experiments locally, please use cluster=local for every command that uses hydra.

Download Datasets

To download a specific dataset, run

python autonnunet/datasets/msd_dataset.py --dataset_name=<dataset>

For example, to download D01 (BrainTumour), run:

python autonnunet/datasets/msd_dataset.py --dataset_name=Dataset001_BrainTumour

To download all datasets, run

./runscripts/download_msd.sh

Convert and Pre-process Datasets for nnU-Net

Important: This has to be executed on the same cluster/compute environment as the target for the training to get the correct nnU-Net configurations, e.g. by appending cluster=gpu.

python runscripts/convert_and_preprocess_nnunet.py -m "dataset=glob(*)"

Baseline Training

nnU-Net Conv

python runscripts/train.py -m "dataset=glob(*)" "fold=range(5)"

nnU-Net ResM

python runscripts/train.py -m "dataset=glob(*)" "fold=range(5)" "hp_config.encoder_type=ResidualEncoderM"

nnU-Net ResL

python runscripts/train.py -m "dataset=glob(*)" "fold=range(5)" "hp_config.encoder_type=ResidualEncoderL"

MedSAM2

Important: First, you need to run the training for at least one of the nnU-Net models for a specific dataset as they create the dataset splits before you can run the MedSAM2 fine-tuning.

Convert and Pre-process Datasets for MedSAM2

Important: The pre-processing for MedSAM2 must be executed locally, i.e. cannot be submitted on a SLURM cluster due to compatibility issues between pickle and multiprocessing.

python runscripts/convert_and_preprocess_medsam2.py -m "dataset=glob(*)" "cluster=local"

Fine-tune MedSAM2

  1. Download model checkpoint
cd submodules/MedSAM && mkdir checkpoints && cd checkpoints
wget https://dl.fbaipublicfiles.com/segment_anything_2/072824/sam2_hiera_tiny.pt
cd ../../../
  1. Fine-tune MedSAM2
python runscripts/finetune_medsam2.py -m "dataset=glob(*)" "fold=range(5)"

Compute Hyperband budgets

python runscripts/determine_hyperband_budgets.py --b_min=10 --b_max=1000 --eta=3

HPO

python runscripts/train.py --config-name=tune_hpo -m "dataset=Dataset001_BrainTumour"

HPO + NAS

python runscripts/train.py --config-name=tune_hpo_nas -m "dataset=Dataset001_BrainTumour"

Extract & Train Incumbent

Incumbent configurations are stored in runscripts/configs/incumbent. You can find our incumbent configurations already in this directory. If you want to re-create them after running the experiments, you need to run:

python runscripts/extract_incumbents.py --approach=hpo

Using these configs, you can than run the training of the incumbent configurations using the command:

python runscripts/train.py -m "dataset=<dataset_name>" "+incumbent=Dataset001_BrainTumour_<approach>" "fold=range(5)" "pipeline.remove_validation_files=False"

Please note that you could also use the model saved during the optimization. In our experiments, we did not store model checkpoints in the respective run directories to reduce the memory consumption.

To run nnU-Net with the incumbent configuration for the HPO approach on D01, run

python runscripts/train.py -m "dataset=Dataset001_BrainTumour" "+incumbent=Dataset001_BrainTumour_hpo" "fold=range(5)"

Cross Evaluation

For cross-evaluation of incumbent configurations, we select the 9/10 datasets where HPO+NAS achieved an improvement. To train all datasets with the incumbent configuration of another dataset, run

./runscripts/train_cross_eval.sh <dataset_name>

Inference and MSD Submission

python runscripts/run_inference.py --approach=<approach>

Or directly submit it to SLURM:

sbatch runscripts/run_inference.sh <approach>

Creates the MSD submission in output/msd_submissions

Plots and Tables

To generate all plots and tables in the paper and store them in output/paper, run

python runscripts/plot.py

Credits

This package was created with Cookiecutter_ and the audreyr/cookiecutter-pypackage_ project template.

.. _Cookiecutter: https://github.com/audreyr/cookiecutter .. _audreyr/cookiecutter-pypackage: https://github.com/audreyr/cookiecutter-pypackage

Common Issues

TorchInductor fails when loading JSON, found extra data

Sometimes during optimization, jobs fail while loading cached torch inductor files. To fix this, run

rm -rf ~/.cache/torch
rm -rf ~/.cache/triton/
rm -rf ~/.nv/ComputeCache

Credits

This package was created with Cookiecutter_ and the audreyr/cookiecutter-pypackage_ project template.

.. _Cookiecutter: https://github.com/audreyr/cookiecutter .. _audreyr/cookiecutter-pypackage: https://github.com/audreyr/cookiecutter-pypackage

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages