(for changes visit CHANGELOG.md
)
Paper: arxiv
The project was build with python 3.10.13, so please ensure that you have this version (or a newer one) installed.
This can be easily done by using pyenv
, however we strongly recommend using uv
, as it gives the highest degree of reproducibility for this project and also automaically handles the python version.
All used dependencies and project settings can be found in pyproject.toml
.
The Software was created and tested on a Linux system with a NVIDIA GPU.
-
Create and activate the virtual environment using the
pyproject.toml
withuv
uv
makes the instantiation easy. Just runuv sync
and it will install the required python version as well as all depencies (with exact resolution usinguv.lock
).ALTERNATIVLY make sure you have python 3.10.13 or newer installed.
Then run
python -m venv .venv source .venv/bin/activate pip install -e .
on subsequent used you only need to activate the virtual environment. For non
bash
-based workflows, have a look into the instructions ofvenv
on how to activate the environment. -
Set environment variables:
DATASET_LOCATION
: location of the images and the dataframe containing the labelsEXPERIMENT_LOCATION
: location for the log files during experiments
Alternativly, if using
uv
you can set the variables in the.env
file. -
Download accompaning
data_and_code
from figshare. You should receive aGleasonXAI_data.zip
.
ALL RUNABLE SCRIPTS CAN BE FOUND IN ./scripts.
Run them with python python scripts/somescript.py
or with uv uv run scripts/somescript.py
.
uv does not require you to setup the environment or a fitting python version beforehand.
If using uv you can set the enviroment variables in .env
and then use uv run --env-file=.env scripts/somescript.py
.
-
setup.py has default values for the download links for the Gleason2019. If they work you need to do nothing (see next step). If you encounter an error in the next step, generate download links for the Gleason 2019 challenge data (test and training sets) by copying the link generated by clicking on "Download all" on their sync download page. Alternatively, download the datasets.
-
use setup.py to create the data structure as needed (see below).
parameter type use --gleasonxai_data string path to directory / zip file containing the dowloaded data ( GleasonXAI_data.zip
)--manual_xai_data flag if set: expect GleasonXAI data to already be in expected structure and place --download flag if set: download / unzip datasets and copy to expected path --calibrate flag if set: create micron/pixel calibrated images --gleason19_test string Download link to the Gleason 2019 challenge test set * --gleason19_train string Download link to the Gleason 2019 challenge training set * --arvaniti string Download link to the Harvard Arvaniti et al. data set (has a default value, that should work).
This creates the following expected directory structure (depending on set flags):
├── [DATASET_LOCATION]
│ └── GleasonXAI
│ ├── final_filtered_explanations_df.csv
│ ├── label_remapping.json
│ ├── TMA
│ │ ├── original
│ │ │ ├── PR482a_A1.jpg
│ │ │ ├── PR482a_A2.jpg
│ │ │ └── ...
│ │ └── MicronsCalibrated
│ │ ├── PR482a_A1.jpg
│ │ ├── PR482a_A2.jpg
│ │ └── ...
│ └── GleasonFinal2
│ └── label_level1
│ ├── SoftDiceBalanced-1
│ │ └── version_0
│ │ └── checkpoints
│ │ └── best_model.ckpt
│ ├── SoftDiceBalanced-2
│ │ └── version_0
│ │ └── checkpoints
│ │ └── best_model.ckpt
│ └── SoftDiceBalanced-3
│ └── version_0
│ └── checkpoints
│ └── best_model.ckpt
└── ...
-
call setup.py with
gleasonxai_data
set to unpack the model weights
i.e.:python setup.py --gleasonxai_data "/path/to/GleasonXAI_data.zip"
-
call run_gleasonXAI.py with the following parameters:
parameter type use --images string path to images to make predictions on (single image or directory) --checkpoint_absolute flag if set: checkpoint paths are absolute paths, else assumed relative to DATASET_LOCATION --checkpoint_1 string path to first of three GleasonXAI models --checkpoint_2 string path to second of three GleasonXAI models --checkpoint_3 string path to third of three GleasonXAI models --save_path string path to output directory The default values for the checkpoints are set to the paths setup.py will move the model weigths to. It the setup was done with the setup.py, call:
python run_gleasonXAI.py --images "/path/to/imagedir" --save_path "/path/to/outputdir"
-
call setup.py with
gleasonxai_data
,download
andcalibrate
set and provide paths or urls to Gleason 2019 challenge data
i.e.:python setup.py --gleasonxai_data "/path/to/GleasonXAI_data.zip" --download --calibrate --gleason19_train "/path/to/Train Imgs.zip" --gleason19_test "/path/to/Test.zip"
-
run test.py to create predictions on the test set
(at least for models inGleasonFinal2/label_level1/SoftDiceBalanced-{i}/version_0/
).parameter type use --experiment_path string path to directory containing all model weights and settings --checkpoint string path within experiment_path to search for models and settings in --glob_checkpoints string Regex to select models trained with specified loss only If the setup.py was used, the following call can be done to generate the predictions for the GleasonXAI:
python test.py --experiment_path "/[DATASET_LOCATION]/GleasonXAI" --checkpoint "GleasonFinal2/label_level1" --glob_checkpoints "SoftDiceBalanced-*"
-
open and run the jupyter notebook evaluate_paper_results.ipynb to create the visualizations and figures of the paper.
-
call setup.py with
gleasonxai_data
,download
andcalibrate
set and provide paths or urls to Gleason 2019 challenge data if needed
i.e.:python setup.py --gleasonxai_data "/path/to/GleasonXAI_data.zip" --download --calibrate --gleason19_train "/path/to/Train Imgs.zip" --gleason19_test "/path/to/Test.zip"
-
Use the provided configs in /configs/ to set all hyperparameters and loss functions. Our project uses the hydra framework to parse the tree of configs and configure your training.
-
call run_training.py
Example configurations:
-
SoftDiceLoss on soft label explanations:
run_training.py dataset.label_level=1 loss_functions=soft_dice_balanced experiment=EXPERIMENTNAME/CAN/CONTAIN/SUBFOLDERS
-
DICELoss on majority voted explanations:
run_training.py dataset.label_level=1 loss_functions=dice_loss experiment=YOUREXPERIMENTNAME
When setting dataset.label_level to 0, you can directly train on Gleason patterns (label_level=2 are the sub-explanations).