
SlideLab is a preprocessing pipeline to preprocess hematoxylin and eosin (H&E) Whole Slide Images (WSI) for computational pathology applications. This script including masking, tiling, normalization, quality checks, encoding, and optional whole slide reconstruction after tiling. It can be used as a customizable pipeline for mass WSI processing or to directly call functions to perform specific tasks. Please refer to the arguments to see the potential options. As a pipeline, it is designed to ensure that if stopped for any reason, you will be able to continue at the last step that was completed. It also includes both an error report (for any error that may occur and the location in the pipeline where occurred) and a summary report with statistics like % of tissue and time taken to process the WSI.
SlideLab uses several methods to filter out artifacts such as pen marks and blots from slides and segment tissue sections according to adjustable threshold. Thresholds like the tissue percentage in a tile is used to select candidate tiles and can be adjusted. To see parameters associated with masking refer to masking params The hematoxylin and eosin otsu adaptation was obtained from Schreiber et. al [4].



In order to use this package, you must have python (>3.9) installed on your system. Conda is also recommended for installation.
First clone the repository and cd into repository:
git clone https://github.com/lolmomarchal/SlideLab.git
cd SlideLab
To install required dependencies you can do through conda:
conda env create -f environment.yml
conda activate slidelab
Argument | Description | Default |
---|---|---|
-i , --input_path |
Path to the input WSI file | None |
-o , --output_path |
Path to save the output tiles | None |
Argument | Description | Default |
---|---|---|
-s , --desired_size |
Desired size of the tiles (in pixels) | 256 |
-m , --desired_magnification |
Desired magnification level (ex: 20x | 20 |
-ov , --overlap |
Factor of overlap between tiles | 1 (no overlap) |
Overlap example: With a size of 256 and an overlap of 2, tiles would overlap by 128 pixels.
Argument | Description | Default |
---|---|---|
-rb , --remove_blurry_tiles |
Remove blurry tiles using a Laplacian filter | False |
-n , --normalize_staining |
Normalize staining of the tiles | False |
-e , --encode |
Encode tiles into an .h5 file |
False |
--extract_high_quality |
Extract features for high quality heatmaps | False |
--augmentations |
Get various augmentations for encoded tiles for model training | 0 |
--reconstruct_slide |
Reconstruct slide to see included tile sections vs excluded | False |
Argument | Description | Default |
---|---|---|
-th , --tissue_threshold |
Minimum tissue content to consider a tile valid | 0.7 |
-bh , --blur_threshold |
Threshold for Laplacian filter variance (blur detection) | 0.015 |
--red_pen_check |
Sanity check for % of red pen detected. If above threshold, red_pen mask will be ignored | 0.4 |
--blue_pen_check |
Sanity check for % of red pen detected. If above threshold, blue_pen mask will be ignored | 0.4 |
Argument | Description | Default |
---|---|---|
--device |
Specify device (e.g., GPU/CPU) | None (will utilize gpu if available) |
--cpu_processes |
Number of CPU processes to use | os.cpu_count() |
--batch_size |
Number of CPU processes to use | 16. If using augmentations batch_size will be recalculated by using # of augmentations/batch size |
Argument | Description | Default |
---|---|---|
--min_tiles |
Minimum number of valid tiles for a sample to be counted as "valid". Will create additional filtered sample metadata file. | 0 |
python SlidePreprocessing.py -i /path/to/input/-o /path/to/output/ \
-s 512 -m 40 --remove_blurry_tiles --normalize_staining --encode \
-th 0.8 -bh 0.02 --device cuda --batch_size 256
Please refer to the tutorials folder!
For questions/requests please make an issue or email aolmomarchal@ucsd.edu under the subject "SlideLab: [insert question]"
- M. Macenko et al., "A method for normalizing histology slides for quantitative analysis," 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, Boston, MA, USA, 2009, pp. 1107-1110, doi: 10.1109/ISBI.2009.5193250.
- Barbano, C. A., & Pedersen, A. (2022, August). EIDOSLAB/torchstain: v1.2.0-stable (Version v1.2.0-stable) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.6979540
- Chen, R.J., Ding, T., Lu, M.Y., Williamson, D.F.K., et al. Towards a general-purpose foundation model for computational pathology. Nat Med (2024). https://doi.org/10.1038/s41591-024-02857-3
- B. A. Schreiber, J. Denholm, F. Jaeckle, M. J. Arends, K. M. Branson, C.-B.Schönlieb, and E. J. Soilleux. Bang and the artefacts are gone! Rapid artefact removal and tissue segmentation in haematoxylin and eosin stained biopsies, 2023. URL http://arxiv.org/abs/2308.13304.