Awesome Segmentation in Medical Imaging: A Breast Ultrasound Benchmark 🚀

A robust and reproducible framework for benchmarking deep learning models on Breast Ultrasound (BUS) image segmentation, designed to ensure fair evaluation and prevent test set leakage.

The core philosophy of this project is to establish a fair and unbiased evaluation pipeline, addressing common pitfalls in machine learning research. By strictly separating cross-validation from final testing, this framework produces publication-ready results that accurately reflect a model's true generalization performance.

Our Vision 💡

While research in medical image segmentation is advancing rapidly, fair model comparison remains a significant challenge due to:

Data Leakage: The test set is often inadvertently used during cross-validation, leading to inflated and unreliable performance metrics.
Inconsistent Evaluation: Different studies use different data splits, preprocessing steps, and evaluation metrics, making direct comparisons impossible.
Limited Scope: Most benchmarks focus only on binary (benign vs. malignant) classification, failing to incorporate the 'normal' class, which is crucial for real-world clinical applications.

Awesome-BUS-Benchmark is engineered to solve these problems. We provide a standardized benchmark built on the principles of strict data separation and stratified sampling to ensure that all models are evaluated under the exact same conditions, leading to truly comparable and reproducible results.

Key Features ✨

🥇 Strict Data Splitting: A dedicated, held-out test set is created once before any training or cross-validation begins. This test set is never seen during model development or selection, preventing any form of data snooping and ensuring a truly unbiased final evaluation of generalization performance.
⚖️ Stratified K-Fold Cross-Validation: Implements Stratified K-Fold to handle the inherent class imbalance in BUS datasets (e.g., the small number of normal cases). This ensures that each fold's class distribution is representative of the overall dataset, leading to more stable training and reliable validation metrics.
📚 Comprehensive Dataset Support: Natively supports multiple public Breast US datasets and crucially includes the normal class, offering a more complete and realistic benchmark than typical binary (benign vs. malignant) studies.
🧩 Modular & Extensible Architecture: The code is structured to be highly modular. You can easily add new datasets, models (both CNN & Transformer-based), loss functions, and metrics with minimal code changes.
⚙️ Automated & Configurable Pipelines: Comes with powerful shell scripts (run_cnn.sh, run_vit.sh) that automate the entire workflow: k-fold training, testing, and results aggregation. All experiment parameters (model choice, learning rate, epochs, etc.) are controlled via central YAML configuration files, allowing for rapid and reproducible experiments.

Supported Models & Datasets 📖

Model Zoo

Model	Original code	Reference
U-Net	Caffe	MICCAI'15
Attention U-Net	Pytorch	MIDL'18
U-Net++	Pytorch	MICCAI'18
U-Net 3+	Pytorch	ICASSP'20
TransUnet	Pytorch	Arxiv'21
MedT	Pytorch	MICCAI'21
UNeXt	Pytorch	MICCAI'22
SwinUnet	Pytorch	ECCV'22
CMU-Net	Pytorch	ISBI'23
CMUNeXt	Pytorch	ISBI'24
U-KAN	Pytorch	AAAI'25

Datasets

Dataset	Official Source	Download link
BUSI	Dataset of Breast Ultrasound Images - Al-Dhabyani et al.	Download Data
BUSBRA	Breast Ultrasound Bi-Rads Classification... - Ribeiro et al.	Download Data
BUS-UC	Breast Ultrasound Cancer Image Classification - Garodia et al.	Download Data
BUS-UCLM	BUS-UCLM: Breast ultrasound lesion segmentation dataset. - Noelia Vallez et al.,2025	Download Data
Yap2018	Breast ultrasound lesions recognition: a preliminary study... - Yap et al.	Download Data

Project Structure 📂

The repository is organized logically to separate concerns and facilitate ease of use and extension.

Awesome_Segmentation_in_Medical/  
│
├── data/  
│   ├── preprocessing/  
│   │   ├── augmentation.py        # Data augmentation logic (rotations, flips, etc.)  
│   │   └── preprocess.py          # Data preprocessing logic (resizing, normalization, etc.)  
│   ├── prepare_datasets.py        # Script to standardize raw datasets and create CSVs with fold splits  
│   └── synthetic_datasets.py      # Script for synthetic data generation (optional)  
│  
├── data_loader/  
│   └── data_loaders.py            # Defines PyTorch DataLoaders for training/validation/testing  
│  
├── datasets/                      # (User-supplied) Directory to store raw downloaded datasets  
│  
├── src/  
│   ├── models/                    # PyTorch model architecture definitions  
│   │   ├── cnn_based/             # --- CNN-based models like UNet, AttUNet, UNet++, UNeXt, CMUNet, U-KAN  
│   │   └── ViT_based/             # --- Transformer-based models like TransUnet, MedT, Swin-Unet  
│   │  
│   ├── trainer/  
│   │   └── trainer.py             # Core training and validation loop logic (epochs, backprop, etc.)  
│   │  
│   └── utils/                     # Core utilities and helper functions  
│       ├── LovaszSoftmax/pytorch/ # --- The Lovász-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks     
│       │   └── lovasz_losses.py   # --- Standalone PyTorch implementation of the Lovász hinge and Lovász-Softmax for the Jaccard index
│       ├── losses.py              # --- Loss functions for segmentation (DiceLoss, BCELoss, etc.)  
│       ├── metrics.py             # --- Evaluation metrics (Dice, IoU, HD95, etc.)  
│       ├── parse_config.py        # --- Functionality to read and parse the config.json file  
│       └── util.py                # --- Other useful functions, such as logging  
│  
├── results/                       # Stores all experiment outputs, including CSVs with metrics per fold  
│  
├── .gitignore                     # List of files to be ignored by Git  
├── config.json                    # Central configuration file to control all experiments (models, hyperparameters, etc.)  
├── environment.yml                # File for Conda environment setup  
│  
├── run_cnn.sh                     # Entrypoint script to run CNN-based model experiments  
├── run_vit.sh                     # Entrypoint script to run Transformer-based model experiments  
├── run_transfer.sh                # Entrypoint script for transfer learning experiments  
│  
├── train.py                       # Main executable file to start model training  
└── test.py                        # Main executable file to test a trained model

⚙️ Installation

Clone the repository

git clone https://github.com/Jonghwan-dev/Awesome_Segmentation_in_Medical.git
cd Awesome_Segmentation_in_Medical

Conda (recommended)

conda env create -f environment.yml
conda activate awesome_seg

Download datasets
- Create a datasets/ folder in the project root
- For each public dataset (BUSI, BUSBRA, BUS-UC, BUS-UCLM, Yap2018), follow the original authors’ instructions to place images and masks under:
```
datasets/
├── BUSI/
│   ├── images/
│   └── masks/
└── BUSBRA/
    ├── images/
    └── masks/
```

🛠️ Data Preparation

Standardize raw data, generate CSV manifests and split files:

python -c "from data.prepare_datasets import PrepareDataset; PrepareDataset().run(['busi','busbra','bus_uc','bus_uclm','yap'])"

This produces:

All dataset information is managed in CSV files located in the data/csv/ directory.
Each file (e.g., busi.csv, busbra.csv) corresponds to a single dataset.
Data splits for training and testing are defined by a split column directly within these files.

🚀 Running Experiments

1. CNN Models

$ chmod u+x ./run_cnn.sh
bash ./run_cnn.sh

Performs k-fold cross-validation for UNet, AttUNet, UNet++ and UNet3+, then evaluates on the held‑out test set.

2. Transformer Models

$ chmod u+x ./run_vit.sh
bash ./run_vit.sh

Trains TransUNet, Swin-Unet and MedT with the same splits.

3. Transfer Learning

$ chmod u+x ./transfer_run.sh
bash ./transfer_run.sh

Fine-tunes CNN or Transformer backbones pretrained on natural images.

All logs and per-fold metrics are saved under results/.

Understanding the Results 📊

The performance metrics reported in the tables represent the mean ± standard deviation derived from a 5-fold cross-validation process. The evaluation methodology is as follows:

A dedicated, held-out test set is created and separated before any training begins.
The remaining data is used for 5-fold cross-validation, which results in 5 independently trained models.
Each of these 5 models is then individually evaluated on the entire held-out test set. This yields 5 separate performance scores for each metric (e.g., 5 Dice scores, 5 IoU scores).
The final value reported in the table (e.g., Dice: 0.7095 ± 0.0300) is the average and standard deviation of these 5 scores.

This method effectively demonstrates the model's stability and generalization performance across different training data subsets.

BUSI Dataset Performance

<<<<<<< Updated upstream TBD (To Be Determined): Some models are currently under training/validation.
Results will be updated upon completion. ======= TBD (To Be Determined): Some models are currently under training/validation.
Results will be updated upon completion. >>>>>>> Stashed changes

Model	Dice (DSC)	IoU	HD95	GFLOPs	Params (M)
UNet	0.7095 ± 0.0300	0.6290 ± 0.0346	36.6824 ± 9.7886	50.11	34.53
AttUNet	0.7400 ± 0.0257	0.6631 ± 0.0270	37.5227 ± 3.6763	50.96	34.88
UNet++	0.7307 ± 0.0220	0.6545 ± 0.0230	38.3222 ± 7.5356	28.73	26.90
UNet 3+	0.7194 ± 0.0268	0.6402 ± 0.0303	34.5574 ± 6.3702	152.87	26.97
UNeXt	0.6955 ± 0.0305	0.6150 ± 0.0322	40.1467 ± 6.0638	0.42	1.47
CMUNet	0.6913 ± 0.0223	0.6129 ± 0.0223	41.1387 ± 4.6279	69.81	49.93
CMUNeXt	0.7217 ± 0.0092	0.6439 ± 0.0092	35.5400 ± 6.5903	5.66	3.15
U-KAN	0.7427 ± 0.0086	0.6689 ± 0.0105	37.5375 ± 6.3250	5.25	9.38
TransUnet	0.7226 ± 0.0166	0.6412 ± 0.0183	32.3411 ± 3.4289	75.17	179.07
MedT	0.5759 ± 0.0435	0.4900 ± 0.0461	53.7967 ± 9.0494	4.33	1.13
SwinUnet	TBD	TBD	TBD	TBD	TBD

BUS-UC Dataset Performance

Model	Dice (DSC)	IoU	HD95	GFLOPs	Params (M)
UNet	0.8905 ± 0.0068	0.8189 ± 0.0077	16.8561 ± 0.9736	50.11	34.53
AttUNet	0.8948 ± 0.0021	0.8234 ± 0.0027	16.4403 ± 0.6161	50.96	34.88
UNet++	0.9113 ± 0.0075	0.8463 ± 0.0099	10.8447 ± 1.8236	28.73	26.90
UNet 3+	0.8900 ± 0.0060	0.8181 ± 0.0070	16.3550 ± 0.8623	152.87	26.97
UNeXt	0.8921 ± 0.0153	0.8192 ± 0.0182	13.8724 ± 1.8751	0.42	1.47
CMUNet	0.8983 ± 0.0132	0.8271 ± 0.0181	11.8369 ± 2.4165	69.81	49.93
CMUNeXt	0.9049 ± 0.0025	0.8371 ± 0.0037	12.5182 ± 0.6259	5.66	3.15
U-KAN	0.8978 ± 0.0045	0.8282 ± 0.0057	12.9284 ± 1.4239	5.25	9.38
TransUnet	TBD	TBD	TBD	TBD	TBD
MedT	0.8703 ± 0.0046	0.7850 ± 0.0045	15.8071 ± 1.2704	4.33	1.13
SwinUnet	TBD	TBD	TBD	TBD	TBD

BUS-UCLM Dataset Performance

Model	Dice (DSC)	IoU	HD95	GFLOPs	Params (M)
UNet	0.7411 ± 0.0211	0.7115 ± 0.0196	42.8127 ± 3.2033	50.11	34.53
AttUNet	0.7264 ± 0.0444	0.6961 ± 0.0468	44.5037 ± 4.6944	50.96	34.88
UNet++	0.8214 ± 0.0177	0.7908 ± 0.0187	25.3912 ± 3.8422	28.73	26.90
UNet 3+	0.7864 ± 0.0336	0.7550 ± 0.0332	29.9108 ± 8.2026	152.87	26.97
UNeXt	0.7744 ± 0.0219	0.7434 ± 0.0233	33.6618 ± 3.4316	0.42	1.47
CMUNet	0.7625 ± 0.0248	0.7321 ± 0.0270	38.7341 ± 6.3967	69.81	49.93
CMUNeXt	0.7821 ± 0.0220	0.7490 ± 0.0226	34.9975 ± 4.9092	5.66	3.15
U-KAN	0.7540 ± 0.0912	0.7266 ± 0.0939	47.3326 ± 13.5931	5.25	9.38
TransUnet	0.7675 ± 0.0221	0.7352 ± 0.0219	34.6927 ± 7.3986	75.17	179.07
MedT	TBD	TBD	TBD	TBD	TBD
SwinUnet	TBD	TBD	TBD	TBD	TBD

BUSBRA Dataset Performance

Model	Dice (DSC)	IoU	HD95	GFLOPs	Params (M)
UNet	0.8704 ± 0.0063	0.7958 ± 0.0080	12.8747 ± 1.4224	50.11	34.53
AttUNet	0.8748 ± 0.0061	0.8021 ± 0.0074	11.5454 ± 1.0852	50.96	34.88
UNet++	0.8789 ± 0.0024	0.8064 ± 0.0031	10.6655 ± 0.6798	28.73	26.90
UNet 3+	0.8787 ± 0.0054	0.8053 ± 0.0064	11.5249 ± 1.0132	152.87	26.97
UNeXt	0.8562 ± 0.0064	0.7751 ± 0.0064	13.7112 ± 2.1483	0.42	1.47
CMUNet	0.8705 ± 0.0050	0.7950 ± 0.0061	11.2522 ± 1.0243	69.81	49.93
CMUNeXt	0.8756 ± 0.0043	0.8013 ± 0.0048	11.3662 ± 1.2649	5.66	3.15
U-KAN	0.8797 ± 0.0064	0.8049 ± 0.0065	10.1306 ± 0.8699	5.25	9.38
TransUnet	TBD	TBD	TBD	TBD	TBD
MedT	0.8151 ± 0.0053	0.7157 ± 0.0071	15.5866 ± 0.6548	4.33	1.13
SwinUnet	TBD	TBD	TBD	TBD	TBD

Yap2018 Dataset Performance

Model	Dice (DSC)	IoU	HD95	GFLOPs	Params (M)
UNet	0.5451 ± 0.0530	0.4506 ± 0.0448	62.3235 ± 21.4890	50.11	34.53
AttUNet	0.6072 ± 0.0411	0.5057 ± 0.0357	45.5852 ± 26.4043	50.96	34.88
UNet++	0.5769 ± 0.0541	0.4882 ± 0.0534	53.9371 ± 21.8446	28.73	26.90
UNet 3+	0.6294 ± 0.0263	0.5329 ± 0.0274	32.0159 ± 11.0018	152.87	26.97
UNeXt	0.5403 ± 0.0315	0.4455 ± 0.0317	58.4016 ± 10.0436	0.42	1.47
CMUNet	0.5511 ± 0.0295	0.4524 ± 0.0327	55.6650 ± 13.0657	69.81	49.93
CMUNeXt	0.6313 ± 0.0463	0.5446 ± 0.0466	53.3332 ± 17.7294	5.66	3.15
U-KAN	0.6083 ± 0.0410	0.5130 ± 0.0542	43.6525 ± 9.3918	5.25	9.38
TransUnet	0.6715 ± 0.0581	0.5806 ± 0.0582	35.7003 ± 11.6601	75.17	179.07
MedT	TBD	TBD	TBD	TBD	TBD
SwinUnet	TBD	TBD	TBD	TBD	TBD

How to Extend 🛠️

Add a New Model

Place your model's .py file in models/cnn_models/ or models/transformer_models/.
Import your model in the corresponding __init__.py file.
Add the new model's name (as a string) to the model_name list in the relevant configs/*.yml file.

Add a New Dataset

Add the raw dataset folder to the datasets/ directory.
In data/prepare_datasets.py, add a new preparation method (e.g., _prepare_mynew_dataset) inside the PrepareDataset class.
Register your new method in the dispatcher dictionary within the run method.

GitHub Topics

awesome-list
medical-image-segmentation
breast-ultrasound
BUS
deep-learning
benchmark
reproducibility
computer-vision
python
pytorch
tensorflow

Acknowledgements 🙏

This project's structure and methodology are heavily inspired by Medical-Image-Segmentation-Benchmarks.
Helper functions from CMU-Net and Image_Segmentation were also utilized. We extend our gratitude to the authors of these repositories for making their excellent work public.

License & Citation 📜

This project is released under the MIT License. See LICENSE for details. If you use this benchmark in your research, please consider citing it:

@misc{awesomebusbenchmark2024,  
  author \= {Jonghwan Kim},  
  title \= {Awesome Segmentation in Medical Imaging: A Breast Ultrasound Benchmark},  
  year \= {2025},  
  publisher \= {GitHub},  
  journal \= {GitHub repository},  
  howpublished \= {\\url{https://github.com/Jonghwan-dev/Awesome\_Segmentation\_in\_Medical}},  
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Awesome Segmentation in Medical Imaging: A Breast Ultrasound Benchmark 🚀

Table of Contents

Our Vision 💡

Key Features ✨

Supported Models & Datasets 📖

Model Zoo

Datasets

Project Structure 📂

⚙️ Installation

🛠️ Data Preparation

🚀 Running Experiments

1. CNN Models

2. Transformer Models

3. Transfer Learning

Understanding the Results 📊

BUSI Dataset Performance

BUS-UC Dataset Performance

BUS-UCLM Dataset Performance

BUSBRA Dataset Performance

Yap2018 Dataset Performance

How to Extend 🛠️

Add a New Model

Add a New Dataset

GitHub Topics

Categories

Acknowledgements 🙏

License & Citation 📜

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data		data
data_loader		data_loader
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.json		config.json
environment.yml		environment.yml
run_cnn.sh		run_cnn.sh
run_transfer.sh		run_transfer.sh
run_vit.sh		run_vit.sh
test.py		test.py
train.py		train.py

License

Jonghwan-dev/Awesome-Segmentation-in-Medical

Folders and files

Latest commit

History

Repository files navigation

Awesome Segmentation in Medical Imaging: A Breast Ultrasound Benchmark 🚀

Table of Contents

Our Vision 💡

Key Features ✨

Supported Models & Datasets 📖

Model Zoo

Datasets

Project Structure 📂

⚙️ Installation

🛠️ Data Preparation

🚀 Running Experiments

1. CNN Models

2. Transformer Models

3. Transfer Learning

Understanding the Results 📊

BUSI Dataset Performance

BUS-UC Dataset Performance

BUS-UCLM Dataset Performance

BUSBRA Dataset Performance

Yap2018 Dataset Performance

How to Extend 🛠️

Add a New Model

Add a New Dataset

GitHub Topics

Categories

Acknowledgements 🙏

License & Citation 📜

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages