FLOSS: Free Lunch in Open-vocabulary Semantic Segmentation

This is the official implementation of our paper:
FLOSS: Free Lunch in Open-vocabulary Semantic Segmentation
Yasser Benigmim, Mohammad Fahes, Tuan-Hung Vu, Andrei Bursuc, Raoul de Charette

Identifying expert templates for each class that outperform averaged classifiers
Using unsupervised metrics (e.g., entropy) to select these experts without labels
Introducing a novel fusion scheme to combine expert predictions
Providing a plug-and-play solution requiring no training or labels

⚙️ Installation

Create and activate a conda environment:

conda create -n floss python=3.9
conda activate floss

Install numpy and PyTorch with CUDA support:

# For CUDA 12.4 (tested version)
pip install numpy==1.26.4 torch==2.6.0 torchvision==0.21.0

Install the remaining dependencies:

pip install -r requirements.txt

Note

This code has been tested with CUDA 12.4. Make sure you have the appropriate CUDA version installed on your system.

🗂️ Data Preparation

Download and process the datasets following the MMSegmentation guide. Place all datasets in ./data with the following structure:

data/
  ├── cityscapes/
  ├── VOCdevkit/VOC2012/  # PASCAL VOC20
  ├── VOCdevkit/VOC2010/  # PASCAL CO59
  ├── ade20k/
  ├── coco_stuff164k/
  ├── mapillary/
  ├── bdd10k/
  └── acdc/              # Night, Fog, Rain, Snow conditions

After downloading, run these conversion scripts:

# COCO-Stuff: Convert annotations to semantic segmentation format
python tools/convert_datasets/coco_stuff16k.py data/coco_stuff164k/ -o data/coco_stuff164k/

# Cityscapes: Generate *labelTrainIds.png
python tools/convert_datasets/cityscapes.py data/cityscapes --nproc 8

# PASCAL Context: Convert annotations (requires Detail package)
python tools/convert_datasets/pascal_context.py data/VOCdevkit data/VOCdevkit/VOC2010/trainval_merged.json

# Mapillary: Convert to Cityscapes format and resize validation set
python tools/convert_datasets/mapillary2cityscape.py data/mapillary data/mapillary/cityscapes_trainIdLabel --train_id
python tools/convert_datasets/mapillary_resize.py data/mapillary/validation/images \
    data/mapillary/cityscapes_trainIdLabel/val/label \
    data/mapillary/half/val_img \
    data/mapillary/half/val_label

🔍 Evaluating Models

Download the CLIP-DINOiser checkpoint and place it in checkpoints/clip_dinoiser/.

To evaluate the baseline models (without FLOSS):

# Evaluate baseline models (without FLOSS)

# For CLIP-DINOiser
python ./tools/test.py configs/clipdinoiser.py --dataset DATASET

# For MaskCLIP
python ./tools/test.py configs/maskclip.py --dataset DATASET

# For NACLIP
python ./tools/eval_naclip.py --dataset DATASET

To evaluate with FLOSS (requires pre-computed rankings available in rankings/ directory, see Computing Template Rankings to reproduce them):

# For CLIP-DINOiser + FLOSS
python ./tools/test.py configs/clipdinoiser.py --dataset DATASET --mode fusion

# For MaskCLIP + FLOSS
python ./tools/test.py configs/maskclip.py --dataset DATASET --mode fusion

# For NACLIP + FLOSS
python ./tools/eval_naclip.py --dataset DATASET --mode fusion

Replace DATASET with one of: cityscapes, pascalvoc20, pascalco59, ade20k, cocostuff, mapillary, bdd10k, or acdc_night, acdc_snow, acdc_rain acdc_fog.

🏆 Computing Template Rankings

To reproduce the rankings:

# For CLIP-DINOiser on a regular size dataset
python ./tools/test.py configs/clipdinoiser.py --dataset DATASET --mode compute_metric --split train

# For CLIP-DINOiser on ADE20K (dividing the 80 templates into groups of 10)
python ./tools/test.py configs/clipdinoiser.py --dataset ade20k --mode compute_metric --split train --id-start 0 --id-end 9
python ./tools/test.py configs/clipdinoiser.py --dataset ade20k --mode compute_metric --split train --id-start 10 --id-end 19
python ./tools/test.py configs/clipdinoiser.py --dataset ade20k --mode compute_metric --split train --id-start 20 --id-end 29
# ... continue with remaining template ranges (30-39, 40-49, etc.)

Tip

When computing metrics, you can use the training set by specifying --split train. However, be aware that processing large datasets like ADE20K and COCO-Stuff requires significant computational resources - ideally a GPU with 40GB of memory 💾. To make this process more manageable, you can process the templates in smaller batches using --id-start and --id-end. For example, instead of processing all 80 templates at once, you can process them in groups (e.g., templates 0-9, then 10-19, etc.). This allows you to parallelize the work across multiple processes, making it more efficient for large-scale datasets.

📈 Results

Method	Cityscapes	VOC20	CO59	ADE20K	Stuff	Avg
MaskCLIP	25.0	61.8	25.5	14.2	17.5	28.7
+ FLOSS	25.8	61.8	26.2	14.9	17.8	29.3
NACLIP	35.5	79.7	35.2	17.4	23.3	38.2
+ FLOSS	37.0	80.2	35.9	18.4	23.6	39.0
CLIP-DINOiser	31.1	80.9	35.9	20.0	24.6	38.5
+ FLOSS	34.6	82.3	36.2	20.7	24.7	39.7

📚 Citation

If you find this work useful, please cite our paper:

@misc{benigmim2025flossfreelunchopenvocabulary,
      title={FLOSS: Free Lunch in Open-vocabulary Semantic Segmentation},
      author={Yasser Benigmim and Mohammad Fahes and Tuan-Hung Vu and Andrei Bursuc and Raoul de Charette},
      year={2025},
      eprint={2504.10487},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2504.10487},
}

🙏 Acknowledgments

We would like to thank the authors of MMSegmentation, CLIP-DINOiser, and NACLIP for making their code publicly available. Our implementation builds upon their excellent work.

🔝 Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
checkpoints		checkpoints
configs		configs
img		img
mmseg		mmseg
my_utils		my_utils
rankings		rankings
tools		tools
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FLOSS: Free Lunch in Open-vocabulary Semantic Segmentation

Table of Contents

🎯 TL;DR

⚙️ Installation

🗂️ Data Preparation

🔍 Evaluating Models

🏆 Computing Template Rankings

📈 Results

📚 Citation

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Languages

yasserben/FLOSS

Folders and files

Latest commit

History

Repository files navigation

FLOSS: Free Lunch in Open-vocabulary Semantic Segmentation

Table of Contents

🎯 TL;DR

⚙️ Installation

🗂️ Data Preparation

🔍 Evaluating Models

🏆 Computing Template Rankings

📈 Results

📚 Citation

🙏 Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages