Skip to content
/ UniA Public

[TMM2025] Tackling Ambiguity from Perspective of Uncertainty Inference and Affinity Diversification for Weakly Supervised Semantic Segmentation

Notifications You must be signed in to change notification settings

zwyang6/UniA

Repository files navigation

[TMM2025] Tackling Ambiguity from Perspective of Uncertainty Inference and Affinity Diversification for Weakly Supervised Semantic Segmentation arXiv

News

  • Our UniA is accepted by TMM 2025.
  • All Code, logs, and checkpoints are available now🔥🔥🔥
  • If you have any questions, please feel free to leave issues or contact us by zwyang21@m.fudan.edu.cn.

Overview

UniA pipeline

Weakly supervised semantic segmentation (WSSS) with image-level labels aims to achieve dense predictions without laborious annotations. However, due to the ambiguous contexts and fuzzy regions, the performance of WSSS, particularly during the stages of generating Class Activation Maps (CAMs) and refining pseudo masks, is widely hindered by ambiguity. Despite this, this issue has received little attention in previous literature. In this work, we propose UniA, a unified single-staged WSSS framework, to efficiently tackle this issue from the perspectives of uncertainty inference and affinity diversification. When activating class objects, we argue that the false activation stems from the bias to ambiguous regions during the feature extraction. Therefore, we formulate a robust feature representation with a Gaussian distribution and introduce the uncertainty estimation to avoid the bias. A distribution loss is proposed to supervise the process, which effectively captures the ambiguity and models the complex dependencies among features. When refining pseudo labels, we observe that the affinity from the prevailing refinement methods intends to be overly similar among ambiguities. To this end, we design an affinity diversification module to promote diversity among semantics. A mutual complementing refinement is first proposed to statically rectify the ambiguous affinity with multiple inferred pseudo labels. Then a contrastive affinity loss is further designed to dynamically diversify the relations among unrelated semantics. It stably propagates the diversity into the feature representation and helps generate better pseudo masks. Extensive experiments are conducted on PASCAL VOC, MS COCO, and medical ACDC datasets, which validate the efficiency of UniA tackling ambiguity and its superiority over recent single- staged or even most multi-staged competitors.

Data Preparation

PASCAL VOC 2012

1. Download

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar

2. Segmentation Labels

The augmented annotations are from SBD dataset. The download link of the augmented annotations at DropBox. After downloading SegmentationClassAug.zip, you should unzip it and move it to VOCdevkit/VOC2012/.

VOCdevkit/
└── VOC2012
    ├── Annotations
    ├── ImageSets
    ├── JPEGImages
    ├── SegmentationClass
    ├── SegmentationClassAug
    └── SegmentationObject

MSCOCO 2014

1. Download

wget http://images.cocodataset.org/zips/train2014.zip
wget http://images.cocodataset.org/zips/val2014.zip

2. Segmentation Labels

To generate VOC style segmentation labels for COCO, you could use the scripts provided at this repo, or just download the generated masks from Google Drive.

COCO/
├── JPEGImages
│    ├── train2014
│    └── val2014
└── SegmentationClass
     ├── train2014
     └── val2014

Requirement

Please refer to the requirements.txt.

We incorporate a regularization loss for segmentation. Please refer to the instruction for this python extension.

Train UniA

### train voc
bash run_train.sh scripts/train_voc.py [gpu_number] [master_port] [gpu_device] train_voc

### train coco
bash run_train.sh scripts/train_coco.py [gpu_numbers] [master_port] [gpu_devices] train_coco

Evaluate UniA

### eval voc
bash run_evaluate_seg_voc.sh tools/infer_seg_voc.py [gpu_device] [checkpoint_path]

### eval coco
bash run_evaluate_seg_coco.sh tools/infer_seg_coco.py [gpu_number] [master_port] [gpu_device] [checkpoint_path]

Main Results

  • Quantitative Results

Semantic performance on VOC and COCO. Logs are available now.

Dataset Backbone Val Test Log Weight
PASCAL VOC ViT-B 74.1 73.6 log checkpoints
MS COCO ViT-B 43.2 - log checkpoints
  • Qualitative Results

UniA results

Citation

Please cite our work if you find it helpful to your reseach. 💕

@article{yang2024tackling,
  title={Tackling Ambiguity from Perspective of Uncertainty Inference and Affinity Diversification for Weakly Supervised Semantic Segmentation},
  author={Yang, Zhiwei and Meng, Yucong and Fu, Kexue and Wang, Shuo and Song, Zhijian},
  journal={arXiv preprint arXiv:2404.08195},
  year={2024}
}

About

[TMM2025] Tackling Ambiguity from Perspective of Uncertainty Inference and Affinity Diversification for Weakly Supervised Semantic Segmentation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published