ADT-Net: Adaptive Transformation-Driven Text-BasedPerson Search Network for Enhancing Cross-Modal Retrieval Robustness

Abstract

Text-based person search aims to retrieve person images matching a given textual description. The challenge lies in mapping images and textual descriptions into a unified semantic space. This paper introduces ADT-Net, a novel framework designed to address the issue of excessive intra-class variance and insufficient inter-class variance caused by lighting variations. ADT-Net comprises two key modules: Invariant Representation Learning (IRL), which employs style transfer strategies and multi-scale alignment techniques to learn visually invariant features, and Dynamic Matching Alignment (DMA), which introduces nonlinear transformations and learnable dynamic temperature parameters to optimize the prediction distribution. Experimental results on multiple benchmark datasets demonstrate that ADT-Net outperforms current mainstream baseline methods, achieving superior retrieval accuracy and generalization ability. Here, we show that our proposed method significantly enhances the robustness of cross-modal person retrieval, particularly under varying lighting conditions and shooting angles.

Usage

Requirements

torch: 1.3.1
torchvision: 0.14.1
transformers: 4.46.3

Prepare Datasets

Download the CUHK-PEDES dataset from here, ICFG-PEDES dataset from here and RSTPReid dataset form here.
Organize them in your dataset root dir folder as follows:

|-- data/
|   |-- <CUHK-PEDES>/
|       |-- imgs
|           |-- cam_a
|           |-- cam_b
|           |-- ...
|       |-- reid_raw.json
|
|   |-- <ICFG-PEDES>/
|       |-- imgs
|           |-- test
|           |-- train
|       |-- ICFG-PEDES.json
|
|   |-- <RSTPReid>/
|       |-- imgs
|       |-- data_captions.json

Installation Environment

conda create -n adtnet python=3.8

conda activate adtnet

pip install torch==1.3.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1+cu117

pip install -r requirements.txt

Training

python train.py

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
datasets		datasets
img		img
model		model
option		option
trainer		trainer
utils		utils
README.md		README.md
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ADT-Net: Adaptive Transformation-Driven Text-BasedPerson Search Network for Enhancing Cross-Modal Retrieval Robustness

Abstract

Usage

Requirements

Prepare Datasets

Training

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

2Elian/ADT-Net

Folders and files

Latest commit

History

Repository files navigation

ADT-Net: Adaptive Transformation-Driven Text-BasedPerson Search Network for Enhancing Cross-Modal Retrieval Robustness

Abstract

Usage

Requirements

Prepare Datasets

Training

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages