TextInPlace

🌟 If this work is useful to you, please give this repository a Star! 🌟

News | Usage | Citation | Acknowledgement

This is the official repository for the paper:

TextInPlace: Indoor Visual Place Recognition in Repetitive Structures with Scene Text Spotting and Verification

Abstract

Visual Place Recognition (VPR) is a crucial capability for long-term autonomous robots, enabling them to identify previously visited locations using visual information. However, existing methods remain limited in indoor settings due to the highly repetitive structures inherent in such environments. We observe that scene text typically appears in indoor spaces, serving to distinguish visually similar but different places. This inspires us to propose TextInPlace, a simple yet effective VPR framework that integrates Scene Text Spotting (STS) to mitigate visual perceptual ambiguity in repetitive indoor environments. Specifically, TextInPlace adopts a dual-branch architecture within a local parameter sharing network. The VPR branch employs attention-based aggregation to extract global descriptors for coarse-grained retrieval, while the STS branch utilizes a bridging text spotter to detect and recognize scene text. Finally, the discriminative text is filtered to compute text similarity and re-rank the top-K retrieved images. To bridge the gap between current text-based repetitive indoor scene datasets and the typical scenarios encountered in robot navigation, we establish an indoor VPR benchmark dataset, called Maze-with-Text. Extensive experiments on both custom and public datasets demonstrate that TextInPlace achieves superior performance over existing methods that rely solely on appearance information.

News

2025-06-23: The code for TextInPlace is publicly available in this repository📦!
2025-06-16: TextInPlace is accepted by IROS 2025. 🎉🎉🎉

Usage

Environment Setup

You can create your own conda environment for TextInPlace based on the following command⚙️:

conda create -n stloc python=3.10 -y
conda activate stloc
pip install torch==2.2.0+cu121 torchvision==0.17.0+cu121 --index-url https://download.pytorch.org/whl/cu121
cd detectron2
pip install -e . && cd ..
pip install -r requirements.txt
python setup.py build develop

Maze-with-Text Dataset

To bridge the gap between current text-based repetitive indoor scene datasets and the typical scenarios encountered in robot navigation, we establish an indoor VPR benchmark dataset, called Maze-with-Text. The number of images in the Maze-with-Text dataset is as follows：

Floor	1	2	3	4	5	All
Queries	280	253	258	245	269	1305
Database	1368	2268	1588	1720	1596	8540

You can download Maze-with-Text dataset from Google Drive. After downloading, please unzip the archive and organize the dataset into the following directory structure:

|-- Maze-with-Text
    |-- images
        |-- test
            |-- database
            |   |-- @-00.0027@038.6324@5@339@0@.jpg
                ......
            |-- queries
            |   |-- @-00.0012@032.7272@5@65@3@.jpg
                ......

Evaluation

Evaluation script

python -W ignore eval.py --backbone ResNet50 --aggregation boq \
  --features_dim 16384 \
  --infer_batch_size 64 \
  --config-file configs/Bridge/TotalText/R_50_poly.yaml \
  --dataset_name Maze-with-Text \
  --datasets_folder <Path with all datasets> \
  --resume <Path with the checkpoint>

Before running the evaluation script, please follow the steps below to validate the results of our experiments on the Maze-with-Text dataset.

Download Checkpoint: Get the checkpoint file.
File Placement: Move the downloaded checkpoint file to the designated path: ./checkpoints/.

(Optional) If you want to use an LLM for text-based reranking, please set your own API key in utils/test.py and add --use-llm flag at the end of the command.

Training

Note

Training code will come soon, please stay tuned.

Citation

If you find TextInPlace helpful for your research, please consider citing:

@inproceedings{tao2025textinplace,
  title={TextInPlace: Indoor Visual Place Recognition in Repetitive Structures with Scene Text Spotting and Verification},
  author={Tao, Huaqi and Liu, Bingxi and Chen, Calvin and Huang, Tingjun and Li, He and Cui, Jinqiang and Zhang, Hong},
  booktitle={2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year={2025},
  organization={IEEE}
}

Acknowledgement

Thanks to these great repositories: Bag-of-Queries, SuperPlace, NYC-Indoor-VPR, Bridging-Text-Spotting, DPText-DETR, DiG and many other inspiring works in the community.
Contact: taohq2024@mail.sustech.edu.cn.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
DiG		DiG
adet		adet
aggregations		aggregations
configs/Bridge		configs/Bridge
datasets		datasets
detectron2		detectron2
figures		figures
utils		utils
.gitignore		.gitignore
README.md		README.md
backbone.py		backbone.py
eval.py		eval.py
network.py		network.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TextInPlace

🌟 If this work is useful to you, please give this repository a Star! 🌟

News

Usage

Environment Setup

Maze-with-Text Dataset

Evaluation

Training

Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

HqiTao/TextInPlace

Folders and files

Latest commit

History

Repository files navigation

TextInPlace

🌟 If this work is useful to you, please give this repository a Star! 🌟

News

Usage

Environment Setup

Maze-with-Text Dataset

Evaluation

Training

Citation

Acknowledgement

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages