
An Adaptive Exploration Strategy for Zero-Shot Object Navigation with Target-centric Semantic Fusion
Mingjie Zhang1, 2, Yuheng Du1, Chengkai Wu1, Jinni Zhou1, Zhenchao Qi1, Jun Ma1, Boyu Zhou2,โ
1 The Hong Kong University of Science and Technology (Guangzhou). ย ย
2 Southern University of Science and Technology. ย ย
โ Corresponding Authors
ApexNav ensures highly reliable object navigation by leveraging Target-centric Semantic Fusion, and boosts efficiency with its Adaptive Exploration Strategy.
- [07/09/2025]: ApexNav has been published in the Early Access area on IEEE Xplore.
- [22/08/2025]: Release the main algorithm of ApexNav.
- [18/08/2025]: ApexNav is conditionally accepted to RA-L 2025.
[RA-L'25] This repository maintains the implementation of "ApexNav: An Adaptive Exploration Strategy for Zero-Shot Object Navigation with Target-centric Semantic Fusion".
The pipeline of ApexNav is detailed in the overview below.
Tested on Ubuntu 20.04 with ROS Noetic and Python 3.9
sudo apt update
sudo apt-get install libarmadillo-dev libompl-dev
A simple cross-platform C++ library for terminal based user interfaces.
git clone https://github.com/ArthurSonzogni/FTXUI
cd FTXUI
mkdir build && cd build
cmake ..
make -j
sudo make install
You can skip LLM configuration and directly use our pre-generated LLM output results in
llm/answers
ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen3:8b
git clone git@github.com:WongKinYiu/yolov7.git # yolov7
git clone https://github.com/IDEA-Research/GroundingDINO.git # GroundingDINO
Download the following model weights and place them in the data/
directory:
mobile_sam.pt
: https://github.com/ChaoningZhang/MobileSAM/tree/master/weights/mobile_sam.ptgroundingdino_swint_ogc.pth
:wget -O data/groundingdino_swint_ogc.pth https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
yolov7-e6e.pt
:wget -O data/yolov7-e6e.pt https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-e6e.pt
git clone git@github.com:Robotics-STAR-Lab/ApexNav.git
cd ApexNav
conda env create -f apexnav_environment.yaml -y
conda activate apexnav
# You can use 'nvcc --version' to check your CUDA version.
# CUDA 11.8
pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu118
# CUDA 12.1
pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu121
# CUDA 12.4
pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu124
We recommend using habitat-lab v0.3.1
# habitat-lab v0.3.1
git clone https://github.com/facebookresearch/habitat-lab.git
cd habitat-lab; git checkout tags/v0.3.1;
pip install -e habitat-lab
# habitat-baselines v0.3.1
pip install -e habitat-baselines
Note: Any numpy-related errors will not affect subsequent operations, as long as numpy==1.23.5
and numba==0.60.0
are correctly installed.
pip install salesforce-lavis==1.0.2 # -i https://pypi.tuna.tsinghua.edu.cn/simple
cd .. # Return to ApexNav directory
pip install -e .
Note: Any numpy-related errors will not affect subsequent operations, as long as numpy==1.23.5
and numba==0.60.0
are correctly installed.
Official Reference: https://github.com/facebookresearch/habitat-lab/blob/main/DATASETS.md
Note: Both HM3D and MP3D scene datasets require applying for official permission first. You can refer to my commands below, and if you encounter any issues, please refer to the official documentation at https://github.com/facebookresearch/habitat-lab/blob/main/DATASETS.md.
- Apply for permission at https://matterport.com/habitat-matterport-3d-research-dataset.
- Download https://api.matterport.com/resources/habitat/hm3d-val-habitat-v0.2.tar.
- Save
hm3d-val-habitat-v0.2.tar
to theApexNav/
directory, and the following commands will help you extract and place it in the correct location:
mkdir -p data/scene_datasets/hm3d/val
mv hm3d-val-habitat-v0.2.tar data/scene_datasets/hm3d/val/
cd data/scene_datasets/hm3d/val
tar -xvf hm3d-val-habitat-v0.2.tar
rm hm3d-val-habitat-v0.2.tar
cd ../..
ln -s hm3d hm3d_v0.2 # Create a symbolic link for hm3d_v0.2
- Apply for download access at https://niessner.github.io/Matterport/.
- After successful application, you will receive a
download_mp.py
script, which should be run withpython2.7
to download the dataset. - After downloading, place the files in
ApexNav/data/scene_datasets
.
# Create necessary directory structure
mkdir -p data/datasets/objectnav/hm3d
mkdir -p data/datasets/objectnav/mp3d
# HM3D-v0.1
wget -O data/datasets/objectnav/hm3d/v1.zip https://dl.fbaipublicfiles.com/habitat/data/datasets/objectnav/hm3d/v1/objectnav_hm3d_v1.zip
unzip data/datasets/objectnav/hm3d/v1.zip -d data/datasets/objectnav/hm3d && mv data/datasets/objectnav/hm3d/objectnav_hm3d_v1 data/datasets/objectnav/hm3d/v1 && rm data/datasets/objectnav/hm3d/v1.zip
# HM3D-v0.2
wget -O data/datasets/objectnav/hm3d/v2.zip https://dl.fbaipublicfiles.com/habitat/data/datasets/objectnav/hm3d/v2/objectnav_hm3d_v2.zip
unzip data/datasets/objectnav/hm3d/v2.zip -d data/datasets/objectnav/hm3d && mv data/datasets/objectnav/hm3d/objectnav_hm3d_v2 data/datasets/objectnav/hm3d/v2 && rm data/datasets/objectnav/hm3d/v2.zip
# MP3D
wget -O data/datasets/objectnav/mp3d/v1.zip https://dl.fbaipublicfiles.com/habitat/data/datasets/objectnav/m3d/v1/objectnav_mp3d_v1.zip
unzip data/datasets/objectnav/mp3d/v1.zip -d data/datasets/objectnav/mp3d/v1 && rm data/datasets/objectnav/mp3d/v1.zip
Your final folder data
structure should look like this:
data
โโโ datasets
โย ย โโโ objectnav
โย ย โโโ hm3d
โย ย โย ย โโโ v1
โย ย โย ย โย ย โโโ train
โย ย โย ย โย ย โโโ val
โย ย โย ย โย ย โโโ val_mini
โย ย โย ย โโโ v2
โย ย โย ย โโโ train
โย ย โย ย โโโ val
โย ย โย ย โโโ val_mini
โย ย โโโ mp3d
โย ย โโโ v1
โย ย โโโ train
โย ย โโโ val
โย ย โโโ val_mini
โโโ scene_datasets
โย ย โโโ hm3d
โย ย โย ย โโโ val
โย ย โย ย โโโ 00800-TEEsavR23oF
โย ย โย ย โโโ 00801-HaxA7YrQdEC
โย ย โย ย โโโ .....
โย ย โโโ hm3d_v0.2 -> hm3d
โย ย โโโ mp3d
โย ย โโโ 17DRP5sb8fy
โย ย โโโ 1LXtFkjw3qL
โย ย โโโ .....
โโโ groundingdino_swint_ogc.pth
โโโ mobile_sam.pt
โโโ yolov7-e6e.pt
Note that train
and val_mini
are not required and you can choose to delete them.
All following commands should be run in the
apexnav
conda environment
catkin_make -DPYTHON_EXECUTABLE=/usr/bin/python3
Each command should be run in a separate terminal.
python -m vlm.detector.grounding_dino --port 12181
python -m vlm.itm.blip2itm --port 12182
python -m vlm.segmentor.sam --port 12183
python -m vlm.detector.yolov7 --port 12184
source ./devel/setup.bash && roslaunch exploration_manager rviz.launch # RViz visualization
source ./devel/setup.bash && roslaunch exploration_manager exploration.launch # ApexNav main algorithm
You can evaluate on all episodes of a dataset.
# Need to source the workspace
source ./devel/setup.bash
# Choose one datasets to evaluate
python habitat_evaluation.py --dataset hm3dv1
python habitat_evaluation.py --dataset hm3dv2 # default
python habitat_evaluation.py --dataset mp3d
# You can also evaluate on one specific episode.
python habitat_evaluation.py --dataset hm3dv2 test_epi_num=10 # episode_id 10
If you want to generate evaluation videos for each episode (videos will be categorized by task results), you can use the following command:
python habitat_evaluation.py --dataset hm3dv2 need_video=true
You can also choose to manually control the agent in the Habitat simulator:
# Need to source the workspace
source ./devel/setup.bash
python habitat_manual_control.py --dataset hm3dv1 # Default episode_id = 0
python habitat_manual_control.py --dataset hm3dv1 test_epi_num=10 # episode_id = 10
- Release the main algorithm of ApexNav
- Complete Installation and Usage documentation
- Add datasets download documentation
- Add acknowledgment documentation
- Add utility tools documentation
- Release the code of real-world deployment
- Add ROS2 support
@ARTICLE{11150727,
author={Zhang, Mingjie and Du, Yuheng and Wu, Chengkai and Zhou, Jinni and Qi, Zhenchao and Ma, Jun and Zhou, Boyu},
journal={IEEE Robotics and Automation Letters},
title={ApexNav: An Adaptive Exploration Strategy for Zero-Shot Object Navigation with Target-centric Semantic Fusion},
year={2025},
volume={},
number={},
pages={1-8},
keywords={Semantics;Navigation;Training;Robustness;Detectors;Noise measurement;Geometry;Three-dimensional displays;Object recognition;Faces;Search and Rescue Robots;Vision-Based Navigation;Autonomous Agents;},
doi={10.1109/LRA.2025.3606388}}