[CVPR'25] FG²: Fine-Grained Cross-View Localization by Fine-Grained Feature Matching

[Arxiv][Poster][Video][BibTeX]

📝 Abstract

We propose a novel fine-grained cross-view localization method that estimates the 3 Degrees of Freedom pose of a ground-level image in an aerial image of the surroundings by matching fine-grained features between the two images. The pose is estimated by aligning a point plane generated from the ground image with a point plane sampled from the aerial image. To generate the ground points, we first map ground image features to a 3D point cloud. Our method then learns to select features along the height dimension to pool the 3D points to a Bird’s-Eye-View (BEV) plane. This selection enables us to trace which feature in the ground image contributes to the BEV representation. Next, we sample a set of sparse matches from computed point correspondences between the two point planes and compute their relative pose using Procrustes alignment. Compared to the previous state-of-the-art, our method reduces the mean localization error by 28% on the VIGOR cross-area test set. Qualitative results show that our method learns semantically consistent matches across ground and aerial view through weakly supervised learning from the camera pose.

📦 Checkpoints

📁 Download pretrained models

⚙️ Setup

git clone https://github.com/vita-epfl/FG2.git
cd FG2
conda env create -f environment.yml
conda activate fg2
mim install "mmcv-full>=1.7.1"

If you encounter errors related to NumPy 2.x, run:

pip install "numpy<2"

VIGOR Dataset

Download the dataset from the official VIGOR repository.

Update the config:
In config.ini, set dataset_root under VIGOR entry to the path where you placed the VIGOR dataset, for example:

dataset_root = /home/username/VIGOR

Corrected labels (recommended):
Download the corrected label splits from SliceMatch (VIGOR_corrected_labels), and follow their instructions to replace the original splits folder with the downloaded splits__corrected folder.

KITTI Dataset

Download and structure the dataset according to [https://github.com/YujiaoShi/HighlyAccurate]. In config.ini, set dataset_root under KITTI entry to the path where you placed the KITTI dataset, for example:

dataset_root = /home/username/KITTI

📊 Evaluation

Run evaluation on the same-area test set with known orientation (use --area crossarea if you wish to evaluate on cross-area test set):

python vigor_eval.py --area samearea -b 24 --random_orientation False --ransac False

🧭 RANSAC Option

To enable robust pose estimation with RANSAC:

--ransac True

🔄 Evaluate with Unknown Orientation (Two-Stage Inference)

First run – predict orientation:

python vigor_eval.py --area samearea -b 24 --random_orientation True --first_run True

Second run – apply predicted orientation for pose estimation:

python vigor_eval.py --area samearea -b 24 --random_orientation True --first_run False

🔍 Visualize Matched Cross-View Correspondences

--idx 0: Selects the sample index. Replace 0 with the index you want to visualize.

python vigor_qualitative_results.py --area samearea --idx 0

📌 Note: Ensure dataset paths are correctly set in config.ini.

🚀 Training

Knwon orientation:

Training on the same-area training set with known orientation (use --area crossarea if you wish to train on cross-area training set):

python vigor_train.py --area samearea -b 24 --random_orientation False

Unknwon orientation:

📌 Note: If you wish to train a model to estimate orientation first (see details about two-step inference in our paper), use a large beta value, for example, beta=100 in config.ini.

python vigor_train.py --area samearea -b 24 --random_orientation True

🚗 KITTI

Evaluation Results Updated: We have revised our evaluation on the KITTI dataset. The updated results differ slightly from the CVPR version. A new ArXiv submission reflecting these changes will be available soon.

Setting	Version	Loc. Mean ↓ (m)	Loc. Median ↓ (m)	Lateral R@1m ↑ (%)	Lateral R@5m ↑ (%)	Long. R@1m ↑ (%)	Long. R@5m ↑ (%)	Orien. Mean ↓ (°)	Orien. Median ↓ (°)	Orien. R@1° ↑ (%)	Orien. R@5° ↑ (%)
Same-Area	CVPR	0.75	0.52	99.73	100.00	86.99	98.75	1.28	0.74	61.17	95.65
Same-Area	Updated	0.75	0.51	95.81	99.66	92.50	99.05	0.93	0.66	67.27	98.91
Cross-Area	CVPR	7.45	4.03	89.46	99.80	12.42	55.73	3.33	1.88	30.34	81.17
Cross-Area	Updated	7.31	4.15	37.89	85.65	21.98	60.77	3.62	2.37	23.03	77.84

Training:

python kitti_train_test.py -b 24 -t train

Test:

python kitti_train_test.py -b 24 -t test

✅ To-Do

Acknowledgements

The implementation of Procrustes analysis and RANSAC for 3DoF pose estimation in this project is adapted from the 6DoF pose estimation framework provided by Mickey.
Many thanks to the authors for their outstanding work and for sharing it with the community!

Citation

@inproceedings{xia2025fg,
  title={FG\^{} 2: Fine-Grained Cross-View Localization by Fine-Grained Feature Matching},
  author={Xia, Zimin and Alahi, Alexandre},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={6362--6372},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
DINO_modules		DINO_modules
KITTI_splits		KITTI_splits
att_layers		att_layers
dataloaders		dataloaders
figures		figures
models		models
results/vigor		results/vigor
utils		utils
LICENSE		LICENSE
README.md		README.md
config.ini		config.ini
environment.yml		environment.yml
kitti_train_test.py		kitti_train_test.py
vigor_eval.py		vigor_eval.py
vigor_qualitative_results.py		vigor_qualitative_results.py
vigor_train.py		vigor_train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

[CVPR'25] FG²: Fine-Grained Cross-View Localization by Fine-Grained Feature Matching

📝 Abstract

📦 Checkpoints

⚙️ Setup

VIGOR Dataset

KITTI Dataset

📊 Evaluation

🧭 RANSAC Option

🔄 Evaluate with Unknown Orientation (Two-Stage Inference)

🔍 Visualize Matched Cross-View Correspondences

🚀 Training

🚗 KITTI

✅ To-Do

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Languages

License

vita-epfl/FG2

Folders and files

Latest commit

History

Repository files navigation

[CVPR'25] FG²: Fine-Grained Cross-View Localization by Fine-Grained Feature Matching

📝 Abstract

📦 Checkpoints

⚙️ Setup

VIGOR Dataset

KITTI Dataset

📊 Evaluation

🧭 RANSAC Option

🔄 Evaluate with Unknown Orientation (Two-Stage Inference)

🔍 Visualize Matched Cross-View Correspondences

🚀 Training

🚗 KITTI

✅ To-Do

Acknowledgements

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages