This repository contains the winning solution to the SoccerNet Monocular Depth Estimation Challenge at CVPR 2025. Our method predicts accurate depth maps from RGB images of soccer matches, enabling improved 3D spatial understanding in sports analytics.
🏆 1st Place at the SoccerNet Monocular Depth Estimation Challenge (CVPR 2025)
Rank | Team | RMSE | AbsRel | RMSElog | SqRel | SIlog |
---|---|---|---|---|---|---|
1️⃣ | Hands-On Computer Vision | 0.00242 | 0.00164 | 0.00432 | 2e-05 | 0.43 |
2 | HUST-iPad | 0.00258 | 0.00179 | 0.00468 | 3e-05 | 0.47 |
3 | bupt miclab | 0.00268 | 0.00186 | 0.00484 | 3e-05 | 0.48 |
4 | jacekm | 0.00275 | 0.00207 | 0.00500 | 3e-05 | 0.50 |
5 | hvrl | 0.00282 | 0.00228 | 0.00502 | 3e-05 | 0.50 |
📺 Presentation Video
📊 Official Leaderboard
- 🔄 Based on Depth Anything V2, a state-of-the-art model for depth estimation.
- 📊 Evaluated using standard metrics: RMSE, AbsRel, SILog, RMSElog, SqRel.
- 🧠 Fine-tuned for soccer-specific imagery and temporal consistency.
- 🖼️ Preserves sharp edges and fine details via specialized losses.
- 📏 Full-resolution training for enhanced accuracy.
The solution uses the Depth Anything V2 architecture with different encoders (Small, Base, Large), which combines a pre-trained ViT backbone with a specialized DPT head to estimate precise metric depth.
- Python 3.9
- PyTorch
- CUDA (for accelerated training)
data/
: Data loaders and utilitiesloss/
: Specialized loss functionsmetrics/
: Evaluation metrics implementationmodels/
: Model architectures (Depth Anything V2)
Config | Abs Rel ×10⁻³ ↓ | RMSE ×10⁻³ ↓ | RMSE Log ×10⁻³ ↓ | Sq Rel ×10⁻⁴ ↓ | SILog ↓ |
---|---|---|---|---|---|
Best Baseline [Leduc et al., 2024] | 2.429 | 2.343 | 4.002 | 0.121 | 0.400 |
Ours Half-Res | 2.781 | 2.516 | 4.350 | 0.125 | 0.435 |
Ours Full-Res | 1.443 | 1.590 | 2.619 | 0.062 | 0.261 |
# Training
bash train.sh
# Evaluation
python evaluate_depth.py --pred_dir [predictions_directory] --gt_dir [ground_truth_directory]
# Inference
python challenge.py --encoder vitl --checkpoint_path [checkpoint_path] --input_dir [input_directory] --output_dir [output_directory]
Training options can be modified in the train.sh file.