Stereo Object Detection with Disparity Depth Estimation

This project focuses on creating object detection models for an aquatic environment. In addition, stereo video is captured and processed with a ZED camera to detect these objects in real time using YOLO models, combined with depth estimation using stereo disparity maps. The output includes bounding boxes with estimated distances and an optional video output.

To validate the proposed neural network models by making inference with the video captured with the ZED camera, the following steps were followed:

1. Requirements

To run the code, you will need to install the following dependencies beforehand:

Ultralytics >=8.3.20
Python
Pytorch
Clearml 1.18.0

2. Python dependencies

The best way to install python dependencies is by using a virtual environment, to do so:

$ sudo apt install virtualenv
$ virtualenv -p python3 venv
$ source venv/bin/activate
$ pip install numpy

To deactivate virtualenv, do by:

$ deactivate

3. Structure and format of the dataset

Datasense@CRAS/ │
              ├── train/ │
              │   ├── images/
              │   ├── image1.jpg
              │   ├── image2.jpg
              │   ├── .... (other image files)  
              │   └── labels/
              │   ├── label1.txt
              │   ├── label2.txt
              │   ├── .... (other label files)     
              ├── valid/ │
              │   ├── image1.jpg
              │   ├── image2.jpg
              │   ├── .... (other image files)
              │   └── labels/
              │   ├── label1.txt
              │   ├── label2.txt
              │   ├── .... (other label files) 
              ├── test/ │
              │   ├── image1.jpg
              │   ├── image2.jpg
              │   ├── .... (other image files)
              │   └── labels/
              │   ├── label1.txt
              │   ├── label2.txt
              │   ├── .... (other label files) 
              └── data.yaml

3.1 Dataset used in this research

Original dataset: Datasense@CRAS.
Original dataset: SeaDroneSee v2.
Our dataset as a contribution: USVDD.

4. Train Yolov8 object detection on a custom dataset

To run or launch the training you will need:

$ yolo detect train data=/path/data.yaml model=yolov8n.pt epochs=150 imgsz=640 batch=16 lr0=0.001 momentum=0.9 weight_decay=0.0005 plots=True save=True amp=True

5. Neural network models

Datasense@CRAS
nano model.
small model.
medium model.
SeaDroneSee v2
nano model.
small model.
medium model.
USVDD
nano model.
small model.
medium model.

6. Running inference on Jetson Orin + ZED 2 camera

6.1 Intall dependencies

JetPack SDK
ZED SDK
Ultralytics
CUDA

6.2 Run inference

To run inferences with our model and the videos captured with the ZED camera, our inference code is based on the Python script detector.py provided by Stereolabs. Alternatively, we have also developed a C++ version based on equivalent with similar performance rates.

In addition, the testing have been performed with some videos available in the folder Videos/ folder, in which we have used the medium, small and nano model of both Datasense@CRAS and USVDD dataset for object detection. The results are available in the folder Videos/ObjectDetection/.

7. Publications

Title: "YOLO-Based Power-Efficient Object Detection on Edge Devices for USVs"
Journal: Journal of Real-Time Image Processing (Published)
DOI: https://doi.org/10.1007/s11554-025-01682-2

8. Acknowledgements

This paper has been partially funded by the EU (FEDER), the Spanish MINECO under grants PID2021-126576NB-I00 and TED2021-130123B-I00 funded by MCIN/AEI/10.13039/501100011033 and by European Union "ERDF A way of making Europe" and the NextGenerationEU/PRT. J.L.M. thanks the National Secretariat of Science, Technology and Innovation (SENACYT) of Panama for financial support during the completion of his PhD.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Videos		Videos
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Stereo Object Detection with Disparity Depth Estimation

1. Requirements

2. Python dependencies

3. Structure and format of the dataset

3.1 Dataset used in this research

4. Train Yolov8 object detection on a custom dataset

5. Neural network models

6. Running inference on Jetson Orin + ZED 2 camera

6.1 Intall dependencies

6.2 Run inference

7. Publications

8. Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

artecs-group/Yolo-AquaticUSV

Folders and files

Latest commit

History

Repository files navigation

Stereo Object Detection with Disparity Depth Estimation

1. Requirements

2. Python dependencies

3. Structure and format of the dataset

3.1 Dataset used in this research

4. Train Yolov8 object detection on a custom dataset

5. Neural network models

6. Running inference on Jetson Orin + ZED 2 camera

6.1 Intall dependencies

6.2 Run inference

7. Publications

8. Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Packages