MMLA Project

Multi-Environment, Multi-Species, Low-Altitude Drone Dataset

Example photo from the MMLA dataset and labels generated from model. The image shows a group of zebras and giraffes at the Mpala Research Centre in Kenya.

This repo provides scripts to fine-tune YOLO models on the MMLA dataset. The MMLA dataset is a collection of low-altitude aerial footage of various species in different environments. The dataset is designed to help researchers and practitioners develop and evaluate object detection models for wildlife monitoring and conservation.

How to use the scripts in this repo

Requirements

# install packages from requirements
conda create --name yolo_env --file requirements.txt
# OR using pip
pip install -r requirements.txt

Model Training

Prepare the dataset

# download the datasets from HuggingFace to local /dataset directory 
mkdir -p dataset
cd data

# wilds dataset
git clone https://huggingface.co/datasets/imageomics/mmla_wilds
# opc dataset
git clone https://huggingface.co/datasets/imageomics/mmla_opc
# mpala dataset
git clone https://huggingface.co/datasets/imageomics/mmla_mpala

# run the script to split the dataset into train and test sets
python prepare_yolo_dataset.py

Alternatively, you can create your own dataset from video frames and bounding box annotations

python frame_extractor.py --dataset wilds --dataset_path ./mmla_wilds --output_dir ./wildwing_wilds

Optional: Downsample the frames to extract a subset of frames from each video

python downsample.py --dataset wilds --dataset_path ./mmla_wilds --output_dir ./mmla_wilds --downsample_rate 0.1

Run the training script

cd model
# run the training script
python train.py

Evaluation

To evaluate the trained model on the test data:

# run the validate script
python validate.py

Optional: Perform bootstrapping to get confidence intervals

cd analysis
# run the evaluation script
bootstrap.ipynb

Download inference results from baseline and fine-tned model

Results

Our fine-tuned YOLO11m model achieves the following performance on the MMLA dataset:

Class	Images	Instances	Box(P)	R	mAP50	mAP50-95
all	7,658	44,619	0.867	0.764	0.801	0.488
Zebra	4,430	28,219	0.768	0.647	0.675	0.273
Giraffe	868	1,357	0.788	0.634	0.678	0.314
Onager	172	1,584	0.939	0.776	0.857	0.505
Dog	3,022	13,459	0.973	0.998	0.995	0.860

Fine-Tuned Model

See HuggingFace Model Repo for details and weights.

Dataset

See HuggingFace Dataset Repo for MMLA dataset.

Paper

@article{kline2025mmla,
  title={MMLA: Multi-Environment, Multi-Species, Low-Altitude Aerial Footage Dataset},
  author={Kline, Jenna and Stevens, Samuel and Maalouf, Guy and Saint-Jean, Camille Rondeau and Ngoc, Dat Nguyen and Mirmehdi, Majid and Guerin, David and Burghardt, Tilo and Pastucha, Elzbieta and Costelloe, Blair and others},
  journal={arXiv preprint arXiv:2504.07744},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
analysis		analysis
docs		docs
model		model
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MMLA Project

Multi-Environment, Multi-Species, Low-Altitude Drone Dataset

Table of Contents

How to use the scripts in this repo

Requirements

Model Training

Prepare the dataset

Alternatively, you can create your own dataset from video frames and bounding box annotations

Optional: Downsample the frames to extract a subset of frames from each video

Run the training script

Evaluation

Optional: Perform bootstrapping to get confidence intervals

Download inference results from baseline and fine-tned model

Results

Fine-Tuned Model

Dataset

Paper

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Imageomics/mmla

Folders and files

Latest commit

History

Repository files navigation

MMLA Project

Multi-Environment, Multi-Species, Low-Altitude Drone Dataset

Table of Contents

How to use the scripts in this repo

Requirements

Model Training

Prepare the dataset

Alternatively, you can create your own dataset from video frames and bounding box annotations

Optional: Downsample the frames to extract a subset of frames from each video

Run the training script

Evaluation

Optional: Perform bootstrapping to get confidence intervals

Download inference results from baseline and fine-tned model

Results

Fine-Tuned Model

Dataset

Paper

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages