BirdSense: Multi-Modal Bird Species Recognition System

This repository contains a comprehensive bird species analysis system with three main components:

Image Classification - Identifying bird species from images
Audio Classification - Recognizing birds by their calls and songs
Image Retrieval - Finding similar bird images using feature extraction and similarity measures

Project Overview

The project utilizes deep learning techniques to analyze and classify birds through different modalities. The system includes several pre-trained models and provides interactive demos for real-time classification and retrieval.

Key Features

Bird species classification from images using InceptionV3 transfer learning
Audio classification of bird songs and calls using deep learning
Image retrieval system to find similar bird images
Grad-CAM visualization to understand model focus areas
Interactive demo interfaces for all components

Data

The project uses the following datasets:

Images: "100-bird-species" dataset from Kaggle containing 525 different bird species
Audio: Custom dataset of bird calls organized by species

Components

1. Image Classification

The image classification system uses a fine-tuned InceptionV3 model with the following architecture:

Pre-trained InceptionV3 base model (trained on ImageNet)
Global Average Pooling layer
Dropout layer (0.3) for regularization
Dense output layer with softmax activation for 525 bird classes

Model Performance

Test set accuracy: 85.5%
Precision: 80%+
Recall: 80%+
F1-score: 85.2%

2. Audio Classification

The audio classification component processes bird songs through:

Audio preprocessing and segmentation
Mel-spectrogram generation
Classification using a custom CNN architecture

Audio Model Architecture

Multiple convolutional layers with batch normalization
Global Average Pooling
Dense layers with dropout for regularization
Output layer with softmax activation

Performance

Test set accuracy: 60%
F1-score: 40.5% on handcrafted test set

3. Image Retrieval System

The retrieval system uses feature extraction to find similar bird images:

Feature extraction using InceptionV3
Cosine similarity measurement
Grad-CAM visualization to highlight important regions

Installation and Setup

Prerequisites

Python 3.7+
TensorFlow 2.x
Keras
librosa
pandas
numpy
matplotlib
scikit-learn
PIL/Pillow

Installation

# Clone the repository
git clone https://github.com/yourusername/bird-analysis-project.git
cd bird-analysis-project

# Install dependencies
pip install -r requirements.txt

# Download necessary models and data
python download_assets.py

Usage

Image Classification Demo

python demo.py --mode image --image path/to/bird_image.jpg

Audio Classification Demo

python demo.py --mode audio --audio path/to/bird_call.ogg

Image Retrieval Demo

python demo.py --mode retrieval --image path/to/query_image.jpg

Interactive Demo

The project includes an interactive demo with a web interface:

python interactive_demo.py

Navigate to http://localhost:8080 in your browser to access the interactive demo.

Files and Structure

retrival.py - Implementation of the image retrieval system
evaluation.py - Evaluation metrics and testing for audio classification
evaluation (1).py - Evaluation metrics and testing for image classification
modeling.py - Audio model architecture and training
modeling (1).py - Image model architecture and training
demo.py - Interactive demo application

Results and Evaluation

Image Classification

The image classifier achieved 85.5% accuracy on the test set, with precision and recall both above 80%. The model performs particularly well on distinctive species but struggles with visually similar birds.

Audio Classification

The audio classifier achieved 60% accuracy on the test set. Performance varies significantly across species, with some birds being consistently recognized while others presenting challenges.

Grad-CAM Visualization

Grad-CAM visualizations show that the model focuses on distinctive features such as:

Beaks
Wing patterns
Distinctive coloration
Head features

Future Work

Improve audio classification accuracy through more sophisticated architectures
Expand the dataset to include more species and variations
Implement ensemble methods to combine audio and visual classifications
Develop a mobile application for field identification

Acknowledgments

Kaggle for providing the "100-bird-species" dataset
TensorFlow and Keras teams for the deep learning framework
The bird watching community for domain expertise and data collection

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Task1_audio_classification		Task1_audio_classification
Task2_image_classification		Task2_image_classification
Task3_image_retrieval		Task3_image_retrieval
crafted_data		crafted_data
example		example
models_used		models_used
Birds_signals_presentation.pdf		Birds_signals_presentation.pdf
Birds_signals_presentation.pptx		Birds_signals_presentation.pptx
DEMO.ipynb		DEMO.ipynb
LICENSE		LICENSE
README.md		README.md
README_Birds_signal_project.txt		README_Birds_signal_project.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BirdSense: Multi-Modal Bird Species Recognition System

Project Overview

Key Features

Data

Components

1. Image Classification

Model Performance

2. Audio Classification

Audio Model Architecture

Performance

3. Image Retrieval System

Installation and Setup

Prerequisites

Installation

Usage

Image Classification Demo

Audio Classification Demo

Image Retrieval Demo

Interactive Demo

Files and Structure

Results and Evaluation

Image Classification

Audio Classification

Grad-CAM Visualization

Future Work

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

License

pasottimatteo98/BirdSense-Multi-Modal-Bird-Species-Recognition-System

Folders and files

Latest commit

History

Repository files navigation

BirdSense: Multi-Modal Bird Species Recognition System

Project Overview

Key Features

Data

Components

1. Image Classification

Model Performance

2. Audio Classification

Audio Model Architecture

Performance

3. Image Retrieval System

Installation and Setup

Prerequisites

Installation

Usage

Image Classification Demo

Audio Classification Demo

Image Retrieval Demo

Interactive Demo

Files and Structure

Results and Evaluation

Image Classification

Audio Classification

Grad-CAM Visualization

Future Work

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages