Blue whale off the coast of San Diego.
Photo credit: Katherine Whitaker
This repository details how to train, validate, and test a ResNet-18 CNN to classify blue whale A and B calls in 30 second spectrogram windows. The opensource Python package, opensoundscape, is used for preprocessing and model training. The model is trained on publicly available acoustic data from the DCLDE 2015 Workshop.
WhaleSongNet/
├── LICENSE
├── README.md <- The top-level README for users.
├── data
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical train, validation, and test data sets for modeling.
│ └── raw <- The original, immutable data dump.
│
├── models <- Trained models
│
├── notebooks <- Jupyter notebooks.
│ └── plot_spectrograms
| └── plot_spectrograms.ipynb
|
├── references <- Relevant literature.
│
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Graphics and figures to be used in reporting
│
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `conda list --export > requirements.txt`
│
├── src <- Source code for use in this project.
├── __init__.py <- Makes src a Python module
│
├── data <- Scripts to download or generate data
│ └── AudioStreamDescriptor.py
| └── download_data.py
| └── extract_xwav_header.py
| └── make_dataset.py
| └── make_hot_clips.py
| └── modify_annotations.py
│
├── models <- Scripts to train models and then use trained models to make
│ predictions
├── train.py
├── evaluate_model.py
└── predict.py
-
Clone the Repository:
git clone https://github.com/m1alksne/WhaleSongNet.git
cd WhaleSongNet
-
Create a Virtual Environment:
conda create -n whalesongnet python=3.9
conda activate whalesongnet
-
Install Dependencies:
pip install opensoundscape==0.9.1
-
Download Data:
*note, this will take a while! The raw .WAV files are available on figshare
python src/data/download_data.py
-
Preprocess the Data:
a.
python src/data/make_hotclips.py
b.
python src/data/make_dataset.py
-
Train the Model:
python src/models/train.py
-
Evaluate Model Performance:
python src/models/evaluate_model.py
-
Make Predictions:
python src/models/predict.py
The training, validation, and testing clips are already included along with each epoch of the model. Depending on the use-case, step 5-6 may not be necessary. The trained model and surrounding data processing and evaluation workflows are meant to be used for educational purposes. This workflow is flexible and model hyper-parameters are easy to fine-tune (in the train.py script). The overall workflow can also be modified to retrain the model on a different acoustic dataset. The original annotations are generated using LoggerPro in Triton. If a new user has audio annotations in a similar format, they can use this as a step-by-step guide to train, validate, and test a CNN model on their dataset.
Remember to always visualize your labels before you train a model (see below)!
Make sure your training data is different/separate from your testing data!
The main goal of this repository is to increase computer vision accessibility in the bioacoustics community.
To visualize labeled spectrograms:
jupyter notebook notebook/plot_spectrograms/plot_spectrogram.ipynb
This work was made possible with the support of the Resnick Sustainability Institute, the Computer Vision for Ecology summer workshop, the National Defense and Graduate Engineering Fellowship, the Kitzes Lab, and the Scripps Acoustic Ecology Lab