Room Acoustics and Microphone Characteristics Show Systematic Impact on Sound Event Recognition

This repository provides all the code, data, and supplementary materials referenced in the INTER-NOISE 2025 article:

Room Acoustics and Microphone Characteristics Show Systematic Impact on Sound Event Recognition

Gabriel Bibbó, Craig Cieciura, Mark D. Plumbley
Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey, United Kingdom

Overview

This repository is a companion to the above article. It contains:

Audio generation code (audio_generation.py): Scripts to generate the standardized 60-minute audio file from AudioSet, including pre-processing (normalization, compression, mixing of classes, concatenation).
Annotations CSV (annotations.csv): A CSV file containing the YouTube video IDs, timestamps, and class labels used to construct the experimental audio segments.
Experimental results CSV (results.csv): A CSV file with all frame-level results for every experimental configuration (room, microphone, class, etc).
Analysis result images:
- Results from the first 30 minutes (single-class segments)
- Results from the last 30 minutes (overlapping-class segments, focusing on one class of each pair)
- Results from the last 30 minutes (overlapping-class segments, focusing on the complementary class)

These resources enable full transparency and reproducibility of the results discussed in the article.

Repository Structure

room_acoustics_SED/
├── audio_generation.py
├── annotations.csv
├── results.csv
├── results_single_class.png
├── results_overlap_classA.png
├── results_overlap_classB.png
├── README.md
└── IN2025_1070751.pdf

audio_generation.py: Scripts and notebooks to generate the audio and process annotations.
annotations.csv: Metadata for each audio segment (YouTube ID, timestamp, class label).
results.csv: Frame-level metrics and summary statistics from all experimental runs.
results_single_class.png: Analysis of the first 30 minutes (single classes).
results_overlap_classA.png: Analysis of the last 30 minutes — focus on one class in overlapping pairs.
results_overlap_classB.png: Analysis of the last 30 minutes — focus on the complementary class.
IN2025_1070751.pdf: The full article as submitted to INTER-NOISE 2025.

Getting Started

Clone the repository.

git clone https://github.com/gbibbo/room_acoustics_SED.git
cd room_acoustics_SED

Review the audio generation code
- All scripts for generating the experimental audio and processing YouTube metadata are in audio_generation/.
- The main script is audio_generation.py. See comments for usage instructions.
Explore the data
- annotations.csv: Lists every used YouTube video, the time intervals, and the mapped class label.
- results.csv: Contains all model outputs, including frame-level occurrence, mean probability, and confidence scores for each configuration.
View analysis results
- Figures as shown in the article:
  - results_single_class.png: Performance for each class, room, and microphone (first 30 mins).
  - results_overlap_classA.png: Overlapping classes, impact on primary class (last 30 mins).
  - results_overlap_classB.png: Overlapping classes, impact on complementary class.

Visual Results

Results from the first 30 minutes (single-class segments)

Results from the last 30 minutes (overlapping-class segments, primary class)

Results from the last 30 minutes (overlapping-class segments, complementary class)

Data Description

Audio Generation

The audio file was generated from AudioSet segments, grouped into 15 daily household sound classes.
Segments were normalized, compressed, and concatenated to form:
- 30 minutes of single-class audio (2 minutes per class)
- 30 minutes of overlapping-class audio (15 unique class pairs, 2 minutes per pair)
See the scripts in audio_generation/ for all processing details.

Annotations

Each 1-second segment is tracked in annotations.csv with:
- YouTube video ID
- Start/end timestamps
- Assigned class(es)
- Source information for traceability

Experimental Results

results.csv contains:
- Room, microphone, class, and overlap configuration
- Frame-level detection occurrence (%)
- Mean probability/confidence assigned to the correct class
- SNR measurements for each configuration

Result Images

Figures summarize the impact of room acoustics, microphone, and overlapping events, as described in the article:
- results_single_class.png: Classes in isolation
- results_overlap_classA.png: Overlaps, focus on primary class
- results_overlap_classB.png: Overlaps, focus on secondary class

Article Reference

If you use this repository or data, please cite:

Bibbó, G., Cieciura, C., & Plumbley, M. D. (2025). Room Acoustics and Microphone Characteristics Show Systematic Impact on Sound Event Recognition. INTER-NOISE 2025.
Link to Article

License

All code and data are provided for academic/research use under a Creative Commons Attribution (CC BY) license.
See LICENSE file for details.

Contact

For questions about the code or data, please contact Gabriel Bibbó: g.bibbo@surrey.ac.uk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Room Acoustics and Microphone Characteristics Show Systematic Impact on Sound Event Recognition

Overview

Repository Structure

Getting Started

Visual Results

Results from the first 30 minutes (single-class segments)

Results from the last 30 minutes (overlapping-class segments, primary class)

Results from the last 30 minutes (overlapping-class segments, complementary class)

Data Description

Audio Generation

Annotations

Experimental Results

Result Images

Article Reference

License

Contact

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
IN2025_1070751.pdf		IN2025_1070751.pdf
README.md		README.md
annotations.csv		annotations.csv
audio_generation.py		audio_generation.py
results.csv		results.csv
results_overlap_classA.png		results_overlap_classA.png
results_overlap_classB.png		results_overlap_classB.png
results_single_class.png		results_single_class.png

gbibbo/room_acoustics_SED

Folders and files

Latest commit

History

Repository files navigation

Room Acoustics and Microphone Characteristics Show Systematic Impact on Sound Event Recognition

Overview

Repository Structure

Getting Started

Visual Results

Results from the first 30 minutes (single-class segments)

Results from the last 30 minutes (overlapping-class segments, primary class)

Results from the last 30 minutes (overlapping-class segments, complementary class)

Data Description

Audio Generation

Annotations

Experimental Results

Result Images

Article Reference

License

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages