Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection

This is the official code for paper "Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection".

Paper Link: https://arxiv.org/abs/2411.01174

Please first pip install -r requirements.txt

1. Pre-trained LASS model

Please follow instructions in DCASE 2024 Challenge Task 9 Baseline for pre-training LASS models: https://github.com/Audio-AGI/dcase2024_task9_baseline

Or Download our pre-trained AudioSep-Dp model, we release our pre-trained AudioSep-DP model at: https://zenodo.org/records/14208090

2. Sound Event Detection

2.1 Prepare data

In our paper, we used DESED, AudioSet-Strong, and WildDESED datasets. Please download dataset from https://project.inria.fr/desed/, https://research.google.com/audioset/, and https://zenodo.org/records/14013803

2.2 Use pre-trained LASS models for separation

In our work, we used the pre-trained LASS model to extract sound tracks of different events. Please use separate_audio.py to perform this procedure.

2.3 Training and Evaluation

Without curriculum learning: run python train_pretrained.py

With curriculum learning: run python train_pretrained_cl.py

PS: Please follow instructions in DCASE 2024 Task 4 to extract embeddings through pre-trained model BEATs.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
LASS_codes		LASS_codes
confs		confs
desed_task		desed_task
local		local
weak_predictions		weak_predictions
README.md		README.md
extract_embeddings.py		extract_embeddings.py
io_func.py		io_func.py
requirements.txt		requirements.txt
separate_audio.py		separate_audio.py
train_pretrained.py		train_pretrained.py
train_pretrained_cl.py		train_pretrained_cl.py
train_sed.py		train_sed.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection

1. Pre-trained LASS model

2. Sound Event Detection

2.1 Prepare data

2.2 Use pre-trained LASS models for separation

2.3 Training and Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

apple-yinhan/Noise-robust-SED

Folders and files

Latest commit

History

Repository files navigation

Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection

1. Pre-trained LASS model

2. Sound Event Detection

2.1 Prepare data

2.2 Use pre-trained LASS models for separation

2.3 Training and Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages