Skip to content

This repository is related to our Dataset and Detection code from the paper: AI-Synthesized Voice Detection Using Neural Vocoder Artifacts accepted in CVPR Workshop on Media Forensic 2023.

License

Notifications You must be signed in to change notification settings

csun22/Synthetic-Voice-Detection-Vocoder-Artifacts

Repository files navigation

🧠 Synthetic-Voice-Detection-Vocoder-Artifacts


📁 LibriSeVoc Dataset

  1. We are the first to identify neural vocoders as a source of features to expose synthetic human voices.
    Here are the differences shown by the six vocoders compared to the original audio:

    image

  2. We provide LibriSeVoC as a dataset of self-vocoding samples created with six state-of-the-art vocoders to highlight and exploit the vocoder artifacts.
    The composition of the dataset is shown in the following table:

    image

    The source of our dataset ground truth comes from LibriTTS. Therefore, we follow the naming logic of LibriTTS.
    For example:
    27_123349_000006_000000.wav

    • 27 is the reader's ID
    • 123349 is the ID of the chapter

🎯 Deepfake Detection

We propose a new approach to detecting synthetic human voices by:

  • Exposing signal artifacts left by neural vocoders
  • Modifying and improving the RawNet2 baseline by adding multi-loss

✅ This lowers the error rate from 6.10% to 4.54% on the ASVspoof Dataset.

Here is the framework of the proposed synthesized voice detection method:

image

📄 Paper & Dataset


🛠️ Usage

🏋️‍♀️ To train the model, run:

python main.py --data_path /your/path/to/LibriSeVoc/ --model_save_path /your/path/to/models/

🧪 To test with your sample, run:

python eval.py --input_path /your/path/to/sample.wav --model_path /your/path/to/your_model.pth

📥 Pretrained Model Weights

Download the trained model weights from the link below:

https://drive.google.com/file/d/15qOi26czvZddIbKP_SOR8SLQFZK8cf8E/view?usp=sharing

🌐 In-the-Wild Testing

You can test audio samples live on our lab's Deepfake O Meter platform:

https://zinc.cse.buffalo.edu/ubmdfl/deep-o-meter/landing_page

📄 License

This repository is licensed under the MIT License.
You are free to use, modify, and distribute the code with proper attribution.

About

This repository is related to our Dataset and Detection code from the paper: AI-Synthesized Voice Detection Using Neural Vocoder Artifacts accepted in CVPR Workshop on Media Forensic 2023.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages