Audio Deepfake Detection

Project Overview

This project is designed to detect deepfake audio using Wav2Vec2, a state-of-the-art self-supervised model developed by Facebook AI for speech representation learning. With the rise of AI-generated content, deepfake audio presents serious threats in domains like cybersecurity, media, legal evidence, and personal privacy. This system leverages the powerful Wav2Vec2 architecture fine-tuned on a labeled dataset consisting of both real human voices and synthetically generated audio samples. The goal is to automatically classify an audio clip as either real or deepfake with high accuracy.

Features

Deepfake Detection: Identifies whether an audio sample is real or fake.
Pre-trained Model Support: Uses state-of-the-art deep learning models.
User-Friendly Interface: Simple script-based execution for ease of use.
Scalable & Efficient: Can be integrated into real-time applications.

📂 Project Structure

├── app/                 # Main application files
│   └── app.py           # Script to run inference
├── models/              # Pre-trained models
├── scripts/             # Utility scripts
├── data/                # Contains raw audio files (ignored in .gitignore)
├── upload/              # For storing temporary files (ignored in .gitignore)
├── requirements.txt     # Dependencies
├── training.log         # Training history
├── .gitignore           # Ignoring unnecessary files
└── README.md            # Project documentation (this file)

Setup & Installation

1️. Clone the Repository

git clone https://github.com/lakshiitakalyanasundaram/Lakshiita_kalyanasundaram.git  
cd Lakshiita_kalyanasundaram

2️. Install Dependencies

pip install -r requirements.txt

3️. Run the Application

python app/app.py --input path/to/audio.wav

How It Works

Preprocessing: The audio file is preprocessed (e.g., noise reduction, feature extraction).
Model Inference: The trained model classifies the audio as real or fake.
Output: The result is displayed as real (✅) or deepfake (❌).

Accuracy & Performance

Dataset: Trained on a dataset of real and fake audio samples.
Model Used: LSTM + CNN Hybrid.
Accuracy: Achieved ~72% accuracy in testing.

Use Cases

Media Verification: Detect and verify the authenticity of voice recordings in journalism and broadcasting.
Cybersecurity: Prevent voice spoofing attacks in authentication systems, especially in financial and biometric applications.
Forensic Analysis: Assist law enforcement and legal investigations by identifying AI-generated speech in evidence materials.
AI Ethics & Policy Compliance: Ensure responsible use of generative AI by identifying deepfake content, helping organizations comply with AI ethics policies.
AI-Generated Speech Detection: Effectively identify synthetically generated voices from real human speech using deep learning.

Dataset

I have used the 3004lakshu/for-norm dataset available on Hugging Face. It contains labeled samples of real and deepfake audio.

How to Access the Dataset

Install the Hugging Face datasets library:
```
pip install datasets
```

Load the dataset in your script:

from datasets import load_dataset
dataset = load_dataset("3004lakshu/Deepfake-Audio")

Explore the dataset:
```
print(dataset)
```

Link: https://huggingface.co/datasets/3004lakshu/Deepfake-Audio

Model Details

The classification model is built on top of the Wav2Vec2 encoder. It extracts high-level audio embeddings from the input waveform. These embeddings are passed through a combination of LSTM and CNN layers, which capture temporal and local patterns in the speech signal.

This hybrid model architecture improves classification accuracy by combining both sequence and feature learning capabilities.

Training Process

Preprocessing:
- Audio normalization
- Silence trimming
- Conversion to 16kHz mono WAV
Training:
- Loss Function: Binary Cross Entropy
- Optimizer: Adam
- Epochs: 20
- Dataset Split: 80% train, 20% test
- Validation Accuracy: ~72%

Fine-Tuned Model

This repository contains a fine-tuned .pth model using the wav2vec2 architecture for audio deepfake detection.
The model has been trained and optimized on a custom dataset for the task.

How to Use

Install the Hugging Face Hub library (if you haven't already):
```
pip install huggingface_hub
huggingface-cli login
```

Download the model:

from huggingface_hub import hf_hub_download
model_path = hf_hub_download(
 repo_id="3004lakshu/wav2vec2_trained",
 filename="deepfake_model.pth"
)

Model Hub URL:

https://huggingface.co/3004lakshu/wav2vec2_trained

Task Description

To understand the goals, objectives, and evaluation criteria of this internship task, please refer to the detailed documentation below:

Task Document (Google Docs):
View Full Internship Task Document

Demo

A screen recording of the Streamlit interface demonstrating the detection process has been included.

Watch the demo here:

Click here for the demo

Contributing

Feel free to fork the repo, open an issue, or submit a pull request to improve the project!

License

MIT License. Use it freely but give credits where due.

If this project helps you, give it a star on GitHub!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Audio Deepfake Detection

Project Overview

Features

📂 Project Structure

Setup & Installation

1️. Clone the Repository

2️. Install Dependencies

3️. Run the Application

How It Works

Accuracy & Performance

Use Cases

Dataset

How to Access the Dataset

Model Details

Training Process

Fine-Tuned Model

How to Use

Model Hub URL:

Task Description

Demo

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
app		app
frontend		frontend
scripts		scripts
static		static
streamlit		streamlit
upload		upload
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
vercel.json		vercel.json

lakshiitakalyanasundaram/DeepSonic

Folders and files

Latest commit

History

Repository files navigation

Audio Deepfake Detection

Project Overview

Features

📂 Project Structure

Setup & Installation

1️. Clone the Repository

2️. Install Dependencies

3️. Run the Application

How It Works

Accuracy & Performance

Use Cases

Dataset

How to Access the Dataset

Model Details

Training Process

Fine-Tuned Model

How to Use

Model Hub URL:

Task Description

Demo

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages