Skip to content

myselfbasil/speaker-diarization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Speaker Diarization and Audio Segmentation

Overview

This project processes YouTube videos by extracting audio, performing noise reduction, and identifying distinct speakers using diarization techniques. The processed audio is segmented and organized for further analysis.

Features

  • YouTube Video Processing: Accepts a YouTube link, downloads the video, and extracts the audio.
  • Audio Standardization: Converts the audio to WAV format, sets a mono channel, and normalizes the sample rate.
  • Noise Reduction: Applies denoising techniques to improve audio quality.
  • Speaker Diarization: Identifies individual speakers and generates timestamped labels.
  • Visualization: Displays speaker transitions and overlaps graphically.
  • Audio Segmentation: Splits the audio into 10-second speaker-specific segments.

Installation

Prerequisites

Ensure you have Python 3.10.12 installed.

Steps to Install Dependencies

  1. Clone this repository:
    git clone https://github.com/myselfbasil/speaker-diarization
    cd speaker-diarization
  2. Create and activate a virtual environment (recommended):
    python -m venv venv
    source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
  3. Install required dependencies:
    pip install -r requirements.txt

Usage

  1. Run the Jupyter Notebook:

    jupyter notebook main.ipynb

    Provide a YouTube link and execute the notebook cells step by step to process the audio.

  2. Usage of the Python script:
    Basic Usage

    python diarization.py "https://youtube.com/watch?v=..."

    Advanced Usage

    python diarization.py "https://youtube.com/watch?v=..." \
     -n 3 \
     -o ./results \
     --window 0.4 \
     --period 0.2 \
     --workers 8 \
     --debug

Applications

  • Podcast and video transcription
  • Speaker analysis in discussions and interviews
  • Content segmentation for research and media archiving

About

The Objective of this project is on Audio Diarization and Speaker Classification

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published