VADCoreESP32

Version: 1.0.2
Author: Rakib Hasan Category: Audio Data Processing

Overview

VADCoreESP32 is a library for Voice Activity Detection (VAD) on ESP32 devices. It utilizes floating-point Fast Fourier Transform (FFT) calculations to detect speech activity in real-time. This library is optimized for the ESP32 hardware, providing efficient audio processing and VAD functionality.

Features

Real-time Voice Activity Detection using FFT
Configurable core and priority for FreeRTOS tasks
Gain adjustment and noise thresholding
Flexible I2S configuration for audio input
Smooth energy calculation for improved detection accuracy

Installation

Download the Library

You can download the latest version of VADCoreESP32 from the GitHub repository.
Add to Your Arduino Libraries
- Copy the VADCoreESP32 folder into your Arduino libraries directory.
- Restart the Arduino IDE if it was open during the copy.

Usage

Example Sketch

#include <VADCoreESP32.h>

// Define your I2S pin configuration
#define I2S_PORT I2S_NUM_0
#define I2S_WS 16
#define I2S_SD 7
#define I2S_SCK 15

VADCoreESP32 myVad;

void setup() {
    Serial.begin(115200);
    myVad.i2sInit(I2S_PORT, I2S_SCK, I2S_WS, I2S_SD);
    myVad.setCore(0); // Set to Core 0 (Optional, if your board has 2 cores)
    myVad.setPriority(10); // Set task priority (Optional, if your board supports different priorities)
}

void loop() {
    if (Serial.available() > 0) {
        String cmd = Serial.readString();
        if (cmd.compareTo("start") == 0) {
            myVad.start(); // Start VAD task
            Serial.println("Start Listening Command Received");
        }
    }

    if (myVad.getState() == VAD_SILENCE) {
        Serial.println("[Idle Heap:" + String(ESP.getFreeHeap()) + " Core:" + String(xPortGetCoreID()) + "]");
    }

    delay(100);
}

Class Methods

void setCore(int coreId)

Sets the core for the VAD task. Accepts 0 or 1 for ESP32 cores.
void setPriority(UBaseType_t priority)

Sets the priority of the VAD task. Use a value between 0 (lowest) and 24 (highest).
bool getState()

Returns the current state of the VAD. Returns VAD_VOICE if speech is detected, otherwise VAD_SILENCE. When VAD_SILENCE triggers VAD Task Gets Automatically got Deleted.
void start()

Initializes and starts the VAD task with the configured core and priority.

Configuration

I2S_SAMPLE_RATE: The sample rate for I2S audio input. Default is 16000.
FFT_SIZE: The size of the FFT. Default is 256.
SPEECH_THRESHOLD: The threshold for detecting speech activity. Default is 3000.
NOISE_THRESHOLD: The threshold for distinguishing noise. Default is 1000.
SPEECH_FREQ_MIN and SPEECH_FREQ_MAX: The frequency range for detecting speech. Defaults are 300 Hz and 3400 Hz, respectively.
GAIN_FACTOR: The gain factor for audio samples. Default is 1.5.

License

This library is released under the MIT License. See the LICENSE file for details.https://chatgpt.com/share/66e75389-7340-8004-a16e-7e6a9af3ecbf

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Examples/OnDemand		Examples/OnDemand
src		src
README.md		README.md
keywords.txt		keywords.txt
library.json		library.json
library.properties		library.properties

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VADCoreESP32

Overview

Features

Installation

Usage

Example Sketch

Class Methods

Configuration

License

About

Uh oh!

Releases 1

Languages

TheZeroHz/VADCoreESP32

Folders and files

Latest commit

History

Repository files navigation

VADCoreESP32

Overview

Features

Installation

Usage

Example Sketch

Class Methods

Configuration

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Languages