A simple real-time voice analysis tool using Vosk, Parselmouth, Librosa, and Electron to provide feedback on speaking habits like pitch (F0), volume (RMS), upward inflection, staccato rhythm, and vocal shakiness (jitter/shimmer).
This is a pet project primarily developed for macOS.
- Real-time calculation of RMS (volume) and F0 (pitch).
- Utterance-level analysis of:
- Jitter & Shimmer (using Praat via Parselmouth)
- Staccato Rhythm (based on pause/word duration statistics from Vosk)
- Upward Inflection (based on F0 slope at utterance end)
- Simple Electron UI displaying metrics and visual alerts.
- Backend: Python 3
- Speech-to-Text: Vosk
- Acoustic Analysis: Parselmouth (Praat), Librosa
- Audio I/O: sounddevice
- Frontend: Electron, Node.js
-
Clone Repository:
git clone <your-repo-url> cd voicecoach
-
Python Setup (Requires Python 3.9+):
- Create a virtual environment:
python3 -m venv venv
- Activate the environment:
- macOS/Linux:
source venv/bin/activate - Windows:
.\venv\Scripts\activate
- macOS/Linux:
- Install Python dependencies:
pip install -r requirements.txt
- Create a virtual environment:
-
Download Vosk Model:
- Download the model
vosk-model-small-en-us-0.15from https://alphacephei.com/vosk/models. - Extract the downloaded archive.
- IMPORTANT: Place the extracted folder (which should be named
vosk-model-small-en-us-0.15) directly into the root directory of this project (voicecoach/).
- Download the model
-
Node.js Setup (Requires Node >= v22, npm >= v10):
- Install Node.js and npm if you haven't already: https://nodejs.org/
- Install Node dependencies:
npm install
Ensure your Python virtual environment is deactivated. From the project's root directory (voicecoach/), run:
npm startThe application window should open and automatically start listening.
voice.py: Python backend script handling audio capture, analysis, and JSON output.main.js: Electron main process script, manages the app window and Python child process.preload.js: Electron preload script for secure IPC.renderer.js: Electron renderer process script, handles UI logic and updates.index.html: Defines the UI structure.
- Primarily tested on macOS.
- Shakiness detection uses basic Jitter/Shimmer thresholds that may need tuning (
renderer.js). - Staccato detection rules are experimental (
voice.py). - Requires the specific Vosk model
vosk-model-small-en-us-0.15placed in the root folder.
