Skip to content

Terminal-based voice logger with real-time playback and optional compression -- built for fast feedback, idea capture, and speech awareness.

Notifications You must be signed in to change notification settings

eugenpt/ep_voicetrain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Voice Logger with Real-Time Playback and Compression

This tool continuously records audio input, displays real-time volume levels, and plays back speech segments after pauses. It saves all audio to .wav files and logs each playback chunk in .jsonl. Optionally, finished .wav files are compressed to .mp3 using ffmpeg if it is available.


🔧 Features

  • 🔊 Continuous audio recording
  • 🎛 Real-time volume meter and adjustable threshold
  • ⏸ Automatic detection of pauses of adjustable duration
  • ↺ Playback after each detected pause
  • 📏 Saves all audio (including silence)
  • 🗒 One-line JSON log per playback (JSONL format)
  • 📁 Hourly file splitting with timestamped filenames
  • 🎼 Optional compression to high-quality MP3 with ffmpeg

▶ How to Use

1. Install Dependencies

  1. (Optional) Install FFmpeg and add it to your system PATH for .mp3 compression.
  2. Run the installation script:
install.bat
  • If a virtual environment (venv/) already exists, the script will ask whether to reuse it or delete and recreate it.
  • All required Python packages will be installed into the virtual environment.

2. Run the Logger

run.bat

Speak naturally. When you pause for a bit, the last spoken phrase will be played back automatically.


🎹 Controls

Key Action
↑ / ↓ Increase / Decrease volume threshold
← / → Increase / Decrease pause duration
n Start a new recording file immediately
space/Enter Stop playback early
q Quit the program

📁 Output

All session data is saved to the recordings/ folder:

  • .wav file (raw audio)
  • .jsonl file (playback chunk metadata)
  • .mp3 file (optional, if ffmpeg is available)

Example log entry in .jsonl:

{
  "fs": 16000,
  "chunk_duration": 0.1,
  "start_chunk": 103,
  "end_chunk": 156,
  "start_time": 1718546724.13,
  "end_time": 1718546729.72,
  "threshold": 0.2,
  "pause_duration": 1.5
}

✅ Requirements

  • Windows OS
  • Python 3.8+
  • Microphone input
  • (Optional) ffmpeg in PATH for .mp3 compression

📌 Notes

  • The tool saves everything, including silence.
  • Playback happens only between detected speech-pause regions.
  • If ffmpeg is missing, the .wav file will still be saved — compression will simply be skipped.

📄 License

MIT License. Use it, tweak it, share it.


Enjoy your real-time voice playback! 🎙️

About

Terminal-based voice logger with real-time playback and optional compression -- built for fast feedback, idea capture, and speech awareness.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published