Skip to content

h1pot/live_interview_agent

Repository files navigation

Live Interview Agent

🎤 A one-PC Python agent that helps you handle AI-based interviews.
It listens to the interviewer, transcribes with Whisper, generates concise English answers with GPT, and shows them in a floating always-on-top window.

✨ Features:

  • Whisper STT + GPT answers in real time
  • Always-on-top floating GUI
  • VU meter (text + graphical) to confirm audio input
  • Translation (transcript / answer / both) with hotkeys
  • Chat timeline pane with save/clear options
  • Optional TTS to hear answers in your ear
  • Pro hotkeys: STAR mode, Concise vs Elaborate, copy, transparency, etc.

Quick Start

0) Prereqs

  • Python 3.10+
  • An OpenAI API key set as an env var:
    • Windows (PowerShell): setx OPENAI_API_KEY "sk-..." then restart terminal
    • macOS/Linux (bash): export OPENAI_API_KEY="sk-..."

1) Install dependencies

pip install -r requirements.txt

2) Route system audio to the agent

  • Windows: Install VB-CABLE or use Voicemeeter Banana. Send your browser/app output (AI’s voice) to CABLE Input, and select CABLE Output as the agent input.
  • macOS: Install BlackHole (2ch). Route browser output into BlackHole, set agent to record from it.

💡 Tip: Use headphones to avoid feedback.

3) Run

python live_interview_agent_v5.py --device-index <index_of_CABLE_Output>

4) During interview

  • Keep the floating window visible.
  • Read/paraphrase the on-screen answer naturally.
  • The interviewer hears only your mic; the agent listens through the virtual cable.

Hotkeys

  • F1: Toggle TTS (on/off)
  • F2: Clear live fields
  • F3: Pause/resume listening
  • F4: Copy last answer to clipboard
  • F5: Toggle STAR mode (Situation → Action → Result)
  • F6: Toggle Concise vs Elaborate answers
  • F7/F8: Adjust window transparency
  • F9: Cycle translation mode (off / transcript / answer / both)
  • F10: Change target language code
  • F11: Save timeline to file (txt)
  • F12: Clear timeline pane
  • ESC: Quit

Configuration (via env vars)

  • AGENT_CHUNK_SECONDS (default 2.5) — audio slice length
  • AGENT_SAMPLE_RATE (default 16000) — capture rate
  • AGENT_SILENCE_THRESHOLD (default 0.003) — RMS gate for silence
  • AGENT_STT_MODEL (default whisper-1)
  • AGENT_GPT_MODEL (default gpt-4o-mini)
  • AGENT_SHOW_TRANSCRIPT (1 to display recognized question)
  • AGENT_TTS_DEFAULT (1 to start with TTS enabled)
  • AGENT_TRANSLATE (off|transcript|answer|both)
  • AGENT_TARGET_LANG (default es)

Example (Windows PowerShell):

setx AGENT_SHOW_TRANSCRIPT "1"
setx AGENT_TRANSLATE "both"
setx AGENT_TARGET_LANG "fr"

Troubleshooting

  • No VU activity → check that browser/app output is routed to CABLE Input.
  • VU moves but no transcript → raise volume or lower AGENT_SILENCE_THRESHOLD.
  • ImportErrorpip install -r requirements.txt
  • TTS issues → disable with F1 or set AGENT_TTS_DEFAULT=0.

Safety & Ethics

This tool is meant to support comprehension and confidence. Use responsibly and check the rules of your interview platform.

About

An AI-powered real-time interview companion that runs entirely on your PC.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published