QA-BOT is an AI-powered solution for monitoring and evaluating customer service and BPO agent interactions in real time. It processes audio data, analyzes sentiment, detects profanity, evaluates tonality, and provides actionable insights to improve service quality.
Real-Time Transcription – Fast & accurate speech-to-text conversion with timestamps.
Speaker Diarization – Identifies & labels different speakers in a conversation.
Sentiment Analysis – Detects emotional states & conversation context.
Profanity Detection – Automatically flags inappropriate language.
Pause Detection – Analyzes response times & conversation flow.
Tonality Checking – Evaluates voice tone & speech patterns.
Knowledge Accuracy (Upcoming Feature) – Assesses information correctness.
| Component | Technology Used |
|---|---|
| Transcription | VOSK ASR + Kaldi Recognizer |
| Sentiment Analysis | NLTK's Sentiment Intensity Analyzer (VADER) |
| Profanity Detection | Better Profanity Library |
| Audio Processing | PyAudio, Pydub |
| Machine Learning (Future) | Hugging Face Transformers |
Key Aspects: High-speed real-time transcription, accuracy, timestamping.
- Approach:
- Captures live audio via PyAudio.
- Segments audio into frames for processing.
- Uses VOSK + Kaldi Recognizer for real-time parsing.
- Outputs structured transcripts with timestamps.
Key Aspects: Multi-speaker detection, agent identification.
- Approach:
- Detects speaker change using pause-based analysis.
- Converts sentences into embeddings for voice pattern comparison.
- Identifies agents based on a dictionary of common phrases.
Key Aspects: Understanding emotional tone, detecting negativity.
- Approach:
- Converts speech to text.
- Uses VADER Lexicon (NLTK) for sentiment polarity.
- Labels speech as Positive, Negative, Neutral, Angry, or Sarcastic.
Key Aspects: Detects inappropriate language, raises alerts.
- Approach:
- Uses Better Profanity Library for real-time monitoring.
- Flags offensive words in transcripts.
Key Aspects: Evaluates speech tone, conversational style.
- Approach:
- Uses pre-trained NLP models (e.g., Hugging Face Transformers).
- Categorizes tone as Formal, Casual, Assertive, or Urgent.
Ensure you have Python 3.8+ installed.
Install dependencies using:
pip install pyaudio vosk nltk better_profanity transformers torch sentencepiece

