An AI companion that understands how you feel
- Features
- Demo
- Installation
- Usage
- Configuration
- Supported Models
- Technical Stack
- Development
- Contributing
- License
- Contact
✨ Multimodal Emotion Detection
- Real-time facial expression analysis via webcam
- Text sentiment analysis from chat input
- Combined emotion scoring for accurate understanding
🎙️ Voice Integration
- Speech-to-text transcription
- Natural text-to-speech responses
- Audio emotion tone detection
💡 Intelligent Responses
- Emotion-aware response generation
- Multiple LLM backend support
- Contextual conversation memory
🖥️ Interactive Interface
- Live face mesh visualization
- Adjustable AI parameters
- Clean, modern UI with Gradio
- Python 3.8+
- GPU with CUDA support (recommended)
- Webcam microphone
-
Clone the repository:
git clone https://github.com/shivapreetham/SoulSync.git cd SoulSync
-
Install dependencies:
pip install -r requirements.txt
-
Download models:
# Vosk speech recognition model wget https://alphacephei.com/vosk/models/vosk-model-en-us-0.42-gigaspeech.zip unzip vosk-model-en-us-0.42-gigaspeech.zip -d models/ # (Optional) Download additional LLM weights python download_models.py
python main.py
- Model Selection: Choose from available LLMs in dropdown
- Video Call:
- Click "Start Video Call" to enable webcam
- Use "Start Recording" to begin emotion analysis
- Chat:
- Type messages in the text box
- View emotion analysis in real-time
- Settings:
- Adjust temperature, top-k, top-p parameters
- Modify max token length
Edit config.ini
to customize:
[models]
default = microsoft/DialoGPT-large
cache_dir = ./model_cache
[audio]
sample_rate = 16000
channels = 1
voice = english
[interface]
theme = soft
width = 800
height = 600
Model Name | Parameters | Best For |
---|---|---|
DialoGPT Large | 355M | General chat |
BlenderBot 3B | 3B | Longer conversations |
GPT-2 XL | 1.5B | Creative responses |
Custom Model | Variable | Specialized use cases |
- Natural Language: HuggingFace Transformers
- Computer Vision: MediaPipe, OpenCV
- Speech Processing: Vosk, PyAudio, pyttsx3
- Interface: Gradio, Matplotlib
SoulSync/
├── main.py # Main application
├── llm_utils.py # Model loading/generation
├── chat_utils.py # Conversation handling
├── emotion_detection.py # Vision/audio analysis
├── config.ini # Configuration
└── requirements.txt # Dependencies
python -m pytest tests/
We welcome contributions! Please see our Contribution Guidelines for details.
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Open a Pull Request
MIT License - See LICENSE.md for details.