An interactive AI VTuber experience with VRM avatar support, real-time speech recognition, and dynamic responses. Chat with your AI companion through voice or text while watching them react with lifelike animations and expressions.
- Load Custom Models: Upload your own
.vrm
files for personalized avatars - Lifelike Animations: Real-time facial expressions, mouth movement, and audio-reactive body gestures
- Full Customization: Adjust avatar position, scale, rotation, and fine-tune arm/elbow positioning
- Whisper AI Integration: High-accuracy, local speech-to-text using
Xenova/transformers
- Optimized Performance: Fast 10-second chunk processing with 2-second overlap for real-time feel
- Voice Hotkeys: Set hotkey (Shift, Ctrl, Alt, Space) for always-on voice activation
- Google Gemini: Latest Gemini models with fast, creative responses
- OpenAI: GPT-4o, GPT-4 Turbo, and other OpenAI models
- Ollama: Local AI models running completely offline
- Smart Request Queue: Prevents API overwhelm while maintaining responsiveness
- 30-Second Timeouts: No more hanging requests or frozen responses
- Azure Neural Voices: Premium, natural-sounding voices (24kHz, 48kbit quality)
- Browser TTS Fallback: Free built-in alternative with voice selection
- Voice Customization: Adjust pitch, rate, volume, and speaking styles
- Twitch Chat: Direct integration with Twitch channel chat
- Stream Mode: Overlay interface designed for streaming
- Message Queue System: Smart batching and prioritization of chat messages
- Real-time Chat Display: Live chat overlay with user messages
- Custom Backgrounds: Images or videos as avatar backdrop
- Visual Effects: Curve and bend effects for immersive scenes
- Real-time Controls: Adjust all settings without reloading
- A modern web browser (Chrome/Edge recommended for full feature support).
- For AI features: API keys for your chosen provider (Google Gemini or OpenAI).
- For premium TTS: An Azure Cognitive Services key (optional).
- For local AI: A running instance of Ollama (optional). See the Ollama Setup Guide for detailed instructions.
-
Clone the repository:
git clone https://github.com/xsploit/WEBWAIFU.git cd WEBWAIFU
-
Open in browser:
- Simply open the
index.html
file in your web browser - No build process or servers required - pure client-side application!
- Simply open the
- Upload VRM Model: In settings panel → "📁 VRM Model" → upload your
.vrm
file - Configure AI: Go to "🤖 AI Configuration" → select provider → enter API key
- Configure Voice: Set up microphone and TTS under "🎙️ Voice Controls" and "🔊 Text-to-Speech"
- Start Chatting: Use text input or voice hotkey (default: Shift) to interact with your AI companion!
- Frontend: Vanilla JavaScript (ES6), HTML5, CSS3 - zero dependencies
- 3D Rendering: Three.js with
@pixiv/three-vrm
library - Speech Recognition: Xenova/transformers.js for client-side Whisper
- Audio Processing: Web Audio API for real-time microphone analysis and TTS mouth sync
- Performance: Smart request queuing, 30s API timeouts, optimized Whisper chunks
- Storage: Local browser storage for all settings (no server required)
- Fast Speech Processing: 10-second Whisper chunks (3x faster than default)
- Smart AI Queue: Prevents API spam while maintaining responsiveness
- Timeout Protection: All API calls have 30-second timeouts to prevent hanging
- Memory Management: Proper cleanup of audio contexts and event listeners
- VRM Models:
.vrm
(VRM 0.x specification) - Backgrounds: JPG, PNG, GIF, WEBP, MP4, MOV, AVI
- Audio: High-quality Azure TTS (24kHz) or browser TTS fallback
-
Speech Recognition Not Working?
- Check browser microphone permissions
- Verify correct microphone selected in settings
- Refresh page to reload Whisper model
- Use Shift key (default hotkey) to activate
-
VRM Model Not Loading?
- Ensure valid
.vrm
file (VRM 0.x specification) - Check browser console for error messages
- Try smaller VRM files first
- Ensure valid
-
AI Not Responding?
- Verify API key is correct and active
- Check internet connection stability
- Look for timeout errors in console (should auto-retry)
- Try different AI provider if one fails
-
Performance Issues?
- Close other browser tabs to free memory
- Disable background extensions
- Use Chrome/Edge for best performance
- Check console for memory warnings
GitHub: https://github.com/xsploit/WEBWAIFU.git
Built upon the excellent VU-VRM project foundation (https://github.com/Automattic/VU-VRM) by itsTallulahhh and powered by these open-source libraries:
- Three.js - 3D graphics engine
- three-vrm - VRM avatar support
- Transformers.js - Client-side AI models
This project is open source. See the repository for license details.
Future Features Coming Soon:
- Azure TTS Streaming for real-time voice synthesis
- LLM Streaming for faster response generation
- Enhanced VRM expression controls
- More AI provider integrations