WEBWAIFU 🤖💖

An interactive AI VTuber experience with VRM avatar support, real-time speech recognition, and dynamic responses. Chat with your AI companion through voice or text while watching them react with lifelike animations and expressions.

✨ Features

🎭 VRM Avatar Support

Load Custom Models: Upload your own .vrm files for personalized avatars
Lifelike Animations: Real-time facial expressions, mouth movement, and audio-reactive body gestures
Full Customization: Adjust avatar position, scale, rotation, and fine-tune arm/elbow positioning

🎤 Advanced Speech Recognition

Whisper AI Integration: High-accuracy, local speech-to-text using Xenova/transformers
Optimized Performance: Fast 10-second chunk processing with 2-second overlap for real-time feel
Voice Hotkeys: Set hotkey (Shift, Ctrl, Alt, Space) for always-on voice activation

🤖 Multi-Provider AI Support

Google Gemini: Latest Gemini models with fast, creative responses
OpenAI: GPT-4o, GPT-4 Turbo, and other OpenAI models
Ollama: Local AI models running completely offline
Smart Request Queue: Prevents API overwhelm while maintaining responsiveness
30-Second Timeouts: No more hanging requests or frozen responses

🔊 High-Quality Text-to-Speech

Azure Neural Voices: Premium, natural-sounding voices (24kHz, 48kbit quality)
Browser TTS Fallback: Free built-in alternative with voice selection
Voice Customization: Adjust pitch, rate, volume, and speaking styles

📺 Streaming & Chat Integration

Twitch Chat: Direct integration with Twitch channel chat
Stream Mode: Overlay interface designed for streaming
Message Queue System: Smart batching and prioritization of chat messages
Real-time Chat Display: Live chat overlay with user messages

🎨 Customization Options

Custom Backgrounds: Images or videos as avatar backdrop
Visual Effects: Curve and bend effects for immersive scenes
Real-time Controls: Adjust all settings without reloading

🚀 Quick Start

Prerequisites

A modern web browser (Chrome/Edge recommended for full feature support).
For AI features: API keys for your chosen provider (Google Gemini or OpenAI).
For premium TTS: An Azure Cognitive Services key (optional).
For local AI: A running instance of Ollama (optional). See the Ollama Setup Guide for detailed instructions.

Installation

Clone the repository:

git clone https://github.com/xsploit/WEBWAIFU.git
cd WEBWAIFU

Open in browser:
- Simply open the index.html file in your web browser
- No build process or servers required - pure client-side application!

First-Time Setup

Upload VRM Model: In settings panel → "📁 VRM Model" → upload your .vrm file
Configure AI: Go to "🤖 AI Configuration" → select provider → enter API key
Configure Voice: Set up microphone and TTS under "🎙️ Voice Controls" and "🔊 Text-to-Speech"
Start Chatting: Use text input or voice hotkey (default: Shift) to interact with your AI companion!

🛠️ Technical Details

Architecture

Frontend: Vanilla JavaScript (ES6), HTML5, CSS3 - zero dependencies
3D Rendering: Three.js with @pixiv/three-vrm library
Speech Recognition: Xenova/transformers.js for client-side Whisper
Audio Processing: Web Audio API for real-time microphone analysis and TTS mouth sync
Performance: Smart request queuing, 30s API timeouts, optimized Whisper chunks
Storage: Local browser storage for all settings (no server required)

Performance Optimizations

Fast Speech Processing: 10-second Whisper chunks (3x faster than default)
Smart AI Queue: Prevents API spam while maintaining responsiveness
Timeout Protection: All API calls have 30-second timeouts to prevent hanging
Memory Management: Proper cleanup of audio contexts and event listeners

Supported File Formats

VRM Models: .vrm (VRM 0.x specification)
Backgrounds: JPG, PNG, GIF, WEBP, MP4, MOV, AVI
Audio: High-quality Azure TTS (24kHz) or browser TTS fallback

🔧 Troubleshooting

Common Issues

Speech Recognition Not Working?
- Check browser microphone permissions
- Verify correct microphone selected in settings
- Refresh page to reload Whisper model
- Use Shift key (default hotkey) to activate
VRM Model Not Loading?
- Ensure valid .vrm file (VRM 0.x specification)
- Check browser console for error messages
- Try smaller VRM files first
AI Not Responding?
- Verify API key is correct and active
- Check internet connection stability
- Look for timeout errors in console (should auto-retry)
- Try different AI provider if one fails
Performance Issues?
- Close other browser tabs to free memory
- Disable background extensions
- Use Chrome/Edge for best performance
- Check console for memory warnings

🔗 Repository

GitHub: https://github.com/xsploit/WEBWAIFU.git

🙏 Acknowledgments

Built upon the excellent VU-VRM project foundation (https://github.com/Automattic/VU-VRM) by itsTallulahhh and powered by these open-source libraries:

Three.js - 3D graphics engine
three-vrm - VRM avatar support
Transformers.js - Client-side AI models

📄 License

This project is open source. See the repository for license details.

Future Features Coming Soon:

Azure TTS Streaming for real-time voice synthesis
LLM Streaming for faster response generation
Enhanced VRM expression controls
More AI provider integrations

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.gitignore		.gitignore
1768080537824958249.vrm		1768080537824958249.vrm
OLLAMA_SETUP.md		OLLAMA_SETUP.md
README.md		README.md
index.html		index.html
script.js		script.js
style.css		style.css
whisper-worker.js		whisper-worker.js
ww.png		ww.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

WEBWAIFU 🤖💖

✨ Features

🎭 VRM Avatar Support

🎤 Advanced Speech Recognition

🤖 Multi-Provider AI Support

🔊 High-Quality Text-to-Speech

📺 Streaming & Chat Integration

🎨 Customization Options

🚀 Quick Start

Prerequisites

Installation

First-Time Setup

🛠️ Technical Details

Architecture

Performance Optimizations

Supported File Formats

🔧 Troubleshooting

Common Issues

🔗 Repository

🙏 Acknowledgments

📄 License

About

Uh oh!

Releases

Packages

Languages

xsploit/WEBWAIFU

Folders and files

Latest commit

History

Repository files navigation

WEBWAIFU 🤖💖

✨ Features

🎭 VRM Avatar Support

🎤 Advanced Speech Recognition

🤖 Multi-Provider AI Support

🔊 High-Quality Text-to-Speech

📺 Streaming & Chat Integration

🎨 Customization Options

🚀 Quick Start

Prerequisites

Installation

First-Time Setup

🛠️ Technical Details

Architecture

Performance Optimizations

Supported File Formats

🔧 Troubleshooting

Common Issues

🔗 Repository

🙏 Acknowledgments

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages