English | 简体中文
Interactive-LLM-VTuber is an innovative platform for interactive virtual streamers, leveraging advanced AI technologies to deliver an immersive user experience. The project supports voice input, text generation, and voice output, with high scalability. Currently in development are features like long-term memory, image recognition, and sentiment analysis. Future plans include local deployment, deep reinforcement learning, system integration, framework optimization, and embedded device support to build an intelligent VTuber ecosystem.
- Real-time Voice Interaction: Enables natural language input via automatic speech recognition (ASR).
- Intelligent Conversation: Supports multiple large language models (LLMs), including Tongyi Qianwen, Deepseek (online), and Ollama2.5:7b (local offline).
- Speech Synthesis: Utilizes Edge-TTS for smooth text-to-speech output.
- Dynamic Front-end: Built with Flask, HTML, JavaScript, and CSS for an intuitive user interface.
- Modular Design: Facilitates feature expansion and third-party integration.
Showcasing real-time interaction with the VTuber model.
- Programming Language: Python
- Speech Recognition (ASR):
speech_recognition
(online) - Large Language Models (LLMs):
- Tongyi Qianwen (online)
- Deepseek (online)
- Ollama2.5:7b (local offline)
- Text-to-Speech (TTS):
edge-tts
(online) - Front-end and Back-end Interaction: Flask + HTML + JavaScript + CSS
Note: Some models may require specific configurations for compatibility.
- Windows: Fully tested and stable.
- Linux: Theoretically compatible (testing recommended).
- Install VSCode or PyCharm.
- Install Python 3.11 interpreter.
- (Optional) Use a virtual environment to isolate dependencies.
-
Clone the project and enter the directory:
git clone https://github.com/toke648/AI-Interactive-LLM-VTuber.git cd AI-Interactive-LLM-VTuber
-
Create and activate a virtual environment:
-
Windows:
python -m venv vtuber vtuber\Scripts\activate
-
Conda Environment:
conda create -n vtuber python=3.11 conda activate vtuber
-
Linux/macOS:
python -m venv vtuber source vtuber/bin/activate
-
-
Install dependencies:
pip install -r requirements.txt
-
Configure API:
- Edit
mainsetting.py
to configure API keys (e.g., for Tongyi Qianwen or Ollama) and other settings.
- Edit
-
Start the project:
python server.py
Or use the one-click startup script (Windows):
setup.bat
- Port Modification: Adjust the port or other settings in
mainsetting.py
. - Model Switching: Modify the
cubism4Model
variable instatic/js/appserver.js
to switch VTuber models (not yet integrated into the UI). - System Settings: Access the configuration page via the “Settings” button in the UI. Restart the project to apply changes.
- One-click Startup: Added
setup.bat
script to simplify the startup process for Windows users. - Model Switching: Supports manual VTuber model switching by modifying the path in
static/js/appserver.js
. - System Configuration Page: Added a settings interface, accessible via the “Settings” button. Restart the project to apply changes.
- Ensure API keys and environment variables are correctly configured for LLM and TTS functionality.
- Linux users may need to verify compatibility. Feedback is welcome via GitHub Issues.
- The project is actively updated. Follow the GitHub repository for the latest updates.
This project is licensed under the MIT License. Contributions and suggestions are warmly welcomed!