PandaLora is an interactive application featuring a 3D talking panda powered by AI. Engage in conversations via text or voice, and experience a unique AI personality with a penchant for the gothic. This project combines a React-based frontend with a Python FastAPI backend.
User Experience & Frontend:
- Interactive 3D Panda: A visually engaging 3D panda avatar that animates while "talking."
- Dual Input Modes: Communicate via traditional text chat or use your voice for a hands-free experience.
- Real-time Feedback: Visual cues for when the panda is "listening," "typing" (processing), or "talking."
- Dynamic Speech Bubble: AI responses are displayed in a speech bubble above the panda.
- Responsive Design: Interface designed for a pleasant user experience.
AI & Backend Capabilities:
- Intelligent Conversations: Powered by Google's Gemini AI, providing context-aware and engaging responses.
- Unique Goth Panda Personality: The AI embodies a calm, reserved, and subtly goth persona with dry humor.
- Speech-to-Text: Converts your voice input into text for the AI to process.
- Text-to-Speech (Backend Driven): AI text responses are converted to audio by the backend (using Gemini's TTS features) and played by the frontend, giving the panda its voice.
- Conversation History: Maintains context within a session for more coherent interactions.
- Efficient Backend: Built with FastAPI for high performance.
- Streaming Support (Backend): Capable of streaming responses for potentially faster perceived interaction (though current frontend might not fully utilize SSE/WebSocket streaming for message display).
The project is organized into two main directories:
pandalora/
: Contains the frontend React application.src/
: Main source code for the React app.App.js
: Main application component.pandachat.jsx
: Core chat interface component.api/pandaApi.js
: Client for backend API communication.components/glbviewer.jsx
: Renders the 3D panda model.
pandacore/
: Contains the backend FastAPI application.app/
: Main source code for the FastAPI app.main.py
: FastAPI application entry point and core configuration.api.py
: Defines API routes for chat, speech, etc.services.py
: Business logic, AI integration (Gemini), speech processing.models.py
: Pydantic models for data validation.
requirements.txt
: Python dependencies..env
(you'll need to create this): For environment variables like API keys.
- Frontend (
pandalora
):- React
- JavaScript (JSX)
- Axios (for API calls)
- React Three Fiber & Drei (for 3D rendering)
- HTML5 Web Speech API (SpeechRecognition, MediaRecorder)
- CSS (inline styles and basic CSS)
- Backend (
pandacore
):- Python 3.9+
- FastAPI (web framework)
- Uvicorn (ASGI server)
- Google Generative AI SDK (for Gemini text and speech)
- SpeechRecognition (library for audio input processing)
- Pydub (for audio manipulation)
- Python-dotenv (for environment variables)
- Development & Workflow:
- Node.js & npm
- Python Virtual Environment
- Concurrently (to run frontend and backend simultaneously)
- Git
Before you begin, ensure you have the following installed:
- Node.js: Latest LTS version recommended (e.g., v18+). This includes npm.
- Python: Version 3.9 or higher. This includes pip.
- Git: For cloning the repository.
Follow these steps to get the project running locally:
-
Clone the Repository:
git clone <your-repository-url> cd <repository-name> # e.g., cd gothicco
-
Backend Setup (
pandacore
):- Navigate to the backend directory:
cd pandacore
- Create and activate a Python virtual environment:
python -m venv .venv # On macOS/Linux: source .venv/bin/activate # On Windows (PowerShell): # .\.venv\Scripts\Activate.ps1 # On Windows (CMD): # .\.venv\Scripts\activate.bat
- Install Python dependencies:
pip install -r requirements.txt
- Create an environment file:
Copy
.env.example
to.env
if an example file exists, or create a new.env
file in thepandacore
directory. Add your Google Gemini API key:# /home/gothicco/pandacore/.env GEMINI_API_KEY="YOUR_GEMINI_API_KEY_HERE" # Optional: You can also define HOST and PORT if needed, but defaults are usually fine. # HOST=0.0.0.0 # PORT=8000
- Navigate to the backend directory:
-
Frontend Setup (
pandalora
):- Navigate to the frontend directory (from the project root):
cd pandalora/
- Install Node.js dependencies:
npm install
- Navigate to the frontend directory (from the project root):
Once both backend and frontend are set up, you can start the entire application with a single command:
-
Navigate to the frontend directory (
pandalora
):# If you are not already there: cd /path/to/your/project/gothicco/pandalora
-
Run the development script:
npm run dev
This command uses
concurrently
(defined inpandalora/package.json
) to:- Start the React frontend development server (usually on
http://localhost:3000
). - Start the FastAPI backend server (usually on
http://127.0.0.1:8000
).
- Start the React frontend development server (usually on
-
Open the Application: Open your web browser and navigate to
http://localhost:3000
.
You should now see the PandaLora application running!
- The React frontend (
pandalora
) renders the chat interface and the 3D panda. - User input (text or voice) is captured by
pandachat.jsx
.- Voice input is processed using the browser's
SpeechRecognition
andMediaRecorder
APIs.
- Voice input is processed using the browser's
- The
pandaApi.js
client sends the processed input to the FastAPI backend (pandacore
). - The backend's
api.py
routes the request to the appropriate service inservices.py
.SpeechService
transcribes audio to text if needed.GeminiAIService
interacts with the Google Gemini API, incorporating the "goth panda" system prompt and conversation history, to generate a text response. It then uses Gemini's TTS capabilities to generate audio data for this response.ConversationService
manages the chat history for the session.
- The backend sends back a JSON response containing the AI's text reply and the base64 encoded audio data.
- The frontend receives this response:
- Displays the AI's text message.
- Plays the received audio data, making the panda "speak."
- Updates UI animations (e.g., talking panda).
GEMINI_API_KEY
not found: Ensure your.env
file is correctly placed in thepandacore
directory and contains your valid API key. The backend logs will indicate if the key is missing.- Backend not starting (
No module named uvicorn
or similar): Make sure you have activated the Python virtual environment forpandacore
before runningnpm run dev
or when installing dependencies. Thenpm run dev
script inpandalora/package.json
attempts to use the Python executable frompandacore/.venv/bin/python
. Verify this path is correct for your setup. - Port conflicts: If
localhost:3000
orlocalhost:8000
are in use,npm run dev
might fail or one of the services might not start. Ensure these ports are free. - Microphone Permissions: The browser will ask for microphone permission for voice input. Ensure you grant it.
- Frontend Warnings (ESLint, Source Maps): The console might show some frontend warnings (e.g., unused variables, source map issues). These are generally non-critical for functionality but should be addressed for code quality.
Enjoy your conversations with PandaLora!