
A modern web application featuring real-time webcam analysis powered by Google's Gemma 3n AI model. The frontend provides an intuitive interface for webcam capture, AI-powered image analysis, and persona-based interactions, with a backend API for model management.
- Interactive Frontend: Modern React-based interface with real-time webcam capture
- AI-Powered Analysis: Image analysis using Google Gemma 3n model
- Persona System: Customizable AI personalities for different analysis scenarios
- Webcam Integration: Live camera feed with instant capture and analysis
- Python 3.11+
- Node.js 22+ (for frontend development)
- CUDA-compatible GPU (optional, but recommended)
- Webcam/Camera device
- 64GB+ RAM
- WSL (Windows Subsystem for Linux) for frontend builds
If you're not interested in making changes to the project and building the image from scratch, follow these steps:
Run the following command in the terminal:
docker pull grctest/gemma3n_webcam_app
Then once again, within a terminal after downloading the image run this command:
docker run -p 8080:8080 --gpus all grctest/gemma3n_webcam_app
The docker container will then run, and you can then navigate to the webapp at 127.0.0.1:8080/
# Clone the repository
git clone https://github.com/grctest/g3n-fastapi-webcam-docker.git
cd g3n-fastapi-webcam-docker
# Create and activate environment
conda create -n g3n python=3.11
conda activate g3n
# Install dependencies
pip install -r requirements.txt
# Download the Gemma model
pip install -U "huggingface_hub[cli]"
huggingface-cli login --token YOUR_HUGGINGFACE_TOKEN
huggingface-cli download google/gemma-3n-E2B-it --local-dir app/models/google/gemma-3n-E2B-it
- Open WSL (Windows Subsystem for Linux)
- Navigate to frontend directory to build it:
cd frontend npm install npm run build
- Exit WSL
This will provide both the REST API access to python functions, as well as host the frontend of the web app.
uvicorn app.main:app --host 0.0.0.0 --port 8080
Or you can build the docker image manually & run it in a dev container:
docker build -t gemma3n_webcam_app .
docker run -p 8080:8080 --gpus all gemma3n_webcam_app
- Open your browser to
http://localhost:8080/
- Allow webcam access when prompted
- Select or create a persona
- Either Capture Frames manually or Enable interval captures to process the webcam footage.
- View results once the frame has been processed by Gemma 3n.
The backend API documentation is available at http://localhost:8080/docs
This project is licensed under the MIT License - see the LICENSE file for details.