This repository contains practical applications and examples demonstrating the capabilities of Google's Gemma 3n, the latest multimodal AI model optimized for on-device inference.
Type: Streamlit Web App
Description: Multimodal chat application supporting text, image, and audio inputs
Features: Multiple API providers, Real-time chat, File uploads, Conversation history
Complexity: Intermediate
Type: Streamlit Voice App
Description: Voice-activated AI assistant with speech recognition and text-to-speech
Features: Voice input/output, Multiple languages, Conversation context, Manual text fallback
Complexity: Advanced
Type: Command Line Tool
Description: Local coding assistant for code analysis, generation, and debugging
Features: Interactive chat, Code analysis, Project generation, Debug assistance
Complexity: Advanced
Type: Streamlit Web App
Description: Document analysis tool for PDFs and images with AI-powered insights
Features: PDF processing, Multiple analysis types, Data export, Batch processing
Complexity: Advanced
Type: Configuration
Description: Python package dependencies for all applications
Features: Complete dependency list, Version specifications, Optional packages
Complexity: Basic
Type: Installation Script
Description: Automated setup script for Ollama and Python environment
Features: Cross-platform support, Ollama installation, Virtual environment setup
Complexity: Intermediate
Type: Docker Configuration
Description: Docker Compose configuration for containerized deployment
Features: Ollama service, App containerization, Network configuration
Complexity: Intermediate
Type: Docker Configuration
Description: Docker container definition for the applications
Features: Python environment, System dependencies, Non-root user
Complexity: Intermediate
- Python 3.8+
- Ollama installed and running
- Gemma 3n model downloaded
-
Automated Setup (Recommended):
chmod +x setup.sh ./setup.sh
-
Manual Setup:
# Install Ollama curl -fsSL https://ollama.com/install.sh | sh # Download Gemma 3n ollama pull gemma3n:e4b ollama pull gemma3n:e2b # Install Python dependencies pip install -r requirements.txt
-
Multimodal Chat:
streamlit run gemma3n_multimodal_chat.py
-
Voice Assistant:
streamlit run gemma3n_voice_assistant.py
-
Coding Agent:
python gemma3n_coding_agent.py --interactive
-
Document Analyzer:
streamlit run gemma3n_document_analyzer.py
For containerized deployment:
docker-compose up -d
- Multimodal Chat: Customer support, education, content creation
- Voice Assistant: Accessibility, hands-free operation, smart home integration
- Coding Agent: Development assistance, code review, debugging
- Document Analysis: Legal document review, research, data extraction
Each application is designed to be easily customizable:
- API Providers: Switch between Ollama, Together AI, Google AI Studio
- Models: Support for both E2B and E4B variants
- UI/UX: Streamlit components can be modified or replaced
- Features: Add new analysis types, commands, or integrations
Feel free to submit issues, feature requests, or pull requests to improve these applications.
This project is open source and available under the MIT License.