A modular Python application that uses Google's Gemini API to perform intelligent image editing operations including analysis, global adjustments, and local object manipulation.
- 🤖 Intelligent Routing: Automatically determines the right editing approach
- 📊 Image Analysis: Detailed image information and content analysis
- 🌈 Global Edits: Brightness, contrast, saturation, and color temperature adjustments
- 🎯 Local Edits: Object detection and inpainting for targeted modifications
- 💬 Chat-based Editing: Interactive conversational editing sessions
# Clone or download the project files
# Install dependencies using UV (Python package manager)
uv sync
# Copy the environment template
cp .env.example .env
# Edit .env and add your API key
# .env file:
GEMINI_API_KEY="your_actual_api_key_here"
# Gemini API information: https://gemini.readthedocs.io/en/latest/
This project uses uv
for Python package management. When running Python scripts or managing dependencies, use uv run
commands:
# Run the CLI interface (without GUI)
uv run main.py
Launch the modern Gradio web interface:
# Launch the web UI
uv run launch_ui.py [--use-gemini-local-edit]
Then open your browser to http://localhost:7860
to access the interface.
Web UI Features:
- 📤 Image Upload: Upload images via drag-and-drop, file browser, or webcam
- 🖌️ Mask Drawing: Built-in brush and eraser tools for precise inpainting masks
- 💬 Chat Interface: Real-time conversation with the AI assistant
- 🔄 Live Updates: See edits applied in real-time
- 📥 Download: Save your edited images instantly
- 🎛️ Image Tools: Crop, flip, rotate, and transform images
- Analyzes user prompts to determine the appropriate editing action
- Routes requests to the appropriate specialized agent: info, global_edit, local_edit, or advanced
- Uses Gemini API to understand user intent and select the best agent for the task
- Analyzes images to provide detailed information about content and characteristics
- Generates concise descriptions (under 150 words) in a single paragraph format
- Identifies objects, scenes, colors, lighting conditions, and other visual elements
- Performs whole-image adjustments like brightness, contrast, saturation
- Handles color temperature, sharpness, and other global parameters
- Applies filters and overall image enhancements
- Performs targeted edits on specific objects or regions
- Handles object detection, selection, and manipulation
- Supports inpainting for object removal and replacement