An intelligent image tagging application powered by AI that automatically generates descriptions, captions, and SEO-optimized keywords for your photos. Supports both Google's Gemini AI and local Ollama inference.
- Username/Password Authentication: Simple and secure login system
- User Registration: Create new accounts with optional email for password reset
- User Management: Individual user accounts with admin privileges
- Session Management: Secure session handling with cookies
- User Isolation: Each user sees only their own images and data
- Default Admin: Automatic admin user creation for initial setup
- Multiple AI Providers: Choose between cloud and local AI:
- Google Gemini: Cloud-based advanced AI analysis
- Ollama: Local AI inference with privacy and no API costs
- Comprehensive Analysis: Generates detailed image descriptions, SEO-optimized captions, relevant keywords, and confidence scores
- Folder Processing: Process entire folders recursively
- Duplicate Detection: Automatically skip already processed images
- Real-time Progress: Live status updates with detailed metrics
- Error Handling: Comprehensive error reporting and recovery
- Background Processing: Non-blocking batch operations
- Standard Formats: JPEG, PNG, TIFF
- RAW Formats: CR2 (Canon), NEF (Nikon), ARW (Sony), DNG, RAF (Fujifilm), ORF (Olympus), RW2 (Panasonic)
- Large Files: Support for files up to 50MB
- Keyword Search: Click any keyword to find related images
- Full-text Search: Search across all metadata fields
- Pagination: Efficient browsing of large image collections
- Real-time Filtering: Instant search results
- Drag-and-drop Upload: Intuitive file upload experience
- Responsive Design: Works on desktop, tablet, and mobile
- Real-time Updates: Live processing status and progress
- Gallery View: Beautiful grid layout with metadata display
- SQLite Database: Efficient local storage
- Metadata Preservation: Complete EXIF data retention
- Thumbnail Generation: Automatic preview creation
- Status Tracking: Processing state management
- TypeScript - Type-safe server development
- Express.js - Web framework
- SQLite - Lightweight database
- Sharp - High-performance image processing
- Multer - File upload handling
- Gemini AI - Google's generative AI for image analysis
- Ollama - Local AI inference support
- ExifR - EXIF data extraction and RAW preview extraction
- React - UI framework
- TypeScript - Type-safe frontend development
- Vite - Fast build tool
- React Router - Client-side routing
- Axios - HTTP client
- Node.js 18+
- npm or yarn
- For Gemini AI: Google Gemini API key
- For Ollama: Ollama installation with a vision model (e.g., llava:latest)
# Clone the repository
git clone <repository-url>
cd image-tagger
# Run the setup script (Unix/Linux/Mac)
./scripts/setup.sh
# Or for Windows
scripts\setup.bat
-
Clone the repository
git clone <repository-url> cd image-tagger
-
Install all dependencies
npm run install:all
-
Set up environment variables
cp .env.example .env
Edit
.env
and add your Gemini API key:GEMINI_API_KEY=your_actual_api_key_here
-
Build the project
npm run build:all
- Go to Google AI Studio
- Sign in with your Google account
- Click "Create API Key"
- Copy the generated key and add it to your
.env
file
The application uses username/password authentication with the following features:
- Simple Login: Username and password authentication
- User Registration: Create new accounts with optional email
- Password Reset: Email required for password reset functionality
- Default Admin: Pre-created admin account for initial access
Run the database migration to set up user authentication:
npm run migrate:username
This will:
- Create the users table with username/password authentication
- Add user_id column to existing tables
- Create a default admin user (username:
admin
, password:admin123
) - Assign existing images to the admin user
Default Admin Credentials:
- Username:
admin
- Password:
admin123
- Important: Change the default password after first login!
# Start both server and client (Unix/Linux/Mac)
./scripts/start.sh
# Or for Windows
scripts\start.bat
# Stop both server and client (Unix/Linux/Mac)
./scripts/stop.sh
# Or for Windows
scripts\stop.bat
npm run dev:both
-
Start the backend server
npm run dev:server
Server will run on http://localhost:3001
-
Start the frontend development server
npm run dev:client
Frontend will run on http://localhost:5173
-
Open your browser and navigate to http://localhost:5173
# Build both server and client
npm run build:all
# Start the production server
npm start
./scripts/setup.sh
- Initial project setup (Unix/Linux/Mac)./scripts/start.sh
- Start both server and client (Unix/Linux/Mac)./scripts/stop.sh
- Stop all processes (Unix/Linux/Mac)scripts\setup.bat
- Initial project setup (Windows)scripts\start.bat
- Start both server and client (Windows)scripts\stop.bat
- Stop all processes (Windows)
npm run dev
- Start server onlynpm run dev:server
- Start server onlynpm run dev:client
- Start client onlynpm run dev:both
- Start both server and client concurrentlynpm run build
- Build server onlynpm run build:all
- Build both server and clientnpm run install:all
- Install dependencies for both server and clientnpm run stop
- Stop all processes (Unix/Linux/Mac only)npm start
- Start production server
For detailed script documentation, see scripts/README.md.
Image Tagger includes powerful batch processing capabilities for handling large collections of images efficiently.
- Recursive Folder Scanning: Automatically discovers all images in folders and subfolders
- Duplicate Detection: Skips files that have already been processed (configurable)
- Real-time Progress: Live updates showing processing status and metrics
- Error Handling: Comprehensive error reporting with detailed logs
- Background Processing: Non-blocking operations that don't freeze the UI
- Configurable Options: Customize thumbnail size, quality, and processing behavior
- Navigate to the main gallery page
- Click the "Batch Processing" button in the header
- Or visit http://localhost:5173/batch directly
Folder Path: /path/to/your/images/folder
Thumbnail Size: 300px (default)
AI Analysis Size: 1024px (default)
JPEG Quality: 85% (default)
Skip Duplicates: β (recommended)
- Click "Start Batch Processing"
- Monitor real-time progress with detailed metrics
- View processing status: Total, Processed, Success, Duplicates, Errors
- Successful Files: Appear in the main gallery with AI analysis
- Duplicate Files: Listed in the error report (if skip duplicates is enabled)
- Error Files: Detailed error messages for troubleshooting
# Example folder structure
/Photos/
βββ 2023/
β βββ Vacation/
β β βββ IMG_001.jpg
β β βββ IMG_002.CR2
β β βββ ...
β βββ Events/
β βββ Wedding/
β βββ Birthday/
βββ 2024/
βββ Travel/
βββ Family/
- JPEG/PNG: Standard web formats
- TIFF: High-quality images
- RAW Files: CR2, NEF, ARW, DNG, RAF, ORF, RW2
- Large Files: Up to 50MB per file
The system gracefully handles:
- Unsupported formats: Skipped with clear error messages
- Corrupted files: Logged and processing continues
- Permission issues: Detailed error reporting
- Network interruptions: Automatic retry mechanisms
- Processing Speed: ~2-5 seconds per image (depending on size and AI analysis)
- Memory Usage: Optimized for large batches with streaming processing
- Storage: Thumbnails and processed images stored efficiently
- Concurrent Processing: Background AI analysis doesn't block file processing
- Progress Bar: Visual progress indicator
- Live Metrics: Updated every 2 seconds
- Status Indicators: Processing, completed, error states
- Time Estimates: Duration and remaining time
- Categorized Errors: Duplicates, processing errors, unsupported files
- File-specific Details: Exact error messages for each failed file
- Expandable Lists: Click to view detailed error information
# Start batch processing
POST /api/images/batch/process
{
"folderPath": "/path/to/images",
"options": {
"skipDuplicates": true,
"thumbnailSize": 300,
"geminiImageSize": 1024,
"quality": 85
}
}
# Get batch status
GET /api/images/batch/:batchId
# Get all batches
GET /api/images/batch
# Delete batch
DELETE /api/images/batch/:batchId
- Organize Your Images: Use clear folder structures for better organization
- Check Disk Space: Ensure sufficient space for thumbnails and processed images
- Monitor Progress: Keep the batch processing page open to monitor progress
- Handle Errors: Review error reports and fix issues before reprocessing
- Backup Important Files: Always backup original images before processing
image-tagger/
βββ src/ # Backend source code
β βββ routes/ # API routes
β βββ services/ # Business logic services
β βββ models/ # Database models
β βββ types/ # TypeScript type definitions
β βββ utils/ # Utility functions
βββ client/ # Frontend React application
β βββ src/
β β βββ components/ # React components
β β βββ services/ # API client services
β β βββ assets/ # Static assets
βββ uploads/ # Uploaded images storage
βββ thumbnails/ # Generated thumbnails
βββ database.sqlite # SQLite database file
βββ dist/ # Compiled backend code
POST /api/auth/login
- Login with username/passwordPOST /api/auth/register
- Register new user accountGET /api/auth/user
- Get current user informationGET /api/auth/status
- Check authentication statusPOST /api/auth/logout
- Logout current user
GET /api/health
- Health checkGET /api/images
- Get all images (supports pagination:?page=1&limit=12
)GET /api/images/:id
- Get specific imageGET /api/images/:id/analysis
- Get image analysisPOST /api/images/upload
- Upload new imagePOST /api/images/:id/analyze
- Trigger manual analysis
GET /api/images/search?q=searchTerm
- Search across all metadata fieldsGET /api/images/search/keyword/:keyword
- Search by specific keyword
POST /api/images/batch/process
- Start batch processingGET /api/images/batch
- Get all batch jobsGET /api/images/batch/:batchId
- Get specific batch statusDELETE /api/images/batch/:batchId
- Delete batch job
GET /api/images/test/gemini
- Test Gemini API connection
Environment variables in .env
:
# AI Provider Configuration
AI_PROVIDER=gemini # 'gemini' or 'ollama'
# Gemini AI Configuration (when AI_PROVIDER=gemini)
GEMINI_API_KEY=your_gemini_api_key_here
# Ollama Configuration (when AI_PROVIDER=ollama)
OLLAMA_BASE_URL=http://localhost:11434 # Ollama server URL
OLLAMA_MODEL=llava:latest # Vision model name
OLLAMA_TIMEOUT=300000 # Request timeout (5 minutes)
# Authentication Configuration
SESSION_SECRET=your-super-secret-session-key-change-in-production
# Server Configuration
PORT=3001
NODE_ENV=development
CLIENT_URL=http://localhost:5173
# Database Configuration
DATABASE_PATH=./database.sqlite
# Upload Configuration
UPLOAD_DIR=./uploads
THUMBNAIL_DIR=./thumbnails
MAX_FILE_SIZE=50000000
# Image Processing Configuration
THUMBNAIL_SIZE=300
AI_IMAGE_SIZE=1024 # Image size for AI analysis
- Standard: JPG, JPEG, PNG, TIFF, TIF
- RAW: CR2 (Canon), NEF (Nikon), ARW (Sony), DNG (Adobe), RAF (Fujifilm), ORF (Olympus), RW2 (Panasonic)
-
Get a Gemini API key:
- Visit Google AI Studio
- Create a new API key
- Add it to your
.env
file asGEMINI_API_KEY
-
Set the provider:
AI_PROVIDER=gemini
macOS:
# Using Homebrew (recommended)
brew install ollama
# Or download from website
# Visit https://ollama.ai/download and download the macOS installer
Linux:
# Using the official install script
curl -fsSL https://ollama.ai/install.sh | sh
# Or manually download and install
# Visit https://ollama.ai/download for manual installation
Windows:
- Visit https://ollama.ai/download
- Download the Windows installer
- Run the installer and follow the setup wizard
- Ollama will be available in your system PATH
After installing Ollama, you need to download a vision model that can analyze images:
# Download the recommended LLaVa model (7B parameters, ~4.7GB)
ollama pull llava:latest
# Alternative models (choose one):
# Larger, more accurate model (13B parameters, ~7.3GB)
ollama pull llava:13b
# BakLLaVa model (alternative implementation)
ollama pull bakllava:latest
# Moondream model (smaller, faster, ~1.7GB)
ollama pull moondream:latest
Model Comparison:
llava:latest
(7B): Best balance of speed and accuracy (recommended)llava:13b
: Higher accuracy but slower and requires more RAMbakllava:latest
: Alternative LLaVa implementationmoondream:latest
: Fastest but lower accuracy
# Start the Ollama server (required for the application to work)
ollama serve
# The server will start on http://localhost:11434
# Keep this terminal window open while using the application
Verification: Test that Ollama is running correctly:
# Test the server is responding
curl http://localhost:11434/api/tags
# Test your vision model
ollama run llava:latest "Describe this image" --image /path/to/test/image.jpg
Edit your .env
file to use Ollama:
# Set Ollama as the AI provider
AI_PROVIDER=ollama
# Ollama server configuration
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llava:latest
OLLAMA_TIMEOUT=300000
# Optional: Adjust timeout for large images (in milliseconds)
# Default is 5 minutes, increase if you have very large images
-
Start the Image Tagger application:
npm run dev:both
-
Test the connection:
- Go to http://localhost:5173
- Upload a test image
- Check that AI analysis works properly
Common Issues:
-
"Connection refused" error:
# Make sure Ollama server is running ollama serve # Check if port 11434 is available netstat -an | grep 11434
-
"Model not found" error:
# List installed models ollama list # Make sure your model name in .env matches exactly # Check OLLAMA_MODEL=llava:latest
-
Slow processing:
# Check system resources htop # Consider using a smaller model ollama pull moondream:latest # Then update .env: OLLAMA_MODEL=moondream:latest
-
Out of memory errors:
- Close other applications to free RAM
- Use a smaller model like
moondream:latest
- Increase system swap space
Performance Tips:
- RAM Requirements: 8GB+ recommended for
llava:latest
, 16GB+ forllava:13b
- CPU: Better performance with more CPU cores
- GPU: Ollama can use GPU acceleration if available (NVIDIA/AMD)
- Storage: Models require 2-8GB disk space each
-
Gemini:
- Higher accuracy for complex scenes
- Better language understanding
- Requires internet connection
- API costs apply
-
Ollama:
- Complete privacy (local processing)
- No API costs after setup
- Works offline
- Requires more local resources
The application provides several new endpoints for AI provider management:
GET /api/images/ai/provider/info
- Get current provider informationGET /api/images/ai/providers
- List all available providersGET /api/images/ai/provider/test
- Test current provider connectionGET /api/images/test/gemini
- Legacy endpoint (now tests current provider)
-
Upload fails with large files
- Check the
MAX_FILE_SIZE
setting in.env
- Default limit is 50MB
- Check the
-
"Failed to extract RAW preview"
- Some RAW formats may not be fully supported
- Try converting to JPEG/TIFF first
-
"GEMINI_API_KEY environment variable is required"
- Make sure you've set up your
.env
file with a valid Gemini API key - Ensure
AI_PROVIDER=gemini
is set
- Make sure you've set up your
-
Gemini API connection fails
- Check your API key is valid and active
- Verify internet connectivity
- Check API quotas and billing in Google Cloud Console
-
"Ollama server not accessible"
- Ensure Ollama is running:
ollama serve
- Check if the base URL is correct in
.env
- Verify port 11434 is not blocked by firewall
- Ensure Ollama is running:
-
"Model not found in Ollama"
- Install the vision model:
ollama pull llava:latest
- Check available models:
ollama list
- Verify the model name in
OLLAMA_MODEL
matches exactly
- Install the vision model:
-
Ollama requests timeout
- Increase
OLLAMA_TIMEOUT
for large images - Consider using a smaller/faster model
- Check system resources (RAM, CPU)
- Increase
-
Poor quality results with Ollama
- Try a larger model:
ollama pull llava:13b
- Experiment with different vision models
- Adjust the custom prompt for better results
- Try a larger model:
-
Database errors
- Delete
database.sqlite
to reset the database - The database will be recreated automatically
- Delete
We welcome contributions from the community! Whether you're fixing bugs, adding features, improving documentation, or suggesting enhancements, your help is appreciated.
- Fork the repository on GitHub
- Clone your fork locally
- Set up the development environment:
./scripts/setup.sh
- Create a feature branch:
git checkout -b feature/your-feature-name
- Make your changes and test thoroughly
- Submit a pull request with a clear description
- π§ͺ Testing: Add unit and integration tests
- π Documentation: Improve guides and API docs
- π Bug Fixes: Fix issues and improve stability
- β¨ New Features: Add new functionality
- π¨ UI/UX: Enhance user interface and experience
- β‘ Performance: Optimize code and improve speed
- Check existing issues and pull requests
- Read our Contributing Guidelines
- Follow our coding standards and best practices
- Test your changes thoroughly
For detailed contribution guidelines, see CONTRIBUTING.md.
Image Tagger is licensed under a Non-Commercial Use License.
- Use for personal, educational, and non-commercial purposes
- Study, modify, and distribute the source code
- Create derivative works for non-commercial use
- Use for commercial purposes without permission
- Sell, rent, or lease the software
- Use in commercial products or services
For commercial licensing, please contact: lists@anands.net
See LICENSE.md for complete license terms.
- Google Gemini AI - Advanced image analysis capabilities
- Sharp - High-performance image processing library
- ExifR - Comprehensive RAW file format support
- React - Modern frontend framework
- Express.js - Fast, minimalist web framework
- SQLite - Reliable embedded database
- TypeScript - Type-safe JavaScript development
- Documentation: Check this README and CONTRIBUTING.md
- Issues: Report bugs and request features on GitHub Issues
- Discussions: Join conversations on GitHub Discussions
- Email: Contact the maintainer at lists@anands.net
Made with β€οΈ by Anand Kumar Sankaran