Skip to content

🖼️ Web app for generating detailed image captions via OpenAI's GPT API or Ollama, perfect for LoRA model training. Upload images, add custom prefixes/suffixes, and download captions as a ZIP file.

License

Notifications You must be signed in to change notification settings

aleksa-codes/gpt-flux-img-captioner

Repository files navigation

GPT Image Captioner 🖼️

A web app that generates AI-powered image captions. Ideal for LoRA model training on platforms like fal LoRA Trainer and Replicate LoRA Trainer.

Live Demo

✨ Features

  • Dual Model Support: OpenAI API (GPT-4.1 series) or Ollama (local models)
  • Batch Processing: Upload and caption multiple images at once
  • Customization: Add prefix/suffix to captions
  • Export: Download all captions as a ZIP file
  • API Key Management: Securely store OpenAI keys in-app

🧠 Model Options

  • OpenAI: GPT-4.1 (high-quality), GPT-4.1-mini (balanced), and GPT-4.1-nano (faster, cheaper)
  • Ollama: Local vision models (LLaVA, moondream, bakLLaVA), no API key needed

Note: When using the deployed web app with Ollama, you have several options:

  1. Use ngrok to create a secure tunnel to your local Ollama server. Learn more.
  2. Configure Ollama to allow additional web origins using the OLLAMA_ORIGINS environment variable. Learn more and check out LobeHub's Ollama provider documentation.

🛠️ Tech Stack

Next.js 14, Tailwind CSS, shadcn/ui, Lucide React, Vercel AI SDK

🚀 Quick Start

Prerequisites

  • Node.js (v16+)
  • Yarn
  • OpenAI API key (if using OpenAI)
  • Ollama installed locally (if using local models)

Install & Run

# Clone repo
git clone https://github.com/aleksa-codes/gpt-flux-img-captioner.git
cd gpt-image-captioner

# Install dependencies
yarn install

# Start development server
yarn dev

Open http://localhost:3000 in your browser.

💡 Usage

  1. Choose between OpenAI or Ollama
  2. Upload one or more images
  3. Add optional prefix/suffix
  4. Generate captions
  5. Download as ZIP

Using Ollama

  1. Install Ollama
  2. Pull a vision model: ollama pull llava
  3. Start Ollama server
  4. Select "Ollama" in the app and choose your model

🤝 Contributing

Contributions welcome! Fork the repo, create a feature branch, and submit a pull request.

📝 License

MIT License - see the LICENSE file for details.


Made with ❤️ by aleksa.codes

About

🖼️ Web app for generating detailed image captions via OpenAI's GPT API or Ollama, perfect for LoRA model training. Upload images, add custom prefixes/suffixes, and download captions as a ZIP file.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published