🖼️ Image Caption Generator

Automatically generate captions for images using a deep learning model.

🔗 Live App: https://image-caption-generator-using-cnn-lstm-nddimension.streamlit.app/

📓 Notebook: https://www.kaggle.com/code/nddimension/image-captioning-using-cnn-rnn

🚀 Model : https://drive.google.com/file/d/1d-qOyZaU34_N-cxEDtFPG9iApCbLrlmu/view?usp=drive_link

🗣️ Dataset : https://drive.google.com/file/d/1QNCjQCsQBoxlMyc9WFM_fg2LU5NEh5iJ/view?usp=drive_link

🎯 Project Overview

Image Caption Generator is an AI-powered web app that generates natural language descriptions for images using a deep learning model. It combines a CNN for image feature extraction and an LSTM decoder to produce coherent captions.

📷 Upload or select a sample image 🧠 AI generates descriptive captions 🗣️ Powered by a pre-trained CNN-LSTM model 🚀 Interactive and educational experience

✅ Pre-trained models loaded automatically ✅ Sample images included for quick testing ✅ Supports image uploads (JPG, PNG, JPEG) ✅ Built with Streamlit for ease of use

🔍 Features

Feature	Description
🖼️ Image Upload	Upload your own image or select from sample images
🧠 AI Captioning	Generate natural-language captions using deep learning
📝 Caption Display	Clean, styled caption output with real-time preview
⚙️ Model Caching	Speeds up inference using Streamlit caching
📖 Educational Sections	Learn how the model architecture works
🔍 Debug Mode	Optional debug panel for technical details

📌 Workflow

Load Pre-trained Models
Image Preprocessing
- Resize, normalize, and format image for the CNN
Feature Extraction
- CNN extracts image features (e.g., ResNet, Inception)
Caption Generation
- LSTM decoder predicts words one by one (auto-regressive)
Display Output
- Caption is cleaned and shown in real time

⚙️ How It Works

Architecture
- A CNN (e.g., ResNet) is used to extract image features
- A pre-trained LSTM model takes these features and generates a caption word-by-word
Tokenizer & Sequence
- A tokenizer encodes/decodes the text data
- Input sequences are padded to a fixed max length
Inference
- Starts with the token startseq
- Predicts next word using softmax
- Ends at endseq or when max length is reached
Interface
- Streamlit UI allows users to upload images or choose from samples
- Captions are generated and displayed on the same page

🎹 App Preview

🧠 Image + Caption

📦 Requirements

Install everything using:

pip install -r requirements.txt

🚀 Getting Started

1️⃣ Clone the repository

git clone https://github.com/NDDimension/Image-Caption-Generator-using-CNN-LSTM.git
cd  image-caption-generator

2️⃣ Install Dependencies

pip install -r requirements.txt

3️⃣ Run the Streamlit App

streamlit run app.py

✨ Highlights

✅ Automatic download of pre-trained models and tokenizer ✅ Streamlit-based interactive interface ✅ Works with sample and user-uploaded images ✅ Educational explanations included ✅ Debug mode for inspecting internals

🔮 Future Improvements

🧠 Add beam search for more accurate caption generation 🌐 Deploy to HuggingFace Spaces 📤 Allow batch caption generation 🗂️ Add support for custom training datasets 🎯 Add attention visualization for interpretability

🙌 Credits & Contributors

Notebook Revamped & Curated by: NISHTHA SHARMA

📌 GitHub: https://github.com/711nishtha

📌 Kaggle: https://www.kaggle.com/nishtha711

App and Training by: DHANRAJ SHARMA

📌 GitHub: https://github.com/NDDimension

Inspired by:

Show and Tell Model (Google)
Flickr8k Dataset
TensorFlow & Keras captioning tutorials
Streamlit open-source community

📜 License

Licensed under the MIT License.

Image Caption Generator — AI that sees and speaks. ❤️ Made with love by Dhanraj Sharma.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
LICENSE		LICENSE
README.md		README.md
app.py		app.py
image-captioning-using-cnn-rnn.ipynb		image-captioning-using-cnn-rnn.ipynb
image.png		image.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🖼️ Image Caption Generator

🎯 Project Overview

🔍 Features

📌 Workflow

⚙️ How It Works

🎹 App Preview

🧠 Image + Caption

📦 Requirements

🚀 Getting Started

✨ Highlights

🔮 Future Improvements

🙌 Credits & Contributors

📜 License

About

Uh oh!

Releases

Packages

Languages

License

NDDimension/Image-Caption-Generator-using-CNN-LSTM

Folders and files

Latest commit

History

Repository files navigation

🖼️ Image Caption Generator

🎯 Project Overview

🔍 Features

📌 Workflow

⚙️ How It Works

🎹 App Preview

🧠 Image + Caption

📦 Requirements

🚀 Getting Started

✨ Highlights

🔮 Future Improvements

🙌 Credits & Contributors

📜 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages