Skip to content

Bempong-Sylvester-Obese/Food-segmentation-with-pre-trained-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Food Segmentation Using GroundingDINO and MobileSAM

Unknown Unknown-2

A comprehensive project for prompt-guided food segmentation using state-of-the-art pre-trained models. This project combines GroundingDINO for object detection and MobileSAM for precise segmentation, providing both a Google Colab Notebook for experimentation and a Flask web application for easy deployment.

πŸ“š IMPORTANT: Research First!

Before diving into the implementation, we strongly recommend reading through the research papers in the research/ folder to understand the theoretical foundations and capabilities of the pre-trained models used in this project:

  • GroundingDINO Research: Understanding prompt-guided object detection
  • Guided Diffusion Model for Adversarial Purification: Advanced model techniques
  • Image Segmentation Using Text and Image Prompts: Core segmentation concepts

These papers provide valuable insights into how the models work, their limitations, and best practices for optimal results.

πŸš€ Features

  • Prompt-guided segmentation: Upload an image and provide a text prompt to segment specific food items
  • Real-time processing: Fast inference using pre-trained models
  • Multiple interfaces:
    • Google Colab notebook for experimentation and analysis
    • Flask web application for easy deployment
  • User-friendly interface: Clean, responsive web interface
  • Transparent background: Segmented objects are saved with transparent backgrounds
  • Comprehensive testing: Built-in test suite to verify functionality
  • Multiple model support: Includes both MobileSAM and MobileSAMv2 for enhanced performance
  • Automatic model download: Models are automatically downloaded if not present
  • Error handling: Robust error handling and validation for various edge cases
  • Health check endpoint: Built-in health monitoring for production deployment

πŸ“ Project Structure

Food-segmentation-with-pre-trained-model/
β”œβ”€β”€ Food_Segmentation.ipynb     # Google Colab notebook for experimentation
β”œβ”€β”€ webapp/                     # Flask web application
β”‚   β”œβ”€β”€ app.py                  # Main Flask application with segmentation logic
β”‚   β”œβ”€β”€ model_loader.py         # Model loading and initialization with auto-download
β”‚   β”œβ”€β”€ test_app.py             # Test script for validation
β”‚   β”œβ”€β”€ requirements.txt        # Python dependencies
β”‚   β”œβ”€β”€ static/                 # Static files (generated images)
β”‚   β”‚   β”œβ”€β”€ images/            # Uploaded and processed images
β”‚   β”‚   └── GeneratedImages/   # Segmentation results with transparent backgrounds
β”‚   β”œβ”€β”€ GroundingDINO/         # GroundingDINO model files
β”‚   └── MobileSAM/             # MobileSAM and MobileSAMv2 model files
β”‚       β”œβ”€β”€ MobileSAMv2/       # Enhanced MobileSAMv2 implementation
β”‚       └── weights/           # Model weights
β”œβ”€β”€ images/                     # Sample food images for testing (40+ images)
β”œβ”€β”€ Results/                    # Segmentation results and analysis
β”‚   β”œβ”€β”€ accurateresults/       # Successful segmentation results
β”‚   β”œβ”€β”€ inaccuracies/          # Failed segmentation cases
β”‚   └── result.json            # Detailed results data (7,000+ lines)
└── readme.md                  # This file

πŸ› οΈ Prerequisites

Make sure you have the following installed:

  • Python 3.7+
  • PyTorch
  • OpenCV
  • Flask
  • Google Colab (for notebook experimentation)
  • Other dependencies listed in webapp/requirements.txt

πŸ“¦ Installation

Option 1: Using Google Colab (Recommended for Development)

  1. Open the Jupyter notebook in Google Colab:

    • Click the "Open in Colab" button in the notebook
    • Or manually upload Food_Segmentation.ipynb to Google Colab
  2. The notebook will automatically:

    • Set up the environment
    • Clone the required model repositories
    • Install all dependencies
    • Download model weights
    • Load the models for experimentation
  3. Run the cells sequentially to perform food segmentation experiments

Option 2: Using the Web Application

  1. Navigate to the webapp directory:
cd webapp
  1. Install dependencies:
pip install -r requirements.txt
  1. The application will automatically download model files if they don't exist:
    • GroundingDINO checkpoint: GroundingDINO/groundingdino_swint_ogc.pth
    • MobileSAM checkpoint: MobileSAM/weights/mobile_sam.pt

πŸš€ Usage

Web Application

  1. Start the web application:
cd webapp
python app.py

2.Click on any of the local host domains available to open in your default web browser:

 * http://127.0.0.1:5001
 * http://192.168.0.181:5001
  1. Upload an image and enter a prompt describing the food item you want to segment (e.g., "Banku", "Jollof Rice", "Tomato Stew")

  2. Click "Segment Food" to process the image

  3. View the results showing both the original image and the segmented object

Google Colab Notebook

  1. Open Food_Segmentation.ipynb in Google Colab
  2. Run the cells sequentially to:
    • Set up the environment (automatic in Colab)
    • Clone and install model repositories
    • Download and load models
    • Perform segmentation on sample images
    • Analyze results

πŸ”§ API Endpoints (Web Application)

  • GET /: Main web interface
  • POST /segment: Process image segmentation (expects multipart form data with image_file and prompt)
  • GET /health: Health check endpoint for monitoring
  • GET /static/<filename>: Serve static files (images)
  • GET /static/images/<filename>: Serve processed images

πŸ€– Model Information

  • GroundingDINO: Used for object detection based on text prompts
  • MobileSAM: Used for precise segmentation of detected objects
  • MobileSAMv2: Enhanced version with object-aware prompt sampling
    • Available in the MobileSAM directory
    • Faster segmentation with improved accuracy
  • Both models run on CPU by default (GPU support available if CUDA is installed)

πŸ§ͺ Testing

Web Application Testing

Run the test script to verify everything is working:

cd webapp
python test_app.py

Manual Testing

  1. Use the sample images in the images/ directory (40+ food images available)
  2. Try different prompts to test segmentation accuracy
  3. Check the Results/ directory for example outputs

πŸ“Š Results

The project includes comprehensive testing results:

  • Sample Results: Check Results/accurateresults/ for successful segmentations
  • Analysis: Review Results/inaccuracies/ for cases where segmentation failed
  • Data: Detailed results in Results/result.json (7,000+ lines of analysis data)
  • Generated Images: Processed images in webapp/static/GeneratedImages/

Performance Metrics

  • Successfully tested on 40+ food images
  • Supports various food types: burgers, pizza, fruits, vegetables, etc.
  • Real-time processing with automatic error handling

πŸ” Troubleshooting

  1. Import errors: Make sure all dependencies are installed
  2. Model loading errors: Models are automatically downloaded if missing
  3. CUDA errors: The app defaults to CPU mode. For GPU acceleration, ensure CUDA is properly installed
  4. Memory issues: Large images may require more RAM. Consider resizing images if needed
  5. Health check: Use the /health endpoint to verify application status

πŸ“ Dependencies

Key dependencies include:

  • torch>=1.9.0
  • torchvision>=0.10.0
  • supervision>=0.3.0
  • opencv-python>=4.5.0
  • numpy>=1.21.0
  • flask>=2.0.0
  • transformers>=4.20.0
  • ultralytics>=8.0.0
  • gradio>=3.0.0
  • streamlit>=1.20.0

For a complete list, see webapp/requirements.txt.

πŸ†• Recent Updates

  • Enhanced Model Support: Added MobileSAMv2 for improved segmentation performance
  • Automatic Model Download: Models are automatically downloaded if not present
  • Improved Error Handling: Better validation and error messages
  • Health Monitoring: Added health check endpoint for production deployment
  • Transparent Background: Segmented objects are saved with transparent backgrounds
  • Comprehensive Testing: Extensive testing on 40+ food images with detailed results

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

πŸ“„ License

This project uses pre-trained models from:

Please refer to their respective licenses for model usage terms.

🎯 Project Status

βœ… Completed Features:

  • GroundingDINO integration for object detection
  • MobileSAM integration for segmentation
  • Flask web application with user interface
  • Google Colab for experimentation
  • Automatic model downloading
  • Error handling and validation
  • Health check endpoint
  • Comprehensive testing suite
  • MobileSAMv2 support
  • Enhanced UI/UX improvements

πŸ”„ In Progress:

  • Performance optimization for large images
  • Additional model fine-tuning options
  • Food Nutritional Content Analysis

Note: This project is designed for experimental purposes. The models are pre-trained and may not work perfectly on all types of food images. For production use, consider fine-tuning the models on your specific dataset.

About

Ghanaian-Based Food Segmentation using GroundingDINO and MobileSAM

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published