Skip to content

atoniolo76/obstacle_classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Semantic Segmentation with MobileNet + UNet

This project implements semantic segmentation using a MobileNet encoder with UNet decoder for 7-class image segmentation.

Dataset Classes

  • 0: sky
  • 1: water
  • 2: bridge
  • 3: obstacle
  • 4: living_obstacle
  • 5: background
  • 6: self

Features

  • MobileNet Encoder: Pretrained MobileNetV2 for feature extraction
  • UNet Decoder: U-Net architecture for pixel-wise classification
  • GPU-Optimized: Requires CUDA GPU for optimal performance
  • Comprehensive Metrics: Pixel accuracy and mean IoU tracking
  • Visualization: Training curves, prediction visualization, confusion matrix
  • Logging: Detailed logging throughout training
  • Model Checkpointing: Saves the best model based on validation IoU

Setup

⚠️ GPU Required: This project requires a CUDA-compatible GPU. CPU-only execution is not supported.

  1. Install dependencies:
pip install -r requirements.txt
  1. Ensure your dataset structure is:
├── images/          # Input images (.png files)
├── segmentations/   # Segmentation masks (.png files)
└── model.py         # Training script

Usage

Run the training script:

python model.py

Training Process

  • Data Split: 80% training, 20% validation
  • Epochs: 50 epochs with early stopping based on validation IoU
  • Batch Size: 8 (adjustable)
  • Learning Rate: 0.001 with StepLR scheduler
  • Image Size: 224x224 pixels
  • Augmentation: Standard normalization for ImageNet pretrained models

Output Files

  • best_model.pth: Best model checkpoint
  • training_curves.png: Loss, accuracy, and IoU curves
  • predictions_visualization.png: Sample predictions
  • confusion_matrix.png: Confusion matrix heatmap

Model Architecture

  • Encoder: MobileNetV2 with multi-scale feature extraction
  • Decoder: UNet-style decoder with skip connections
  • Output: 7-class segmentation map

Metrics

  • Pixel Accuracy: Percentage of correctly classified pixels
  • Mean IoU: Average Intersection over Union across all classes
  • Per-class IoU: Individual IoU for each semantic class

Using the Trained Model

After training, you can use the saved model for inference on new images in several ways:

1. Simple Inference Example

python simple_inference_example.py

This runs inference on a sample image and shows the basic usage pattern.

2. Command Line Inference Tool

Single Image:

python inference.py --image_path path/to/your/image.png

Batch Processing:

python inference.py --input_folder path/to/image/folder

Demo with Training Data:

python inference.py  # Uses first image from training data

3. Programmatic Usage

from inference import load_model, preprocess_image, predict_segmentation

# Load trained model
model = load_model('best_model.pth')

# Process an image
image_tensor, original_size = preprocess_image('your_image.png')
prediction = predict_segmentation(model, image_tensor)

Model Files

  • best_model.pth: The trained model checkpoint (created during training)
  • inference_results/: Directory containing inference outputs
    • *_result.png: Visualization of predictions
    • *_prediction.npy: Raw prediction arrays

Downloading the Model

The model is saved locally as best_model.pth after training. To use it elsewhere:

  1. Copy the best_model.pth file
  2. Copy the model.py file (contains model architecture)
  3. Copy the inference.py file (contains inference functions)
  4. Install the same dependencies (requirements.txt)

Model Size and Performance

  • Model size: ~2.3MB (MobileNet-based, very efficient)
  • Input size: 224x224 pixels
  • Output: Full-resolution segmentation maps
  • Classes: 7 semantic classes (sky, water, bridge, obstacle, living_obstacle, background, self)

About

lightweight cnn for detecting boat obstacles from the massmind dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages