Semantic Segmentation with MobileNet + UNet

This project implements semantic segmentation using a MobileNet encoder with UNet decoder for 7-class image segmentation.

Dataset Classes

0: sky
1: water
2: bridge
3: obstacle
4: living_obstacle
5: background
6: self

Features

MobileNet Encoder: Pretrained MobileNetV2 for feature extraction
UNet Decoder: U-Net architecture for pixel-wise classification
GPU-Optimized: Requires CUDA GPU for optimal performance
Comprehensive Metrics: Pixel accuracy and mean IoU tracking
Visualization: Training curves, prediction visualization, confusion matrix
Logging: Detailed logging throughout training
Model Checkpointing: Saves the best model based on validation IoU

Setup

⚠️ GPU Required: This project requires a CUDA-compatible GPU. CPU-only execution is not supported.

Install dependencies:

pip install -r requirements.txt

Ensure your dataset structure is:

├── images/          # Input images (.png files)
├── segmentations/   # Segmentation masks (.png files)
└── model.py         # Training script

Usage

Run the training script:

python model.py

Training Process

Data Split: 80% training, 20% validation
Epochs: 50 epochs with early stopping based on validation IoU
Batch Size: 8 (adjustable)
Learning Rate: 0.001 with StepLR scheduler
Image Size: 224x224 pixels
Augmentation: Standard normalization for ImageNet pretrained models

Output Files

best_model.pth: Best model checkpoint
training_curves.png: Loss, accuracy, and IoU curves
predictions_visualization.png: Sample predictions
confusion_matrix.png: Confusion matrix heatmap

Model Architecture

Encoder: MobileNetV2 with multi-scale feature extraction
Decoder: UNet-style decoder with skip connections
Output: 7-class segmentation map

Metrics

Pixel Accuracy: Percentage of correctly classified pixels
Mean IoU: Average Intersection over Union across all classes
Per-class IoU: Individual IoU for each semantic class

Using the Trained Model

After training, you can use the saved model for inference on new images in several ways:

1. Simple Inference Example

python simple_inference_example.py

This runs inference on a sample image and shows the basic usage pattern.

2. Command Line Inference Tool

Single Image:

python inference.py --image_path path/to/your/image.png

Batch Processing:

python inference.py --input_folder path/to/image/folder

Demo with Training Data:

python inference.py  # Uses first image from training data

3. Programmatic Usage

from inference import load_model, preprocess_image, predict_segmentation

# Load trained model
model = load_model('best_model.pth')

# Process an image
image_tensor, original_size = preprocess_image('your_image.png')
prediction = predict_segmentation(model, image_tensor)

Model Files

best_model.pth: The trained model checkpoint (created during training)
inference_results/: Directory containing inference outputs
- *_result.png: Visualization of predictions
- *_prediction.npy: Raw prediction arrays

Downloading the Model

The model is saved locally as best_model.pth after training. To use it elsewhere:

Copy the best_model.pth file
Copy the model.py file (contains model architecture)
Copy the inference.py file (contains inference functions)
Install the same dependencies (requirements.txt)

Model Size and Performance

Model size: ~2.3MB (MobileNet-based, very efficient)
Input size: 224x224 pixels
Output: Full-resolution segmentation maps
Classes: 7 semantic classes (sky, water, bridge, obstacle, living_obstacle, background, self)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
IoU_improvements_explained.md		IoU_improvements_explained.md
JETSON_ANALYSIS_SUMMARY.md		JETSON_ANALYSIS_SUMMARY.md
JETSON_NANO_BINARY_EVALUATION_SUMMARY.md		JETSON_NANO_BINARY_EVALUATION_SUMMARY.md
MODEL_PERFORMANCE_SUMMARY.md		MODEL_PERFORMANCE_SUMMARY.md
PERFORMANCE_ANALYSIS.md		PERFORMANCE_ANALYSIS.md
README.md		README.md
architecture_comparison.md		architecture_comparison.md
binary_classification_model.py		binary_classification_model.py
binary_confusion_matrix.png		binary_confusion_matrix.png
binary_predictions_visualization.png		binary_predictions_visualization.png
binary_training_curves.png		binary_training_curves.png
confusion_matrix.png		confusion_matrix.png
dropout_explained.md		dropout_explained.md
how_dropout_works.md		how_dropout_works.md
improved_training_curves.png		improved_training_curves.png
inference.py		inference.py
jetson_nano_binary_evaluation.json		jetson_nano_binary_evaluation.json
jetson_nano_binary_evaluation.png		jetson_nano_binary_evaluation.png
jetson_nano_binary_evaluation.py		jetson_nano_binary_evaluation.py
jetson_performance_comparison.png		jetson_performance_comparison.png
jetson_performance_prediction.py		jetson_performance_prediction.py
mask_analysis_a00092894.png		mask_analysis_a00092894.png
mask_analysis_a00100106.png		mask_analysis_a00100106.png
mask_analysis_a00102614.png		mask_analysis_a00102614.png
mask_analysis_a00104694.png		mask_analysis_a00104694.png
mask_analysis_a00124901.png		mask_analysis_a00124901.png
mask_preprocessing_test.png		mask_preprocessing_test.png
model.py		model.py
model_improved.py		model_improved.py
model_practical.py		model_practical.py
performance_summary.py		performance_summary.py
practical_training_curves.png		practical_training_curves.png
predictions_visualization.png		predictions_visualization.png
quick_demo.py		quick_demo.py
requirements.txt		requirements.txt
simple_inference_example.py		simple_inference_example.py
simple_inference_result.png		simple_inference_result.png
test_segmentation_labels.py		test_segmentation_labels.py
training_curves.png		training_curves.png
video_inference.py		video_inference.py
video_inference_performance.png		video_inference_performance.png
vram_comparison.py		vram_comparison.py
vram_monitor.py		vram_monitor.py
vram_performance_comparison.png		vram_performance_comparison.png
vram_usage.png		vram_usage.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Semantic Segmentation with MobileNet + UNet

Dataset Classes

Features

Setup

Usage

Training Process

Output Files

Model Architecture

Metrics

Using the Trained Model

1. Simple Inference Example

2. Command Line Inference Tool

3. Programmatic Usage

Model Files

Downloading the Model

Model Size and Performance

About

Uh oh!

Releases

Packages

Languages

atoniolo76/obstacle_classifier

Folders and files

Latest commit

History

Repository files navigation

Semantic Segmentation with MobileNet + UNet

Dataset Classes

Features

Setup

Usage

Training Process

Output Files

Model Architecture

Metrics

Using the Trained Model

1. Simple Inference Example

2. Command Line Inference Tool

3. Programmatic Usage

Model Files

Downloading the Model

Model Size and Performance

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages