Skip to content

πŸ“Œ Landslide Detection with Satellite Imagery & Machine Learning . The CNN model trained on multi-channel satellite images. Achieved 89.2% F1-score by using NDVI, slope, and elevation features. Optimized model accuracy through hyperparameter tuning and threshold adjustments.

Notifications You must be signed in to change notification settings

mallickboy/land-slide-prediction

Repository files navigation

Landslide Prediction Using CNN (Achieved 89% F1-Score)

This repository showcases a deep learning-based solution for landslide prediction using Convolutional Neural Networks (CNN). By analyzing satellite imagery and incorporating various environmental features, this model predicts the likelihood of landslides, enabling more effective disaster management and early warning systems.

Dataset Overview

The dataset consists of 3,799 labeled images with associated masks, which are used for training and evaluation. Each image represents a satellite view of a region, and its corresponding mask indicates areas at risk of landslides. The dataset includes the following:

  • Image Shape: (128, 128, 14) β€” The images consist of 14 channels, capturing multiple environmental variables.
  • Mask Shape: (128, 128, 1) β€” Each mask is a binary image with 1 channel, representing areas with potential landslide risks.

Used channels in the Image:

  1. Channel 1: Blue
  2. Channel 2: Green
  3. Channel 3: Red
  4. Channel 7: Near Infrared (NIR)
  5. Channel 13: Elevation
  6. Channel 14: Slope

Feature Engineering

We calculated the Normalized Vegetation Index (NDVI) using the formula:

[ \text{NDVI} = \frac{\text{NIR} - \text{Red}}{\text{NIR} + \text{Red}} ]

This helps to assess the vegetation health of the region, an important factor for landslide prediction. Additionally, we normalized the slope and elevation data to improve model performance.

Training Data Features

For model training, we utilized the following 6-channel dataset:

  1. Normalized Red
  2. Normalized Green
  3. Normalized Blue
  4. NDVI
  5. Normalized Slope
  6. Normalized Elevation

The shape of the training data is: (128, 128, 6).

Model Architecture and Training

We employed Convolutional Neural Networks (CNNs) for this task, which are known for their ability to extract spatial hierarchies of features from image data. The model was trained using the selected channels, and extensive experimentation with hyperparameters and architecture optimization was performed.

Here is an image depicting the model training process:

Model Training

Model Evaluation and Comparisons

Underfitting Model

Initially, we observed underfitting, where the model performed poorly on both the training and validation sets. This is expected when the model is too simple or lacks sufficient capacity.

Underfitting Model

Hyperparameter Tuning and Finetuning

Through hyperparameter optimization and fine-tuning, the model's performance improved significantly. The next image shows the results after these adjustments:

Finetuned Model

Best Model with 89% F1-Score

After extensive fine-tuning, the best-performing model achieved an F1-Score of 89%, indicating a strong balance between precision and recall. This model demonstrated the highest accuracy and reliability.

Best Model

Confusion Matrix

The confusion matrix below illustrates the model's predictive performance on the test set. It shows how effectively the model distinguishes between landslide and non-landslide areas:

Confusion Matrix

Threshold Selection

The selection of an optimal threshold for classification plays a crucial role in balancing false positives and false negatives. After evaluating different thresholds, the following thresholds were tested:

  • Low Threshold:

    Low Threshold

  • High Threshold:

    High Threshold

The 50% threshold was selected for final classification, which provided an optimal balance between sensitivity and specificity.

Final Prediction on Labeled Data (50% Threshold)

After applying the 50% threshold, the model predicted areas at risk of landslides on the labeled dataset, as shown in the following image:

Final Prediction on Labeled Data

Final Predictions on Unlabeled Data

The trained model was then applied to unlabeled data to predict regions at risk of landslides:

  • Prediction 1:

    Prediction 1

  • Prediction 2:

    Prediction 2

  • Prediction 3:

    Prediction 3

Conclusion

This project demonstrates the powerful application of Convolutional Neural Networks (CNNs) for landslide prediction. By leveraging satellite imagery and environmental data, we built a robust model that achieved 89% F1-Score through careful data preprocessing, hyperparameter optimization, and model fine-tuning.

Key Highlights:

  • Hyperparameter Tuning: Extensive experiments were conducted to fine-tune the model, resulting in improved performance.
  • Feature Engineering: NDVI and normalization of slope and elevation significantly contributed to the model's predictive power.
  • Threshold Selection: The optimal threshold was selected at 50% to balance false positives and false negatives, ensuring reliable predictions.

Technologies Used:

  • Python
  • TensorFlow / Keras
  • OpenCV
  • NumPy
  • Matplotlib

Future Work:

  • Expand the dataset to include more diverse environmental factors, such as rainfall and soil type.
  • Incorporate temporal data to improve prediction accuracy over time.
  • Explore the use of Deep Learning architectures like U-Net for better segmentation-based predictions.

For more details on the code implementation and model training, please explore the repository and feel free to contribute or reach out with questions.

About

πŸ“Œ Landslide Detection with Satellite Imagery & Machine Learning . The CNN model trained on multi-channel satellite images. Achieved 89.2% F1-score by using NDVI, slope, and elevation features. Optimized model accuracy through hyperparameter tuning and threshold adjustments.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published