Skin Cancer Detection

To build a CNN based model which can accurately detect melanoma. Melanoma is a type of cancer that can be deadly if not detected early. It accounts for 75% of skin cancer deaths. A solution that can evaluate images and alert dermatologists about the presence of melanoma has the potential to reduce a lot of manual effort needed in diagnosis.

Dataset

The dataset has been uploaded to Kagglge - https://www.kaggle.com/datasets/kk20krishna/skin-cancer-data-isic

The dataset consists of 2357 images of malignant and benign oncological diseases, which were formed from the International Skin Imaging Collaboration (ISIC). All images were sorted according to the classification taken with ISIC, and all subsets were divided into the same number of images, with the exception of melanomas and moles, whose images are slightly dominant.

The data set contains the following 9 diseases:

Actinic keratosis
Basal cell carcinoma
Dermatofibroma
Melanoma
Nevus
Pigmented benign keratosis
Seborrheic keratosis
Squamous cell carcinoma
Vascular lesion

Augmentation

Data augmentation performed using Augmentor library to:

Handle class imbalance
Increase training data, ehnance variety and reduce overfitting data

CNN Design

Cancer Classification Model Design

Model Architecture

Convolutional Neural Network (CNN) designed to classify images into 9 distinct cancer classes. The input images have a resolution of 180x180 pixels with 3 color channels (RGB). The model classifies images into 9 classes.

The design choices made for this model are explained in the notebook. Below, the design choices made for this model are explained:

Input Preprocessing

Rescaling: The input pixel values are normalized to the range [0,1] using Rescaling(scale=1./255). This helps in stabilizing the learning process by ensuring smaller gradients during training.

Feature Extraction using Convolutional Blocks The model consists of four convolutional blocks, each having:
1. Conv2D Layer: Extracts spatial features using 3x3 filters with 'same' padding to maintain spatial dimensions.
2. ReLU Activation: Introduces non-linearity, allowing the model to learn complex patterns.
3. MaxPooling Layer: Reduces spatial dimensions and computational cost while preserving important features.
- First Convolutional Block:
  - 32 filters, 3x3 kernel, ReLU activation, 'same' padding.
  - Followed by MaxPooling to reduce spatial size.
- Second Convolutional Block:
  - 64 filters, 3x3 kernel, ReLU activation, 'same' padding.
  - Followed by MaxPooling.
- Third Convolutional Block:
  - 128 filters, 3x3 kernel, ReLU activation, 'same' padding.
  - Followed by MaxPooling.
- Fourth Convolutional Block:
  - 256 filters, 3x3 kernel, ReLU activation, 'same' padding.
  - Followed by MaxPooling.
Regularization using Dropout
- A Dropout layer (50%) is introduced after the fourth convolutional block to reduce overfitting by randomly setting neuron outputs to zero during training.
Fully Connected Layers
- Flatten Layer: Converts the feature maps into a 1D vector.
- Dense Layer with 128 neurons (ReLU activation): Helps in learning complex feature representations.
- Dropout Layer (50%): Provides additional regularization.
Output Layer

Dense Layer with 9 neurons: Uses Softmax activation to output probabilities for the 9 cancer classes.

Image created using https://alexlenail.me/NN-SVG/AlexNet.html

Conclusions

1. Overview This report provides an evaluation of the Convolutional Neural Network (CNN) model trained for melanoma detection and cancer classification. The evaluation metrics include accuracy, precision, recall, F1-score, and loss for both training and validation datasets.

2. Performance Metrics

Training Performance
- Final Training Accuracy: 90.08%
- Final Training F1-score: 89.83%
- Final Training Precision: 91.81%
- Final Training Recall: 88.71%
- Final Training Loss: 0.2508
Validation Performance
- Best Validation Accuracy: 81.82%
- Final Validation Accuracy: 81.47%
- Final Validation F1-score: 80.88%
- Final Validation Precision: 83.06%
- Final Validation Recall: 80.22%
- Final Validation Loss: 0.7534

3. Training vs Validation Trends

Accuracy
- Training accuracy remained consistently above 89%.
- Validation accuracy fluctuated between 81% and 82%, peaking at 81.82%.
- The gap between training and validation accuracy suggests minor overfitting.
F1-score
- Training F1-score improved steadily, reaching 89.83%.
- Validation F1-score stabilized around 80.88%, indicating the model maintains good balance in classification.
Loss Analysis
- Training loss steadily decreased to 0.2508, showing good convergence.
- Validation loss plateaued around 0.75, indicating room for further generalization improvements.
4. Learning Rate Analysis
- The final learning rate decay suggests the model was approaching optimal convergence.
5. Confusion Matrix

Key Takeaways
- 'Nevus' and 'Melanoma' have the highest confusion. This makes sense since melanoma can visually resemble a nevus.
- 'Actinic Keratosis' is often confused with 'Squamous Cell Carcinoma' and 'Nevus.'
- 'Vascular Lesion' and 'Seborrheic Keratosis' show strong classification performance.
- 'Melanoma' misclassification is concerning since early detection is critical for treatment.
6. Observations
- Slight Overfitting: The training accuracy is higher than validation accuracy, indicating overfitting. Consider applying stronger regularization techniques such as dropout, weight decay, or data augmentation.
- Validation Performance Plateau: Validation accuracy remained stable around 81%, indicating that performance improvements may require architectural modifications or additional data.
- Early Stopping Activation: Training stopped at epoch 74, suggesting further training would not yield significant improvements.
7. Conclusion The model achieved 90.08% training accuracy and 81.82% best validation accuracy for cancer detection.

Technologies Used

Augmentor
shutil
pathlib
tensorflow
matplotli
numpy
pandas
os
PIL
keras
kagglehub
cv2 (OpenCV)
scikit-learn
seaborn

Additionally, various TensorFlow and Keras modules were utilized for building and training the model, including layers, models, callbacks, and regularizers.

Acknowledgements

Cource content in upGrad course
CNN design image created using https://alexlenail.me/NN-SVG/AlexNet.html

Contact

Created by @kk20krishna

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Krishna Kumar S.ipynb		Krishna Kumar S.ipynb
README.md		README.md
image.png		image.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Skin Cancer Detection

Table of Contents

Dataset

Augmentation

CNN Design

Conclusions

Technologies Used

Acknowledgements

Contact

About

Uh oh!

Releases

Packages

Languages

kk20krishna/skin-cancer-detection

Folders and files

Latest commit

History

Repository files navigation

Skin Cancer Detection

Table of Contents

Dataset

Augmentation

CNN Design

Conclusions

Technologies Used

Acknowledgements

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages