This project aims to classify cell images as either infected with Malaria or uninfected using Convolutional Neural Networks (CNN) and Transfer Learning techniques. The project utilizes the dataset provided by the National Institutes of Health (NIH) comprising thousands of cell images.
- Overview
- Dataset
- Requirements
- Installation
- Usage
- Model Architecture
- Training
- Evaluation
- Results
- Conclusion
- Acknowledgements
- References
The dataset used in this project can be downloaded from the official NIH dataset repository or Kaggle. It contains cell images categorized into two classes:
Parasitized
: Cells infected with Malaria parasites.Uninfected
: Healthy cells without any infection.
The dataset is split into training, validation, and testing sets.
- Python 3.8+
- TensorFlow 2.5+
- Keras
- NumPy
- Matplotlib
- Scikit-learn
- Pandas
- Clone this repository:
git clone https://github.com/yourusername/Malaria-Cell-Classification.git cd Malaria-Cell-Classification
- Create and activate a virtual environment (optional but recommended):
python3 -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
- Install the required packages:
pip install -r requirements.txt
- Prepare the data:
python src/data_loader.py
- Train the model:
python src/train.py
- Evaluate the model:
python src/evaluate.py
We utilize a CNN model architecture with Transfer Learning. The pre-trained models such as VGG16, ResNet50, or InceptionV3 from the ImageNet dataset are used as the base model, and additional custom layers are added for fine-tuning on the Malaria cell dataset.
Training scripts and Jupyter notebooks are provided for step-by-step guidance on training the model. The key steps include:
- Data preprocessing and augmentation.
- Loading the pre-trained model.
- Adding custom layers for classification.
- Compiling the model with appropriate loss functions and optimizers.
- Training the model with the training dataset and validating it with the validation set.
The evaluation script and notebooks provide detailed analysis of the model's performance on the test dataset. Key metrics include:
- Accuracy
- Precision
- Recall
- F1-score
Additionally, confusion matrices and ROC curves are plotted for a comprehensive evaluation.
- Convolutional Neural Network (CNN): A custom CNN model was designed and trained from scratch for the classification of malaria-infected and uninfected cells.
- Transfer Learning: Pre-trained models such as VGG16, ResNet50, and InceptionV3 were fine-tuned on the malaria cell dataset to leverage learned features from large-scale datasets.
- Accuracy: The accuracy achieved by the custom CNN model and transfer learning models were compared. Typically, transfer learning models outperformed the custom CNN in terms of accuracy due to their ability to generalize better with pre-learned features.
- Precision, Recall, F1-Score: These metrics were used to evaluate the classification performance more comprehensively. Transfer learning models generally showed higher precision, recall, and F1-score.
- The custom CNN model showed higher training and validation loss compared to the transfer learning models, indicating potential overfitting or underfitting.
- Transfer learning models had lower training and validation losses, suggesting better generalization.
- The confusion matrix indicated that transfer learning models had fewer false positives and false negatives compared to the custom CNN model.
- Transfer learning significantly improved the performance of malaria cell classification. Pre-trained models like Mobilenet and VGG! achieved higher accuracy and better overall performance metrics compared to a custom CNN trained from scratch.
- Transfer learning models demonstrated superior generalization abilities. They were more effective in correctly identifying both infected and uninfected cells, as evidenced by higher precision, recall, and F1-scores.
- Among the transfer learning models, ResNet50 and InceptionV3 were particularly effective, showing the highest accuracy and best performance in handling the classification task.
- For practical applications, transfer learning with models like Mobilenet or VGG19 is recommended for malaria cell classification due to their robust performance and ability to generalize well across different datasets.
- Further improvements can be achieved by fine-tuning these models and possibly combining them with ensemble techniques to enhance classification performance.
- National Institutes of Health (NIH) for providing the dataset.
- TensorFlow and Keras for the deep learning frameworks.