This project demonstrates how to classify SMS messages as spam or ham using Natural Language Processing (NLP) and machine learning models.
Any suggestion and feedback is welcome. You can message me on
- Spam SMS Classication.ipynb — Jupyter Notebook covering data preprocessing, feature extraction (TF-IDF), model training and evaluation.
- Spam SMS Collection — Dataset containing SMS messages labeled as spam or ham.
- requirements.txt — Python dependencies required to run the notebook.
- Text preprocessing (cleaning, tokenization, stopword removal)
- Feature extraction using Bag of Words and TF-IDF
- Classification models (Naive Bayes, Logistic Regression, SVM)
- Evaluation using Accuracy, Precision, Recall, and F1-score
The dataset contains SMS messages labeled as spam or ham, useful for supervised learning in NLP.
- Python
- scikit-learn
- Pandas, NumPy
- NLTK
- Matplotlib, Seaborn
- Clone or download this repository.
- Install the dependencies:
pip install -r requirements.txt
- Launch Jupyter Notebook:
jupyter notebook
- Open and run:
Spam SMS Classication.ipynb.
- Models are evaluated using accuracy and F1-score.
- Notebook includes classification metrics and confusion matrix visualizations.
This project is designed for learning and portfolio showcase. It can be extended with advanced models like LSTMs or Transformers.