Skip to content

A machine learning application that detects fake and spam messages on social media using LSTM neural networks and traditional ML methods. Includes a Flask web interface.

Notifications You must be signed in to change notification settings

BaverYldz/SpamHandler-DL

Repository files navigation

Fake Message Detector

A machine learning-based solution to detect fake/spam messages on social media platforms using LSTM and TF-IDF models.

Fake Message Detection Python 3.7+ TensorFlow

Overview

This project implements a web application that can detect fake or spam messages using natural language processing and deep learning techniques. The system analyzes text content to determine whether it's legitimate or potentially spam/fake with high accuracy.

Features

  • Multiple Classification Models:

    • LSTM neural network for sequence analysis
    • TF-IDF + Logistic Regression for comparison
    • Ensemble method combining both approaches
  • Interactive Web UI:

    • Real-time message analysis
    • Prediction confidence scores
    • History tracking of previous detections
    • Statistics dashboard
  • Text Processing:

    • Advanced NLP preprocessing
    • URL, emoji, and special character handling
    • Stop word removal and lemmatization

Project Structure

├── data/                    # Dataset files
│   ├── twitter_spam_data.csv    # Raw dataset
│   └── processed_twitter_spam.csv # Preprocessed dataset
├── models/                  # Trained model files
│   ├── lstm_model.h5        # LSTM neural network model
│   ├── tokenizer.pickle     # Text tokenizer
│   ├── tfidf_vectorizer.pickle # TF-IDF vectorizer
│   └── lr_model.pickle      # Logistic regression model
├── src/                     # Source code
│   ├── preprocessing.py     # Text cleaning and preprocessing
│   ├── main.py              # Model training script
│   ├── predict.py           # Prediction functionality
│   └── data_helper.py       # Dataset utilities
├── static/                  # Web assets
│   ├── css/                 # Stylesheets
│   └── js/                  # JavaScript files
├── templates/               # HTML templates
├── run_flask.py             # Flask web application
├── run.py                   # Main runner script with CLI
└── README.md                # Project documentation

Installation

# Clone the repository
git clone https://github.com/BaverYldz/SpamHandler-DL.git
cd SpamHandler-DL

# Install Git LFS (if not installed)
# For Windows: https://git-lfs.github.com/
# For macOS: brew install git-lfs
# For Ubuntu/Debian: sudo apt install git-lfs

# Set up Git LFS
git lfs install

# Create virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Dataset

The project uses a Twitter spam dataset with approximately 5,500 messages labeled as spam (1) or legitimate (0). The dataset includes various features such as:

  • Message text
  • Class label (spam/legitimate)

Usage

1. Data Preparation

python src/data_helper.py

2. Training Models

python src/main.py

3. Running the Web Application

python run_flask.py

Then open your browser and navigate to: http://localhost:5000

4. Using the CLI Runner

python run.py

Choose from the interactive menu options to prepare data, train models, or launch the web app.

Model Performance

Model Accuracy Precision Recall F1-Score
LSTM 87% 84% 81% 82.5%
TF-IDF + LR 85% 82% 80% 81%
Ensemble 88% 85% 82% 83.5%

Future Improvements

  • Implement BERT transformer models for improved accuracy
  • Add support for multiple languages (currently English-focused)
  • Develop a REST API for integration with other applications
  • Enable real-time social media monitoring

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A machine learning application that detects fake and spam messages on social media using LSTM neural networks and traditional ML methods. Includes a Flask web interface.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published