Skip to content

Vishnu3377/fake-news-nlp-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

📰 Fake News Detection & NLP Analysis

This project performs natural language processing (NLP) tasks and classification on a dataset of news articles to distinguish between Fake News and Factual News. It involves preprocessing, tokenization, Named Entity Recognition (NER), sentiment analysis, topic modeling, and machine learning classification.


📁 Dataset

  • fake_news_data.csv: Contains 198 news articles labeled as either "Fake News" or "Factual News".

Dataset Columns:

  • title: Headline of the news article.
  • text: Body of the news article.
  • date: Publication date.
  • fake_or_factual: Label ("Fake News" or "Factual News").

🔧 Libraries Used

  • pandas, matplotlib, seaborn
  • spacy, nltk, re
  • vaderSentiment
  • gensim
  • sklearn

🧹 Preprocessing & Feature Engineering

  • Lowercasing, punctuation removal, stopword filtering
  • Tokenization using nltk
  • Lemmatization using WordNetLemmatizer
  • Named Entity Recognition with spaCy
  • Sentiment scoring with VADER
  • Bag of Words and TF-IDF features

📊 Exploratory Data Analysis

  • Distribution of fake vs factual news
  • Part-of-speech tagging frequency
  • Common named entities in each category
  • Sentiment analysis across news types
  • Top unigrams after preprocessing

🧠 Topic Modeling

  • LDA (Latent Dirichlet Allocation)
  • LSA (Latent Semantic Analysis)
  • Visualization of coherence scores for optimal topic number

🤖 Machine Learning Models

Two models were trained using Bag of Words features:

Logistic Regression

  • Accuracy: 90%
  • Precision/Recall:
    • Fake News: 93% / 86%
    • Factual News: 88% / 94%

SGDClassifier (Linear SVM)

  • Accuracy: 83%
  • Precision/Recall:
    • Fake News: 91% / 72%
    • Factual News: 78% / 94%

📈 Visualizations

  • Count plots
  • POS and NER distribution bars
  • Sentiment bar charts
  • LDA/LSA topic charts

🚀 How to Run

  1. Clone the repository.
  2. Make sure fake_news_data.csv is in the root directory.
  3. Install the dependencies:
pip install -r requirements.txt
  1. Run the analysis in a Jupyter Notebook or Python script.

🧾 Author

Vishnu M
LinkedIn: linkedin.com/in/vishnu-m737

About

NLP-based Fake News Detection using NER, sentiment analysis, topic modeling, and ML classifiers.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published