With the rise of misinformation, it is crucial to have an effective system for detecting fake news. This project focuses on detecting fake Hindi news using various machine learning models and analyzing their performance.
- Preprocessing: Cleaning and preparing Hindi news datasets.
- Fake News Detection: Implementing and evaluating multiple models:
- Linear Regression
- Support Vector Machine (SVM)
- Random Forest
- Naïve Bayes
- Model Analysis: Comparing performance metrics like accuracy, precision, recall, and F1-score.
- Result Visualization: Graphical representation of model comparisons.
- Programming Language: Python 🐍
- Libraries Used: Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn, NLTK
- Dataset: Hindi Fake News Dataset (Custom)
- Data Collection: Gather a dataset containing real and fake Hindi news.
- Data Preprocessing:
- Text cleaning (removing stop words, special characters, tokenization)
- Vectorization using TF-IDF/ CountVectorizer
- Model Implementation:
- Train & test multiple ML models
- Hyperparameter tuning for optimal results
- Performance Evaluation:
- Compare models using precision, recall, and F1-score
- Use visualization for better understanding
Model | Accuracy | Precision | Recall | F1-score |
---|---|---|---|---|
SVM | 89% | 88% | 90% | 89% |
Random Forest | 87% | 85% | 86% | 85.5% |
Naïve Bayes | 73% | 81% | 80% | 80.5% |
Linear Regression | 85% | 74% | 72% | 73% |
Conclusion: SVM performed the best among all tested models.
Ensure you have Python and the required libraries installed.
pip install pandas numpy scikit-learn nltk matplotlib seaborn
- Clone the repository:
git clone https://github.com/yourusername/hindi-news-fake-detection.git cd hindi-news-fake-detection
- Execute the script:
python main.py
Thanks to these amazing people:
![]() |
![]() |
![]() |
---|---|---|
Ankit Kumar | Sarthak Gupta | Aditya Birwatkar |
For queries or collaboration, reach out to: 📧 Email: ankitk22it@student.mes.ac.in