Description

Fake-News Detection with Apache Spark and Databricks

Description

This project explores an effective way to identify misinformation by using sophisticated big data solution, including Apache Spark, Apache Kafka, and Databricks. It focuses on processing vast, ongoing data streams, integrating real-time RSS feeds with static news databases. The system enhances its efficiency through high-level preprocessing methods such as normalization and tokenization, along with effective data management practices that reduce computational demands. The approach combines various modeling techniques such as global, local, and ensemble—to achieve a well rounded performance, with an accuracy, precision, and recall rate of about 72.5% and an F1 score of 72.3%.

File Structure

The model code is in 'databricks/model' The kafka code is in 'databricks/kafka' The data ingestion code is in 'databricks/data-pipeline'

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
databricks		databricks
newscrapper		newscrapper
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fake-News Detection with Apache Spark and Databricks

Description

File Structure

About

Uh oh!

Releases

Packages

Uh oh!

Languages

tejas7777/Spark-Fake-News-Detection

Folders and files

Latest commit

History

Repository files navigation

Fake-News Detection with Apache Spark and Databricks

Description

File Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages