Movie-Reviews-Sentiment-Analysis-using-PySpark

This repository contains code for performing sentiment analysis on movie reviews using PySpark. The analysis is done by counting the occurrences of positive and negative words in the reviews and assigning sentiments based on the counts.

Prerequisites

Before running the code, ensure you have the following installed:

Java 8: sudo apt-get install openjdk-8-jdk-headless
Apache Spark 3.2.0: Download from here and extract it.
Python packages: findspark

Setup

Install Java 8 and extract Apache Spark 3.2.0.
Install required Python packages: pip install findspark.
Clone this repository: git clone <repository-url>.
Place your movie review data in the specified directory: /content/drive/MyDrive/Analysis/data_moviereviews.
Update positive words in pos.txt and negative words in neg.txt.

Usage

Open the Jupyter Notebook or Python environment where you have Spark set up.
Run the provided Jupyter Notebook or execute the Python script in your environment.

File Descriptions

data_moviereviews: Directory containing movie review files.
pos.txt: File containing positive words.
neg.txt: File containing negative words.
Sentiment_Analysis.ipynb: Jupyter Notebook containing the code for sentiment analysis.

Output

The analysis will generate a data frame containing the following columns:

filename: Name of the movie review file.
positive_count: Count of positive words in the review.
negative_count: Count of negative words in the review.
sentiment: Assigned sentiment based on word counts (either 'positive' or 'negative').

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
Sentiment_Analysis.ipynb		Sentiment_Analysis.ipynb
data_moviereviews.zip		data_moviereviews.zip
neg.txt		neg.txt
pos.txt		pos.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Movie-Reviews-Sentiment-Analysis-using-PySpark

Prerequisites

Setup

Usage

File Descriptions

Output

About

Uh oh!

Releases

Packages

Languages

License

tina-joseph/Movie-Reviews-Sentiment-Analysis-using-PySpark

Folders and files

Latest commit

History

Repository files navigation

Movie-Reviews-Sentiment-Analysis-using-PySpark

Prerequisites

Setup

Usage

File Descriptions

Output

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages