Skip to content

Movie Recommendation System: A content-based recommendation platform built with Python, Pinecone, and Streamlit. The system provides personalized movie suggestions based on genres and metadata, allowing users to explore tailored recommendations. With interactive genre filtering & clean interface, the app enhances movie discovery , hosted on render.

Notifications You must be signed in to change notification settings

gupta-v/movie-recommendation-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Movie Recommendation System

This is a Movie Recommendation System built with content-based filtering and hosted on Render. The system leverages metadata such as movie genres, titles, and tags to recommend movies based on similarity. The recommendation engine uses Pinecone for vector similarity search and TF-IDF vectorization for feature extraction.

Table of Contents

Table of Contents

Project Overview

The Movie Recommendation System takes a user-selected movie and recommends similar movies by analyzing metadata like genres and keywords. The system uses content-based filtering, where movies with similar features (genre, actors, plot) are recommended. The system allows filtering by genres, making it more customized to users' preferences.

The application uses Streamlit for the front-end interface and Pinecone for fast vector-based similarity search.

Features

  • Movie Recommendations: Get personalized recommendations based on selected movies.
  • Genre Filter: Users can filter recommendations by genre to receive more relevant suggestions.
  • Interactive UI: A user-friendly interface with Streamlit to browse and explore movies.
  • Poster Display: Movie posters are fetched from the OMDB API for better visualization of results.
  • Deployed Live: Access the live version here.

Uses and Scope

The Movie Recommendation System offers an intuitive platform for users to discover personalized movie suggestions, while also providing potential for future enhancements:

  • Personalized Movie Recommendations: Users can input their favorite movie titles and receive tailored recommendations based on similar genres and metadata, enhancing their movie-watching experience.

  • Interactive Genre Filtering: The system allows users to filter recommendations by selecting genres like Action, Comedy, or Sci-Fi, enabling more precise discovery of movies that match their preferences.

  • Content-Based Filtering: By analyzing movie metadata, the system delivers recommendations closely aligned with the content and characteristics of previously watched movies, ensuring relevance in suggestions.

  • Expansion to Collaborative Filtering: Future iterations could integrate collaborative filtering, allowing recommendations based on user behavior and preferences, enhancing the accuracy and diversity of suggestions.

  • Integration with Additional Movie Databases: Incorporating more movie sources such as IMDb or TMDb would enrich the dataset, improving both the variety and accuracy of recommendations.

  • Real-Time Feedback and Learning: The system could be expanded to collect real-time feedback (e.g., likes/dislikes), enabling dynamic learning and more personalized recommendations over time.

File Structure

movie-recommendation-system
│
├── app
│   ├── streamlit-app.py                         # Streamlit application script
│   └── preprocessings.py                        # Preprocessing functions used in Streamlit App
│
├── data
│   └── content_based_filtering_dataset.csv      # Dataset for content-based filtering
│
├── steps
│   ├── data_ingestion.py                        # Script for loading data
│   ├── data_preprocessing.py                    # Script for data preprocessing
│   ├── database_setup.py                        # Script for Pinecone setup and vectorizing Index
│   └── model.py                                 # Script for recommendation engine implementation
│
├── .env.example                                 # Example environment variables file
├── .gitignore                                   # Files and directories to be ignored by Git
├── requirements.txt                             # Python dependencies
└── README.md                                    # Project documentation

Software and Tools Requirements

  1. GitHub Account
  2. Render Account
  3. Pinecone Account
  4. OMDB API
  5. Python 3
  6. VSCode IDE
  7. Git CLI

Getting Started

Prerequisites

  • Python: Version 3.7 or higher
  • pip: Ensure pip is installed for managing project dependencies

Installation

  • Clone the repository:

    git clone https://github.com/gupta-v/movie-recommendation-system.git
    
  • Navigate to the project directory:

    cd movie-recommendation-system
    
  • Install the required packages using requirements.txt:

    pip install -r requirements.txt
  • Set up your environment variables:

    cp .env.example .env
    

    Edit the .env file with the necessary details

    • (API keys for OMDB, Pinecone, etc.).

Data Description

The dataset used for content-based filtering includes movie metadata such as titles, genres, tags, relevant tags , combined data and movie IDs. The dataset is loaded from content_based_filtering_dataset.csv located in the data/ folder.

Key columns:

  • movieId: Unique ID for each movie.
  • title: The name of the movie.
  • genres: List of genres the movie belongs to.
  • tag: List of tags given to the movies.
  • relevant_tags: List of tags that are relevant according to users.
  • combined_features: This column contains the string of all genres, tags & relevant_tags for each movie row.
  • imdbId: The IMDb ID used to fetch movie posters.

Usage

  • Create a Pinecone Index:

    • Visit Pinecone.io and create an index for your data vectors.
    • Create the index with -
    • Index Name : movie-recommendation-system
    • Dimension : 500
    • Metric : Cosine
  • Run the following Script to setup the index with your data vectors in Pinecone:

    python .\steps\database_setup.py
  • Start the Streamlit app:

    streamlit run app/streamlit-app.py
    

Project Steps

1. Data Ingestion:

  • Script: data_ingestion.py
  • Description: Loads the movie metadata from data/content_bases_filtering_dataset using pandas and logs the data loading process.

2. Data Preprocessing:

  • Script: data_preprocessing.py
  • Description: Prepossesses the data by embedding / vectoring the combined_features. Calculating TFidf Vector and cosine metric for the same. Also reduces the dimension of the Tfidf vector so that it could be use without any memory error in any vector database.

3. Database Setup:

  • Script: database_setup.py
  • Description: Setting up the pinecone vector database to use cosine metric for recommendations.

4. Model:

  • Script: model.py
  • Description: Model contains recommend function and more processing used to call pinecone api and queries the index for recommendations.

Model Evaluation

  • Metrics Used:
    • Cosine Similarity
    • Recommend top 12 matching movie with respect to movie title given.

Future Enhancements

Scalability:

  • Distributed Pinecone Setup: As more data is added, Pinecone's distributed architecture allows for seamless scaling of vector similarity search.
  • Expanding Dataset: Integrating more features like actor names, director names, and plot keywords could improve recommendation quality.
  • Hybrid Recommendation System: Future iterations could combine content-based filtering with collaborative filtering techniques (such as using user ratings) for better accuracy.

Possible Features:

  • User Profiles: Implementing user profiles to provide personalized recommendations based on past behavior.
  • Collaborative Filtering: Add collaborative filtering to include recommendations based on similar users’ tastes. More Customization: Adding more filters (e.g., release year, popularity) to fine-tune recommendations.

Hosted Link

Acknowledgments

  • Inspired by the diversity and enthusiasm of movie lovers who explore the vast world of Entertainment.
  • Grateful to the open-source community for providing the tools, libraries, and resources that made this project possible.

About

Movie Recommendation System: A content-based recommendation platform built with Python, Pinecone, and Streamlit. The system provides personalized movie suggestions based on genres and metadata, allowing users to explore tailored recommendations. With interactive genre filtering & clean interface, the app enhances movie discovery , hosted on render.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages