This project builds a hybrid recommendation engine that suggests movies both users will likely enjoy, based on a combination of their individual preferences. It intelligently blends Collaborative Filtering and Content-Based Filtering to provide highly relevant and personalized movie recommendations โ even in cold-start scenarios where certain movies have no rating data.
Our goal was to develop a system that could recommend unseen movies to a pair of users, leveraging both their past ratings and movie content (like genre and IMDb score). By using a hybrid approach, we addressed the limitations of each method individually, ensuring strong recommendations regardless of data sparsity.
- Python
- Pandas
- Scikit-learn
- scikit-surprise
- Matplotlib / Seaborn
- IMDb Top 1000 Dataset
- Simulated Ratings Dataset
- Collaborative Filtering (CF): Uses the SVD algorithm to predict how much a user would like a movie based on similar usersโ behavior.
- Content-Based Filtering (CB): Uses cosine similarity on genre vectors, combined with IMDb rating, to evaluate how well a movie aligns with the usersโ preferences.
- Hybrid Score: For each movie, we compute a weighted score that blends CF prediction and CB relevance.
The system produces a ranked list of movies that:
- Neither user has rated yet
- Are predicted to match shared taste
- Are labeled by method (CF or CB)
- CF performance may be limited when user rating history is sparse.
- Currently assumes access to movie metadata and at least basic user ratings.
- Install dependencies:
pip install pandas scikit-learn scikit-surprise matplotlib