Skip to content

🎬 Binary movie recommendation system (like/dislike) using Restricted Boltzmann Machines and PyTorch for probabilistic filtering (πŸ‘/πŸ‘Ž)

License

Notifications You must be signed in to change notification settings

Ahmadhammam03/movie-recommendation-rbm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎬 Movie Recommendation System with Restricted Boltzmann Machines (RBM)

Python PyTorch License LinkedIn

Binary movie recommendation system using Restricted Boltzmann Machines - predicts whether users will LIKE or DISLIKE movies with probabilistic deep learning.

A probabilistic deep learning approach to movie recommendations using Restricted Boltzmann Machines (RBM) implemented in PyTorch. This project demonstrates energy-based models for collaborative filtering with binary rating predictions (like/dislike), perfect for thumbs-up/thumbs-down recommendation systems.

🎯 Looking for rating predictions (1-5 stars)? Check out my Stacked AutoEncoder implementation for continuous rating predictions!

🌟 Features

  • Probabilistic Model: RBM with Gibbs sampling for recommendation generation
  • Binary Classification: Converts ratings to liked/not-liked predictions
  • Energy-Based Learning: Unsupervised feature learning through energy minimization
  • Contrastive Divergence: CD-k algorithm for efficient training
  • PyTorch Implementation: Modern deep learning framework with GPU support

πŸ“Š Results

  • Training Loss: ~0.247 after 10 epochs
  • Test Loss: ~0.227 (excellent generalization)
  • Binary Accuracy: High precision in like/dislike predictions
  • Architecture: [nb_movies β†’ 100 hidden units]
  • Training Time: ~2 minutes for 10 epochs

πŸš€ Quick Start

Prerequisites

# Python 3.7 or higher
python --version

# Install required packages
pip install -r requirements.txt

Installation

  1. Clone the repository:
git clone https://github.com/Ahmadhammam03/movie-recommendation-rbm.git
cd movie-recommendation-rbm
  1. Download the MovieLens datasets:
# Create data directory
mkdir -p data/ml-1m data/ml-100k

# Download datasets (or use provided links)
# ML-1M: https://grouplens.org/datasets/movielens/1m/
# ML-100K: https://grouplens.org/datasets/movielens/100k/
  1. Run the training:
python train_rbm.py

πŸ“ Project Structure

movie-recommendation-rbm/
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ README.md          # Dataset documentation
β”‚   β”œβ”€β”€ ml-1m/
β”‚   β”‚   β”œβ”€β”€ movies.dat
β”‚   β”‚   β”œβ”€β”€ ratings.dat
β”‚   β”‚   └── users.dat
β”‚   └── ml-100k/
β”‚       β”œβ”€β”€ u1.base
β”‚       β”œβ”€β”€ u1.test
β”‚       └── ...
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ README.md          # Model storage documentation
β”‚   β”œβ”€β”€ best_rbm_model.pth # Trained model (after running main.py)
β”‚   └── checkpoints/       # Training checkpoints
β”œβ”€β”€ notebooks/
β”‚   └── rbm.ipynb          # Jupyter notebook implementation
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ __init__.py        # Package initialization
β”‚   β”œβ”€β”€ model.py           # RBM model architecture
β”‚   β”œβ”€β”€ data_loader.py     # Data preprocessing utilities
β”‚   └── trainer.py         # Training logic & experiment class
β”œβ”€β”€ main.py                # Main script to run full pipeline
β”œβ”€β”€ test_recommendations.py # Script to test trained model
β”œβ”€β”€ train_rbm.py           # Alternative training script
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ LICENSE                # MIT License
└── README.md              # Project documentation

πŸ”§ Model Architecture

Restricted Boltzmann Machine

Visible Layer: nb_movies neurons (binary ratings)
Hidden Layer: 100 neurons (learned features)
Weights: W (100 x nb_movies)
Visible Bias: b (1 x nb_movies)
Hidden Bias: a (1 x 100)
Activation: Sigmoid
Sampling: Gibbs sampling with k=10 steps

Key Differences from Autoencoders:

  • Stochastic vs Deterministic: RBMs use probabilistic sampling
  • Energy-Based: Learns joint probability distribution
  • Binary Outputs: Natural for like/dislike recommendations
  • Bidirectional: Can generate both hidden and visible states

πŸ’» Usage

Basic Training

from src.trainer import RBMExperiment

# Initialize experiment
experiment = RBMExperiment(data_path="data/", model_save_path="models/")

# Run complete pipeline
experiment.load_and_prepare_data(binary=True)
experiment.initialize_model(n_hidden=100)
experiment.train_model(nb_epochs=10)
experiment.evaluate_model()
experiment.save_model("best_rbm_model.pth")

Quick Start with Main Script

# Train the model
python main.py

# Test recommendations
python test_recommendations.py

Custom Configuration

# Modify architecture
rbm = RBM(
    nv=nb_movies,
    nh=200,              # More hidden units
    k=15                 # More Gibbs sampling steps
)

# Different learning parameters
trainer = RBMTrainer(
    rbm,
    batch_size=50,
    learning_rate=0.01,
    momentum=0.9
)

πŸ“ˆ Training Details

Hyperparameters

  • Hidden Units: 100
  • Batch Size: 100
  • Epochs: 10
  • Gibbs Steps (k): 10
  • Learning Rate: Implicit (weight updates)

Binary Rating Conversion

Ratings 1-2 β†’ 0 (Not Liked)
Ratings 3-5 β†’ 1 (Liked)
Rating 0    β†’ -1 (Not Rated)

Loss Progression

Epoch Training Loss
1 0.3432
5 0.2471
10 0.2475

🎯 Key Algorithms

Contrastive Divergence (CD-k)

  1. Positive Phase: Sample hidden units from visible data
  2. Negative Phase: Reconstruct visible units after k Gibbs steps
  3. Weight Update: Difference between positive and negative statistics

Gibbs Sampling

# Sample hidden given visible
p(h|v) = sigmoid(vΒ·W^T + a)
h ~ Bernoulli(p(h|v))

# Sample visible given hidden
p(v|h) = sigmoid(hΒ·W + b)
v ~ Bernoulli(p(v|h))

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • GroupLens for providing the MovieLens dataset
  • Geoffrey Hinton for pioneering work on RBMs and Deep Belief Networks
  • PyTorch community for the excellent deep learning framework

πŸ‘¨β€πŸ’» Author

Ahmad Hammam

πŸ”¬ Theory Behind RBMs

Energy Function

E(v,h) = -b^TΒ·v - a^TΒ·h - v^TΒ·WΒ·h

Probability Distribution

P(v,h) = exp(-E(v,h)) / Z

Where Z is the partition function (normalization constant)

Learning Rule

Ξ”W = Ξ΅(<vΒ·h^T>_data - <vΒ·h^T>_model)

πŸ“Š Comparison: RBM vs SAE

Feature RBM (This Project) SAE (Link)
Output Type Binary (Like/Dislike) Continuous (1-5 Stars)
Use Case Thumbs Up/Down Systems Rating Prediction
Model Type Generative (Energy-based) Discriminative (Reconstruction)
Learning Probabilistic Sampling Deterministic Encoding
Architecture Bipartite Graph Multi-layer Network
Training Contrastive Divergence Backpropagation
Best For Binary preferences, discovery Precise rating prediction

When to Use Which:

  • πŸ”₯ RBM (This Project): Netflix-style thumbs up/down, Spotify-like discovery, binary feedback systems
  • ⭐ SAE (Other Project): Amazon-style star ratings, detailed preference modeling, rating prediction

⭐ If you find this project useful, please consider giving it a star!

About

🎬 Binary movie recommendation system (like/dislike) using Restricted Boltzmann Machines and PyTorch for probabilistic filtering (πŸ‘/πŸ‘Ž)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published