Binary movie recommendation system using Restricted Boltzmann Machines - predicts whether users will LIKE or DISLIKE movies with probabilistic deep learning.
A probabilistic deep learning approach to movie recommendations using Restricted Boltzmann Machines (RBM) implemented in PyTorch. This project demonstrates energy-based models for collaborative filtering with binary rating predictions (like/dislike), perfect for thumbs-up/thumbs-down recommendation systems.
π― Looking for rating predictions (1-5 stars)? Check out my Stacked AutoEncoder implementation for continuous rating predictions!
- Probabilistic Model: RBM with Gibbs sampling for recommendation generation
- Binary Classification: Converts ratings to liked/not-liked predictions
- Energy-Based Learning: Unsupervised feature learning through energy minimization
- Contrastive Divergence: CD-k algorithm for efficient training
- PyTorch Implementation: Modern deep learning framework with GPU support
- Training Loss: ~0.247 after 10 epochs
- Test Loss: ~0.227 (excellent generalization)
- Binary Accuracy: High precision in like/dislike predictions
- Architecture: [nb_movies β 100 hidden units]
- Training Time: ~2 minutes for 10 epochs
# Python 3.7 or higher
python --version
# Install required packages
pip install -r requirements.txt
- Clone the repository:
git clone https://github.com/Ahmadhammam03/movie-recommendation-rbm.git
cd movie-recommendation-rbm
- Download the MovieLens datasets:
# Create data directory
mkdir -p data/ml-1m data/ml-100k
# Download datasets (or use provided links)
# ML-1M: https://grouplens.org/datasets/movielens/1m/
# ML-100K: https://grouplens.org/datasets/movielens/100k/
- Run the training:
python train_rbm.py
movie-recommendation-rbm/
βββ data/
β βββ README.md # Dataset documentation
β βββ ml-1m/
β β βββ movies.dat
β β βββ ratings.dat
β β βββ users.dat
β βββ ml-100k/
β βββ u1.base
β βββ u1.test
β βββ ...
βββ models/
β βββ README.md # Model storage documentation
β βββ best_rbm_model.pth # Trained model (after running main.py)
β βββ checkpoints/ # Training checkpoints
βββ notebooks/
β βββ rbm.ipynb # Jupyter notebook implementation
βββ src/
β βββ __init__.py # Package initialization
β βββ model.py # RBM model architecture
β βββ data_loader.py # Data preprocessing utilities
β βββ trainer.py # Training logic & experiment class
βββ main.py # Main script to run full pipeline
βββ test_recommendations.py # Script to test trained model
βββ train_rbm.py # Alternative training script
βββ requirements.txt # Python dependencies
βββ LICENSE # MIT License
βββ README.md # Project documentation
Visible Layer: nb_movies neurons (binary ratings)
Hidden Layer: 100 neurons (learned features)
Weights: W (100 x nb_movies)
Visible Bias: b (1 x nb_movies)
Hidden Bias: a (1 x 100)
Activation: Sigmoid
Sampling: Gibbs sampling with k=10 steps
- Stochastic vs Deterministic: RBMs use probabilistic sampling
- Energy-Based: Learns joint probability distribution
- Binary Outputs: Natural for like/dislike recommendations
- Bidirectional: Can generate both hidden and visible states
from src.trainer import RBMExperiment
# Initialize experiment
experiment = RBMExperiment(data_path="data/", model_save_path="models/")
# Run complete pipeline
experiment.load_and_prepare_data(binary=True)
experiment.initialize_model(n_hidden=100)
experiment.train_model(nb_epochs=10)
experiment.evaluate_model()
experiment.save_model("best_rbm_model.pth")
# Train the model
python main.py
# Test recommendations
python test_recommendations.py
# Modify architecture
rbm = RBM(
nv=nb_movies,
nh=200, # More hidden units
k=15 # More Gibbs sampling steps
)
# Different learning parameters
trainer = RBMTrainer(
rbm,
batch_size=50,
learning_rate=0.01,
momentum=0.9
)
- Hidden Units: 100
- Batch Size: 100
- Epochs: 10
- Gibbs Steps (k): 10
- Learning Rate: Implicit (weight updates)
Ratings 1-2 β 0 (Not Liked)
Ratings 3-5 β 1 (Liked)
Rating 0 β -1 (Not Rated)
Epoch | Training Loss |
---|---|
1 | 0.3432 |
5 | 0.2471 |
10 | 0.2475 |
- Positive Phase: Sample hidden units from visible data
- Negative Phase: Reconstruct visible units after k Gibbs steps
- Weight Update: Difference between positive and negative statistics
# Sample hidden given visible
p(h|v) = sigmoid(vΒ·W^T + a)
h ~ Bernoulli(p(h|v))
# Sample visible given hidden
p(v|h) = sigmoid(hΒ·W + b)
v ~ Bernoulli(p(v|h))
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- GroupLens for providing the MovieLens dataset
- Geoffrey Hinton for pioneering work on RBMs and Deep Belief Networks
- PyTorch community for the excellent deep learning framework
Ahmad Hammam
- GitHub: @Ahmadhammam03
- LinkedIn: Ahmad Hammam
E(v,h) = -b^TΒ·v - a^TΒ·h - v^TΒ·WΒ·h
P(v,h) = exp(-E(v,h)) / Z
Where Z is the partition function (normalization constant)
ΞW = Ξ΅(<vΒ·h^T>_data - <vΒ·h^T>_model)
Feature | RBM (This Project) | SAE (Link) |
---|---|---|
Output Type | Binary (Like/Dislike) | Continuous (1-5 Stars) |
Use Case | Thumbs Up/Down Systems | Rating Prediction |
Model Type | Generative (Energy-based) | Discriminative (Reconstruction) |
Learning | Probabilistic Sampling | Deterministic Encoding |
Architecture | Bipartite Graph | Multi-layer Network |
Training | Contrastive Divergence | Backpropagation |
Best For | Binary preferences, discovery | Precise rating prediction |
- π₯ RBM (This Project): Netflix-style thumbs up/down, Spotify-like discovery, binary feedback systems
- β SAE (Other Project): Amazon-style star ratings, detailed preference modeling, rating prediction
β If you find this project useful, please consider giving it a star!