An end-to-end modular pipeline for book recommendation using clustering techniques, built with best practices in MLOps, data engineering, and cloud deployment.
-
Modular, maintainable pipeline components (data ingestion, validation, transformation, training, prediction)
-
Data versioning and pipeline orchestration using DVC for reproducibility
-
Secure management of credentials via AWS Secrets Manager
-
Containerised with Docker for easy deployment on cloud platforms (e.g. AWS EC2)
-
Interactive Streamlit app interface for exploring book recommendations
-
Automated download of datasets from Kaggle and external sources
-
Comprehensive logging and exception handling for robustness
-
Git: https://git-scm.com/
-
Data link: https://www.kaggle.com/datasets/ra4u12/bookrecommendation
- Config file (Constants)
- Config Entity (Return values)
- App Config (Read config file)
- Components (Pipeline code files)
- Pipeline (Run components)
- Main file (run pipeline)
- App file (User interface)
Clone the repository
https://github.com/razyousuf/Book-Recommendation-Clustering-Pipeline.git
conda create -n book python=3.10 -y
conda activate book
pip install -r requirements.txt
Now run,
streamlit run app.py
Note: Do the port mapping to this port:- 8080
sudo apt-get update -y
sudo apt-get upgrade
#Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu
newgrp docker
git clone "your-project-repository-url"
docker build -t raz/app:latest .
docker images -a
docker run -d -p 8501:8501 raz/app
docker ps
docker stop container_id
docker rm $(docker ps -a -q)
docker login
docker push raz/app:latest
docker rmi raz/app:latest
docker pull raz/app
git add .
git commit -m "Updated"
git push origin main