A production-inspired portfolio project to build, monitor, and scale a real-time recommender system — starting locally, scaling to bare-metal Kubernetes.
This project is intentionally structured to reflect the real-world constraints and decisions machine learning engineers face every day. We begin by developing locally with Docker Compose, because the homelab cluster isn’t available yet. But rather than view this as a blocker, we treat it as a learning opportunity.
The Kubernetes cluster isn't up yet — and that’s by design.
We want to build this system step-by-step and understand each part in isolation before deploying everything into a distributed environment. This mirrors how real-world systems evolve:
- Local development first
- Component-level testing and instrumentation
- Gradual migration to production-like infrastructure
By doing this:
- We gain confidence in the architecture
- We build robust observability practices from day one
- We simulate production workflows locally, so scaling up becomes a matter of deployment — not rework
After years of working on large-scale recommendation pipelines at Tumblr, I wanted to rebuild the fundamentals in a hands-on, public way — from model training to serving to observability — and share the learning journey openly.
This project is not just about coding. It's about thinking like an ML systems engineer.
- API Design — How do you expose a real-time recommender over HTTP?
- Caching & Retrieval — How do you store and retrieve precomputed candidates with speed?
- Batch Processing — What does data prep look like using PySpark?
- Monitoring — What metrics matter? How do you collect them?
- Containerization — How do you compose this into a working system?
- DevOps Readiness — How do you go from "it works on my laptop" to "it works on a cluster"?
And eventually...
- Migration to Kubernetes — How do you take a working local system and deploy it to a real cluster with production-like configs?
- Project initialized
- FastAPI recommendation endpoint
- Redis + embedding storage
- Prometheus + Grafana monitoring
- CTR simulation + A/B framework
- K8s deployment on Homelab
- Write-up and video walkthrough
- FastAPI – high-performance Python API
- Redis / Faiss – fast similarity search
- PySpark – candidate generation / batch prep
- Docker – containerized microservices
- Prometheus – metrics collection
- Grafana – dashboards and observability
- Locust – load testing & traffic simulation
- Docker Compose locally
- Prefect / Airflow – orchestration
- GitHub Actions – CI/CD for testing & linting
[WIP] Start simple and scale later.
- Serve recommendations in <100ms
- Monitor API latency & CTR simulation
- Modular components for offline prep, online serving
- Simulated A/B testing for cold-start & feedback delay
- Self-hosted & cloud-native options
git clone https://github.com/stefannieuwenhuis/open-rec-sphere.git
cd open-rec-sphere
docker-compose up
[WIP] Posts are added while the project advances.
- Project Wiki: Detailed documentation, including system architecture, API references, and tutorials, can be found in the Project Wiki.
- API Documentation: Interactive API docs are available at
http://localhost:8000/docs
once the application is running. - Docker Documentation: Learn more about Docker and Docker Compose here.
- FastAPI Documentation: Official FastAPI documentation is available here.
We welcome contributions from the community! To get started:
- Fork the repository.
- Create a new branch:
git checkout -b feature/your-feature-name
. - Make your changes and commit them:
git commit -m 'Add your message here'
. - Push to the branch:
git push origin feature/your-feature-name
. - Open a pull request.
Please read our Contributing Guide for more details on our code of conduct, and the process for submitting pull requests.
This project is licensed under the MIT License. See the LICENSE file for details.
Happy coding! If you find this project helpful, please give it a star ⭐ and share it with others.