The goal of this repository is to act as a template demonstrating a full end to end MLOps style R project.
The toy example is training a predictive model on a csv
dataset that
predicts water temperature throughout the year for Sydney beaches. This
project addresses how you might want to structure this project to run in
a ‘Production’ context.
I go into more details in my workshop Data Science Workflows in R An introduction to deploying production quality R code.
This includes:
- Data import and cleaning with the
{tidyverse}
- Data documentation
- Documented R functions for processing and model training using
{roxygen2}
- R package structure
- Model training and inference pipeline using
{quarto}
- Unit tests with
{testthat}
- Monitoring dashboard in interactive rmarkdown and
{flexdashboard}
- Model card for documenting model ethics, decision and recommended use cases
- Prediction API endpoint creation using
{plumber}
- Model versioning and metadata storage using
{vetiver}
and{pins}
- Fully containerized deployment using Docker
- Dependency managment using
{renv}
- Project website and documentation using
{pkgdown}
- Continuous Integration (CI) pipeline automated using Github Actions
You can install the development version of beachwatch from GitHub with:
# install.packages("pak")
pak::pak("deanmarchiori/r-model-template")
Then run devtools::install()
The model pipeline can be stepped through in model_notebook.qmd
This is a basic example of using the deployed outputs from this project.
-
Clone the repository:
git clone https://github.com/deanmarchiori/r-model-template.git
-
To build your docker image:
docker build -t beach .
-
To run it:
docker run -p 8080:8080 beach
The endpoint should now be live at http://127.0.0.1:8080
To deploy this in a production context, the project can be easily extended to add a Continuous Deployment (CD) workflow in Github Action to build and push the Docker container to your preferred container hub, which can then be accessed and run by your production systems.