In this course, we use Databricks serverless version 3
In our examples, we use UV. Check out the documentation on how to install it: https://docs.astral.sh/uv/getting-started/installation/
To create a new environment and create a lockfile, run:
uv sync --extra dev
Using the Marvel Characters Dataset from Kaggle.
This dataset contains detailed information about Marvel characters (e.g., name, powers, physical attributes, alignment, etc.). It is used to build classification and feature engineering models for various MLOps tasks, such as predicting character attributes or status.
01.process_data.py
: Loads and preprocesses the Marvel dataset, splits into train/test, and saves to the catalog.02.train_register_fe_model.py
: Performs feature engineering and trains the Marvel character model.03.deploy_model.py
: Deploys the trained Marvel model to a Databricks model serving endpoint.04.post_commit_status.py
: Posts status updates for Marvel integration tests to GitHub.05.refresh_monitor.py
: Refreshes monitoring tables and dashboards for Marvel model serving.