MLRun is an open-source MLOps orchestration framework that streamlines the entire machine learning lifecycle, from development to production. It automates data preparation, model training, and deployment as elastic, serverless functions, dramatically reducing time to production and engineering effort.
This blueprint demonstrates how MLRun can orchestrate NVIDIA's NeMo Microservices platform to continuously discover and promote more efficient models, as shown in the original NVIDIA Data Flywheel Foundational Blueprint. MLRun provides automatic tracking, logging, scaling and other MLOps best practices, while reducing boilerplate code and glue logic.
Note: This blueprint is a direct clone of the NVIDIA Data Flywheel Foundational Blueprint repository. Click on the link to know more about the Data Flywheels and the blueprint, its architecture and components.
MLRun is integrated into the original blueprint in the following order:
- Blueprint deployment now installs MLRun.
- MLRun Functions base image is created and set as the blueprint's MLRun project default.
- MLRun code turning the original Blueprint tasks to modular MLRun functions.
- MLRun project is created.
- Blueprint is running via MLRun's Jupyter using this notebook.
- Using MLRun to orchestrate the original NVIDIA Data Flywheel Foundational Blueprint workflow, turning each NeMo Microservice into a runnable MLRun function.
- Deploy NIMs as Nuclio serverless functions via MLRun, allowing for auto-scaling and resource management.
- Remove redundant boilerplate code and glue logic, including MongoDB requirement as all runs are stored within MLRun. making the codebase cleaner and more maintainable.
- Add auto-logging capability for NeMo Evaluator and Customizer runs, logging and visualizing the jobs via MLRun.
- Generalize the blueprint to accept any dataset.
In addition to the original blueprint's disclaimer, the purpose of this Blueprint is to showcase MLRun's integration with NeMo Microservices and educate the community on MLOps best practices. This code is provided "as-is" as a reference implementation, and should not be used in production.
The MLRun-related code is licensed under the Apache License, Version 2.0.. Refer to the NVIDIA license for the NVIDIA AI BLUEPRINT license. For more information about NVIDIA license and additional 3rd party license data, refer to the NVIDIA readme. Review all licenses before use.