Merge branch 'integrated-mlops-docs' into develop

zeryx · zeryx · commit f61aecdbe2fe · 2022-07-04T16:14:38.000-03:00
diff --git a/README.md b/README.md
@@ -27,6 +27,8 @@ This document will describe the following:
 - What is an Algorithm Development Kit
 - Changes to Algorithm development
 - Example workflows you can use to create your own Algorithms.
+- The Model Manifest System
+- Datarobot MLOps integrations support
 
 
 ## What is an Algorithm Development Kit
@@ -209,6 +211,145 @@ algorithm.init({"data": "https://i.imgur.com/bXdORXl.jpeg"})
 
 ```
 
+## The Model Manifest System
+Model Manifests are optional files that you can provide to your algorithm to easily
+define important model files, their locations; and metadata - this file is called `model_manifest.json`.
+<!-- embedme examples/pytorch_image_classification/model_manifest.json -->
+```python
+{
+  "required_files" : [
+      { "name": "squeezenet",
+      "source_uri": "data://AlgorithmiaSE/image_cassification_demo/squeezenet1_1-f364aa15.pth",
+      "fail_on_tamper": true,
+      "metadata": {
+        "dataset_md5_checksum": "46a44d32d2c5c07f7f66324bef4c7266"
+      }
+    },
+    {
+      "name": "labels",
+      "source_uri": "data://AlgorithmiaSE/image_cassification_demo/imagenet_class_index.json",
+      "fail_on_tamper": true,
+      "metadata": {
+        "dataset_md5_checksum": "46a44d32d2c5c07f7f66324bef4c7266"
+      }
+    }
+  ],
+  "optional_files": [
+      {
+        "name": "mobilenet",
+        "source_uri": "data://AlgorithmiaSE/image_cassification_demo/mobilenet_v2-b0353104.pth",
+        "fail_on_tamper": false,
+        "metadata": {
+            "dataset_md5_checksum": "46a44d32d2c5c07f7f66324bef4c7266"
+          }
+      }
+  ]
+}
+```
+With the Model Manifest system, you're also able to "freeze" your model_manifest.json, creating a model_manifest.json.freeze.
+This file encodes the hash of the model file, preventing tampering once frozen - forver locking a version of your algorithm code with your model file.
+<!-- embedme examples/pytorch_image_classification/model_manifest.json.freeze -->
+```python
+{
+   "required_files":[
+      {
+         "name":"squeezenet",
+         "source_uri":"data://AlgorithmiaSE/image_cassification_demo/squeezenet1_1-f364aa15.pth",
+         "fail_on_tamper":true,
+         "metadata":{
+            "dataset_md5_checksum":"46a44d32d2c5c07f7f66324bef4c7266"
+         },
+         "md5_checksum":"46a44d32d2c5c07f7f66324bef4c7266"
+      },
+      {
+         "name":"labels",
+         "source_uri":"data://AlgorithmiaSE/image_cassification_demo/imagenet_class_index.json",
+         "fail_on_tamper":true,
+         "metadata":{
+            "dataset_md5_checksum":"46a44d32d2c5c07f7f66324bef4c7266"
+         },
+         "md5_checksum":"c2c37ea517e94d9795004a39431a14cb"
+      }
+   ],
+   "optional_files":[
+      {
+         "name":"mobilenet",
+         "source_uri":"data://AlgorithmiaSE/image_cassification_demo/mobilenet_v2-b0353104.pth",
+         "fail_on_tamper":false,
+         "metadata":{
+            "dataset_md5_checksum":"46a44d32d2c5c07f7f66324bef4c7266"
+         }
+      }
+   ],
+   "timestamp":"1633450866.985464",
+   "lock_checksum":"24f5eca888d87661ca6fc08042e40cb7"
+}
+```
+
+As you can link to both hosted data collections, and AWS/GCP/Azure based block storage media, you're able to link your algorithm code with your model files, wherever they live today.
+
+
+## Datarobot MLOps Integration
+As part of the integration with Datarobot, we've built out integration support for the [DataRobot MLOps Agent](https://docs.datarobot.com/en/docs/mlops/deployment/mlops-agent/index.html)
+By selecting `mlops=True` as part of the ADK `init()` function, the ADK will configure and setup the MLOps Agent to support writing content directly back to DataRobot.
+
+
+For this, you'll need to select an MLOps Enabled Environment; and you will need to setup a DataRobot External Deployment.
+Once setup, you will need to define your `mlops.json` file, including your deployment and model ids.
+
+<!-- embedme examples/mlops_hello_world/mlops.json -->
+```python
+{
+  "model_id": "YOUR_MODEL_ID",
+  "deployment_id": "YOUR_DEPLOYMENT_ID",
+  "datarobot_mlops_service_url": "https://app.datarobot.com"
+}
+```
+
+Along with defining your `DATAROBOT_MLOPS_API_TOKEN` as a secret to your Algorithm, you're ready to start sending MLOps data back to DataRobot!
+
+<!-- embedme examples/mlops_hello_world/src/Algorithm.py -->
+```python
+from Algorithmia import ADK
+from time import time
+
+# API calls will begin at the apply() method, with the request body passed as 'input'
+# For more details, see algorithmia.com/developers/algorithm-development/languages
+
+def load(state):
+    # Lets initialize the final components of the MLOps plugin and prepare it for sending info back to DataRobot.
+    state['mlops'] = MLOps().init()
+    return state
+
+def apply(input, state):
+    t1 = time()
+    df = pd.DataFrame(columns=['id', 'values'])
+    df.loc[0] = ["abcd", 0.25]
+    df.loc[0][1] += input
+    association_ids = df.iloc[:, 0].tolist()
+    reporting_predictions = df.loc[0][1]
+    t2 = time()
+    # As we're only making 1 prediction, our reporting tool should show only 1 prediction being made
+    state['mlops'].report_deployment_stats(1, t2 - t1)
+
+    # Report the predictions data: features, predictions, class_names
+    state['mlops'].report_predictions_data(features_df=df,
+                                           predictions=reporting_predictions,
+                                           association_ids=association_ids)
+    return reporting_predictions
+
+
+algorithm = ADK(apply, load)
+algorithm.init(0.25, mlops=True)
+
+
+```
+
+report_deployment_stats()
+
+
+
+
 ## Readme publishing
 To compile the template readme, please check out [embedme](https://github.com/zakhenry/embedme) utility
 and run the following:
diff --git a/README_template.md b/README_template.md
@@ -8,6 +8,8 @@ This document will describe the following:
 - What is an Algorithm Development Kit
 - Changes to Algorithm development
 - Example workflows you can use to create your own Algorithms.
+- The Model Manifest System
+- Datarobot MLOps integrations support
 
 
 ## What is an Algorithm Development Kit
@@ -55,6 +57,44 @@ Check out these examples to help you get started:
 ```python
 ```
 
+## The Model Manifest System
+Model Manifests are optional files that you can provide to your algorithm to easily
+define important model files, their locations; and metadata - this file is called `model_manifest.json`.
+<!-- embedme examples/pytorch_image_classification/model_manifest.json -->
+```python
+```
+With the Model Manifest system, you're also able to "freeze" your model_manifest.json, creating a model_manifest.json.freeze.
+This file encodes the hash of the model file, preventing tampering once frozen - forver locking a version of your algorithm code with your model file.
+<!-- embedme examples/pytorch_image_classification/model_manifest.json.freeze -->
+```python
+```
+
+As you can link to both hosted data collections, and AWS/GCP/Azure based block storage media, you're able to link your algorithm code with your model files, wherever they live today.
+
+
+## Datarobot MLOps Integration
+As part of the integration with Datarobot, we've built out integration support for the [DataRobot MLOps Agent](https://docs.datarobot.com/en/docs/mlops/deployment/mlops-agent/index.html)
+By selecting `mlops=True` as part of the ADK `init()` function, the ADK will configure and setup the MLOps Agent to support writing content directly back to DataRobot.
+
+
+For this, you'll need to select an MLOps Enabled Environment; and you will need to setup a DataRobot External Deployment.
+Once setup, you will need to define your `mlops.json` file, including your deployment and model ids.
+
+<!-- embedme examples/mlops_hello_world/mlops.json -->
+```python
+```
+
+Along with defining your `DATAROBOT_MLOPS_API_TOKEN` as a secret to your Algorithm, you're ready to start sending MLOps data back to DataRobot!
+
+<!-- embedme examples/mlops_hello_world/src/Algorithm.py -->
+```python
+```
+
+report_deployment_stats()
+
+
+
+
 ## Readme publishing
 To compile the template readme, please check out [embedme](https://github.com/zakhenry/embedme) utility
 and run the following:
diff --git a/examples/mlops_hello_world/mlops.json b/examples/mlops_hello_world/mlops.json
@@ -0,0 +1,5 @@
+{
+  "model_id": "YOUR_MODEL_ID",
+  "deployment_id": "YOUR_DEPLOYMENT_ID",
+  "datarobot_mlops_service_url": "https://app.datarobot.com"
+}
diff --git a/examples/mlops_hello_world/requirements.txt b/examples/mlops_hello_world/requirements.txt
@@ -0,0 +1,4 @@
+algorithmia>=1.0.0,<2.0
+datarobot-mlops==8.0.7
+pyaml==21.10.1
+pillow<9.0
diff --git a/examples/mlops_hello_world/src/Algorithm.py b/examples/mlops_hello_world/src/Algorithm.py
@@ -0,0 +1,32 @@
+from Algorithmia import ADK
+from time import time
+
+# API calls will begin at the apply() method, with the request body passed as 'input'
+# For more details, see algorithmia.com/developers/algorithm-development/languages
+
+def load(state):
+    # Lets initialize the final components of the MLOps plugin and prepare it for sending info back to DataRobot.
+    state['mlops'] = MLOps().init()
+    return state
+
+def apply(input, state):
+    t1 = time()
+    df = pd.DataFrame(columns=['id', 'values'])
+    df.loc[0] = ["abcd", 0.25]
+    df.loc[0][1] += input
+    association_ids = df.iloc[:, 0].tolist()
+    reporting_predictions = df.loc[0][1]
+    t2 = time()
+    # As we're only making 1 prediction, our reporting tool should show only 1 prediction being made
+    state['mlops'].report_deployment_stats(1, t2 - t1)
+
+    # Report the predictions data: features, predictions, class_names
+    state['mlops'].report_predictions_data(features_df=df,
+                                           predictions=reporting_predictions,
+                                           association_ids=association_ids)
+    return reporting_predictions
+
+
+algorithm = ADK(apply, load)
+algorithm.init(0.25, mlops=True)
+
diff --git a/examples/mlops_hello_world/src/__init__.py b/examples/mlops_hello_world/src/__init__.py