Skip to content

Commit f61aecd

Browse files
committed
Merge branch 'integrated-mlops-docs' into develop
2 parents 13a682f + 5dc12aa commit f61aecd

File tree

6 files changed

+222
-0
lines changed

6 files changed

+222
-0
lines changed

README.md

Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,8 @@ This document will describe the following:
2727
- What is an Algorithm Development Kit
2828
- Changes to Algorithm development
2929
- Example workflows you can use to create your own Algorithms.
30+
- The Model Manifest System
31+
- Datarobot MLOps integrations support
3032

3133

3234
## What is an Algorithm Development Kit
@@ -209,6 +211,145 @@ algorithm.init({"data": "https://i.imgur.com/bXdORXl.jpeg"})
209211

210212
```
211213

214+
## The Model Manifest System
215+
Model Manifests are optional files that you can provide to your algorithm to easily
216+
define important model files, their locations; and metadata - this file is called `model_manifest.json`.
217+
<!-- embedme examples/pytorch_image_classification/model_manifest.json -->
218+
```python
219+
{
220+
"required_files" : [
221+
{ "name": "squeezenet",
222+
"source_uri": "data://AlgorithmiaSE/image_cassification_demo/squeezenet1_1-f364aa15.pth",
223+
"fail_on_tamper": true,
224+
"metadata": {
225+
"dataset_md5_checksum": "46a44d32d2c5c07f7f66324bef4c7266"
226+
}
227+
},
228+
{
229+
"name": "labels",
230+
"source_uri": "data://AlgorithmiaSE/image_cassification_demo/imagenet_class_index.json",
231+
"fail_on_tamper": true,
232+
"metadata": {
233+
"dataset_md5_checksum": "46a44d32d2c5c07f7f66324bef4c7266"
234+
}
235+
}
236+
],
237+
"optional_files": [
238+
{
239+
"name": "mobilenet",
240+
"source_uri": "data://AlgorithmiaSE/image_cassification_demo/mobilenet_v2-b0353104.pth",
241+
"fail_on_tamper": false,
242+
"metadata": {
243+
"dataset_md5_checksum": "46a44d32d2c5c07f7f66324bef4c7266"
244+
}
245+
}
246+
]
247+
}
248+
```
249+
With the Model Manifest system, you're also able to "freeze" your model_manifest.json, creating a model_manifest.json.freeze.
250+
This file encodes the hash of the model file, preventing tampering once frozen - forver locking a version of your algorithm code with your model file.
251+
<!-- embedme examples/pytorch_image_classification/model_manifest.json.freeze -->
252+
```python
253+
{
254+
"required_files":[
255+
{
256+
"name":"squeezenet",
257+
"source_uri":"data://AlgorithmiaSE/image_cassification_demo/squeezenet1_1-f364aa15.pth",
258+
"fail_on_tamper":true,
259+
"metadata":{
260+
"dataset_md5_checksum":"46a44d32d2c5c07f7f66324bef4c7266"
261+
},
262+
"md5_checksum":"46a44d32d2c5c07f7f66324bef4c7266"
263+
},
264+
{
265+
"name":"labels",
266+
"source_uri":"data://AlgorithmiaSE/image_cassification_demo/imagenet_class_index.json",
267+
"fail_on_tamper":true,
268+
"metadata":{
269+
"dataset_md5_checksum":"46a44d32d2c5c07f7f66324bef4c7266"
270+
},
271+
"md5_checksum":"c2c37ea517e94d9795004a39431a14cb"
272+
}
273+
],
274+
"optional_files":[
275+
{
276+
"name":"mobilenet",
277+
"source_uri":"data://AlgorithmiaSE/image_cassification_demo/mobilenet_v2-b0353104.pth",
278+
"fail_on_tamper":false,
279+
"metadata":{
280+
"dataset_md5_checksum":"46a44d32d2c5c07f7f66324bef4c7266"
281+
}
282+
}
283+
],
284+
"timestamp":"1633450866.985464",
285+
"lock_checksum":"24f5eca888d87661ca6fc08042e40cb7"
286+
}
287+
```
288+
289+
As you can link to both hosted data collections, and AWS/GCP/Azure based block storage media, you're able to link your algorithm code with your model files, wherever they live today.
290+
291+
292+
## Datarobot MLOps Integration
293+
As part of the integration with Datarobot, we've built out integration support for the [DataRobot MLOps Agent](https://docs.datarobot.com/en/docs/mlops/deployment/mlops-agent/index.html)
294+
By selecting `mlops=True` as part of the ADK `init()` function, the ADK will configure and setup the MLOps Agent to support writing content directly back to DataRobot.
295+
296+
297+
For this, you'll need to select an MLOps Enabled Environment; and you will need to setup a DataRobot External Deployment.
298+
Once setup, you will need to define your `mlops.json` file, including your deployment and model ids.
299+
300+
<!-- embedme examples/mlops_hello_world/mlops.json -->
301+
```python
302+
{
303+
"model_id": "YOUR_MODEL_ID",
304+
"deployment_id": "YOUR_DEPLOYMENT_ID",
305+
"datarobot_mlops_service_url": "https://app.datarobot.com"
306+
}
307+
```
308+
309+
Along with defining your `DATAROBOT_MLOPS_API_TOKEN` as a secret to your Algorithm, you're ready to start sending MLOps data back to DataRobot!
310+
311+
<!-- embedme examples/mlops_hello_world/src/Algorithm.py -->
312+
```python
313+
from Algorithmia import ADK
314+
from time import time
315+
316+
# API calls will begin at the apply() method, with the request body passed as 'input'
317+
# For more details, see algorithmia.com/developers/algorithm-development/languages
318+
319+
def load(state):
320+
# Lets initialize the final components of the MLOps plugin and prepare it for sending info back to DataRobot.
321+
state['mlops'] = MLOps().init()
322+
return state
323+
324+
def apply(input, state):
325+
t1 = time()
326+
df = pd.DataFrame(columns=['id', 'values'])
327+
df.loc[0] = ["abcd", 0.25]
328+
df.loc[0][1] += input
329+
association_ids = df.iloc[:, 0].tolist()
330+
reporting_predictions = df.loc[0][1]
331+
t2 = time()
332+
# As we're only making 1 prediction, our reporting tool should show only 1 prediction being made
333+
state['mlops'].report_deployment_stats(1, t2 - t1)
334+
335+
# Report the predictions data: features, predictions, class_names
336+
state['mlops'].report_predictions_data(features_df=df,
337+
predictions=reporting_predictions,
338+
association_ids=association_ids)
339+
return reporting_predictions
340+
341+
342+
algorithm = ADK(apply, load)
343+
algorithm.init(0.25, mlops=True)
344+
345+
346+
```
347+
348+
report_deployment_stats()
349+
350+
351+
352+
212353
## Readme publishing
213354
To compile the template readme, please check out [embedme](https://github.com/zakhenry/embedme) utility
214355
and run the following:

README_template.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@ This document will describe the following:
88
- What is an Algorithm Development Kit
99
- Changes to Algorithm development
1010
- Example workflows you can use to create your own Algorithms.
11+
- The Model Manifest System
12+
- Datarobot MLOps integrations support
1113

1214

1315
## What is an Algorithm Development Kit
@@ -55,6 +57,44 @@ Check out these examples to help you get started:
5557
```python
5658
```
5759

60+
## The Model Manifest System
61+
Model Manifests are optional files that you can provide to your algorithm to easily
62+
define important model files, their locations; and metadata - this file is called `model_manifest.json`.
63+
<!-- embedme examples/pytorch_image_classification/model_manifest.json -->
64+
```python
65+
```
66+
With the Model Manifest system, you're also able to "freeze" your model_manifest.json, creating a model_manifest.json.freeze.
67+
This file encodes the hash of the model file, preventing tampering once frozen - forver locking a version of your algorithm code with your model file.
68+
<!-- embedme examples/pytorch_image_classification/model_manifest.json.freeze -->
69+
```python
70+
```
71+
72+
As you can link to both hosted data collections, and AWS/GCP/Azure based block storage media, you're able to link your algorithm code with your model files, wherever they live today.
73+
74+
75+
## Datarobot MLOps Integration
76+
As part of the integration with Datarobot, we've built out integration support for the [DataRobot MLOps Agent](https://docs.datarobot.com/en/docs/mlops/deployment/mlops-agent/index.html)
77+
By selecting `mlops=True` as part of the ADK `init()` function, the ADK will configure and setup the MLOps Agent to support writing content directly back to DataRobot.
78+
79+
80+
For this, you'll need to select an MLOps Enabled Environment; and you will need to setup a DataRobot External Deployment.
81+
Once setup, you will need to define your `mlops.json` file, including your deployment and model ids.
82+
83+
<!-- embedme examples/mlops_hello_world/mlops.json -->
84+
```python
85+
```
86+
87+
Along with defining your `DATAROBOT_MLOPS_API_TOKEN` as a secret to your Algorithm, you're ready to start sending MLOps data back to DataRobot!
88+
89+
<!-- embedme examples/mlops_hello_world/src/Algorithm.py -->
90+
```python
91+
```
92+
93+
report_deployment_stats()
94+
95+
96+
97+
5898
## Readme publishing
5999
To compile the template readme, please check out [embedme](https://github.com/zakhenry/embedme) utility
60100
and run the following:

examples/mlops_hello_world/mlops.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"model_id": "YOUR_MODEL_ID",
3+
"deployment_id": "YOUR_DEPLOYMENT_ID",
4+
"datarobot_mlops_service_url": "https://app.datarobot.com"
5+
}
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
algorithmia>=1.0.0,<2.0
2+
datarobot-mlops==8.0.7
3+
pyaml==21.10.1
4+
pillow<9.0
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
from Algorithmia import ADK
2+
from time import time
3+
4+
# API calls will begin at the apply() method, with the request body passed as 'input'
5+
# For more details, see algorithmia.com/developers/algorithm-development/languages
6+
7+
def load(state):
8+
# Lets initialize the final components of the MLOps plugin and prepare it for sending info back to DataRobot.
9+
state['mlops'] = MLOps().init()
10+
return state
11+
12+
def apply(input, state):
13+
t1 = time()
14+
df = pd.DataFrame(columns=['id', 'values'])
15+
df.loc[0] = ["abcd", 0.25]
16+
df.loc[0][1] += input
17+
association_ids = df.iloc[:, 0].tolist()
18+
reporting_predictions = df.loc[0][1]
19+
t2 = time()
20+
# As we're only making 1 prediction, our reporting tool should show only 1 prediction being made
21+
state['mlops'].report_deployment_stats(1, t2 - t1)
22+
23+
# Report the predictions data: features, predictions, class_names
24+
state['mlops'].report_predictions_data(features_df=df,
25+
predictions=reporting_predictions,
26+
association_ids=association_ids)
27+
return reporting_predictions
28+
29+
30+
algorithm = ADK(apply, load)
31+
algorithm.init(0.25, mlops=True)
32+

examples/mlops_hello_world/src/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)