Skip to content

Commit 244d9d0

Browse files
committed
Add in new generate_requirements_json.ipynb example.
1 parent bec73a7 commit 244d9d0

File tree

2 files changed

+202
-6
lines changed

2 files changed

+202
-6
lines changed
Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
[
22
{
3-
"step": "install pandas",
4-
"command": "pip install pandas==1.2.4"
3+
"step": "install scikit-learn",
4+
"command": "pip install scikit-learn==1.2.0"
55
},
66
{
77
"step": "install numpy",
8-
"command": "pip install numpy==1.20.1"
8+
"command": "pip install numpy==1.23.5"
99
},
1010
{
11-
"step": "install sklearn",
12-
"command": "pip install sklearn"
11+
"step": "install pandas",
12+
"command": "pip install pandas==1.5.3"
1313
}
14-
]
14+
]
Lines changed: 196 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,196 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "09361f03-99d3-4cbf-a3ba-a75ca2c74b35",
6+
"metadata": {},
7+
"source": [
8+
"Copyright © 2023, SAS Institute Inc., Cary, NC, USA. All Rights Reserved.\n",
9+
"SPDX-License-Identifier: Apache-2.0"
10+
]
11+
},
12+
{
13+
"cell_type": "markdown",
14+
"id": "e9b8cb7c-1974-4af5-8992-d51f90fcfe5b",
15+
"metadata": {},
16+
"source": [
17+
"# Automatic Generation of the requirements.json File\n",
18+
"In order to validate Python models within a container publishing destination, the Python packages which contain the modules that are used in the Python score code file and its score resource files must be installed in the run-time container. You can install the packages when you publish a Python model or decision that contains a Python model to a container publishing destination by adding a `requirements.json` file that includes the package install statements to your model.\n",
19+
"\n",
20+
"This notebook provides an example execution and assessment of the create_requirements_json() function added in python-sasctl v1.8.0. The aim of this function is help to create the instructions (aka the `requirements.json` file) for a lightweight Python container in SAS Model Manager. Lightweight here meaning that the container will only install the packages found in the model's pickle files and python scripts.\n",
21+
"\n",
22+
"### **User Warnings**\n",
23+
"The methods utilized in this function can determine package dependencies and versions from provided scripts and pickle files, but there are some stipulations that need to be considered:\n",
24+
"\n",
25+
"1. If run outside of the development environment that the model was created in, the create_requirements_json() function **CANNOT** determine the required package _versions_ accurately. \n",
26+
"2. Not all Python packages have matching import and install names and as such some of the packages added to the requirements.json file may be incorrectly named (i.e. `import sklearn` vs `pip install scikit-learn`).\n",
27+
"\n",
28+
"As such, it is recommended that the user check over the requirements.json file for package name and version accuracy before deploying to a run-time container in SAS Model Manager."
29+
]
30+
},
31+
{
32+
"cell_type": "markdown",
33+
"id": "ef68334e-7fa3-481a-bc39-9aa6c389f925",
34+
"metadata": {},
35+
"source": [
36+
"---"
37+
]
38+
},
39+
{
40+
"cell_type": "markdown",
41+
"id": "4613074a-a138-4d93-810a-1bbfca79e957",
42+
"metadata": {},
43+
"source": [
44+
"As an example, let's create the requirements.json file for the HMEQ Decision Tree Classification model created and uploaded in pzmmModelImportExample.ipynb. Simply import the function and aim it at the model directory."
45+
]
46+
},
47+
{
48+
"cell_type": "code",
49+
"execution_count": 1,
50+
"id": "654a1382-9576-4215-bf47-ac7fc69428e5",
51+
"metadata": {},
52+
"outputs": [],
53+
"source": [
54+
"from pathlib import Path\n",
55+
"from sasctl import pzmm"
56+
]
57+
},
58+
{
59+
"cell_type": "code",
60+
"execution_count": 2,
61+
"id": "1df8e5d3-c62e-4c35-993c-765a48d25444",
62+
"metadata": {},
63+
"outputs": [],
64+
"source": [
65+
"model_dir = Path.cwd() / \"data/hmeqModels/DecisionTreeClassifier\"\n",
66+
"requirements_json = pzmm.JSONFiles.create_requirements_json(model_dir)"
67+
]
68+
},
69+
{
70+
"cell_type": "markdown",
71+
"id": "ced96ece-8221-413f-a5b5-a03fa93be8fd",
72+
"metadata": {},
73+
"source": [
74+
"Let's take a quick look at what packages were determined for the Decision Tree Classifier model:"
75+
]
76+
},
77+
{
78+
"cell_type": "code",
79+
"execution_count": 3,
80+
"id": "2e3b29e6-aef5-4a02-a54b-57bf7e853cf0",
81+
"metadata": {},
82+
"outputs": [
83+
{
84+
"name": "stdout",
85+
"output_type": "stream",
86+
"text": [
87+
"[\n",
88+
" {\n",
89+
" \"command\": \"pip install sklearn\",\n",
90+
" \"step\": \"install sklearn\"\n",
91+
" },\n",
92+
" {\n",
93+
" \"command\": \"pip install numpy==1.23.5\",\n",
94+
" \"step\": \"install numpy\"\n",
95+
" },\n",
96+
" {\n",
97+
" \"command\": \"pip install pandas==1.5.3\",\n",
98+
" \"step\": \"install pandas\"\n",
99+
" }\n",
100+
"]\n"
101+
]
102+
}
103+
],
104+
"source": [
105+
"import json\n",
106+
"print(json.dumps(requirements_json, sort_keys=True, indent=4))"
107+
]
108+
},
109+
{
110+
"cell_type": "markdown",
111+
"id": "f0b11bc8-a1f3-46ff-a232-90b93b1bdabc",
112+
"metadata": {},
113+
"source": [
114+
"Note how we have returned the `sklearn` import, which is attempting to refer to the scikit-learn package, but would fail to install the correct package via `pip install sklearn` and also could not collect a package version.\n",
115+
"\n",
116+
"Let's modify the name and add the version in Python and rewrite the requirements.json file to match."
117+
]
118+
},
119+
{
120+
"cell_type": "code",
121+
"execution_count": 4,
122+
"id": "49721dc9-38e2-4d63-86e1-6555b364f4d6",
123+
"metadata": {},
124+
"outputs": [
125+
{
126+
"name": "stdout",
127+
"output_type": "stream",
128+
"text": [
129+
"[\n",
130+
" {\n",
131+
" \"command\": \"pip install scikit-learn==1.2.0\",\n",
132+
" \"step\": \"install scikit-learn\"\n",
133+
" },\n",
134+
" {\n",
135+
" \"command\": \"pip install numpy==1.23.5\",\n",
136+
" \"step\": \"install numpy\"\n",
137+
" },\n",
138+
" {\n",
139+
" \"command\": \"pip install pandas==1.5.3\",\n",
140+
" \"step\": \"install pandas\"\n",
141+
" }\n",
142+
"]\n"
143+
]
144+
}
145+
],
146+
"source": [
147+
"scikit_learn_install = {\n",
148+
" \"command\": \"pip install scikit-learn==1.2.0\",\n",
149+
" \"step\": \"install scikit-learn\"\n",
150+
"}\n",
151+
"requirements_json[0].update(scikit_learn_install)\n",
152+
"print(json.dumps(requirements_json, sort_keys=True, indent=4))"
153+
]
154+
},
155+
{
156+
"cell_type": "code",
157+
"execution_count": 5,
158+
"id": "90da05c4-cd05-423d-8626-97125937f72b",
159+
"metadata": {},
160+
"outputs": [],
161+
"source": [
162+
"with open(Path(model_dir) / \"requirements.json\", \"w\") as req_file:\n",
163+
" req_file.write(json.dumps(requirements_json, indent=4))"
164+
]
165+
},
166+
{
167+
"cell_type": "markdown",
168+
"id": "53e5f3ca-f990-4ca7-9e92-2505087ff985",
169+
"metadata": {},
170+
"source": [
171+
"Now we have a complete and accurate requirements.json file for deploying models to containers in SAS Model Manager!"
172+
]
173+
}
174+
],
175+
"metadata": {
176+
"kernelspec": {
177+
"display_name": "dev-py38",
178+
"language": "python",
179+
"name": "dev-py38"
180+
},
181+
"language_info": {
182+
"codemirror_mode": {
183+
"name": "ipython",
184+
"version": 3
185+
},
186+
"file_extension": ".py",
187+
"mimetype": "text/x-python",
188+
"name": "python",
189+
"nbconvert_exporter": "python",
190+
"pygments_lexer": "ipython3",
191+
"version": "3.8.16"
192+
}
193+
},
194+
"nbformat": 4,
195+
"nbformat_minor": 5
196+
}

0 commit comments

Comments
 (0)