Skip to content

Commit 198eac5

Browse files
committed
Updated docs.
1 parent 5c8f615 commit 198eac5

File tree

3 files changed

+277
-3
lines changed

3 files changed

+277
-3
lines changed
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
You can call the ``.summary_status()`` method after a model serialization instance such as ``GenericModel``, ``SklearnModel``, ``TensorFlowModel``, or ``PyTorchModel`` is created. The ``.summary_status()`` method returns a Pandas dataframe that guides you through the entire workflow. It shows which methods are available to call and which ones aren't. Plus it outlines what each method does. If extra actions are required, it also shows those actions.
1+
You can call the ``.summary_status()`` method after a model serialization instance such as ``GenericModel``, ``SklearnModel``, ``TensorFlowModel``, ``EmbeddingONNXModel``, or ``PyTorchModel`` is created. The ``.summary_status()`` method returns a Pandas dataframe that guides you through the entire workflow. It shows which methods are available to call and which ones aren't. Plus it outlines what each method does. If extra actions are required, it also shows those actions.
22

3-
The following image displays an example summary status table created after a user initiates a model instance. The table's Step column displays a Status of Done for the initiate step. And the ``Details`` column explains what the initiate step did such as generating a ``score.py`` file. The Step column also displays the ``prepare()``, ``verify()``, ``save()``, ``deploy()``, and ``predict()`` methods for the model. The Status column displays which method is available next. After the initiate step, the ``prepare()`` method is available. The next step is to call the ``prepare()`` method.
3+
The following image displays an example summary status table created after a user initiates a model instance. The table's Step column displays a Status of Done for the initiate step. And the ``Details`` column explains what the initiate step did such as generating a ``score.py`` file. The Step column also displays the ``prepare()``, ``verify()``, ``save()``, ``deploy()``, and ``predict()`` methods for the model. The Status column displays which method is available next. After the initiate step, the ``prepare()`` method is available. The next step is to call the ``prepare()`` method.

docs/source/user_guide/model_registration/framework_specific_instruction.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,6 @@
1010
frameworks/lightgbmmodel
1111
frameworks/xgboostmodel
1212
frameworks/huggingfacemodel
13+
frameworks/embeddingonnxmodel
1314
frameworks/automlmodel
1415
frameworks/genericmodel
15-
Lines changed: 274 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,274 @@
1+
EmbeddingONNXModel
2+
******************
3+
4+
See `API Documentation <../../../ads.model.framework.html#ads.model.framework.embedding_onnx_model.EmbeddingONNXModel>`__
5+
6+
Overview
7+
========
8+
9+
The ``ads.model.framework.embedding_onnx_model.EmbeddingONNXModel`` class in ADS is designed to rapidly get an Embedding ONNX Model into production. The ``.prepare()`` method creates the model artifacts that are needed without configuring it or writing code. However, you can customize the required ``score.py`` file.
10+
11+
.. include:: ../_template/overview.rst
12+
13+
The following steps take the `sentence-transformers/all-MiniLM-L6-v2 <https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2>`_ model and deploy it into production with a few lines of code.
14+
15+
16+
**Download Embedding Model from HuggingFace**
17+
18+
.. code-block:: python3
19+
20+
import tempfile
21+
import os
22+
import shutil
23+
from huggingface_hub import snapshot_download
24+
25+
local_dir = tempfile.mkdtemp()
26+
27+
# download files needed for this demostration to local folder
28+
snapshot_download(
29+
repo_id="sentence-transformers/all-MiniLM-L6-v2",
30+
local_dir=local_dir,
31+
allow_patterns=[
32+
"onnx/model.onnx",
33+
"config.json",
34+
"special_tokens_map.json",
35+
"tokenizer_config.json",
36+
"tokenizer.json",
37+
"vocab.txt"
38+
]
39+
)
40+
41+
artifact_dir = tempfile.mkdtemp()
42+
# copy all downloaded files to artifact folder
43+
for root, dirs, files in os.walk(local_dir):
44+
for file in files:
45+
src_path = os.path.join(root, file)
46+
shutil.copy(src_path, artifact_dir)
47+
48+
49+
Install Conda Pack
50+
==================
51+
52+
To deploy the embedding onnx model, start with the onnx conda pack with slug ``onnxruntime_p311_gpu_x86_64``.
53+
54+
.. code-block:: bash
55+
56+
odsc conda install -s onnxruntime_p311_gpu_x86_64
57+
58+
59+
Prepare Model Artifact
60+
======================
61+
62+
Instantiate an ``EmbeddingONNXModel()`` object with Embedding ONNX model. All the model related files will be saved under ``artifact_dir``. ADS will auto generate the ``score.py`` and ``runtime.yaml`` that are required for the deployment.
63+
64+
For more detailed information on what parameters that ``EmbeddingONNXModel`` takes, refer to the `API Documentation <../../../ads.model.framework.html#ads.model.framework.embedding_onnx_model.EmbeddingONNXModel>`__
65+
66+
67+
.. code-block:: python3
68+
69+
import ads
70+
from ads.model import EmbeddingONNXModel
71+
72+
# other options are `api_keys` or `security_token` depending on where the code is executed
73+
ads.set_auth("resource_principal")
74+
75+
embedding_onnx_model = EmbeddingONNXModel(artifact_dir=artifact_dir)
76+
embedding_onnx_model.prepare(
77+
inference_conda_env="onnxruntime_p311_gpu_x86_64",
78+
inference_python_version="3.11",
79+
model_file_name="model.onnx",
80+
force_overwrite=True
81+
)
82+
83+
84+
Summary Status
85+
==============
86+
87+
.. include:: ../_template/summary_status.rst
88+
89+
.. figure:: ../figures/summary_status.png
90+
:align: center
91+
92+
93+
Verify Model
94+
============
95+
96+
Call the ``verify()`` to check if the model can be executed locally.
97+
98+
.. code-block:: python3
99+
100+
embedding_onnx_model.verify(
101+
{
102+
"input": ['What are activation functions?', 'What is Deep Learning?'],
103+
"model": "sentence-transformers/all-MiniLM-L6-v2"
104+
},
105+
)
106+
107+
If successful, similar results as below should be presented.
108+
109+
.. code-block:: python3
110+
111+
{
112+
'object': 'list',
113+
'data':
114+
[{
115+
'object': 'embedding',
116+
'embedding':
117+
[[
118+
-0.11011122167110443,
119+
-0.39235609769821167,
120+
0.38759472966194153,
121+
-0.34653618931770325,
122+
...,
123+
]]
124+
}]
125+
}
126+
127+
Register Model
128+
==============
129+
130+
Save the model artifacts and create an model entry in OCI DataScience Model Catalog.
131+
132+
.. code-block:: python3
133+
134+
embedding_onnx_model.save(display_name="sentence-transformers/all-MiniLM-L6-v2")
135+
136+
137+
Deploy and Generate Endpoint
138+
============================
139+
140+
Create a model deployment from the embedding onnx model in Model Catalog. The process takes several minutes and the deployment configurations will be presented once it's completed.
141+
142+
.. code-block:: python3
143+
144+
embedding_onnx_model.deploy(
145+
display_name="all-MiniLM-L6-v2 Embedding Model Deployment",
146+
deployment_log_group_id="<log_group_id>",
147+
deployment_access_log_id="<access_log_id>",
148+
deployment_predict_log_id="<predict_log_id>",
149+
deployment_instance_shape="VM.Standard.E4.Flex",
150+
deployment_ocpus=20,
151+
deployment_memory_in_gbs=256,
152+
)
153+
154+
155+
Run Prediction against Endpoint
156+
===============================
157+
158+
Call ``predict()`` to check the model deployment endpoint.
159+
160+
.. code-block:: python3
161+
162+
embedding_onnx_model.predict(
163+
{
164+
"input": ["What are activation functions?", "What is Deep Learning?"],
165+
"model": "sentence-transformers/all-MiniLM-L6-v2"
166+
},
167+
)
168+
169+
If successful, similar results as below should be presented.
170+
171+
.. code-block:: python3
172+
173+
{
174+
'object': 'list',
175+
'data':
176+
[{
177+
'object': 'embedding',
178+
'embedding':
179+
[[
180+
-0.11011122167110443,
181+
-0.39235609769821167,
182+
0.38759472966194153,
183+
-0.34653618931770325,
184+
...,
185+
]]
186+
}]
187+
}
188+
189+
Run Prediction with OCI CLI
190+
===========================
191+
192+
Model deployment endpoints can also be invoked with the OCI CLI.
193+
194+
.. code-block:: bash
195+
196+
oci raw-request --http-method POST --target-uri <deployment_endpoint> --request-body '{"input": ["What are activation functions?", "What is Deep Learning?"], "model": "sentence-transformers/all-MiniLM-L6-v2"}' --auth resource_principal
197+
198+
199+
Example
200+
=======
201+
202+
.. code-block:: python3
203+
204+
import tempfile
205+
import os
206+
import shutil
207+
import ads
208+
from ads.model import EmbeddingONNXModel
209+
from huggingface_hub import snapshot_download
210+
211+
# other options are `api_keys` or `security_token` depending on where the code is executed
212+
ads.set_auth("resource_principal")
213+
214+
local_dir = tempfile.mkdtemp()
215+
216+
# download files needed for the demostration to local folder
217+
snapshot_download(
218+
repo_id="sentence-transformers/all-MiniLM-L6-v2",
219+
local_dir=local_dir,
220+
allow_patterns=[
221+
"onnx/model.onnx",
222+
"config.json",
223+
"special_tokens_map.json",
224+
"tokenizer_config.json",
225+
"tokenizer.json",
226+
"vocab.txt"
227+
]
228+
)
229+
230+
artifact_dir = tempfile.mkdtemp()
231+
# copy all downloaded files to artifact folder
232+
for root, dirs, files in os.walk(local_dir):
233+
for file in files:
234+
src_path = os.path.join(root, file)
235+
shutil.copy(src_path, artifact_dir)
236+
237+
# initialize EmbeddingONNXModel instance and prepare score.py, runtime.yaml and openapi.json files.
238+
embedding_onnx_model = EmbeddingONNXModel(artifact_dir=artifact_dir)
239+
embedding_onnx_model.prepare(
240+
inference_conda_env="onnxruntime_p311_gpu_x86_64",
241+
inference_python_version="3.11",
242+
model_file_name="model.onnx",
243+
force_overwrite=True
244+
)
245+
246+
# validates model locally
247+
embedding_onnx_model.verify(
248+
{
249+
"input": ['What are activation functions?', 'What is Deep Learning?'],
250+
"model": "sentence-transformers/all-MiniLM-L6-v2"
251+
},
252+
)
253+
254+
# save model to oci model catalog
255+
embedding_onnx_model.save(display_name="sentence-transformers/all-MiniLM-L6-v2")
256+
257+
# deploy model
258+
embedding_onnx_model.deploy(
259+
display_name="all-MiniLM-L6-v2 Embedding Model Deployment",
260+
deployment_log_group_id="<log_group_id>",
261+
deployment_access_log_id="<access_log_id>",
262+
deployment_predict_log_id="<predict_log_id>",
263+
deployment_instance_shape="VM.Standard.E4.Flex",
264+
deployment_ocpus=20,
265+
deployment_memory_in_gbs=256,
266+
)
267+
268+
# check model deployment endpoint
269+
embedding_onnx_model.predict(
270+
{
271+
"input": ["What are activation functions?", "What is Deep Learning?"],
272+
"model": "sentence-transformers/all-MiniLM-L6-v2"
273+
},
274+
)

0 commit comments

Comments
 (0)