Skip to content

Commit 18d43ba

Browse files
committed
Merge commit 'ff08a04a7d7c2183cd72c66bf063be594b5b68c1'
Conflicts: docs/source/user_guide/large_language_model/deploy_langchain_application.rst
2 parents ef64304 + ff08a04 commit 18d43ba

File tree

5 files changed

+197
-34
lines changed

5 files changed

+197
-34
lines changed

THIRD_PARTY_LICENSES.txt

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,12 @@ docker
5454
* Source code: https://github.com/docker
5555
* Project home: https://www.docker.com/
5656

57+
evaluate
58+
* Copyright 2023 HuggingFace Inc.
59+
* License: Apache-2.0 license
60+
* Source code: https://github.com/huggingface/evaluate
61+
* Project home: https://huggingface.co/docs/evaluate/index
62+
5763
fastavro
5864
* Copyright (c) 2011 Miki Tebeka
5965
* License: MIT License
@@ -133,6 +139,12 @@ jinja2
133139
* Source code: https://github.com/pallets/jinja/
134140
* Project home: https://palletsprojects.com/p/jinja/
135141

142+
langchain
143+
* Copyright (c) 2023 LangChain, Inc.
144+
* License: MIT license
145+
* Source code: https://github.com/langchain-ai/langchain
146+
* Project home: https://www.langchain.com/
147+
136148
lightgbm
137149
* Copyright (c) 2023 Microsoft Corporation
138150
* License: MIT license

ads/llm/serialize.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,6 @@
1515
from langchain.chains import RetrievalQA
1616
from langchain.chains.loading import load_chain_from_config
1717
from langchain.llms import loading
18-
from langchain.load import dumpd
1918
from langchain.load.load import Reviver
2019
from langchain.load.serializable import Serializable
2120
from langchain.schema.runnable import RunnableParallel

ads/llm/serializers/retrieval_qa.py

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
1+
#!/usr/bin/env python
2+
# -*- coding: utf-8 -*--
3+
4+
# Copyright (c) 2023 Oracle and/or its affiliates.
5+
# Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/
6+
17
import base64
28
import json
39
import os
@@ -44,18 +50,18 @@ def save(obj):
4450
serialized["type"] = "constructor"
4551
serialized["_type"] = OpenSearchVectorDBSerializer.type()
4652
kwargs = {}
47-
for key, val in obj.__dict__.items():
48-
if key == "client":
49-
if isinstance(val, OpenSearch):
50-
client_info = val.transport.hosts[0]
53+
for component_name, component in obj.__dict__.items():
54+
if component_name == "client":
55+
if isinstance(component, OpenSearch):
56+
client_info = component.transport.hosts[0]
5157
opensearch_url = (
5258
f"https://{client_info['host']}:{client_info['port']}"
5359
)
5460
kwargs.update({"opensearch_url": opensearch_url})
5561
else:
5662
raise NotImplementedError("Only support OpenSearch client.")
5763
continue
58-
kwargs[key] = dump(val)
64+
kwargs[component_name] = dump(component)
5965
serialized["kwargs"] = kwargs
6066
return serialized
6167

docs/source/user_guide/large_language_model/deploy_langchain_application.rst

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -68,15 +68,18 @@ In this example, we're using a temporary folder generated by ``tempfile.mkdtemp(
6868
Prepare the Model Artifacts
6969
***************************
7070

71-
Call ``prepare()`` from ``ChainDeployment`` to generate the ``score.py`` and serialize the LangChain application to ``chain.yaml`` file under ``artifact_dir`` folder.
72-
Parameters ``inference_conda_env`` and ``inference_python_version`` are passed to define the conda environment where your LangChain application will be running on OCI.
73-
Here we're using ``pytorch21_p39_gpu_v1`` with python 3.9.
71+
Call ``prepare`` from ``ChainDeployment`` to generate the ``score.py`` and serialize the LangChain application to ``chain.yaml`` file under ``artifact_dir`` folder.
72+
Parameters ``inference_conda_env`` and ``inference_python_version`` are passed to define the conda environment where your LangChain application will be running on OCI cloud.
73+
Here, replace ``custom_conda_environment_uri`` with your conda environment uri that has the latest ADS 2.9.1 and replace ``python_version`` with your conda environment python version.
74+
75+
.. note::
76+
For how to customize and publish conda environment, take reference to `Publishing a Conda Environment to an Object Storage Bucket <https://docs.oracle.com/en-us/iaas/data-science/using/conda_publishs_object.htm>`_
7477

7578
.. code-block:: python3
7679
7780
chain_deployment.prepare(
78-
inference_conda_env="pytorch21_p39_gpu_v1",
79-
inference_python_version="3.9",
81+
inference_conda_env="<custom_conda_environment_uri>",
82+
inference_python_version="<python_version>",
8083
)
8184
8285
Below is the ``chain.yaml`` file that was saved from ``llm_chain`` object.

docs/source/user_guide/large_language_model/retrieval.rst

Lines changed: 166 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,167 @@
11
.. _vector_store:
22

3-
########################
4-
Vector Store integration
5-
########################
3+
#################################################
4+
Integration with OCI Generative AI and OpenSearch
5+
#################################################
66

77
.. versionadded:: 2.9.1
88

9-
Current version of Langchain does not support serialization of any vector stores. This will be a problem when you want to deploy a langchain application with the vector store being one of the components using data science model deployment service. To solve this problem, we extended our support of vector stores serialization:
9+
OCI Generative Embedding
10+
========================
11+
12+
The Generative AI Embedding Models convert textual input - ranging from phrases and sentences to entire paragraphs - into a structured format known as embeddings. Each piece of text input is transformed into a numerical array consisting of 1024 distinct numbers. The following pretrained model is available for creating text embeddings:
13+
14+
- embed-english-light-v2.0
15+
16+
To find out the latest supported embedding model, check the `documentation <https://docs.oracle.com/en-us/iaas/Content/generative-ai/embed-models.htm>`_.
17+
18+
The following code snippet shows how to use the Generative AI Embedding Models:
19+
20+
.. code-block:: python3
21+
22+
import ads
23+
ads.set_auth("resource_principal")
24+
25+
oci_embedings = GenerativeAIEmbeddings(
26+
compartment_id="ocid1.compartment.####",
27+
client_kwargs=dict(service_endpoint="https://generativeai.aiservice.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
28+
)
29+
30+
Retrieval QA with OpenSearch
31+
============================
32+
33+
OCI OpenSearch
34+
--------------
35+
36+
OCI Search with OpenSearch is a fully managed service which makes searching vast datasets and getting quick results fast and easy. In large language model world, you can use it as a vector store to store your documents and conduct keyword search or semantic search with help of a text embedding model. For a complete walk through on spinning up a OCI OpenSearch Cluster, see `Search and visualize data using OCI Search Service with OpenSearch <https://docs.oracle.com/en/learn/oci-opensearch/index.html#introduction>`_.
37+
38+
Semantic Search with OCI OpenSearch
39+
-----------------------------------
40+
41+
With the OCI OpenSearch and OCI Generative Embedding, you can do semantic search by using langchain. The following code snippet shows how to do semantic search with OCI OpenSearch:
42+
43+
.. code-block:: python3
44+
45+
import os
46+
os.environ['OCI_OPENSEARCH_USERNAME'] = "username"
47+
os.environ['OCI_OPENSEARCH_PASSWORD'] = "password"
48+
os.environ['OCI_OPENSEARCH_VERIFY_CERTS'] = "False"
49+
50+
# specify the index name that you would like to conduct semantic search on.
51+
INDEX_NAME = "your_index_name"
52+
53+
opensearch_vector_search = OpenSearchVectorSearch(
54+
"https://localhost:9200", # your oci opensearch private endpoint
55+
embedding_function=oci_embedings,
56+
index_name=INDEX_NAME,
57+
engine="lucene",
58+
http_auth=(os.environ["OCI_OPENSEARCH_USERNAME"], os.environ["OCI_OPENSEARCH_PASSWORD"]),
59+
verify_certs=os.environ["OCI_OPENSEARCH_VERIFY_CERTS"],
60+
)
61+
opensearch_vector_search.similarity_search("your query", k=2, size=2)
62+
63+
Retrieval QA Using OCI OpenSearch as a Retriever
64+
------------------------------------------------
65+
66+
Since the search result usually cannot be directly used to answer a specific question. More practical solution is to send the origiral query along with the searched results to a Large Language model to get a more coherent answer. You can also use OCI OpenSearch as a retriever for retrieval QA. The following code snippet shows how to use OCI OpenSearch as a retriever:
67+
68+
.. code-block:: python3
69+
70+
from langchain.chains import RetrievalQA
71+
from ads.llm import GenerativeAI
72+
73+
ads.set_auth("resource_principal")
74+
75+
oci_llm = GenerativeAI(
76+
compartment_id="ocid1.compartment.####",
77+
client_kwargs=dict(service_endpoint="https://generativeai.aiservice.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
78+
)
79+
80+
retriever = opensearch_vector_search.as_retriever(search_kwargs={"vector_field": "embeds",
81+
"text_field": "text",
82+
"k": 3,
83+
"size": 3})
84+
qa = RetrievalQA.from_chain_type(
85+
llm=oci_llm,
86+
chain_type="stuff",
87+
retriever=retriever,
88+
chain_type_kwargs={
89+
"verbose": True
90+
}
91+
)
92+
qa.run("your question")
93+
94+
Retrieval QA with FAISS
95+
=======================
96+
97+
FAISS as Vector DB
98+
------------------
99+
100+
A lot of the time, your documents are not that large and you dont have a OCI OpenSearch cluster set up. In that case, you can use ``FAISS`` as your in-memory vector store, which can also do similarty search very efficiently.
101+
102+
The following code snippet shows how to use ``FAISS`` along with OCI Embedding Model to do semantic search:
103+
104+
.. code-block:: python3
105+
106+
from langchain.document_loaders import TextLoader
107+
from langchain.text_splitter import CharacterTextSplitter
108+
from langchain.vectorstores import FAISS
109+
110+
loader = TextLoader("your.txt")
111+
documents = loader.load()
112+
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
113+
docs = text_splitter.split_documents(documents)
114+
115+
l = len(docs)
116+
embeddings = []
117+
for i in range(l // 16 + 1):
118+
subdocs = [item.page_content for item in docs[i * 16: (i + 1) * 16]]
119+
embeddings.extend(oci_embedings.embed_documents(subdocs))
120+
121+
texts = [item.page_content for item in docs]
122+
text_embedding_pairs = [(text, embed) for text, embed in zip(texts, embeddings)]
123+
db = FAISS.from_embeddings(text_embedding_pairs, oci_embedings)
124+
db.similarity_search("your query", k=2, size=2)
125+
126+
Retrieval QA Using FAISS Vector Store as a Retriever
127+
----------------------------------------------------
128+
129+
Similarly, you can use FAISS Vector Store as a retriever to build a retrieval QA engine using langchain. The following code snippet shows how to use OCI OpenSearch as a retriever:
130+
131+
.. code-block:: python3
132+
133+
from langchain.chains import RetrievalQA
134+
from ads.llm import GenerativeAI
135+
136+
ads.set_auth("resource_principal")
137+
138+
oci_llm = GenerativeAI(
139+
compartment_id="ocid1.compartment.####",
140+
client_kwargs=dict(service_endpoint="https://generativeai.aiservice.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
141+
)
142+
retriever = db.as_retriever()
143+
qa = RetrievalQA.from_chain_type(
144+
llm=oci_llm,
145+
chain_type="stuff",
146+
retriever=retriever,
147+
chain_type_kwargs={
148+
"verbose": True
149+
}
150+
)
151+
qa.run("your question")
152+
153+
Deployment of Retrieval QA
154+
==========================
155+
156+
As of version 0.0.346, Langchain does not support serialization of any vector stores. This will be a problem when you want to deploy a Retrieval QA langchain application. To solve this problem, we extended our support of vector stores serialization:
10157

11158
- ``OpenSearchVectorSearch``
12159
- ``FAISS``
13160

14161
OpenSearchVectorSearch Serialization
15162
------------------------------------
16163

17-
langchain does not automatically support serialization of ``OpenSearchVectorSearch``. However, ADS provides a way to serialize ``OpenSearchVectorSearch``. To serialize ``OpenSearchVectorSearch``, you need to use environment variables to pass in the credentials. The following variables can be passed in through the corresponding environment variables:
164+
langchain does not automatically support serialization of ``OpenSearchVectorSearch``. However, ADS provides a way to serialize ``OpenSearchVectorSearch``. To serialize ``OpenSearchVectorSearch``, you need to use environment variables to store the credentials. The following variables can be passed in through the corresponding environment variables:
18165

19166
- http_auth: (``OCI_OPENSEARCH_USERNAME``, ``OCI_OPENSEARCH_PASSWORD``)
20167
- verify_certs: ``OCI_OPENSEARCH_VERIFY_CERTS``
@@ -52,10 +199,10 @@ During deployment, it is very important that you remember to pass in those envir
52199
"OCI_OPENSEARCH_PASSWORD": "<oci_opensearch_password>",
53200
"OCI_OPENSEARCH_VERIFY_CERTS": "<oci_opensearch_verify_certs>",)
54201
55-
OpenSearchVectorSearch Deployment
56-
---------------------------------
202+
Deployment of Retrieval QA with OpenSearch
203+
------------------------------------------
57204

58-
Here is an example code snippet for OpenSearchVectorSearch deployment:
205+
Here is an example code snippet for deployment of Retrieval QA using OpenSearch as a retriever:
59206

60207
.. code-block:: python3
61208
@@ -66,12 +213,12 @@ Here is an example code snippet for OpenSearchVectorSearch deployment:
66213
ads.set_auth("resource_principal")
67214
68215
oci_embedings = GenerativeAIEmbeddings(
69-
compartment_id="ocid1.compartment.oc1..aaaaaaaapvb3hearqum6wjvlcpzm5ptfxqa7xfftpth4h72xx46ygavkqteq",
216+
compartment_id="ocid1.compartment.####",
70217
client_kwargs=dict(service_endpoint="https://generativeai.aiservice.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
71218
)
72219
73220
oci_llm = GenerativeAI(
74-
compartment_id="ocid1.compartment.oc1..aaaaaaaapvb3hearqum6wjvlcpzm5ptfxqa7xfftpth4h72xx46ygavkqteq",
221+
compartment_id="ocid1.compartment.####",
75222
client_kwargs=dict(service_endpoint="https://generativeai.aiservice.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
76223
)
77224
@@ -95,8 +242,7 @@ Here is an example code snippet for OpenSearchVectorSearch deployment:
95242
retriever = opensearch_vector_search.as_retriever(search_kwargs={"vector_field": "embeds",
96243
"text_field": "text",
97244
"k": 3,
98-
"size": 3},
99-
max_tokens_limit=1000)
245+
"size": 3})
100246
qa = RetrievalQA.from_chain_type(
101247
llm=oci_llm,
102248
chain_type="stuff",
@@ -108,7 +254,8 @@ Here is an example code snippet for OpenSearchVectorSearch deployment:
108254
from ads.llm.deploy import ChainDeployment
109255
model = ChainDeployment(qa)
110256
model.prepare(force_overwrite=True,
111-
inference_conda_env="your_conda_pack",
257+
inference_conda_env="<custom_conda_environment_uri>",
258+
inference_python_version="<python_version>",
112259
)
113260
114261
model.save()
@@ -124,16 +271,10 @@ Here is an example code snippet for OpenSearchVectorSearch deployment:
124271
model.predict("your prompt")
125272
126273
127-
FAISS Serialization
128-
-------------------
129-
130-
If your documents are not too large and you dont have a OCI OpenSearch cluster, you can use ``FAISS`` as your in-memory vector store, which can also do similarty search very efficiently. For ``FAISS``, you can just use it and deploy it as it is.
274+
Deployment of Retrieval QA with FAISS
275+
-------------------------------------
131276

132-
133-
FAISS Deployment
134-
----------------
135-
136-
Here is an example code snippet for FAISS deployment:
277+
Here is an example code snippet for deployment of Retrieval QA using FAISS as a retriever:
137278

138279
.. code-block:: python3
139280
@@ -181,8 +322,10 @@ Here is an example code snippet for FAISS deployment:
181322
)
182323
183324
from ads.llm.deploy import ChainDeployment
325+
model = ChainDeployment(qa)
184326
model.prepare(force_overwrite=True,
185-
inference_conda_env="your_conda_pack",
327+
inference_conda_env="<custom_conda_environment_uri>",
328+
inference_python_version="<python_version>",
186329
)
187330
188331
model.save()

0 commit comments

Comments
 (0)