|
1 | 1 | LangChain Integration
|
2 | 2 | *********************
|
3 | 3 |
|
4 |
| -.. versionadded:: 2.9.1 |
| 4 | +.. versionadded:: 2.11.19 |
5 | 5 |
|
6 |
| -LangChain compatible models/interfaces are needed for LangChain applications to invoke OCI generative AI service or LLMs deployed on OCI data science model deployment service. |
| 6 | +LangChain compatible models/interfaces are needed for LangChain applications to invoke LLMs deployed on OCI data science model deployment service. |
7 | 7 |
|
8 |
| -.. admonition:: Preview Feature |
| 8 | +.. admonition:: LangChain Community |
9 | 9 | :class: note
|
10 | 10 |
|
11 |
| - While the official integration of OCI and LangChain will be added to the LangChain library, ADS provides a preview version of the integration. |
12 |
| - It it important to note that the APIs of the preview version may change in the future. |
| 11 | + Integrations from ADS may provide additional or experimental features in the latest updates, while the stable integrations (such as ``OCIModelDeploymentVLLM`` and ``OCIModelDeploymentTGI``) are also available from `LangChain Community <https://python.langchain.com/docs/integrations/llms/oci_model_deployment_endpoint>`_. |
13 | 12 |
|
14 |
| -Integration with Generative AI |
15 |
| -============================== |
| 13 | +If you deploy LLM on OCI model deployment service using `AI Quick Actions <https://github.com/oracle-samples/oci-data-science-ai-samples/blob/main/ai-quick-actions/model-deployment-tips.md>`_ or `HuggingFace TGI <https://huggingface.co/docs/text-generation-inference/index>`_ , you can use the integration models described in this page to build your application with LangChain. |
16 | 14 |
|
17 |
| -The `OCI Generative AI service <https://www.oracle.com/artificial-intelligence/generative-ai/large-language-models/>`_ provide text generation, summarization and embedding models. |
| 15 | +Authentication |
| 16 | +============== |
18 | 17 |
|
19 |
| -To use the text generation model as LLM in LangChain: |
| 18 | +By default, the integration uses the same authentication method configured with ``ads.set_auth()``. Optionally, you can also pass the ``auth`` keyword argument when initializing the model to use specific authentication method for the model. For example, to use resource principal for all OCI authentication: |
20 | 19 |
|
21 | 20 | .. code-block:: python3
|
22 | 21 |
|
23 |
| - from ads.llm import GenerativeAI |
24 |
| -
|
25 |
| - llm = GenerativeAI( |
26 |
| - compartment_id="<compartment_ocid>", |
27 |
| - # Optionally you can specify keyword arguments for the OCI client, e.g. service_endpoint. |
28 |
| - client_kwargs={ |
29 |
| - "service_endpoint": "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com" |
30 |
| - }, |
| 22 | + import ads |
| 23 | + from ads.llm import ChatOCIModelDeploymentVLLM |
| 24 | + |
| 25 | + ads.set_auth(auth="resource_principal") |
| 26 | + |
| 27 | + llm = ChatOCIModelDeploymentVLLM( |
| 28 | + model="odsc-llm", |
| 29 | + endpoint= f"https://modeldeployment.oci.customer-oci.com/<OCID>/predict", |
| 30 | + # Optionally you can specify additional keyword arguments for the model, e.g. temperature. |
| 31 | + temperature=0.1, |
31 | 32 | )
|
32 | 33 |
|
33 |
| - llm.invoke("Translate the following sentence into French:\nHow are you?\n") |
34 |
| -
|
35 |
| -Here is an example of using prompt template and OCI generative AI LLM to build a translation app: |
| 34 | +Alternatively, you may use specific authentication for the model: |
36 | 35 |
|
37 | 36 | .. code-block:: python3
|
38 | 37 |
|
39 |
| - from langchain.prompts import PromptTemplate |
40 |
| - from langchain.schema.runnable import RunnableParallel, RunnablePassthrough |
41 |
| - from ads.llm import GenerativeAI |
42 |
| - |
43 |
| - # Map the input into a dictionary |
44 |
| - map_input = RunnableParallel(text=RunnablePassthrough()) |
45 |
| - # Template for the input text. |
46 |
| - template = PromptTemplate.from_template( |
47 |
| - "Translate English into French. Do not ask any questions.\nEnglish: Hello!\nFrench: " |
48 |
| - ) |
49 |
| - llm = GenerativeAI( |
50 |
| - compartment_id="<compartment_ocid>", |
51 |
| - # Optionally you can specify keyword arguments for the OCI client, e.g. service_endpoint. |
52 |
| - client_kwargs={ |
53 |
| - "service_endpoint": "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com" |
54 |
| - }, |
55 |
| - ) |
| 38 | + import ads |
| 39 | + from ads.llm import ChatOCIModelDeploymentVLLM |
56 | 40 |
|
57 |
| - # Build the app as a chain |
58 |
| - translation_app = map_input | template | llm |
| 41 | + llm = ChatOCIModelDeploymentVLLM( |
| 42 | + model="odsc-llm", |
| 43 | + endpoint= f"https://modeldeployment.oci.customer-oci.com/<OCID>/predict", |
| 44 | + # Use security token authentication for the model |
| 45 | + auth=ads.auth.security_token(profile="my_profile"), |
| 46 | + # Optionally you can specify additional keyword arguments for the model, e.g. temperature. |
| 47 | + temperature=0.1, |
| 48 | + ) |
59 | 49 |
|
60 |
| - # Now you have a translation app. |
61 |
| - translation_app.invoke("Hello!") |
62 |
| - # "Bonjour!" |
| 50 | +Completion Models |
| 51 | +================= |
63 | 52 |
|
64 |
| -Similarly, you can use the embedding model: |
| 53 | +Completion models takes a text string and input and returns a string with completions. To use completion models, your model should be deployed with the completion endpoint (``/v1/completions``). The following example shows how you can use the ``OCIModelDeploymentVLLM`` class for model deployed with vLLM container. If you deployed the model with TGI container, you can use ``OCIModelDeploymentTGI`` similarly. |
65 | 54 |
|
66 | 55 | .. code-block:: python3
|
67 | 56 |
|
68 |
| - from ads.llm import GenerativeAIEmbeddings |
| 57 | + from ads.llm import OCIModelDeploymentVLLM |
69 | 58 |
|
70 |
| - embed = GenerativeAIEmbeddings( |
71 |
| - compartment_id="<compartment_ocid>", |
72 |
| - # Optionally you can specify keyword arguments for the OCI client, e.g. service_endpoint. |
73 |
| - client_kwargs={ |
74 |
| - "service_endpoint": "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com" |
75 |
| - }, |
| 59 | + llm = OCIModelDeploymentVLLM( |
| 60 | + model="odsc-llm", |
| 61 | + endpoint= f"https://modeldeployment.oci.customer-oci.com/<OCID>/predict", |
| 62 | + # Optionally you can specify additional keyword arguments for the model. |
| 63 | + max_tokens=32, |
76 | 64 | )
|
77 | 65 |
|
78 |
| - embed.embed_query("How are you?") |
| 66 | + # Invoke the LLM. The completion will be a string. |
| 67 | + completion = llm.invoke("Who is the first president of United States?") |
79 | 68 |
|
80 |
| -Integration with Model Deployment |
81 |
| -================================= |
| 69 | + # Stream the completion |
| 70 | + for chunk in llm.stream("Who is the first president of United States?"): |
| 71 | + print(chunk, end="", flush=True) |
| 72 | +
|
| 73 | + # Invoke asynchronously |
| 74 | + completion = await llm.ainvoke("Who is the first president of United States?") |
| 75 | +
|
| 76 | + # Stream asynchronously |
| 77 | + async for chunk in llm.astream("Who is the first president of United States?"): |
| 78 | + print(chunk, end="", flush=True) |
82 | 79 |
|
83 |
| -.. admonition:: Available in LangChain |
84 |
| - :class: note |
85 | 80 |
|
86 |
| - The same ``OCIModelDeploymentVLLM`` and ``ModelDeploymentTGI`` classes are also `available from LangChain <https://python.langchain.com/docs/integrations/llms/oci_model_deployment_endpoint>`_. |
| 81 | +Chat Models |
| 82 | +=========== |
87 | 83 |
|
88 |
| -If you deploy open-source or your own LLM on OCI model deployment service using `vLLM <https://docs.vllm.ai/en/latest/>`_ or `HuggingFace TGI <https://huggingface.co/docs/text-generation-inference/index>`_ , you can use the ``ModelDeploymentVLLM`` or ``ModelDeploymentTGI`` to integrate your model with LangChain. |
| 84 | +Chat models takes `chat messages <https://python.langchain.com/docs/concepts/#messages>`_ as inputs and returns additional chat message (usually `AIMessage <https://python.langchain.com/docs/concepts/#aimessage>`_) as output. To use chat models, your models must be deployed with chat completion endpoint (``/v1/chat/completions``). The following example shows how you can use the ``ChatOCIModelDeploymentVLLM`` class for model deployed with vLLM container. If you deployed the model with TGI container, you can use ``ChatOCIModelDeploymentTGI`` similarly. |
89 | 85 |
|
90 | 86 | .. code-block:: python3
|
91 | 87 |
|
92 |
| - from ads.llm import ModelDeploymentVLLM |
| 88 | + from langchain_core.messages import HumanMessage, SystemMessage |
| 89 | + from ads.llm import ChatOCIModelDeploymentVLLM |
93 | 90 |
|
94 |
| - llm = ModelDeploymentVLLM( |
95 |
| - endpoint="https://<your_model_deployment_endpoint>/predict", |
96 |
| - model="<model_name>" |
| 91 | + llm = ChatOCIModelDeploymentVLLM( |
| 92 | + model="odsc-llm", |
| 93 | + endpoint= f"https://modeldeployment.oci.customer-oci.com/<OCID>/predict", |
| 94 | + # Optionally you can specify additional keyword arguments for the model. |
| 95 | + max_tokens=32, |
97 | 96 | )
|
98 | 97 |
|
99 |
| -.. code-block:: python3 |
| 98 | + messages = [ |
| 99 | + SystemMessage(content="You're a helpful assistant providing concise answers."), |
| 100 | + HumanMessage(content="Who's the first president of United States?"), |
| 101 | + ] |
100 | 102 |
|
101 |
| - from ads.llm import ModelDeploymentTGI |
| 103 | + # Invoke the LLM. The response will be `AIMessage` |
| 104 | + response = llm.invoke(messages) |
| 105 | + # Print the text of the response |
| 106 | + print(response.content) |
102 | 107 |
|
103 |
| - llm = ModelDeploymentTGI( |
104 |
| - endpoint="https://<your_model_deployment_endpoint>/predict", |
105 |
| - ) |
| 108 | + # Stream the response. Note that each chunk is an `AIMessageChunk`` |
| 109 | + for chunk in llm.stream(messages): |
| 110 | + print(chunk.content, end="", flush=True) |
106 | 111 |
|
107 |
| -Authentication |
108 |
| -============== |
| 112 | + # Invoke asynchronously |
| 113 | + response = await llm.ainvoke(messages) |
| 114 | + print(response.content) |
109 | 115 |
|
110 |
| -By default, the integration uses the same authentication method configured with ``ads.set_auth()``. Optionally, you can also pass the ``auth`` keyword argument when initializing the model to use specific authentication method for the model. For example, to use resource principal for all OCI authentication: |
| 116 | + # Stream asynchronously |
| 117 | + async for chunk in llm.astream(messages): |
| 118 | + print(chunk.content, end="") |
| 119 | +
|
| 120 | +
|
| 121 | +Tool Calling |
| 122 | +============ |
| 123 | + |
| 124 | +The vLLM container support `tool/function calling <https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#automatic-function-calling>`_ on some models (e.g. Mistral and Hermes models). To use tool calling, you must customize the "Model deployment configuration" to use ``--enable-auto-tool-choice`` and specify ``--tool-call-parser`` when deploying the model with vLLM container. A customized ``chat_template`` is also needed for tool/function calling to work with vLLM. ADS includes a convenience way to import the example templates provided by vLLM. |
111 | 125 |
|
112 | 126 | .. code-block:: python3
|
113 | 127 |
|
114 |
| - import ads |
115 |
| - from ads.llm import GenerativeAI |
116 |
| - |
117 |
| - ads.set_auth(auth="resource_principal") |
118 |
| - |
119 |
| - llm = GenerativeAI( |
120 |
| - compartment_id="<compartment_ocid>", |
121 |
| - # Optionally you can specify keyword arguments for the OCI client, e.g. service_endpoint. |
122 |
| - client_kwargs={ |
123 |
| - "service_endpoint": "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com" |
124 |
| - }, |
| 128 | + from ads.llm import ChatOCIModelDeploymentVLLM, ChatTemplates |
| 129 | +
|
| 130 | + llm = ChatOCIModelDeploymentVLLM( |
| 131 | + model="odsc-llm", |
| 132 | + endpoint= f"https://modeldeployment.oci.customer-oci.com/<OCID>/predict", |
| 133 | + # Set tool_choice to "auto" to enable tool/function calling. |
| 134 | + tool_choice="auto", |
| 135 | + # Use the modified mistral template provided by vLLM |
| 136 | + chat_template=ChatTemplates.mistral() |
125 | 137 | )
|
126 | 138 |
|
127 |
| -Alternatively, you may use specific authentication for the model: |
| 139 | +Following is an example of creating an agent with a tool to get current exchange rate: |
128 | 140 |
|
129 | 141 | .. code-block:: python3
|
130 | 142 |
|
131 |
| - import ads |
132 |
| - from ads.llm import GenerativeAI |
133 |
| -
|
134 |
| - llm = GenerativeAI( |
135 |
| - # Use security token authentication for the model |
136 |
| - auth=ads.auth.security_token(profile="my_profile"), |
137 |
| - compartment_id="<compartment_ocid>", |
138 |
| - # Optionally you can specify keyword arguments for the OCI client, e.g. service_endpoint. |
139 |
| - client_kwargs={ |
140 |
| - "service_endpoint": "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com" |
141 |
| - }, |
| 143 | + import requests |
| 144 | + from langchain_core.tools import tool |
| 145 | + from langchain_core.prompts import ChatPromptTemplate |
| 146 | + from langchain.agents import create_tool_calling_agent, AgentExecutor |
| 147 | +
|
| 148 | + @tool |
| 149 | + def get_exchange_rate(currency:str) -> str: |
| 150 | + """Obtain the current exchange rates of currency in ISO 4217 Three Letter Currency Code""" |
| 151 | +
|
| 152 | + response = requests.get(f"https://open.er-api.com/v6/latest/{currency}") |
| 153 | + return response.json() |
| 154 | +
|
| 155 | + tools = [get_exchange_rate] |
| 156 | + prompt = ChatPromptTemplate.from_messages( |
| 157 | + [ |
| 158 | + ("system", "You are a helpful assistant"), |
| 159 | + ("placeholder", "{chat_history}"), |
| 160 | + ("human", "{input}"), |
| 161 | + ("placeholder", "{agent_scratchpad}"), |
| 162 | + ] |
142 | 163 | )
|
| 164 | +
|
| 165 | + agent = create_tool_calling_agent(llm, tools, prompt) |
| 166 | + agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, return_intermediate_steps=True) |
| 167 | + agent_executor.invoke({"input": "what's the currency conversion of USD to Yen"}) |
0 commit comments