Maintain context persistence for the agent #26259

juliouyehara · 2024-09-10T13:47:40Z

juliouyehara
Sep 10, 2024

Checked other resources

I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.

Commit to Help

I commit to help with one of those options 👆

Example Code

from tools import tools
from langchain.memory import ConversationBufferMemory
from langchain.prompts import PromptTemplate
from langchain.agents import AgentExecutor, create_structured_chat_agent
import pandas as pd 
from langchain_core.callbacks import StdOutCallbackHandler


tools = [tools]

template = """

{tools}

Use a json blob to specify a tool by providing an action key (tool name) and an action_input key (tool input).

Valid "action" values: "Final Answer" or {tool_names}

Provide only ONE action per $JSON_BLOB, as shown:



{{

"action": $TOOL_NAME,

"action_input": $INPUT

}}

Follow this format:

Question: input question to answer

Thought: consider previous and subsequent steps

Action:


$JSON_BLOB

Observation: action result

... (repeat Thought/Action/Observation N times)

Thought: I know what to respond

Action:


{{

"action": "Final Answer",

"action_input": "Final response to human"

}}

Begin! Reminder to ALWAYS respond with a valid json blob of a single action. Use tools if necessary. Respond directly if appropriate. Format is Action:```$JSON_BLOB Observation

Chat History:
{chat_history}

Human: {input}

Chatbot: {agent_scratchpad} 

(reminder to respond in a JSON blob no matter what)
"""

llm = gemini_model
prompt = PromptTemplate(input_variables=["chat_history", "input", "agent_scratchpad"], template=template})
memory = ConversationBufferMemory(memory_key="chat_history")

agent = create_structured_chat_agent(
    llm,
    tools,
    prompt=prompt
    )

agent_executor = AgentExecutor(
        agent=agent,
        tools=tools,
        memory=memory,  
        verbose=True,
        callbacks=[handler]

    )

Description

When I configure the agent, it consumes the tokens in the template, and after each new agent call it continues to consume the template tokens. I want to know if there is a way to pass the template context just once, and after each new call only output tokens are spent.

System Info

platform (windows)
python version: 3.11.9

@juliouyehara · 2024-09-10T13:49:45Z

dosubot[bot]
bot Sep 10, 2024

Hello, @juliouyehara! I'm here to help you with your coding questions and bug fixes. Let's tackle this challenge together!

To configure the agent to pass the template context just once and ensure that only output tokens are spent after each new agent call in LangChain, you can use the state_modifier parameter in the create_react_agent function. Here is an example of how to achieve this:

from langgraph.prebuilt import create_react_agent
from langgraph.prebuilt.chat_agent_executor import AgentState
from langchain_core.prompts import ChatPromptTemplate

# Define the prompt template
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant."),
        ("placeholder", "{messages}"),
    ]
)

# Define the state modifier function
def _modify_state_messages(state: AgentState):
    # Pass the template context just once
    return prompt.invoke({"messages": state["messages"]}).to_messages()

# Create the agent with the state modifier
app = create_react_agent(model, tools, state_modifier=_modify_state_messages)

# Invoke the agent
messages = app.invoke({"messages": [("human", query)]})
print(
    {
        "input": query,
        "output": messages["messages"][-1].content,
    }
)

This setup ensures that the template context is passed only once, and subsequent calls will only spend tokens on the output ^[1].

To continue talking to Dosu, mention @dosu.

14 replies

dosubot[bot] bot Sep 10, 2024

Yes, you can define an initial agent that interacts with the user and then calls another agent to use tools only when the user requests information that requires a tool. This can be achieved by dynamically selecting tools based on the user's query using a custom prompt template and a tool retriever.

Here is an example setup:

from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.messages import HumanMessage

# Initialize Tavily and an OpenAI chat model capable of tool calling
tools = [TavilySearchResults(max_results=1)]
chat = ChatOpenAI(model="gpt-3.5-turbo-1106", temperature=0)

# Create a prompt with a placeholder for chat history
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. You may not need to use tools for every query - the user may just want to chat!",
        ),
        ("placeholder", "{messages}"),
        ("placeholder", "{agent_scratchpad}"),
    ]
)

# Assemble the agent
agent = create_tool_calling_agent(chat, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# Use the agent
config = {"configurable": {"thread_id": "abc123"}}
for chunk in agent_executor.invoke({"messages": [HumanMessage(content="hi im bob! and i live in sf")]}, config):
    print(chunk)
    print("----")

for chunk in agent_executor.invoke({"messages": [HumanMessage(content="whats the weather where I live?")]}, config):
    print(chunk)
    print("----")

This setup allows the initial agent to interact with the user and dynamically select and call another agent to use tools based on the user's query. The agent can handle multi-turn conversations and call tools like a search engine when needed ^[1]^[2]^[3]^[4]^[5].

To continue talking to Dosu, mention @dosu.

juliouyehara Sep 11, 2024
Author

@dosu in this case it is only using the tool_calling_agent, I want to use a structured_chat_agent agent to manage the conversation, but I need the tool descriptions to have tokens consumed only when necessary, in the current model, the agent is reading the entire description of all selected tools and this costs a lot of tokens.

dosubot[bot] bot Sep 11, 2024

To configure a structured_chat_agent in LangChain to manage the conversation while ensuring that tool descriptions consume tokens only when necessary, you can use the tools_renderer argument in the create_structured_chat_agent function. The tools_renderer controls how the tools are converted into a string and then passed into the LLM. By default, it uses render_text_description_and_args, but you can customize this to optimize token usage.

Here's an example of how you can configure it:

from langchain import hub
from langchain_community.chat_models import ChatOpenAI
from langchain.agents import AgentExecutor, create_structured_chat_agent
from langchain_core.prompts import ChatPromptTemplate

# Define a custom tools renderer to optimize token usage
def custom_tools_renderer(tools):
    # Implement your custom logic here to minimize token usage
    return "\n".join([f"{tool.name}: {tool.description}" for tool in tools])

# Pull the prompt template
prompt = hub.pull("hwchase17/structured-chat-agent")

# Initialize the language model
model = ChatOpenAI()

# Define your tools
tools = [...]  # Your tools here

# Create the structured chat agent with the custom tools renderer
agent = create_structured_chat_agent(
    llm=model,
    tools=tools,
    prompt=ChatPromptTemplate(input_variables=["input", "agent_scratchpad"]),
    tools_renderer=custom_tools_renderer
)

# Create the agent executor
agent_executor = AgentExecutor(agent=agent, tools=tools)

# Invoke the agent
agent_executor.invoke({"input": "hi"})

In this example, custom_tools_renderer is a function that formats the tool descriptions in a way that minimizes token usage. You can customize this function based on your specific requirements ^[1].

To continue talking to Dosu, mention @dosu.

juliouyehara Sep 11, 2024
Author

@dosu there was no significant change in the number of tokens used by the prompt

dosubot[bot] bot Sep 11, 2024

To significantly reduce the number of tokens used by the prompt when using a structured_chat_agent, you can compress the prompt or conversational history. LangChain provides a tool called LLMLinguaCompressor for this purpose. Here’s how you can use it:

Compress the Prompt: Use the compress_prompt_using_llmlingua function to compress the prompt before passing it to the agent.
Optimize the Prompt Template: Ensure that the prompt template is optimized to include only necessary information.

Here’s an example of how to implement this:

from langchain_community.document_compressors import LLMLinguaCompressor
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.agents import AgentExecutor, create_structured_chat_agent
from langchain_community.chat_models import ChatOpenAI

# Initialize the compressor
compressor = LLMLinguaCompressor(model_name="openai-community/gpt2", device_map="cpu")

# Define the compression function
def compress_prompt_using_llmlingua(prompt: str, compression_rate: float = 0.5) -> str:
    compressed_data = compressor.compress_prompt(
        prompt,
        rate=compression_rate,
        force_tokens=["!", ".", "?", "\n"],
        drop_consecutive=True,
    )
    return compressed_data

# Define the prompt template
system = '''Respond to the human as helpfully and accurately as possible. You have access to the following tools:

{tools}

Use a json blob to specify a tool by providing an action key (tool name) and an action_input key (tool input).

Valid "action" values: "Final Answer" or {tool_names}

Provide only ONE action per $JSON_BLOB, as shown:


Follow this format:

Question: input question to answer
Thought: consider previous and subsequent steps
Action:

$JSON_BLOB

Observation: action result
... (repeat Thought/Action/Observation N times)
Thought: I know what to respond
Action:

{{
"action": "Final Answer",
"action_input": "Final response to human"
}}

Begin! Reminder to ALWAYS respond with a valid json blob of a single action. Use tools if necessary. Respond directly if appropriate. Format is Action:```$JSON_BLOB```then Observation:.
Thought:'''

human = '''{input}

{agent_scratchpad}

(reminder to respond in a JSON blob no matter what)'''

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        MessagesPlaceholder("chat_history", optional=True),
        ("human", human),
    ]
)

# Initialize the language model
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)

# Define your tools
tools = [...]  # Your tools here

# Create the structured chat agent
agent = create_structured_chat_agent(
    llm=llm,
    tools=tools,
    prompt=prompt
)

# Create the agent executor
agent_executor = AgentExecutor(agent=agent, tools=tools)

# Compress the prompt before invoking the agent
compressed_prompt = compress_prompt_using_llmlingua(prompt.render({"input": "hi", "agent_scratchpad": ""}))

# Invoke the agent with the compressed prompt
result = agent_executor.invoke({"input": compressed_prompt})

print(result)

By compressing the prompt and optimizing the prompt template, you can significantly reduce the number of tokens used by the structured_chat_agent ^[1]^[2]^[3].

To continue talking to Dosu, mention @dosu.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Maintain context persistence for the agent #26259

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 14 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Maintain context persistence for the agent #26259

Uh oh!

juliouyehara Sep 10, 2024

Checked other resources

Commit to Help

Example Code

Description

System Info

Replies: 1 comment · 14 replies

Uh oh!

dosubot[bot] bot Sep 10, 2024

Uh oh!

dosubot[bot] bot Sep 10, 2024

Uh oh!

juliouyehara Sep 11, 2024 Author

Uh oh!

dosubot[bot] bot Sep 11, 2024

Uh oh!

juliouyehara Sep 11, 2024 Author

Uh oh!

dosubot[bot] bot Sep 11, 2024

juliouyehara
Sep 10, 2024

Replies: 1 comment 14 replies

dosubot[bot]
bot Sep 10, 2024

juliouyehara Sep 11, 2024
Author

juliouyehara Sep 11, 2024
Author