What is the method for fine-tuning an LLM with function calling for a Langchain Agent? Or, how can I build a custom dataset for fine-tuning an LLM with function calling for a Langchain Agent? #30890

khw11044 · 2025-04-17T04:01:38Z

khw11044
Apr 17, 2025

Checked other resources

I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.

Commit to Help

I commit to help with one of those options 👆

Example Code

from langchain_ollama import ChatOllama 
from langchain_openai import ChatOpenAI

from langchain.agents import create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
from langchain.agents import AgentExecutor
from typing import List

from langchain.agents import tool

from dotenv import load_dotenv

# API 키 정보 로드
load_dotenv()


@tool
def add(xy_pairs: List[tuple]) -> List[dict]:
    """Add two integers together.
    """
    results = []
    for x, y in xy_pairs:
        result = {
            f"{x}+{y}": x + y,
        }
        results.append(result)
    return results

@tool
def multiply(xy_pairs: List[tuple]) -> List[dict]:
    """Multiply two integers together.
    """
    results = []
    for x, y in xy_pairs:
        result = {
            f"{x}*{y}": x * y,
        }
        results.append(result)
    return results


tools = [add, multiply]


prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. "
        ),
        ("placeholder", "{chat_history}"),
        ("human", "{input}"),
        ("placeholder", "{agent_scratchpad}"),
    ]
)

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.6)

llm_with_tools = llm.bind_tools(tools)

agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,                   
    verbose=True,
)

query = "What is 3 * 12?"

result = agent_executor.invoke({"input": query})

Description

"I built a Langchain agent using OpenAI, and the agent effectively utilizes the tools and provides the desired results according to my intentions. Now, I want to build a Langchain agent using an open-source local LLM. When I built an agent using llama3.2, it doesn't use the tools well. Therefore, I have been diligently researching under the title 'fine tuning llm for function calling.' However, when I actually apply this to the Langchain agent I built, it doesn't work well."

for example,

from langchain_ollama import ChatOllama 
from langchain_openai import ChatOpenAI

from langchain.agents import create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
from langchain.agents import AgentExecutor
from typing import List

from langchain.agents import tool

from dotenv import load_dotenv

# API 키 정보 로드
load_dotenv()


@tool
def add(xy_pairs: List[tuple]) -> List[dict]:
    """Add two integers together.
    """
    results = []
    for x, y in xy_pairs:
        result = {
            f"{x}+{y}": x + y,
        }
        results.append(result)
    return results

@tool
def multiply(xy_pairs: List[tuple]) -> List[dict]:
    """Multiply two integers together.
    """
    results = []
    for x, y in xy_pairs:
        result = {
            f"{x}*{y}": x * y,
        }
        results.append(result)
    return results


tools = [add, multiply]


prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. "
        ),
        ("placeholder", "{chat_history}"),
        ("human", "{input}"),
        ("placeholder", "{agent_scratchpad}"),
    ]
)

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.6)

llm_with_tools = llm.bind_tools(tools)

agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,                   
    verbose=True,
)

query = "What is 3 * 12?"

result = agent_executor.invoke({"input": query})

log is

> Entering new AgentExecutor chain...

Invoking: `multiply` with `{'xy_pairs': [[3, 12]]}`


[{'3*12': 36}]3 * 12 is 36.

> Finished chain.

and also

log_history = []
stream_iterator = agent_executor.stream({"input": query})
for step in stream_iterator:
    log_history.append(step)

log_history

You can see results like the ones below

[{'actions': [ToolAgentAction(tool='multiply', tool_input={'xy_pairs': [[3, 12]]}, log="\nInvoking: multiply with ~
'messages': [AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call ~
{'steps': [AgentStep(action=ToolAgentAction(tool='multiply', tool_input={'xy_pairs': [[3, 12]]}, log="\nInvoking ~
'messages': [FunctionMessage(content='[{"3*12": 36}]', additional_kwargs={}, response ~
{'output': 'The result of \(3 \times 12\) is \(36\).',
'messages': [AIMessage(content='The result of \(3 \times ~

However, when using a fine-tuned LLM, I have seen the following results.

Entering new AgentExecutor chain...
{"type":"function","function":{"name":"multiply","arguments":[{"x":3, "y":12}]}}
The result of the multiplication operation between 3 and 12 is 36.

Finished chain.

log_history

{'output': '{"type":"function","function":{"name":"multiply","arguments":[{"x":3, "y":12}]}}\nThe result of the multiplication operation between 3 and 12 is 36.',
'messages': [AIMessage(content='{"type":"function","function":{"name":"multiply","arguments":[{"x":3, "y":12}]}}\nThe result of the multiplication operation between 3 and 12 is 36.', additional_kwargs={}, response_metadata={})]}]

I want to know in detail about building a dataset for a Langchain agent from scratch. I want to fine-tune a local small LLM so that it can follow the same flow as the trajectories of OpenAI's Langchain agent as much as possible. How should I do this?

System Info

langchain==0.3.23
langchain-chroma==0.2.2
langchain-community==0.3.21
langchain-core==0.3.51
langchain-experimental==0.3.4
langchain-google-genai==2.1.2
langchain-huggingface==0.1.2
langchain-ollama==0.3.1
langchain-openai==0.3.12
langchain-text-splitters==0.3.8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What is the method for fine-tuning an LLM with function calling for a Langchain Agent? Or, how can I build a custom dataset for fine-tuning an LLM with function calling for a Langchain Agent? #30890

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

What is the method for fine-tuning an LLM with function calling for a Langchain Agent? Or, how can I build a custom dataset for fine-tuning an LLM with function calling for a Langchain Agent? #30890

Uh oh!

khw11044 Apr 17, 2025

Checked other resources

Commit to Help

Example Code

Description

System Info

Replies: 0 comments

khw11044
Apr 17, 2025