Tool-rewriting and Agents-as-tools #591

strawgate · 2025-05-24T15:24:44Z

strawgate
May 24, 2025

@jlowin tl;dr I'm deeply interested in using FastMCP significantly more, and I would love to contribute (or start the conversation on how we could include them) in contrib or main:

Tool Rewriting which enables the ability to apply significant changes to a 3rd party Tool's behavior, such as modifying arguments or processing responses
A Built-in Agent Framework that enables a FastMCP Author or anyone proxying any 3rd party MCP Server to embed Agent(s) as tools

From here i'll do some stage setting and then we'll get to the good stuff. Feel free to skip to the good stuff.

Problem 1: Bad Tools make bad Agents

Every MCP Server has a set of tools. The information on these tools are included in the instructions for the AI Agent. It's up to the AI Agent to figure out, based on the provided names, descriptions, and arguments for the tools, how to leverage them to solve the user's question. These tool instructions end up having a huge influence on the AI Agent's behavior and performance! When there's a problem with the instructions, the AI Agent's performance suffers.

An Example

Imagine you have a Flight booking Agent that uses the MCP Time Server. You give it the following prompt:

You are a helpful AI Assistant that finds available flights for users and helps book them. You focus on last-minute bookings and primarily focus on the west coast. When talking about the departure time, you must use the time-zone of the departure airport. When talking about the arrival time, you must use the time-zone of the arrival airport.

Tools

To help the AI Agent convert timezones, you also give it the convert_time tool from the MCP Time Server:. When you do this, the "instruction book" that the author of the convert_time tool has written is automatically injected, and you're off to the races:

{prompt from above}

CONVERT_TIME tool

Convert time between timezones

Arguments Description

source_timezone Source IANA timezone name (e.g., 'America/New_York', 'Europe/London'). Use '{local_tz}' as local timezone if no source timezone provided by the user.

target_timezone Target IANA timezone name (e.g., 'Asia/Tokyo', 'America/San_Francisco'). Use '{local_tz}' as local timezone if no target timezone provided by the user.

time Time to convert in 24-hour format (HH:MM)

A Problem

But there's a problem! Every time someone on the west coast asks a question that requires time conversion, the LLM uses the tool and an error occurs. Well, unfortunately, America/San_Francisco (the value in the target_timezone example) is not actually a valid IANA timezone. So every time the AI Agent uses the tool, it returns an error instead of the correct time. Because these descriptions have been included in the Agent's prompt, the model is significantly more likely to use this invalid IANA in a request.

You can't change the MCP Server; you can't change the tool; you can't change the documentation. So you have to give your Agent specific instructions for using this tool.

{prompt from above}

{tool documentation from above}

Important Notes

When you need to convert a time between timezones, use the CONVERT_TIME tool. Do not attempt to use America/San_Francisco as a source timezone, it's not a real timezone, instead please use America/Los_Angeles as the source timezone.

Problem 2: Generic Tools are Bad Tools

We now have thousands of third-party MCP Servers. With MCP, you run around and hook in all of these generic Tools to your various AI Agents and you let the AI Agent decide which ones are the right ones.

Imagine I want to reply to a user's question on GitHub. I decide that the best way to formulate a great response is to:

Read the issue and comments in full
Determine if the user is a customer and what their domain/industry/specialization is in
Query our internal knowledge base for potential solutions to the issue

---
config:
  layout: dagre
---
flowchart LR
    n1["AI Agent"] --> n3["55 GitHub Tools
    Official MCP"] & n4["34 SalesForce Tools
    3rd Party MCP"] & n5["15 Elasticsearch 
    Tools
    Official MCP"]
    n3 --> n6["GitHub"]
    n4 --> n7["SalesForce"]
    n5 --> n8["Elasticsearch"]
    n5@{ shape: rect}

The Agent now has almost 100 tools available to it. Each one is a shiny distraction on the path to solving the user's problem. These tools do almost the right thing almost most of the time, but unfortunately, the Agent can no longer reliably complete tasks, gets distracted, performs unnecessary and expensive tool calls, and generally does not do what you want it to do.

Solving with PromptOps?

So, you begin to limit tools and write lengthy prompts that tell it:

Which tools to use to solve which problems
How to work around bugs in those tools
To only use certain tools in certain ways
To never use tool X, Y and Z
What order to run the tools in

In addition, you run into issues with the LLM trying to consume too much data -- LLMs have strict limits on how much information they can handle before they start producing errors. There is no way for the AI Agent to know ahead of time whether searching for 100 issues on GitHub will return 100 megabytes or 100 kilobytes of data. If the AI Agent receives too big of a response from the tool, everything blows up.

So to avoid Footguns, you add to the prompt:

When you query GitHub for issues, limit the number of issues to 10.
When you read files on disk make sure to check the sizes of all the files you want to read before you read them so you can make sure the combined size does not exceed 1MB.
When you query Salesforce, never use a wildcard query.

Unfortunately, you've embedded all of this knowledge into your Agent's prompt. Your Agent has a mega prompt that tries to make your Agent an expert at everything.

When you go to build more Agents you copy this mega prompt over and over again. All of this just so you can workaround the "bad tools" problem.

Problem 3: Specialized Tools don't scale

So like me, you decide that your AI Agents shouldn't have 100 tools, they should only know about the exact tools they need to complete the task at hand.

So you go ahead and restrict the Agent to only the tools it needs, but even when you restrict it down to 4 tools (read GitHub issue, read Salesforce Customer, query Elasticsearch, write Comment), those 4 tools can still hold a lot of power:

---
config:
  layout: dagre
---
flowchart LR
    n3["Query Elasticsearch"] --> n6["Reply to Customer"]
    n7["Read GitHub Issue"] --> n8["Query Salesforce"]
    n8 --> n3
    n7@{ shape: rect}

So ultimately, to constrain the Agent, you figure out what you need and you write specialty tools:

I only want the AI Agent to access that one specific issue raised by the customer
I only want the AI Agent to look at that specific customer in Salesforce
I only want the AI Agent to query a specific index in Elasticsearch
I only want the AI Agent to write a single comment on the issue

But now you're no longer benefiting from the thousands of third-party MCP Servers that are out there. You're just writing your own tools and MCP might as well not exist.

Specializing Tools and Embedded Agents

A built-in tool-rewriting framework

I don't want to build specialized tools, but I do want specialized tools. Instead of building a tool for every possible use case, I want to take existing tools and tailor them to my specific use-case.

A FastMCP-based tool-rewriting framework would allow you to take off-the-shelf tools from third-party MCP Servers, improve them for your use-case, share them with the community, and optionally specialize them for the task at hand:

Update the documentation for each tool and its parameters
- Better describe the tool and what it does
- Provide examples of how to use the tool to solve various problems
- Document what the tool returns
Insert default values and mandatory values for any parameter on any tool
- Limit the scope of the tool by hard-coding parameters like the SalesForce Customer Name or the GitHub Issue number.
- Set a default value for a parameter like the number of issues to return from a GitHub issues search.
Enforce limits on how much data any particular tool call can return
Apply arbitrary pre- and post-processing to tool call arguments and responses before they reach the LLM.

I've made a draft PR that offers this capability here: #599

A framework for Embedded Agents

A FastMCP-based Agent Framework would allow you to stop teaching every AI Agent how to use every tool, and instead have each MCP Server contain an embedded AI Agent that leverages its expertise on the tools to help you solve problems.

    filesystem_agent = FastMCPAgent(
        name="Filesystem Agent",
        description="Assists with locating, categorizing, searching, reading, or writing files on the system.",
        default_instructions="""
        When you are asked to perform a task that requires you to interact with local files, 
        you should leverage the tools available to you to perform the task. If you are asked a
        question, you should leverage the tools available to you to answer the question.
        """,
        llm_link=AsyncLitellmLLMLink.from_model(
            model="vertex_ai/gemini-2.5-flash-preview-05-20",
        ),
    )

    filesystem_agent.register_as_tools(server)

    await server.run_async(transport=mcp_transport)

Each embedded Agent is exposed as just-another-tool available on the MCP Server. In the same server that you have a query-based search_issues you could have an Agent-based find_related_issues tool that takes a textual description of the issue and returns a list of related issues.

# Normal MCP Tool
>> search_issues(query="repo:strawgate/fastmcp is:issue login page issues")
[
    GitHubIssue(title="Login page is not working", number=123456),
]

# Embedded Agent Tool
>> find_related_issues(repo: "strawgate/fastmcp", problem="I'm having trouble with the login page")
[
    GitHubIssue(title="Login page is not working", number=123456),
]

Both of these tools return the same response to the client but one, the direct tool call, requires deep knowledge of GitHub query syntax, and the other simply entails the agent asking for what it wants.

I have a repo here which contains some initial work towards capability on top of FastMCP: https://github.com/strawgate/fastmcp-agents

Late-binding wrapping/rewriting of third-party MCP Servers

With FastMCPs built-in proxying capabilities, tool-rewriting, and an embedded Agent framework, you could wrap third-party MCP servers without code, at run-time, improve their tool descriptions and parameters, and provide an expert-agent on top of it.

You could then go ahead and leverage it anywhere that can leverage an MCP Server.

No-code embedding of expert-agents on top of any third-party MCP Servers

    "oops-proof-calendar": {
        "command": "uvx",
        "args": [
            "mcp-oops",
            "--agent", 
            "agent.yml", 
            "stdio",
            
            // Original MCP Server config starts here
            "uvx",
            "git+https://github.com/strawgate/py-mcp-collection.git#subdirectory=filesystem-operations", 
            "--setting",
            "settingvalue"
        ]
    }

No-code tool-rewriting of third-party MCP Servers

For example, you could fix the tool descriptions and parameters of third-party MCP servers without writing code.

    "oops-proof-calendar": {
        "command": "uvx",
        "args": [
            "mcp-oops",
            "--tool-overrides", 
            "tool-overrides.yml", 
            "stdio",

            // Original MCP Server config starts here
            "uvx",
            "git+https://github.com/strawgate/py-mcp-collection.git#subdirectory=filesystem-operations", 
            "--setting",
            "settingvalue"
        ]
    }

jlowin · 2025-05-25T15:20:47Z

jlowin
May 25, 2025
Maintainer

@strawgate just want to let you know I've seen your note and it aligns closely with a few internal conversations we've had. I'm traveling and don't have time to write a full response but would you be up for chatting live early next week so we can explore together? You can email me at my first name at prefect.io!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tool-rewriting and Agents-as-tools #591

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

CONVERT_TIME tool

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Tool-rewriting and Agents-as-tools #591

Uh oh!

Uh oh!

strawgate May 24, 2025

Problem 1: Bad Tools make bad Agents

An Example

Tools

CONVERT_TIME tool

A Problem

Problem 2: Generic Tools are Bad Tools

Solving with PromptOps?

Problem 3: Specialized Tools don't scale

Specializing Tools and Embedded Agents

A built-in tool-rewriting framework

A framework for Embedded Agents

Late-binding wrapping/rewriting of third-party MCP Servers

No-code embedding of expert-agents on top of any third-party MCP Servers

No-code tool-rewriting of third-party MCP Servers

Replies: 1 comment

Uh oh!

jlowin May 25, 2025 Maintainer

strawgate
May 24, 2025

jlowin
May 25, 2025
Maintainer