1️⃣ Dynamic MCP Tool Filtering using Embedding & 2️⃣ Confident tool calls with BAML #2893

qdrddr · 2025-04-23T21:00:55Z

qdrddr
Apr 23, 2025

Hi team 👋

First, thank you for the amazing work you're doing, Roo-Code! I've been diving into this ecosystem and wanted to propose a feature that I think would benefit a lot of developers working with large toolchains across multiple MCP servers.

🤖 The Problem

In the LLM-based Cline environment, we've seen that the reliability of tool usage drops significantly once the number of available tools exceeds ~20. This is a known issue: LLMs struggle to reason effectively when presented with too many tool options.

Now consider a real-world setup:

Each MCP server can expose 10+ tools
A user might have dozens of MCP servers
That quickly leads to 100+ tools available at once

Currently, users are forced to manually disable MCP servers just to stay under the tool limit. It’s not scalable, and it limits the utility of what MCP can offer.

🎯 The Opportunity with Embeddings

Embedding filtering could be an ideal 1/2 solution to this scaling problem. Imagine dynamically filtering tools using semantic similarity between:

The user’s query
The available tools’ metadata or descriptions

By narrowing the tool list down to the most relevant 20 (or a configurable limit), you:

✅ Keep the toolset within the LLM comfort zone
✅ Increase the accuracy of tool selection and calls
✅ Avoid manual management of tool/server lists

For example, if a user enters:
"Commit all my modified files and log their last modified timestamps before pushing to GitHub."
—Then only 2-3 tools from 2-3 MCP servers should be selected dynamically. There's no need to expose the full universe of 30+ tools.

This would allow MCP to scale elegantly even as more servers and tools come online. Ideally, this would support dynamic registration/deregistration of MCP servers and on-the-fly tool filtering

2️⃣ Execute tool calls more reliably with BAML

The second important part of the solution is to confidently call tools. We can enjoy the reliability of tool calls thanks to BAML capabilities (similar to Pydantic) to confidently populate all the required fields and parameters for the tool being called.

💡 Proposed Enhancements:

Dynamically embed and filter MCP tools based on the prompt to select a reduced tool list (e.g., max 20 tools) for the LLM with similarity search
Apply BAML to execute tools, reliably populate all the required parameters for the tool to run

📚 References

BAML with MCP tools – example notebook
Large-scale classification with BAML
hello.py – example with Embeddings filtering
pick_best_category.baml

Would love to hear your thoughts on this! It could make the toolchain smarter, lighter, and way more user-friendly.

Thanks so much 🙏
Damein

qdrddr · 2025-07-16T19:11:25Z

qdrddr
Jul 16, 2025
Author

Here is what I propose for the MCP specification update to facilitate dynamic tool discovery. Please support the idea here

modelcontextprotocol/modelcontextprotocol#845

0 replies

qdrddr · 2025-07-17T02:11:34Z

qdrddr
Jul 17, 2025
Author

Another elegant solution would be to allow MCP Clients to utilize the /v1/responses endpoints that can "offload" this task instead of using /v1/chat/completions.

0 replies

qdrddr · 2025-07-18T02:12:02Z

qdrddr
Jul 18, 2025
Author

Here is a paper that may be relevant

1 reply

normalnormie Jul 18, 2025

As of problem #1, the author of this MCP(@smart-mcp-proxy/mcpproxy-go) says it achieves that paper results, as of problem #2 I totally agree on integrating BAML

qdrddr · 2025-07-21T13:12:55Z

qdrddr
Jul 21, 2025
Author

I think this is relevant.

Addressing LLMs limitations in generating sophisticated long-form outputs. Survey analysis of over 1400 research papers:

Context Retrieval and Generation
Context Processing
Context Management
Memory Systems
RAG
Tool-Integrated Reasoning

https://github.com/Meirtz/Awesome-Context-Engineering
https://arxiv.org/abs/2505.06416
https://arxiv.org/abs/2505.03275
https://github.com/smart-mcp-proxy/mcpproxy-go
https://github.com/Dumbris/mcpproxy
https://github.com/nullplatform/meta-mcp-proxy
https://github.com/metatool-ai/metamcp
https://github.com/pratikjadhav2726/Unified-MCP-Tool-Graph
BoundaryML/baml-examples#53

0 replies

qdrddr · 2025-07-21T15:17:25Z

qdrddr
Jul 21, 2025
Author

I believe that MCP Client should be able to do RAG with MCP to improve quality and scalability.

0 replies

R-omk · 2025-07-21T17:38:16Z

R-omk
Jul 21, 2025

related #5963

0 replies

qdrddr · 2025-07-23T21:26:25Z

qdrddr
Jul 23, 2025
Author

Context Rot Affects LLM Performance

Longer input does not guarantee consistent results

🔍 Chroma researchers tested 18 LLMs on simple tasks
📉 Found performance declines with longer inputs

📏 Input length caused unexpected reliability issues
🧪 Highlights the need for long-context evaluations
🧠 Suggests better context engineering strategies

Research results https://github.com/chroma-core/context-rot

0 replies

1️⃣ Dynamic MCP Tool Filtering using Embedding & 2️⃣ Confident tool calls with BAML #2893

Uh oh!

qdrddr Apr 23, 2025

🤖 The Problem

🎯 The Opportunity with Embeddings

2️⃣ Execute tool calls more reliably with BAML

💡 Proposed Enhancements:

📚 References

Replies: 7 comments · 1 reply

Uh oh!

qdrddr Jul 16, 2025 Author

Uh oh!

qdrddr Jul 17, 2025 Author

Uh oh!

qdrddr Jul 18, 2025 Author

Uh oh!

normalnormie Jul 18, 2025

Uh oh!

Uh oh!

qdrddr Jul 21, 2025 Author

Uh oh!

Uh oh!

qdrddr Jul 21, 2025 Author

Uh oh!

R-omk Jul 21, 2025

Uh oh!

qdrddr Jul 23, 2025 Author

qdrddr
Apr 23, 2025

Replies: 7 comments 1 reply

qdrddr
Jul 16, 2025
Author

qdrddr
Jul 17, 2025
Author

qdrddr
Jul 18, 2025
Author

qdrddr
Jul 21, 2025
Author

qdrddr
Jul 21, 2025
Author

R-omk
Jul 21, 2025

qdrddr
Jul 23, 2025
Author