Anthropic’s New MCP Tool Search: The End of Context Pollution?

Anthropic has just released a fix for one of the biggest issues facing the Model Context Protocol (MCP): context window pollution caused by tool definitions. If you are building MCP clients or servers, this update changes how tools are loaded and utilized.

Here is why this update is critical and a guide to the best practices for implementing it.

Why It Is Needed: Solving the Context Bloat

The core problem with the standard MCP setup is how it handles tool connections. By default, when you connect a new MCP server, the system loads the definitions for all available tools on that server immediately. This happens before a single conversational message is sent, meaning a massive portion of your context window can be occupied by tools you might not even use.

The GitHub Example: To illustrate the severity of this issue, consider the GitHub MCP. It contains approximately 91 different tools. Upon connection, these definitions consume about 46,000 tokens—roughly 22% of the entire context window available to Claude 3.5 Opus.

If you connect 5 to 10 different MCP servers, each with 10 to 20 tools, your context window becomes heavily polluted.

The Solution: Anthropic’s new MCP Tool Search allows Claude to load tools dynamically into the context only when they are needed, rather than pre-loading everything,. This approach can theoretically result in an 85% reduction in token usage. instead of holding every definition in memory, Claude loads a single "Tool Search" tool. When a query is made, the system searches the catalog, finds 3-5 relevant tools, and loads the full definitions for only those specific tools.

Best Practice Guide: Implementing MCP Tool Search

While the potential savings are massive, this feature requires specific implementation strategies to work effectively.

1. When to Use It

This feature is not necessary for every scenario. You should implement Tool Search if:

Your MCP tool definitions occupy more than 10% of the context window,.
You have 10 or more MCP tools.

Skip it if: You only have 3-5 tools, if all your tools are frequently used, or if latency is absolutely critical (as the search mechanism adds latency).

2. Choosing Your Search Variant

There are two search mechanisms available, and you should choose based on your naming conventions:

Regular Expression (Regex) Based: Best when your tools follow consistent naming conventions (e.g., get*data). Claude writes patterns to match tool names.
BM25 (Keyword Based): Best when tool names and descriptions vary. This is a semantic search with relevance ranking where Claude writes natural language queries like "tool for weather",.

3. Client-Side Implementation Checklist

If you are building an MCP client, follow this four-step process:

Enable the Beta Header: Add the required beta string to your API request header; without it, tool search will not work.
Add the Tool Search Tool: Add either the Regex or BM25 tool to your tools area. Crucially, do not set deferred loading on this tool—it must be loaded immediately,.
Mark Tools for Deferred Loading: Add the deferred loading: true keyword to non-essential tools. This tells the system not to load them upfront but to wait until Claude searches for them.
Keep Essential Tools Loaded: Identify 3-5 of your most frequently used tools and set deferred loading: false. This ensures immediate access for common operations while searching for everything else.

4. Developer Optimization: Writing Better Descriptions

If you are an MCP server developer, your biggest lever for making tools findable is optimizing tool descriptions. Every word costs tokens, so you must be ruthless.

Rules for Tool Descriptions:

Lead with Function: The first sentence should clearly state what the tool does (e.g., "Get current weather forecast").
Keep it Concise: Limit descriptions to 1-2 sentences.
Add Keywords: Use searchable synonyms like "fetch," "get," or "retrieve" to help Claude find the tool.
Separate Schema from Description: Put constraints in the input schema (for validation), not in the description (which is for discovery).
Avoid "Too Short": A description like "get weather" is worse for search relevance than "get current weather forecast or historical data".

5. Utilizing Server Instructions

You can use a "server instructions" field (similar to a system prompt) to guide Claude on tool workflows. For example, you can instruct the model that for PR operations, it must "first check PR status" before attempting to "view PR or approve it". This context helps Claude understand when to search for which tool.

6. Monitoring and Testing

To ensure the implementation is working:

Test with 30 or more tools to see the most significant improvements.
Monitor your context usage before and after implementation to track token savings.
Avoid the common pitfall of defer-loading everything. If you defer load nothing, you get no benefit; if you defer load the search tool itself, the system breaks.