Opus 4.5 and Tool Search: The Native Fix for MCP Context Bloat

Anthropic released Claude Opus 4.5 today. Here’s the TLDR:

Token efficiency: 76% fewer output tokens at medium effort, 48% at max effort
Automatic context summarization: Long conversations no longer hit walls
Tool Search: Discover tools on-demand instead of loading definitions upfront
Effort parameter: API control to balance speed/cost vs capability
Pricing: $5/$25 per million tokens (down from $15/$75)
Claude Code: Improved Plan Mode, multi-agent parallel sessions in desktop app

The feature that matters most for my workflow: Tool Search.

The Problem I Wrote About

Last month I published Isolating MCP Context in Claude Code. The core issue:

Chrome DevTools MCP? 20,000 tokens just to load the tool definitions. Add a few more MCP servers and you’re starting every conversation with 50% of your context window already gone.

My workaround: slash commands that spawn isolated Claude instances with separate MCP configs. /chrome for browser debugging, /db for database queries. Each runs in its own context, reports back results, keeps the main conversation clean.

It worked. But it was a workaround for a problem that shouldn’t exist.

The Native Solution

Opus 4.5 ships with Tool Search. Instead of loading all tool definitions upfront, you mark tools with defer_loading: true:

{
  "type": "mcp_toolset",
  "mcp_server_name": "chrome-devtools",
  "default_config": {"defer_loading": true}
}

Claude sees only the Tool Search Tool initially. When it needs specific capabilities, it searches for relevant tools. Matching tools get expanded into full definitions on-demand.

How it works

Deferred tools aren’t loaded into context initially. Claude uses regex or BM25-based search to discover tools when needed. You can also implement custom search using embeddings.

The numbers are dramatic:

Traditional: ~72K tokens upfront for 50+ MCP tools
With Tool Search: ~500 tokens initially, plus 3-5 tools (~3K) on-demand
Result: 85% token reduction while maintaining full tool access

Accuracy improved too. Opus 4.5 on MCP evaluations: 79.5% → 88.1% with Tool Search enabled. Fewer tools in context means less confusion between similar tools like notification-send-user vs notification-send-channel.

The tool search tool lets agents work with hundreds of tools by dynamically discovering and loading only what they need instead of loading all definitions upfront.

— Anthropic, Advanced Tool Use

Does This Make Isolation Patterns Obsolete?

Mostly yes. For pure token savings, defer_loading wins. No slash commands, no subprocess spawning, no context juggling. Just a flag.

But the isolation pattern survives for specific cases:

Complete context separation: Tool Search reduces token overhead. It doesn’t prevent Chrome DevTools output (console logs, network traces, DOM snapshots) from filling your conversation. If you want debugging artifacts isolated from your main coding context, spawn a separate instance.
Specialized system prompts: My /chrome command includes a focused system prompt for browser debugging. Tool Search doesn’t help with cognitive specialization.
Permission boundaries: --allowed-tools constrains what a spawned instance can access. Tool Search is about discovery, not restriction.
Self-contained reports: Isolated instances return concise results. No conversation history pollution.

The practical split: Use defer_loading for token efficiency. Use isolation patterns when you need cognitive or permission boundaries.

What Else Is New

The Advanced Tool Use post covers two other features worth knowing:

Programmatic Tool Calling (PTC): Claude writes Python to orchestrate tools in a sandbox. Only final output returns to context. 37% token reduction on complex research tasks, eliminates 19+ inference passes when running 20+ tool calls.
Tool Use Examples: input_examples parameter provides sample tool calls. Improved accuracy from 72% → 90% on complex parameter handling.

Both require the same beta header as Tool Search.

The Bigger Picture

Tool Search is the headline feature. Combined with Opus 4.5’s other improvements:

76% fewer output tokens means longer runs before hitting limits
Automatic context summarization handles conversation overflow
Effort parameter lets you dial quality vs cost per-task

This is infrastructure for the SDLC collapse I wrote about. When agents can sustain multi-hour reasoning across planning, building, testing, and documentation, they need efficient context management. Tool Search + token efficiency + auto-summarization = longer autonomous runs with less human intervention.

The skeptical view

Simon Willison tested Opus 4.5 and couldn’t identify meaningful capability differences from Sonnet 4.5 in practice. The improvements might be more about efficiency than raw capability. That’s still valuable, but temper expectations.

What I’m Changing

Removing MCP isolation entirely: defer_loading handles token savings natively. No more /chrome or /db slash commands for context isolation.
Still using libraries over MCPs: As I wrote in my MCP Code Wrapper post, native libraries (pg, Playwright) are faster and lower friction than MCP wrappers. That hasn’t changed.
Keeping multi-model slash commands: /codex, /gemini, and /nano-banana remain. These aren’t about context isolation. They’re about cognitive specialization: different models for different thinking modes.

Try It

Tool Search is currently API-only with a beta header:

client.beta.messages.create(
    betas=["advanced-tool-use-2025-11-20"],
    model="claude-opus-4-5-20251101",
    tools=[
        {"type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex"},
        # Your MCP tools with defer_loading: true
    ]
)

Update: Now in Claude Code (sort of)

On-demand tool loading shipped in Claude Code 2.0.74, then rolled back in 2.0.76. To preserve the behavior, set ENABLE_EXPERIMENTAL_MCP_CLI=true in your environment. Run /context and you’ll see MCP tools listed as “Available” rather than showing token counts - the schema definitions only load when Claude actually needs them. LSP support also landed, adding ~500 tokens to system tools for go-to-definition, find-references, and hover info. Worth the trade for navigating unfamiliar codebases.

When to use which pattern

Tool Search: You have many MCP tools and want token efficiency. Most cases.

Isolation patterns: You need context separation, specialized prompts, or permission boundaries. Specific cases.

The workaround I built was necessary at the time. Now there’s a better way for most of it. That’s progress.

Opus 4.5 and Tool Search: The Native Fix for MCP Context Bloat

The Problem I Wrote About

The Native Solution

Does This Make Isolation Patterns Obsolete?

What Else Is New

The Bigger Picture

What I’m Changing

Try It

Share this article

Related Posts

Expressing MCP Tools as Code APIs (96% Less Context)

Claude Code's Hidden MCP Flag: 32k Tokens Back

One Year of Claude Code