Anthropic released Claude Opus 4.5 today. Here’s the TLDR:

  • Token efficiency: 76% fewer output tokens at medium effort, 48% at max effort
  • Automatic context summarization: Long conversations no longer hit walls
  • Tool Search: Discover tools on-demand instead of loading definitions upfront
  • Effort parameter: API control to balance speed/cost vs capability
  • Pricing: $5/$25 per million tokens (down from $15/$75)
  • Claude Code: Improved Plan Mode, multi-agent parallel sessions in desktop app

The feature that matters most for my workflow: Tool Search.

The Problem I Wrote About

Last month I published Isolating MCP Context in Claude Code. The core issue:

Chrome DevTools MCP? 20,000 tokens just to load the tool definitions. Add a few more MCP servers and you’re starting every conversation with 50% of your context window already gone.

My workaround: slash commands that spawn isolated Claude instances with separate MCP configs. /chrome for browser debugging, /db for database queries. Each runs in its own context, reports back results, keeps the main conversation clean.

It worked. But it was a workaround for a problem that shouldn’t exist.

The Native Solution

Opus 4.5 ships with Tool Search. Instead of loading all tool definitions upfront, you mark tools with defer_loading: true:

{
  "type": "mcp_toolset",
  "mcp_server_name": "chrome-devtools",
  "default_config": {"defer_loading": true}
}

Claude sees only the Tool Search Tool initially. When it needs specific capabilities, it searches for relevant tools. Matching tools get expanded into full definitions on-demand.

How it works

Deferred tools aren’t loaded into context initially. Claude uses regex or BM25-based search to discover tools when needed. You can also implement custom search using embeddings.

The numbers are dramatic:

  • Traditional: ~72K tokens upfront for 50+ MCP tools
  • With Tool Search: ~500 tokens initially, plus 3-5 tools (~3K) on-demand
  • Result: 85% token reduction while maintaining full tool access

Accuracy improved too. Opus 4.5 on MCP evaluations: 79.5% → 88.1% with Tool Search enabled. Fewer tools in context means less confusion between similar tools like notification-send-user vs notification-send-channel.

The tool search tool lets agents work with hundreds of tools by dynamically discovering and loading only what they need instead of loading all definitions upfront.

— Anthropic, Advanced Tool Use

Does This Make Isolation Patterns Obsolete?

Mostly yes. For pure token savings, defer_loading wins. No slash commands, no subprocess spawning, no context juggling. Just a flag.

But the isolation pattern survives for specific cases:

  • Complete context separation: Tool Search reduces token overhead. It doesn’t prevent Chrome DevTools output (console logs, network traces, DOM snapshots) from filling your conversation. If you want debugging artifacts isolated from your main coding context, spawn a separate instance.
  • Specialized system prompts: My /chrome command includes a focused system prompt for browser debugging. Tool Search doesn’t help with cognitive specialization.
  • Permission boundaries: --allowed-tools constrains what a spawned instance can access. Tool Search is about discovery, not restriction.
  • Self-contained reports: Isolated instances return concise results. No conversation history pollution.

The practical split: Use defer_loading for token efficiency. Use isolation patterns when you need cognitive or permission boundaries.

What Else Is New

The Advanced Tool Use post covers two other features worth knowing:

  • Programmatic Tool Calling (PTC): Claude writes Python to orchestrate tools in a sandbox. Only final output returns to context. 37% token reduction on complex research tasks, eliminates 19+ inference passes when running 20+ tool calls.
  • Tool Use Examples: input_examples parameter provides sample tool calls. Improved accuracy from 72% → 90% on complex parameter handling.

Both require the same beta header as Tool Search.

The Bigger Picture

Tool Search is the headline feature. Combined with Opus 4.5’s other improvements:

  • 76% fewer output tokens means longer runs before hitting limits
  • Automatic context summarization handles conversation overflow
  • Effort parameter lets you dial quality vs cost per-task

This is infrastructure for the SDLC collapse I wrote about. When agents can sustain multi-hour reasoning across planning, building, testing, and documentation, they need efficient context management. Tool Search + token efficiency + auto-summarization = longer autonomous runs with less human intervention.

The skeptical view

Simon Willison tested Opus 4.5 and couldn’t identify meaningful capability differences from Sonnet 4.5 in practice. The improvements might be more about efficiency than raw capability. That’s still valuable, but temper expectations.

What I’m Changing

  • Removing MCP isolation entirely: defer_loading handles token savings natively. No more /chrome or /db slash commands for context isolation.
  • Still using libraries over MCPs: As I wrote in my MCP Code Wrapper post, native libraries (pg, Playwright) are faster and lower friction than MCP wrappers. That hasn’t changed.
  • Keeping multi-model slash commands: /codex, /gemini, and /nano-banana remain. These aren’t about context isolation. They’re about cognitive specialization: different models for different thinking modes.

Try It

Tool Search is currently API-only with a beta header:

client.beta.messages.create(
    betas=["advanced-tool-use-2025-11-20"],
    model="claude-opus-4-5-20251101",
    tools=[
        {"type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex"},
        # Your MCP tools with defer_loading: true
    ]
)
Not in Claude Code yet

As of writing, Tool Search and defer_loading aren’t available in Claude Code. The v2.0.51 release added Opus 4.5 and improved Plan Mode, but MCP tool discovery is still API-only. Expect this to change soon.

When to use which pattern

Tool Search: You have many MCP tools and want token efficiency. Most cases.

Isolation patterns: You need context separation, specialized prompts, or permission boundaries. Specific cases.

The workaround I built was necessary at the time. Now there’s a better way for most of it. That’s progress.