Claude Code's Hidden MCP Flag: 32k Tokens Back

February 2026 Update

The ENABLE_EXPERIMENTAL_MCP_CLI environment variable no longer works. On-demand tool loading is now controlled via ~/.claude/settings.json:

{
  "env": {
    "ENABLE_TOOL_SEARCH": "auto:0"
  }
}

The rest of this post still applies - the mechanism and trade-offs are the same. Just the configuration method changed.

I tried something in my shell config that isn’t documented anywhere.

export ENABLE_EXPERIMENTAL_MCP_CLI=true

One environment variable. Zero documentation. 32,000 tokens back.

The Problem

MCP tools are expensive. Not in API costs - in context. Every tool schema gets loaded into Claude’s context window at startup, whether you use it or not.

Two Chrome-related MCPs (claude-in-chrome and chrome-devtools) cost me 31.7k tokens before I typed a single message. That’s 16% of my context window gone. Add a few database MCPs, a git server, maybe Slack - you’re easily burning 50-100k tokens on tool definitions alone.

I’ve written about this before. Built a wrapper that generates TypeScript Skills with progressive discovery. Anthropic documented the pattern for their API. The community has been requesting lazy loading for months.

Turns out Claude Code already has a solution. It’s just not public yet.

The Flag

export ENABLE_EXPERIMENTAL_MCP_CLI=true

Add this to your shell config. Restart Claude Code.

Experimental means experimental

This is undocumented. I found it by accident. It could change, break, or disappear in any release. Use at your own risk.

Before and After

Without the flag (2 Chrome MCPs):

Context Usage
claude-opus-4-5-20251101 · 99k/200k tokens (49%)

⛁ System prompt: 3.9k tokens (1.9%)
⛁ System tools: 16.6k tokens (8.3%)
⛁ MCP tools: 31.7k tokens (15.9%)
⛁ Custom agents: 232 tokens (0.1%)
⛁ Memory files: 1.4k tokens (0.7%)
⛁ Messages: 8 tokens (0.0%)
⛶ Free space: 101k (50.6%)

With the flag:

Context Usage
claude-opus-4-5-20251101 · 67k/200k tokens (34%)

⛁ System prompt: 3.9k tokens (1.9%)
⛁ System tools: 16.6k tokens (8.3%)
⛁ Custom agents: 232 tokens (0.1%)
⛁ Memory files: 1.4k tokens (0.7%)
⛁ Messages: 8 tokens (0.0%)
⛶ Free space: 133k (66.4%)

MCP tools line: gone. 31.7k tokens saved. System tools unchanged - the CLI wrapper adds negligible overhead.

For context, that’s enough tokens for roughly 25,000 words of conversation, or reading several large source files. With heavy MCP usage (5+ servers), you could recover 50-70k tokens.

How It Works

Instead of loading tool schemas into context, Claude accesses MCP tools through a Bash-based CLI wrapper:

# Discover available tools
mcp-cli tools [server]
mcp-cli grep <pattern>

# Get tool schema (required before calling)
mcp-cli info <server>/<tool>

# Invoke the tool
mcp-cli call <server>/<tool> '<json>'

The pattern is progressive discovery:

Claude doesn’t know tool schemas at startup
When it needs a tool, it calls mcp-cli info to fetch the schema
Then it calls mcp-cli call with the correct parameters
Results come back through stdout

This is essentially Anthropic’s Tool Search Tool pattern (defer_loading: true) implemented at the CLI level instead of the API level.

Real Usage

Here’s what an MCP call looks like with the flag enabled:

Bash(mcp-cli call claude-in-chrome/computer '{"action": "screenshot", "tabId": 684655928}')
⎿  MCP Result: [1 image], [2 text blocks]
   Successfully captured screenshot (1440x761, jpeg) - ID: ss_1191684qv

Claude treats MCP tools as Bash commands. The JSON parameters are passed as arguments. Results parse back into the conversation.

Trade-offs

Extra round-trip per tool. Every MCP call requires an info call first to fetch the schema. For heavy tool usage in a single session, this adds latency. In practice, it’s barely noticeable - the context savings dwarf the extra calls.

Bash escaping edge cases. Complex JSON with nested quotes can get messy. The CLI handles most cases, but I’ve seen occasional parsing issues with deeply nested structures.

No native tool UI. In the standard mode, Claude Code shows MCP tools in the tool picker. With the flag, they’re invisible until you invoke them. You need to know what tools exist.

Experimental stability. This could change or break. No guarantees.

When to Use It

Enable if:

You run multiple MCP servers (3+)
Context pressure is a problem
You’re comfortable with undocumented features
You know which MCP tools you need

Skip if:

You rely on tool discoverability in the UI
You’re running mission-critical workflows
You want stability over savings

Comparison to Other Approaches

Approach	Token Savings	Setup	Stability
Native MCP (default)	0%	None	Stable
mcp-code-wrapper	90-97%	Generate Skills	Experimental
ENABLE_TOOL_SEARCH (settings.json)	100%	One setting	Experimental
Anthropic Tool Search API	~85%	API changes	Beta

The flag is the simplest option if you’re willing to accept experimental status. My mcp-code-wrapper generates Skills but requires a preprocessing step and YMMV. The API approach requires code changes.

The Bigger Picture

MCP’s context cost has been the elephant in the room since it launched. Loading 50+ tool schemas upfront never made sense. Context windows were never unlimited and cheap.

The community has been asking for lazy loading since September. Multiple GitHub issues, proof-of-concept implementations, workarounds. This flag suggests Anthropic is working on it internally.

The best hidden features are the ones that solve problems everyone’s complaining about.

Whether this becomes official, gets renamed, or disappears entirely - the pattern is clear. Progressive MCP discovery is coming. This flag is just an early glimpse.

Try It

~~The old env var approach (export ENABLE_EXPERIMENTAL_MCP_CLI=true) no longer works.~~ Add this to ~/.claude/settings.json:

{
  "env": {
    "ENABLE_TOOL_SEARCH": "auto:0"
  }
}

Restart Claude Code, then run /context. If MCP tools show as deferred instead of loaded upfront, it’s working.

32,000 tokens back is hard to ignore.

Claude Code's Hidden MCP Flag: 32k Tokens Back

The Problem

The Flag

Before and After

How It Works

Real Usage

Trade-offs

When to Use It

Comparison to Other Approaches

The Bigger Picture

Try It

Share this article

Related Posts

Opus 4.5 and Tool Search: The Native Fix for MCP Context Bloat

Expressing MCP Tools as Code APIs (96% Less Context)

The Context Wars: Why Your Browser Tools Are Bleeding Tokens