I tried something in my shell config that isn’t documented anywhere.
export ENABLE_EXPERIMENTAL_MCP_CLI=true
One environment variable. Zero documentation. 32,000 tokens back.
The Problem
MCP tools are expensive. Not in API costs - in context. Every tool schema gets loaded into Claude’s context window at startup, whether you use it or not.
Two Chrome-related MCPs (claude-in-chrome and chrome-devtools) cost me 31.7k tokens before I typed a single message. That’s 16% of my context window gone. Add a few database MCPs, a git server, maybe Slack - you’re easily burning 50-100k tokens on tool definitions alone.
I’ve written about this before. Built a wrapper that generates TypeScript Skills with progressive discovery. Anthropic documented the pattern for their API. The community has been requesting lazy loading for months.
Turns out Claude Code already has a solution. It’s just not public yet.
The Flag
export ENABLE_EXPERIMENTAL_MCP_CLI=true
Add this to your shell config. Restart Claude Code.
This is undocumented. I found it by accident. It could change, break, or disappear in any release. Use at your own risk.
Before and After
Without the flag (2 Chrome MCPs):
Context Usage
claude-opus-4-5-20251101 · 99k/200k tokens (49%)
⛁ System prompt: 3.9k tokens (1.9%)
⛁ System tools: 16.6k tokens (8.3%)
⛁ MCP tools: 31.7k tokens (15.9%)
⛁ Custom agents: 232 tokens (0.1%)
⛁ Memory files: 1.4k tokens (0.7%)
⛁ Messages: 8 tokens (0.0%)
⛶ Free space: 101k (50.6%)
With the flag:
Context Usage
claude-opus-4-5-20251101 · 67k/200k tokens (34%)
⛁ System prompt: 3.9k tokens (1.9%)
⛁ System tools: 16.6k tokens (8.3%)
⛁ Custom agents: 232 tokens (0.1%)
⛁ Memory files: 1.4k tokens (0.7%)
⛁ Messages: 8 tokens (0.0%)
⛶ Free space: 133k (66.4%)
MCP tools line: gone. 31.7k tokens saved. System tools unchanged - the CLI wrapper adds negligible overhead.
For context, that’s enough tokens for roughly 25,000 words of conversation, or reading several large source files. With heavy MCP usage (5+ servers), you could recover 50-70k tokens.
How It Works
Instead of loading tool schemas into context, Claude accesses MCP tools through a Bash-based CLI wrapper:
# Discover available tools
mcp-cli tools [server]
mcp-cli grep <pattern>
# Get tool schema (required before calling)
mcp-cli info <server>/<tool>
# Invoke the tool
mcp-cli call <server>/<tool> '<json>'
The pattern is progressive discovery:
- Claude doesn’t know tool schemas at startup
- When it needs a tool, it calls
mcp-cli infoto fetch the schema - Then it calls
mcp-cli callwith the correct parameters - Results come back through stdout
This is essentially Anthropic’s Tool Search Tool pattern (defer_loading: true) implemented at the CLI level instead of the API level.
Real Usage
Here’s what an MCP call looks like with the flag enabled:
Bash(mcp-cli call claude-in-chrome/computer '{"action": "screenshot", "tabId": 684655928}')
⎿ MCP Result: [1 image], [2 text blocks]
Successfully captured screenshot (1440x761, jpeg) - ID: ss_1191684qv
Claude treats MCP tools as Bash commands. The JSON parameters are passed as arguments. Results parse back into the conversation.
Trade-offs
Extra round-trip per tool. Every MCP call requires an info call first to fetch the schema. For heavy tool usage in a single session, this adds latency. In practice, it’s barely noticeable - the context savings dwarf the extra calls.
Bash escaping edge cases. Complex JSON with nested quotes can get messy. The CLI handles most cases, but I’ve seen occasional parsing issues with deeply nested structures.
No native tool UI. In the standard mode, Claude Code shows MCP tools in the tool picker. With the flag, they’re invisible until you invoke them. You need to know what tools exist.
Experimental stability. This could change or break. No guarantees.
When to Use It
Enable if:
- You run multiple MCP servers (3+)
- Context pressure is a problem
- You’re comfortable with undocumented features
- You know which MCP tools you need
Skip if:
- You rely on tool discoverability in the UI
- You’re running mission-critical workflows
- You want stability over savings
Comparison to Other Approaches
| Approach | Token Savings | Setup | Stability |
|---|---|---|---|
| Native MCP (default) | 0% | None | Stable |
| mcp-code-wrapper | 90-97% | Generate Skills | Experimental |
| ENABLE_EXPERIMENTAL_MCP_CLI | 100% | One env var | Experimental |
| Anthropic Tool Search API | ~85% | API changes | Beta |
The flag is the simplest option if you’re willing to accept experimental status. My mcp-code-wrapper generates Skills but requires a preprocessing step and YMMV. The API approach requires code changes.
The Bigger Picture
MCP’s context cost has been the elephant in the room since it launched. Loading 50+ tool schemas upfront never made sense. Context windows were never unlimited and cheap.
The community has been asking for lazy loading since September. Multiple GitHub issues, proof-of-concept implementations, workarounds. This flag suggests Anthropic is working on it internally.
The best hidden features are the ones that solve problems everyone’s complaining about.
Whether this becomes official, gets renamed, or disappears entirely - the pattern is clear. Progressive MCP discovery is coming. This flag is just an early glimpse.
Try It
# Add to ~/.zshrc or ~/.bashrc
export ENABLE_EXPERIMENTAL_MCP_CLI=true
# Restart Claude Code
claude
# Check context usage
/context
If your MCP tools line disappears, it’s working. If something breaks, remove the flag and restart.
No documentation. No support. No guarantees. But 32,000 tokens back is hard to ignore.


