The Promise
Claude Code Skills are reusable capabilities that Claude automatically invokes when relevant. You write a SKILL.md file with a description, Claude reads it, and autonomously decides when to activate it based on your request.
The pitch is compelling:
- Simpler than MCP: Just Markdown with YAML frontmatter, not a whole protocol
- More powerful than slash commands: Context-aware auto-invocation
- Token-efficient when idle: Only ~50 tokens for name + description until activated
Skills are positioned as ambient intelligence - Claude magically knows what tools are available and reaches for them when appropriate.
Skills are model-invoked capabilities stored in ~/.claude/skills/ or .claude/skills/. Claude reads skill descriptions and autonomously decides when to activate them based on your request.
The Problem: No Control Over Invocation
Here’s the catch: you can’t control when skills activate. Claude decides using its own semantic understanding of your request - there’s no algorithmic matching, it’s LLM reasoning about whether your intent matches a skill description. You have no override mechanism. It’s not predictable code logic - it’s non-deterministic model behavior.
This creates friction for engineering workflows:
- Can’t force-invoke: You need architectural analysis right now, but Claude doesn’t think your question matches the skill description
- Can’t prevent invocation: You’re debugging a caching issue, but Claude triggers web search to look up “cache strategies” when you just wanted to grep the codebase
- No visibility into decisions: Claude silently chooses when to load skill instructions, you don’t see the token cost until it happens
- Unpredictable context consumption: A skill might consume 10k tokens of context for instructions you didn’t need right now
- Breaks context engineering: You’re carefully managing what’s in the context window, then Claude auto-loads skill instructions you didn’t ask for
— The control principleContext engineering means controlling what Claude processes. Skills that auto-activate break that discipline - you need to decide what’s relevant, not let the model guess.
Context isn’t just about cost. It’s about signal-to-noise ratio. Every token in the window competes for Claude’s attention. Auto-invoked skills add noise when you need signal. Control over your context is control over quality.
Real Example: Web Search
In the Claude.ai web interface, web search has a toggle - you explicitly enable it. In Claude Code, web search is available as a tool that Claude invokes based on context judgment.
The problem surfaces when:
- You’re asking about a framework concept that exists in your codebase
- Claude decides “this sounds like a web search query”
- Burns tokens searching the web instead of grepping your code
- You wanted local answers, you got web results
What you need: “Search the codebase for authentication patterns” What Claude might do: Trigger web search for “authentication best practices 2025”
You didn’t want web search. Claude thought you did. No way to prevent it.
The Ultrathink Illusion
“Ultrathink” sounds like a skill - some kind of deep reasoning mode Claude activates. It’s not. Unlike skills which use LLM semantic reasoning, ultrathink is just hardcoded string matching that sets token budgets:
- “think”: 4,000 tokens
- “megathink”: 10,000 tokens (also responds to “think hard”, “think deeply”, “think more”)
- “ultrathink”: 31,999 tokens (oddly specific, but that’s the hardcoded value)
When Claude Code sees these strings, it sets max_thinking_length to the corresponding value. This only works in the CLI, not in the web interface or API. It feels like magic, but it’s just a config flag triggered by string matching.
Breaking the illusion: This isn’t Claude choosing to think harder based on problem complexity. It’s you typing a keyword to change a parameter. The “intelligence” is marketing, the reality is if (prompt.includes("ultrathink")) { max_thinking = 31999; }.
The irony: Skills use unpredictable LLM reasoning to decide invocation. Ultrathink uses predictable string matching. Neither gives you explicit control, but for opposite reasons - one’s too fuzzy, the other’s just hidden.
Ultrathink only works in Claude Code’s terminal interface. In claude.ai or the API, it’s just regular text. The “skill” is actually hardcoded keyword detection that sets a thinking budget parameter.
What Skills Don’t Solve
Skills are positioned as ambient intelligence, but once you see past the automation, the gaps become obvious. They can’t deliver on several critical needs for engineering workflows:
- Explicit invocation: No
claude --skill=analysisflag to force a skill when you need it - Prevention controls: No
--disable-skill=web-searchto block auto-invocation - Context visibility: No token budget shown before skill instructions load into your context
- Invocation logging: No way to see why Claude chose to activate a skill
- Debugging: Can’t debug LLM reasoning - you can’t deterministically reproduce why Claude matched (or didn’t match) a skill
- Conditional logic: Can’t say “only use this skill in X directory” or “only after Y tool fails”
The fundamental issue: skills use LLM reasoning for invocation, which is inherently non-deterministic. For exploratory conversations that’s fine. For engineering workflows where context engineering and predictable behavior matter, it’s a problem.
The Alternative: Slash Commands
Slash commands give you what skills don’t: explicit control over invocation.
Here’s how they differ:
Skills (Auto-Invoked)
- Claude decides when to use based on semantic reasoning
- No explicit invocation syntax
- Context loaded when Claude thinks it’s relevant
- Non-deterministic (LLM behavior, not code logic)
- Good for: Ambient capabilities, exploratory work where unpredictability is acceptable
Slash Commands (User-Invoked)
- You type
/commandto trigger - Explicit invocation syntax
- Context loaded only when you call it
- Deterministic (runs when you say, not when Claude guesses)
- Good for: Engineering workflows, context engineering, predictable behavior
Create slash commands that wrap skill-like logic. You get reusable capabilities with explicit invocation control. Best of both worlds.
Real Example: /codex for Architectural Analysis
Here’s a working pattern that gives you skill-like reusability with command-like control:
# ~/.claude/commands/codex.md
---
allowed-tools:
- Bash(codex:*)
- Read(~/projects/**)
- Glob
- Grep
description: Launch Codex for software architecture analysis and research
---
Invoke Codex (OpenAI) to analyze software architecture, research design patterns,
or provide senior-level technical insights.
## Steps:
1. Gather context (what's the user working on?)
2. Build prompt with project context
3. Execute: codex exec --search -C [project] -s read-only --full-auto "[prompt]"
4. Return filtered output
When you type /codex should we use BLoC or Hooks?, you get:
- Explicit invocation: You decided to pay for Codex analysis
- Clear boundaries: Codex runs, returns results, exits
- Isolated context: Codex analysis doesn’t pollute main conversation
- Predictable cost: You know you’re invoking an external tool
This pattern works for any specialized capability:
/chrome- Chrome DevTools debugging (isolated MCP context)/db- Database queries and schema inspection/perf- Performance analysis and profiling/codex- Senior-level architectural analysis
Each one is skill-like (reusable, documented) but command-like (explicit, controlled, predictable).
There’s no way to have both a skill and a slash command that solves the control problem. If a skill exists with execution logic, Claude can still auto-invoke it. The slash command doesn’t prevent that. You have to choose: skills (auto-invoke) or slash commands (explicit control).
What I Want
Skills could work for engineering if they exposed the controls instead of hiding behind automation. Here’s what that looks like:
- Explicit invocation:
claude --skill=analysis "question"to force a skill when needed - Disable toggles:
--disable-skill=web-searchto prevent auto-invocation - Invocation logging: Show which skills activated and why
- Context visibility: Show token cost before loading skill instructions
- Conditional logic: Activate skills based on directory, previous tool results, or user intent patterns
Until these controls exist, slash commands are the better choice for predictable engineering workflows.
Where Skills Might Make Sense
Skills do solve a real problem: progressive disclosure for teams with many specialized patterns.
In a very large codebase (think ~1M LOC monorepo), you might have dozens of domain-specific patterns that don’t fit cleanly in CLAUDE.md. Skills let you:
- Load ~50 tokens of metadata initially (“Backend service design pattern”)
- Full instructions (~500+ tokens) only load when Claude matches relevance
- Avoid bloating your system prompt with every possible pattern upfront
This is a legitimate use case. The question is whether the tradeoffs are worth it:
- 50 tokens per skill adds up: 10 skills = 500 tokens of metadata before anything loads. For comparison, “follow the established pattern in the codebase” costs ~10 tokens and often works just as well.
- Non-deterministic matching is a risk for critical workflows: If you’re building production automation that needs reliability, LLM semantic matching means you can’t guarantee Claude will load the right skill at the right time.
- The use case is specific: Most valuable for large teams with many specialized patterns in complex codebases. For smaller projects, the overhead may not be worth it.
If you’re in that large monorepo scenario, Skills could genuinely help. For most teams, slash commands give you similar reusability with explicit control.
The Bottom Line
Skills are great for:
- Large codebases (~1M LOC monorepos) with many domain-specific patterns
- Teams that need progressive disclosure to avoid bloating CLAUDE.md
- Exploratory work where unpredictability is acceptable
Skills are problematic for:
- Engineering workflows that need predictability
- Context engineering (you want to control what’s in the window)
- Scenarios where you know better than Claude when to use a tool
Slash commands are better when:
- You want explicit control over invocation
- You’re practicing context engineering (managing signal-to-noise)
- You’re building specialized workflows that should run on-demand
The ideal future: skills with explicit invocation controls and disable flags. Until then, treat Claude as a tool that needs explicit controls, not magic that guesses your intent.
Engineering maturity means seeing past the magic and demanding control. Choose explicit over ambient when context matters.
Control is a recurring theme in effective Claude Code usage:
Slash commands give you explicit invocation control - see Isolating MCP Context with Slash Commands for Chrome DevTools isolation and When Claude Needs a Second Opinion for architectural analysis.
Plan mode gives you approval control - Claude shows you what it will do before executing. You get to review the plan, not discover 500 lines of unwanted code after the fact. See Stop Speedrunning Claude Code for why mastering this core loop matters.
Both patterns solve the same problem: treating Claude as a tool that needs explicit controls, not magic that guesses your intent.


