The Quiet Features That Shipped With Opus 4.6

Opus 4.6 got the blog post. Agent teams got their own. Both deserved it. But versions 2.1.32 through 2.1.36 shipped three other features in the same week that say more about where Claude Code is heading than the headline model.

A persistent memory system. A new pricing dimension. And a steady stream of multi-agent fixes. One thread connects them.

Auto Memory: Claude Writes Its Own CLAUDE.md

Coding agents have amnesia. Every session starts from zero. I wrote about Beads solving this, then covered Anthropic productizing the pattern into Tasks. Auto memory is the next step: each project gets a persistent directory at ~/.claude/projects/<project>/memory/ where Claude writes a MEMORY.md file (capped at 200 lines) that loads into the system prompt every session. Topic files like debugging.md or patterns.md handle overflow.

The design choice matters: plain markdown. No embeddings. No vector database. No retrieval pipeline. Claude writes notes to itself and reads them next time.

But Beads wasn’t the only attempt at solving this. The community has been building “brains” for months. Cline’s Memory Bank prescribes structured files for product context and decisions. Roo Code forked the idea as an MCP server. BMAD-METHOD encodes entire agent personas as versioned markdown with orchestrator-managed memory. SpecMem adds vector search across Kiro, Cursor, Claude Code, and others. And plenty of developers just hand-curate elaborate CLAUDE.md files as project brains.

The pattern across all of them: structure, infrastructure, or manual effort. Anthropic’s answer is none of that. Just let the model write freeform markdown and trust its judgment. It’s a bet that model intelligence makes memory infrastructure unnecessary.

Enable it now

Auto memory is in gradual rollout. Force-enable it with CLAUDE_CODE_DISABLE_AUTO_MEMORY=0 in your shell config.

The tradeoffs: auto memory lives outside the repo, so it’s per-user and not team-shared. No built-in sync between machines either (open feature request). If you work across devices, point a dotfile sync tool like Tether at ~/.claude/projects/ or your memories stay siloed. The 200-line cap forces prioritization, but whether Claude writes useful context or noise is unproven. And unlike the community tools, there’s no cross-agent portability: Claude Code only.

Fast Mode: Buying Latency

Claude Code now offers a “fast” mode: the same Opus 4.6 model, optimized for lower latency, at $30 input / $150 output per million tokens. For context, standard Opus 4.6 pricing is $5/$25.

The signal isn’t the price. It’s the pricing dimension. Anthropic is now selling speed as an axis independent of intelligence. You’re not choosing a dumber model for faster responses. You’re choosing the same model and paying for the infrastructure to run it hotter.

Cost adds up fast

Fast mode bypasses your subscription quota entirely. It’s extra-usage only, billed per token. A heavy coding session could run into the hundreds without you noticing. There’s a 50% launch discount through February 16, but after that the premium is steep.

Combined with effort levels (which trade quality for cost at the model layer), Claude Code now has two independent speed knobs. Effort level adjusts how hard the model thinks. Fast mode adjusts how quickly the infrastructure returns results. One is a quality tradeoff. The other is a cost tradeoff. Both are user-controlled.

This is a quiet precedent for how AI tools will be priced going forward: not just by model tier, but by latency guarantee.

Agent Teams Keep Shipping

The agent teams launch was rough around the edges. Two days of patch releases (2.1.33 through 2.1.36) show Anthropic moving fast on stabilization:

Messaging fixes: tmux-based inter-agent communication got reliability patches. Crash fixes for teammates dropping mid-task.
New hooks: TeammateIdle and TaskCompleted events. If you’ve built hooks for Claude Code, you can now trigger actions when agents finish work or go idle. This is the event model that multi-agent orchestration needs.
Spawn restrictions: Agent frontmatter now supports Task(agent_type) restrictions, letting you control what types of sub-agents a given agent can spawn. Guardrails for the recursion problem.
Agent memory: A memory field in agent frontmatter with user, project, and local scopes. Individual agents can now have persistent memory. Not just the lead, the whole team.

Every launch post covers what shipped. The patch notes tell you what’s actually getting used.
— The pattern

I’ve been running agent teams heavily since launch. It’s genuinely good. The velocity here is the story: agent teams went from feature-flagged experiment to actively-maintained product feature in under a week.

The Thread

These three features point in the same direction:

Auto memory gives Claude persistence across sessions
Fast mode gives users cost differentiation within the same model
Agent refinements stabilize multi-agent as a first-class workflow

Together they sketch a tool that remembers what it learned yesterday, lets you choose how fast and expensive each session is, and coordinates multiple agents without falling over.

The per-session coding assistant is a transitional form. What ships next assumes Claude is still there tomorrow.

Claude Code is becoming a long-running collaborator. The headline model got the attention. These features are building the scaffolding for it to stick around.

The Quiet Features That Shipped With Opus 4.6

Auto Memory: Claude Writes Its Own CLAUDE.md

Fast Mode: Buying Latency

Agent Teams Keep Shipping

The Thread

Share this article

Related Posts

Claude Skills: The Controllability Problem

Always Have an Agent Running

One Year of Claude Code