Skip to content
Abstract brain merging with digital circuits

Emergent Minds

>>_

Posts

View all →

Recent Posts

The Expensive Middle
llm

The Expensive Middle

Sonnet 5 lands within a few points of Opus 4.8 on most work and looks 2.5x cheaper, but that discount inverts on real tasks: at high effort Sonnet is so token-hungry it often bills more per task than Opus. The usage squeeze, meanwhile, is self-inflicted: agentic work now fans out dozens of subagents across parallel workstreams. Opus 4.8 became the expensive middle, though its real problem was never the price. It's the position.

Read more →
Your Code Was Never Pristine
ai-coding

Your Code Was Never Pristine

There's a myth, loudest from senior engineers and architects, that before AI the codebase was a cathedral and now it's slop. It was never a cathedral. 'Technical debt' was coined in 1992, the world runs on 220 billion lines of COBOL, and the thing that actually mattered was never how the code looked. It was whether you could prove it works.

Read more →
Trending
GPT-5.6 Is Out. Twenty Companies Can Use It.
llm

GPT-5.6 Is Out. Twenty Companies Can Use It.

OpenAI shipped a Mythos-class frontier model on June 26, then handed the guest list to the US government. Twenty approved customers, classified criteria, no published rules - a de facto license, applied to the labs that cooperate and useless against the open weights shipping freely out of China.

Read more →
Don't Send Your Recon to Beijing
security

Don't Send Your Recon to Beijing

The open model that engages with authorized security work also has a default route that ships your client's data through Chinese infrastructure. Here's how to run GLM-5.2 from the cloud for real engagements - minimal false refusals, data kept in the US, no Beijing tax.

Read more →
Trending
GLM-5.2: The Receipts Came In
llm

GLM-5.2: The Receipts Came In

Eleven days ago I flagged GLM-5.2's launch claims as unverified. The receipts arrived: independent benchmarks above Fable 5, a security eval beating Claude Code at a sixth of the cost, a 2-bit quant running on a Mac Studio, and a model trained without a single NVIDIA chip.

Read more →
Who Does the Refusal Actually Stop?
security

Who Does the Refusal Actually Stop?

Over-broad AI safety refusals block the defenders who follow the rules and cost attackers nothing - they just self-host. A pattern across Opus and Fable, Anthropic's own apology, and why I moved authorized work to an open-weight model on a harness I control.

Read more →
The Editor Is Now a Host
dev-tools

The Editor Is Now a Host

Cognition killed Windsurf overnight via an over-the-air update, rebranded it Devin Desktop, made the default UI an agent command center instead of a code editor, and shipped an open Agent Client Protocol so Codex, Claude, and OpenCode can all run inside it. The bet underneath: the IDE wins by being the place agents report for work, not by having the best autocomplete. The editor was always the wrong center of gravity.

Read more →

Showing 15 of 207 posts

View All Posts →