Three Ways to Build Deep Research with Claude
From 20 lines of shell to production apps. Anthropic renamed Claude Code SDK to Agent SDK because deep research is now a first-class use case.
Read more →
From 20 lines of shell to production apps. Anthropic renamed Claude Code SDK to Agent SDK because deep research is now a first-class use case.
Read more →
Anthropic credits Steve Yegge's Beads as inspiration for Claude Code's new task system. The pattern-to-product cycle continues.
Read more →
Claude Code can now watch your PRs in the cloud, fix CI failures, and address reviewer comments while you're away. It's the logical next step after auto mode - and it raises the same trust questions, harder.
Read more →
Most agent discourse is about coding agents. The agents quietly running in production right now are simpler, dumber, and more useful: stale-ticket sweepers, board monitors, three-line incident explainers, leadership briefs from a fixed template. Different species. Different shape. Already shipping.
Read more →
OpenAI publishes weekly active developers. Anthropic publishes annual recurring revenue. Each company brags about the metric it can defend. Codex went 600K to 4M in four months. The 'Claude Code is better' discourse is a quality argument because quality is what's left when scale goes the other way.
Read more →
Opus 4.5 scores 80.9% on SWE-bench Verified. The same model scores 45.89% on the contamination-free Pro split. OpenAI has quietly stopped reporting Verified at all. Vendor benchmark cards are marketing.
Read more →
Lightrun: 43% of AI-generated code changes need debugging in production after passing QA. CodeRabbit: 1.7x bugs, 2.74x security, 8x I/O. METR: 19% slower while feeling 20% faster. The numerator is what gets reported. The denominator is what nobody puts in the deck.
Read more →
GitHub paused Copilot Pro signups, killed Opus on the Pro plan, and leaked a June 1 move to token-based billing. Three vendors, one event, three different ways not to say 'price hike.'
Read more →
Anthropic markets MCP as the universal AI tooling standard, but a 200,000-server RCE class is 'expected behavior.' You can't be both.
Read more →
Two weeks after Anthropic said Mythos was too dangerous to release, OpenAI shipped a model with comparable cyber capabilities to anyone with a $20 ChatGPT subscription. The gating posture didn't survive a single news cycle.
Read more →
Anthropic's April 23 postmortem confirms three Claude Code regressions, including one where Opus 4.7 caught a bug Opus 4.6 shipped past human and automated review. What happens when the reviewer is a version of the product being reviewed?
Read more →
Uncle Bob, father of TDD, posted on X that TDD is 'very inefficient for AIs' and that the agent is best thought of as 'a highly focused idiot savant.' Testing didn't die. It got more important. And the review target flipped.
Read more →
Traditional coders touched a file and tidied it. The Boy Scout Rule. Now nobody does. Agents add, they don't subtract, and the codebase accretes faster than ever. A technique for putting cleanup back in as an explicit gate, not a virtue you hope for.
Read more →
A CTO once told me not to take people's Legos away. I ignored him, solved the team's problems myself, and got exactly what I optimised for: a sound plan and a team that couldn't stand me. In 2026, with agents doing the bricks, this is the lesson that matters.
Read more →
Vercel got breached through Context.ai, an AI tool an employee installed with OAuth scopes into Google Workspace. It's the latest in a pattern: Trivy into litellm, axios maintainer hijack, now this. The safest AI tool is the one you didn't install.
Read more →Showing 15 of 164 posts
View All Posts →