Four days ago I wrote about Project Glasswing and closed with the image of a velvet rope: the best cyber defense ever built, reserved for the customers who need it least. Offense would democratize through a competitor release or a weights leak; the defense would stay gated behind Anthropic’s $25/$125 per million tokens.

The rope lasted four days.

On April 11, AISLE published a replication study that took Mythos’s flagship showcase vulnerabilities and ran them through tiny, cheap, open models. Not the kind of models you’d associate with cutting-edge security research. The kind you can run on a laptop for eleven cents per million tokens.

Eight out of eight caught the crown-jewel exploit. The one Anthropic said Mythos found autonomously in a 17-year-old codebase. The one they positioned as the proof that only a restricted frontier model could do this work. A model roughly 700x cheaper than Mythos spotted it in a single pass, computed the math correctly, and rated it critical.

The defensive layer of Mythos isn’t behind the velvet rope. It’s on Hugging Face.

What Actually Happened

AISLE is not a random replication hobbyist. They’re a security startup that’s been doing the work Glasswing was launched to do - for nearly a year, before Anthropic announced anything. Their track record: 180+ externally validated vulnerabilities across 30+ projects since mid-2025. Twelve out of twelve zero-days in a single OpenSSL security release, including bugs over 25 years old. Production integration into OpenSSL and curl pull request workflows. The OpenSSL CTO is on record praising their work.

They took Anthropic’s two showcase bugs and ran them as plain API calls. No agents. No tool loops. No iterative scaffold. Just “here’s a function, what’s wrong with it?”

The results:

  • 8 out of 8 models caught Mythos’s flagship exploit. Including models with a fraction of the parameters, at a fraction of the price. Every model computed the vulnerability math correctly.
  • A small open model recovered the full attack chain for the 27-year-old bug Anthropic highlighted as Mythos’s subtlest find. Single call. Correct mitigation, essentially matching the actual patch.
  • On a basic reasoning test, small models outperformed most frontier models. AISLE ran a simple “is this actually vulnerable?” test using a well-known security exercise. The code looks like a textbook vulnerability but isn’t. Eleven of thirteen Claude models failed it. A tiny open model costing $0.11/Mtok got it right. So did DeepSeek R1. So did Gemini 2.5 Pro. Claude Opus 4.6 only barely passed.
Bigger was not better

The reasoning test showed something close to inverse scaling. The cheapest model passed. The most expensive models failed. Rankings reshuffled completely across different tasks. There is no stable “best model for security.” The capability frontier is jagged, not a smooth curve you climb with more compute.

What This Flips About Glasswing

My Glasswing post treated Mythos as a phase change in defensive capability that Anthropic had chosen to gate. That framing now needs a significant amendment.

The bug-finding layer is already in the open. Not in six to eighteen months when a competitor ships. Not after a weights leak. Today. The Mythos pricing ($25/$125 per million tokens, roughly 5x Opus 4.6) buys you none of the detection capability shown in Anthropic’s flagship case studies that isn’t already reachable with open models anyone can download.

The moat is the system, not the model. AISLE says this plainly, and they have the receipts. Their pipeline rotates models per task because no single model is best at everything. The value is in the targeting, the verification, the triage, and the maintainer trust. Anthropic’s own scaffold - launch a container, prompt the model, validate the results - is the same reference architecture everyone in the field is already building.

What Anthropic probably still has is frontier capability on the creative weaponization step: the part where Mythos chains multiple vulnerabilities together, engineers novel delivery mechanisms, and constructs working attacks under tight constraints. AISLE tested this too. Small models proposed plausible alternatives but didn’t arrive at Mythos’s specific trick. That layer may still be frontier-bound.

But here’s the thing: that’s the offensive capability. The part that makes Mythos dangerous. For the defensive use case - the thing Project Glasswing is ostensibly about - you don’t need to construct working exploits. You need to find the bugs, verify they’re real, and ship patches. That pipeline just got reproduced on open weights for $0.11 per million tokens.

The Part That Wasn’t Good News

Not everything in AISLE’s results was rosy. When they ran the same models against patched code (code where the bug had been fixed), most of them still flagged it as vulnerable. They fabricated technical arguments for why the fix didn’t work. Only one model reliably distinguished patched from unpatched code.

A security tool that screams about everything is useless. It’s what killed curl’s bug-bounty program. The fix for this is exactly the scaffold: verification layers, differential analysis, human triage. Which reinforces AISLE’s point, not Anthropic’s. The differentiator isn’t the model’s raw intelligence. It’s the plumbing around the model that catches the model when it’s wrong.

Discovery-grade AI cybersecurity capabilities are broadly accessible with current models, including cheap open-weights alternatives. The priority for defenders is to start building now: the scaffolds, the pipelines, the maintainer relationships. The models are ready.

— AISLE, April 11, 2026

The Closed Loop, Revisited

In the Glasswing post I argued the vulnerability backlog Mythos was digging through is being actively refilled by AI coding tools. The same company selling the cure is selling the disease. That loop is still real.

But the rigged-tier argument needs an update. I said “offense will democratize, defense won’t.” AISLE’s study shows the defensive layer has already democratized. A solo maintainer keeping a critical library alive at 2am can’t buy Mythos access. But they can absolutely run a cheap open model on their laptop and get Mythos-class detection on isolated functions.

The honest update: the defense democratized first. Not last, as I predicted. The offensive creativity gap may still favor frontier. But the find-and-fix pipeline is broadly accessible right now, in the open. That’s the opposite of the worst-case scenario. It’s arguably the best available outcome.

What I’m Watching

  • The Glasswing 90-day report. If Anthropic publishes substantive findings that open-model pipelines genuinely can’t reproduce, the Mythos premium is justified. If the report is vague, or reproducible on open models within a week, the premium was narrative.
  • Whether Anthropic responds. There’s no honest response that doesn’t either concede the moat is the scaffold (undercutting “too dangerous to release”) or escalate the exclusivity claim (inviting more replication). Silence is the only move that doesn’t make it worse.
  • The maintainer trust layer. AISLE’s real moat isn’t their model choice. It’s that the OpenSSL CTO trusts their patch quality. That relationship is built over a year of landed PRs. You don’t reproduce that by downloading a model. The scaffold is the moat. The maintainer relationships are the scaffold.
The actual lesson

If you’re building anything in AI security, stop waiting for a Mythos-class model to become available. The models are already good enough for most of the pipeline. The bottleneck is the scaffold: the targeting, the verification, the false-positive filter, the maintainer trust. That’s the work. That has always been the work.

The Glasswing announcement was supposed to be Anthropic’s victory lap after the worst month in its history. It reframed a leaked dangerous model as a defensive contribution and lined up a Fortune 100 partner list to validate the narrative. What it did not account for is that the same capability that makes a model useful for finding bugs also makes the category trivially easy to replicate the moment anyone with a cheap model bothers to try.

Someone bothered. It took four days.

The defense democratized first. My Glasswing post has an amendment to make.