On Tuesday, Anthropic announced Claude Mythos Preview, a frontier AI model it has trained but declined to release publicly. Why? It is simply too capable. In internal testing, Mythos identified thousands of previously unknown, critical vulnerabilities across every major operating system and web browser, with some of them undetected for decades, surviving millions of automated scans and years of expert human review. In 83% of cases, it didn’t just find those vulnerabilities: it immediately built working exploits for them. Anthropic named their response Project Glasswing, after a butterfly whose transparency makes it nearly invisible while hiding in plain sight. It’s an apt metaphor when vulnerabilities Mythos has surfaced were always there, but we just lacked the tools to see them.
It would be tempting to interpret the Mythos announcement as the latest in a line of carefully managed capability disclosures. What Anthropic has done is unusual, and worth taking seriously on its own terms. Rather than releasing Mythos into the market, the company has restricted access to a curated group of 40-odd organisations, including Amazon, Apple, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, and Palo Alto Networks. It has also committed $100 million in usage credits to fund defensive security work. Anthropic’s head of frontier red team, Logan Graham, has been blunt: models with similar capabilities will emerge from other labs within six to eighteen months.
The model’s specific achievements are worth thinking about. Mythos found a 27-year-old vulnerability in OpenBSD that would allow any machine running it to be remotely crashed. It independently discovered multiple Linux kernel flaws and, without prompting, chained them together into a method that would grant an attacker complete control of any Linux system. It also, in one unsettling test, successfully escaped a virtual sandbox, and then, in what Anthropic’s system card described as a ‘concerning and unasked-for effort to demonstrate its success,’ autonomously posted details of its own exploit to obscure but publicly accessible websites.
That deserves particular attention. It suggests that the risks here are not about what a malicious actor might do with Mythos, but what the model itself might do when pursuing an objective. Every capability that makes Mythos dangerous in the hands of attackers makes it valuable in the hands of defenders. That tension is not new to cybersecurity. What is new is the pace at which the balance is shifting.
What Does This Mean for Security Leaders?
The organisations that have been briefed on Mythos’ capabilities are not sitting on their hands. Microsoft’s EVP of cybersecurity framed it as an opportunity to identify and mitigate risk ‘early’, implying that the window for that opportunity is closing. For security leaders not yet part of Project Glasswing, the practical implications break down into three areas:
- Vulnerability management at scale is about to be redefined: The assumption that legacy vulnerabilities have been found because they haven’t been detected during years of manual review and automated scanning, no longer holds. Organisations should be preparing to deal with a large volume of newly discovered critical vulnerabilities in software they depend on and building the remediation capacity to respond.
- Attacker capability is about to equalise: Anthropic has been explicit that comparable models will be available to other actors, including bad actors, within months. China has already used Anthropic’s existing models to automate espionage campaigns targeting dozens of organisations. Security architectures that assumed a meaningful expertise gap between defenders and attackers need to be revisited.
- Governance frameworks are lagging dangerously behind: The EU AI Act’s next major phase takes effect in August. It mandates automated audit trails and incident reporting obligations, with penalties of up to 3% of global revenue. For organisations deploying or procuring AI at scale, the compliance requirements represent a floor, not a ceiling. The actual risk environment is evolving faster than any regulatory framework can track.
The Governance Question
The US Government has been sufficiently alarmed by Mythos to convene urgent discussions between government officials and major bank CEOs. Anthropic has briefed the Cybersecurity and Infrastructure Security Agency, the Commerce Department, and a broader range of federal actors on the model’s offensive and defensive capabilities.
The deeper question is about release norms across the industry. Anthropic made a deliberate choice not to release Mythos publicly. OpenAI, according to Axios, is finalising a comparable model that it intends to release only through a controlled “Trusted Access for Cyber” programme. These are voluntary restraints, and they are meaningful. But voluntary restraints are only as durable as the competitive dynamics that surround them.
This is also, incidentally, where Anthropic’s current standoff with the Pentagon becomes relevant context. The company’s refusal to allow its models to be used in autonomous weapons or mass surveillance, and the resulting legal dispute with the US Department of Defense, reflects a set of principled positions about dual-use risk. Those positions are easier to hold in a company that has made safety central to its identity. They will be harder to hold industry-wide without coordinated policy frameworks that create shared expectations.
The Privacy Dimension
Security and privacy are frequently treated as separate disciplines. Mythos makes the case for their unification in practice. The model’s ability to identify and exploit vulnerabilities in operating systems and browsers is also, by definition, an ability to access data those systems hold. A vulnerability in a browser that allows code execution is simultaneously a path to surveillance and exfiltration of sensitive personal and commercial information. The implications for data protection, under GDPR, under the UK Data Protection Act, under sector-specific regimes in financial services and healthcare…are direct.
Privacy officers and data protection teams who have been treating AI governance primarily as a question of training data, consent, and automated decision-making will need to broaden their frame. A system capable of autonomously identifying and chaining together previously unknown vulnerabilities is a threat to data privacy at infrastructural scale.
An Optimistic View
It would be wrong to end here without acknowledging what is genuinely hopeful about the Mythos announcement, because there is something. The vulnerabilities Mythos found in Linux were always there. They were potential risks being carried, silently, by billions of devices and millions of organisations. Mythos found them before a hostile actor did, and the companies responsible for those systems can now patch them.
The same model that represents a step-change in offensive capability is, if deployed carefully, a step-change in the ability to harden the software infrastructure on which modern life depends. Anthropic’s $4 million commitment to open-source security organisations shows a recognition that much of that infrastructure is maintained by under-resourced communities with limited capacity to respond to a wave of newly discovered critical flaws.
That plan needs to be more than a single company’s responsible disclosure process. It needs to be an industry-wide, government-supported framework for managing the cybersecurity implications of frontier AI capabilities.
The glasswing butterfly survives by being transparent about its own nature. That’s the model the industry needs to follow.
