Anthropic Mythos Breach: Unauthorized Group Gets Into “Too Dangerous” Cyber-AI Model

The CyberSec Guru

Updated on:

Anthropic Mythos Breach

If you like this post, then please share it:

Buy me A Coffee!

Support The CyberSec Guru’s Mission

🔐 Fuel the cybersecurity crusade by buying me a coffee! Why your support matters: Zero paywalls: Keep the content 100% free for learners worldwide, Writeup Access: Get complete writeup access within 12 hours of machine drop along with scripts and commands.

“Your coffee keeps the servers running and the knowledge flowing in our fight against cybercrime.”☕ Support My Work

Buy Me a Coffee Button

An unauthorized group, operating through a Discord community focused on unreleased AI, has gotten into “Mythos”, a frontier AI model Anthropic built for elite-level cybersecurity. They got in through a chain that reports call “low-sophistication, high-impact”: guessing URL patterns based on Anthropic’s known naming conventions, using leaked metadata from a prior breach at AI startup Mercor, and exploiting credentials from a contractor. Mythos can find and exploit zero-day vulnerabilities in Windows, Linux, macOS, and major web browsers at machine speed. Anthropic has confirmed it’s investigating unauthorized access through a “third-party vendor environment,” but says there’s no evidence anyone touched its core internal systems. The group also claims it has access to Anthropic’s entire unreleased model pipeline – not just Mythos.

A Bad Day for AI Security

Reports surfaced today which were cautiously confirmed by Anthropic that an unauthorized group has gained working access to Claude Mythos, a model the company decided was too dangerous to release.

For weeks, the tech industry had been quietly buzzing about “Project Glasswing,” Anthropic’s initiative to share this so-called superhacker AI with a handful of select partners: Apple, Microsoft, Amazon, and Google. The idea was to use the model to find and fix the world’s most critical software bugs before anyone with bad intentions could. According to a report, the people with bad intentions or at least, the unauthorized ones are already in.

This isn’t a leaked chatbot. It’s a tool that Anthropic CEO Dario Amodei previously suggested could “hack every major OS and browser.”

How They Got In

It started not at Anthropic but at Mercor, an AI training startup that does contracting work for Anthropic. A data leak there reportedly exposed Anthropic’s internal model naming conventions. Armed with that, the Discord group – a private community of people obsessed with unreleased AI didn’t need to crack anything. They just guessed the URL. Using Anthropic’s known endpoint structure, they found where the private Mythos preview was hosted.

The second piece was a contractor. There are reports that one member of the group works at a firm contracted by Anthropic, and that person’s credentials were still active. That was enough to get through what was left of the perimeter.

“We’re investigating a report claiming unauthorized access to Claude Mythos Preview through one of our third-party vendor environments,” an Anthropic spokesperson said. The company’s position is that its internal servers weren’t touched but when you’re dealing with an agentic AI, access to its environment is functionally the same as access to the model.

What Mythos Actually Does

During internal testing, Anthropic pointed Mythos at OpenBSD – an operating system with a near-obsessive focus on security and within hours the model found a high-severity vulnerability that had gone undetected by human researchers for 27 years.

Beyond that, Mythos reportedly found zero-day flaws in FFmpeg (the library that powers most web video), chained together multiple Linux kernel vulnerabilities to get root access, and built exploit chains for the latest versions of Chromium and Safari.

Unlike Claude Sonnet or Opus, Mythos wasn’t built with Constitutional AI guardrails. It won’t refuse to write exploit code, social engineering scripts, or network scanners. It was built as an offensive tool for defensive purposes. That offense is now outside Anthropic’s walls.

Who Did This?

The group describes itself as “model hunters” – people who track down unreleased AI models to probe their capabilities. They say they’re curious, not destructive.

“We are interested in playing around with new models, not wreaking havoc with them,” a source from the group told an analyst. They provided screenshots of internal system prompts and live demonstrations as proof of access.

But the group also made a claim that’s harder to shrug off: they say they have access to the “whole pipeline.” That would mean other unreleased Anthropic models – including possible successors to Opus and specialized reasoning models may also be compromised.

Why This Matters

When a software bug is found, defenders normally have a window – days, weeks, sometimes months to patch it before it gets exploited widely. Mythos collapses that window to near-zero. The model can find a vulnerability and write a working exploit in seconds. If an unauthorized group has it running, they can theoretically automate the discovery and exploitation of every unpatched server on the internet.

Anthropic’s Responsible Scaling Policy was supposed to prevent this kind of thing. But the weak link wasn’t alignment or safety training. It was a contractor’s credentials and a predictable URL.

FAQ

Is my personal data at risk? Not directly. Mythos finds software flaws; it doesn’t rifle through databases. But if someone uses it to find vulnerabilities in software you depend on Windows, your bank’s web portal – those systems become exposed.

Can I use Mythos? No. Anthropic has no plans to release it publicly.

What is Project Glasswing? Anthropic’s program to share Mythos with Apple, Microsoft, and others for defensive red-teaming and patching.

How did they get in? A data leak at Mercor revealed naming conventions, the group guessed the model’s URL, and a contractor’s credentials provided the rest.

Buy me A Coffee!

Support The CyberSec Guru’s Mission

🔐 Fuel the cybersecurity crusade by buying me a coffee! Your contribution powers free tutorials, hands-on labs, and security resources.

Why your support matters:
  • Writeup Access: Get complete writeup access within 24 hours
  • Zero paywalls: Keep the content 100% free for learners worldwide

Perks for one-time supporters:
☕️ $5: Shoutout in Buy Me a Coffee
🛡️ $8: Fast-track Access to Live Webinars
💻 $10: Vote on future tutorial topics + exclusive AMA access

“Your coffee keeps the servers running and the knowledge flowing in our fight against cybercrime.”☕ Support My Work

Buy Me a Coffee Button

If you like this post, then please share it:

News

Discover more from The CyberSec Guru

Subscribe to get the latest posts sent to your email!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from The CyberSec Guru

Subscribe now to keep reading and get access to the full archive.

Continue reading