Anthropic Mythos Breach: Unauthorized Access Reported

An unauthorized group, operating through a Discord community focused on unreleased AI, has gotten into “Mythos”, a frontier AI model Anthropic built for elite-level cybersecurity. They got in through a chain that reports call “low-sophistication, high-impact”: guessing URL patterns based on Anthropic’s known naming conventions, using leaked metadata from a prior breach at AI startup Mercor, and exploiting credentials from a contractor. Mythos can find and exploit zero-day vulnerabilities in Windows, Linux, macOS, and major web browsers at machine speed. Anthropic has confirmed it’s investigating unauthorized access through a “third-party vendor environment,” but says there’s no evidence anyone touched its core internal systems. The group also claims it has access to Anthropic’s entire unreleased model pipeline – not just Mythos.

A Bad Day for AI Security

Reports surfaced today which were cautiously confirmed by Anthropic that an unauthorized group has gained working access to Claude Mythos, a model the company decided was too dangerous to release.

For weeks, the tech industry had been quietly buzzing about “Project Glasswing,” Anthropic’s initiative to share this so-called superhacker AI with a handful of select partners: Apple, Microsoft, Amazon, and Google. The idea was to use the model to find and fix the world’s most critical software bugs before anyone with bad intentions could. According to a report, the people with bad intentions or at least, the unauthorized ones are already in.

This isn’t a leaked chatbot. It’s a tool that Anthropic CEO Dario Amodei previously suggested could “hack every major OS and browser.”

How They Got In

It started not at Anthropic but at Mercor, an AI training startup that does contracting work for Anthropic. A data leak there reportedly exposed Anthropic’s internal model naming conventions. Armed with that, the Discord group – a private community of people obsessed with unreleased AI didn’t need to crack anything. They just guessed the URL. Using Anthropic’s known endpoint structure, they found where the private Mythos preview was hosted.

The second piece was a contractor. There are reports that one member of the group works at a firm contracted by Anthropic, and that person’s credentials were still active. That was enough to get through what was left of the perimeter.

“We’re investigating a report claiming unauthorized access to Claude Mythos Preview through one of our third-party vendor environments,” an Anthropic spokesperson said. The company’s position is that its internal servers weren’t touched but when you’re dealing with an agentic AI, access to its environment is functionally the same as access to the model.

What Mythos Actually Does

During internal testing, Anthropic pointed Mythos at OpenBSD – an operating system with a near-obsessive focus on security and within hours the model found a high-severity vulnerability that had gone undetected by human researchers for 27 years.

Beyond that, Mythos reportedly found zero-day flaws in FFmpeg (the library that powers most web video), chained together multiple Linux kernel vulnerabilities to get root access, and built exploit chains for the latest versions of Chromium and Safari.

Unlike Claude Sonnet or Opus, Mythos wasn’t built with Constitutional AI guardrails. It won’t refuse to write exploit code, social engineering scripts, or network scanners. It was built as an offensive tool for defensive purposes. That offense is now outside Anthropic’s walls.

Who Did This?

The group describes itself as “model hunters” – people who track down unreleased AI models to probe their capabilities. They say they’re curious, not destructive.

“We are interested in playing around with new models, not wreaking havoc with them,” a source from the group told an analyst. They provided screenshots of internal system prompts and live demonstrations as proof of access.

But the group also made a claim that’s harder to shrug off: they say they have access to the “whole pipeline.” That would mean other unreleased Anthropic models – including possible successors to Opus and specialized reasoning models may also be compromised.

Why This Matters

When a software bug is found, defenders normally have a window – days, weeks, sometimes months to patch it before it gets exploited widely. Mythos collapses that window to near-zero. The model can find a vulnerability and write a working exploit in seconds. If an unauthorized group has it running, they can theoretically automate the discovery and exploitation of every unpatched server on the internet.

Anthropic’s Responsible Scaling Policy was supposed to prevent this kind of thing. But the weak link wasn’t alignment or safety training. It was a contractor’s credentials and a predictable URL.

FAQ

Is my personal data at risk? Not directly. Mythos finds software flaws; it doesn’t rifle through databases. But if someone uses it to find vulnerabilities in software you depend on Windows, your bank’s web portal – those systems become exposed.

Can I use Mythos? No. Anthropic has no plans to release it publicly.

What is Project Glasswing? Anthropic’s program to share Mythos with Apple, Microsoft, and others for defensive red-teaming and patching.

How did they get in? A data leak at Mercor revealed naming conventions, the group guessed the model’s URL, and a contractor’s credentials provided the rest.

If you like this post, then please share it:

Buy me A Coffee!

Support The CyberSec Guru’s Mission

A Bad Day for AI Security

How They Got In

What Mythos Actually Does

Who Did This?

Why This Matters

FAQ

Join the Conversation

The analysis doesn't stop here. Connect with our community of tech enthusiasts and security pros for daily discussions and Q&As

Buy me A Coffee!

Support The CyberSec Guru’s Mission

Why your support matters:

Who Approved This? (Security Edition)

Beginner’s Guide to Conquering PingPong on Hack the Box

Supply Chain Crisis: Over 30 Red Hat npm Packages Hijacked to Spread the Self-Propagating ‘Miasma’ Worm

The Meta AI exploit: how a prompt injection flaw bypassed 2FA to steal million-dollar Instagram accounts

Critical Gogs RCE Vulnerability: Unpatched 0-Day Sitting Open for Over Two Months

The TrapDoor Supply Chain Attack: Coordinated Multi-Registry Campaign Hits npm, PyPI, and Crates.io

Mini Shai-Hulud Worm Hits npm: TanStack and Mistral Among 160+ Packages Compromised in Massive Supply Chain Attack

Leave a ReplyCancel reply

News

Supply Chain Crisis: Over 30 Red Hat npm Packages Hijacked to Spread the Self-Propagating ‘Miasma’ Worm

Quiz

CompTIA Security+ Quiz: Test Your Cybersecurity Fundamentals

News

The Meta AI exploit: how a prompt injection flaw bypassed 2FA to steal million-dollar Instagram accounts

BMC Series

Linux Privilege Escalation – the Practical Hacking Cheatsheet Series

News

Critical Gogs RCE Vulnerability: Unpatched 0-Day Sitting Open for Over Two Months

CTF Walkthroughs

Beginner’s Guide to Conquering DevHub on Hack the Box

Anthropic Mythos Breach: Unauthorized Group Gets Into “Too Dangerous” Cyber-AI Model

A Bad Day for AI Security

How They Got In

What Mythos Actually Does

Who Did This?

Why This Matters

FAQ

Join the Conversation

The analysis doesn't stop here. Connect with our community of tech enthusiasts and security pros for daily discussions and Q&As

Buy me A Coffee!

Support The CyberSec Guru’s Mission

Why your support matters:

If you like this post, then please share it:

Discover more from The CyberSec Guru

Related Posts

Leave a ReplyCancel reply

most recent

News

Quiz

News

BMC Series

News

CTF Walkthroughs

Newsletter Subscription

Discover more from The CyberSec Guru