Claude Fable 5: Anthropic's First Public Mythos Model — Banned in 72 Hours

In One Sentence
On June 9, 2026, Anthropic released Claude Fable 5 — the first publicly available Mythos-class model. It scored 80.3% on SWE-Bench Pro, a 22-point lead over GPT-5.5 (58.6%). Three days later, the U.S. government ordered a suspension of access for foreign nationals on national security grounds. Anthropic disabled the model worldwide.
This article covers: Fable 5's real benchmark results, a full comparison with Opus 4.8 / GPT-5.5 / Gemini 3.1 Pro, how the safety classifier system works, what Mythos 5 (the unrestricted version) can do, and the export control saga.
Quick Comparison Table
| Dimension | Claude Fable 5 | Claude Opus 4.8 | GPT-5.5 | Gemini 3.1 Pro |
|---|---|---|---|---|
| Vendor | Anthropic | Anthropic | OpenAI | Google DeepMind |
| Release date | 2026-06-09 | 2026-05-28 | 2026-04 | 2026-05 |
| Input price ($/M tokens) | $10 | $5 | $5 | $2 |
| Output price ($/M tokens) | $50 | $25 | $30 | $12 |
| SWE-Bench Pro | 80.3% | 69.2% | 58.6% | 54.2% |
| FrontierCode Diamond | 29.3% | 13.4% | 5.7% | — |
| GDP.pdf (vision, no tools) | 29.8% | 22.5% | 24.9% | 16.7% |
| Safety mechanism | Classifier → Opus 4.8 fallback | Standard | Standard | Standard |
| Current status | ⛔ Suspended globally | ✅ Available | ✅ Available | ✅ Available |
Sources: Anthropic official, Vellum analysis
What Is Mythos Tier?
Anthropic's model tiers run: Haiku → Sonnet → Opus → Mythos. Fable 5 is the first publicly available Mythos-class model, representing Anthropic's current ceiling.
Compared to Opus 4.8:
- 2× pricing ($10/$50 vs $5/$25)
- Major coding lead (SWE-Bench Pro 80.3% vs 69.2%)
- But safety classifiers intercept some queries, falling back to Opus 4.8
The underlying Mythos 5 is the same model with some safety restrictions removed, available only through Project Glasswing for U.S. cyber defenders and critical infrastructure operators.
Benchmark Results
All figures cross-verified from Anthropic, Vellum, and LushBinary.
Coding: Fable 5's Dominance
| Benchmark | Fable 5 | Opus 4.8 | GPT-5.5 | Gemini 3.1 Pro |
|---|---|---|---|---|
| SWE-Bench Pro | 80.3% | 69.2% | 58.6% | 54.2% |
| FrontierCode Diamond (hardest) | 29.3% | 13.4% | 5.7% | — |
| Terminal-Bench 2.1★ | 88.0%★ | 82.7% | 83.4% | 70.7% |
★ Mythos 5 (unrestricted) score. Fable 5 falls back to Opus 4.8 on security-related tasks.
Fable 5's 80.3% on SWE-Bench Pro is a clear lead over the field. Real-world case: Stripe used it to perform a codebase-wide migration across 50 million lines of Ruby — finished in one day vs an estimated 2+ months for a full team.
Cursor founder Michael Truell: "Fable 5 is the state of the art model on CursorBench. It's opened up a class of long-horizon problems that were out of reach."
Knowledge Work & Vision
| Benchmark | Fable 5 | Opus 4.8 | GPT-5.5 | Gemini 3.1 Pro |
|---|---|---|---|---|
| GDPval-AA (knowledge, ELO) | 1932 | 1890 | 1769 | 1314 |
| GDP.pdf (vision-only, no tools) | 29.8% | 22.5% | 24.9% | 16.7% |
| OSWorld-Verified (computer use) | 85.0% | 83.4% | 78.7% | 76.2% |
| AutomationBench (tool use) | 17.4% | 15.5% | 12.9% | 9.6% |
| Legal Agent Benchmark | 13.3% | 10.4% | 2.1% | 0.0% |
The vision leap is significant — Fable 5 can reconstruct web app source code from screenshots alone. Anthropic's CTO says tasks that required 100 prompts a year ago can now be "one-shotted."
Memory & Long-Running Tasks
Fable 5 holds focus across millions of tokens and self-improves using persistent notes:
- Slay the Spire test: With file-based memory, Fable 5 improved 3× more than Opus 4.8 and reached the final act 3× as often.
- Physics research (Matthew Pines, Alignment Lab): Fable 5 achieved in 36 hours what took GPT-5.5 4 days.
Key behavioral shift: stays on task longer, validates its own work before declaring completion.
Mythos 5 (Unrestricted) Scientific Capabilities
| Benchmark | Mythos 5 | Opus 4.8 |
|---|---|---|
| BioMysteryBench | 46.1% | 40.0% |
| ExploitBench (cybersecurity) | 78.0% | 40.0% |
| HealthBench Professional | 66.0% | 56.9% |
| Humanity's Last Exam (with tools) ★ | 64.5% | 57.9% |
Source: LushBinary comparison
Key finding: In blind tests, scientists preferred Mythos 5's molecular biology hypotheses ~80% of the time over Opus-class models. One novel mechanism for an E. coli protein was independently corroborated by another lab.
In drug design: accelerated parts of the workflow by ~10× — 9 of 14 protein targets yielded strong candidates.
The Safety Classifier System

Fable 5's core innovation — and biggest controversy — is the safety classifier system. A separate AI monitors input queries and determines if they touch sensitive domains (cybersecurity, biology/chemistry, model distillation):
- Blocked queries → routed to Claude Opus 4.8 for response
- Unblocked queries → Fable 5 at full capability (≈ Mythos 5 level)
Key stats:
- <5%** of sessions trigger fallback. **>95% run with full Fable 5 capability.
- On cybersecurity evaluations, Mythos 5 scores 78.0%, but Fable 5 in blocking mode scores 0% on offensive cyber tasks.
- Bug bounty ran 1,000+ hours with no universal jailbreak found. UK AI Safety Institute made progress within a brief window.
Controversy: the classifiers are overly broad. Legitimate cybersecurity research questions can get intercepted. This drew community criticism.
Additionally, 30-day data retention is mandatory for all Mythos-class model traffic. Anthropic pledges no training use; data is deleted after 30 days.
The Core Event: A 72-Hour Lifecycle
| Date | Event |
|---|---|
| 2026-06-09 | Anthropic releases Fable 5 + Mythos 5 |
| 2026-06-09 to 06-12 | Normal service |
| 2026-06-12, 5:21 PM ET | U.S. export control directive takes effect |
| 2026-06-12 | Anthropic disables Fable 5 + Mythos 5 worldwide |
| Present | Still unavailable (awaiting government consultation) |
The Dispute
The government claimed to have discovered a jailbreak method that allowed Fable 5 to perform restricted cybersecurity tasks.
Anthropic's response:
"We reviewed a demonstration of this specific technique being used to identify a small number of previously known, minor vulnerabilities. These vulnerabilities all appear relatively simple, and we have found that other publicly-available models are able to discover them as well without requiring a bypass." — Anthropic, 2026-06-13
Anthropic called the government's action a "misunderstanding" and argued that a narrow, non-universal potential jailbreak should not trigger a recall of a widely deployed commercial model.
Behind the scenes:
- Amazon (Anthropic's largest investor) raised concerns with the White House after internal testing allegedly showed Fable 5 could provide attack-relevant information under certain prompts.
- Semafor (2026-06-14) reported the move was motivated by fears that a China-linked group had accessed Mythos 5.
Current Impact
- Fable 5 and Mythos 5 globally unavailable
- Opus 4.8 and other Claude models unaffected
- Anthropic is negotiating for restored access
- Enterprise users should default to Opus 4.8 until Fable 5 returns
Pricing Analysis
Standard Rates
| Model | Input ($/M tokens) | Cached Input | Output ($/M tokens) |
|---|---|---|---|
| Fable 5 / Mythos 5 | $10 | $1 | $50 |
| Opus 4.8 | $5 | $0.50 | $25 |
| GPT-5.5 | $5 | $0.50 | $30 |
| Gemini 3.1 Pro | $2 | $0.20 | $12 |
Source: LushBinary pricing table
Per-Task Cost Comparison (200K input + 50K output, no cache)
| Model | Input (200K) | Output (50K) | Total per task |
|---|---|---|---|
| Fable 5 | $2.00 | $2.50 | $4.50 |
| GPT-5.5 | $1.00 | $1.50 | $2.50 |
| Opus 4.8 | $1.00 | $1.25 | $2.25 |
| Gemini 3.1 Pro | $0.40 | $0.60 | $1.00 |
Cost is Fable 5's weakness — 80% more expensive than GPT-5.5 per task, 2× Opus 4.8. But the real metric is cost per successful task: if Fable 5 solves in one pass while a cheaper model needs 2-3 retries, total cost may favor Fable 5.
Subscription Note
Pro/Max/Team subscribers got free Fable 5 access June 9–22. After June 23, Fable 5 required usage credits. This is moot now — the model is suspended.
Decision Guide: What to Use Until Fable 5 Returns
| Task Profile | Recommended Model | Rationale |
|---|---|---|
| Hard coding, framework migrations, multi-step autonomous programming | Opus 4.8 (interim Fable 5 replacement) | Best available coding model at $5/$25 |
| Routine high-volume, latency-sensitive: classification, summarization, translation | Opus 4.8 | Half the price of Fable 5 |
| Codex CLI investment, diagram-heavy reasoning | GPT-5.5 | Strong agentic coding at 56% of Fable 5 cost |
| Google ecosystem (Vertex AI, Workspace), high throughput | Gemini 3.1 Pro | ~4.5× cheaper; gap not fatal for most tasks |
| Cybersecurity or biology workloads | Test Opus 4.8 or GPT-5.5 carefully | Even with Fable 5 available, these queries fall back to Opus 4.8 |
Summary
Claude Fable 5's 72-hour existence highlights the two core tensions in 2026 AI:
- Capability vs. Regulation — Mythos-class coding and science capabilities are real and unprecedented. That very capability triggered the government response.
- Openness vs. Safety — Anthropic's safety classifier system is the industry's most advanced, yet it still wasn't enough for regulators. The strict safety mechanisms (30-day retention, fallback strategy) also frustrated developers.
For most developers, the practical impact is manageable: Opus 4.8 is still the best available coding model, while GPT-5.5 holds unique advantages in the Codex CLI ecosystem. Fable 5 was a glimpse of the ceiling — a demonstration of what's possible, waiting to return.
Important Note
All benchmark figures come from Anthropic's official reports or reputable third-party evaluations. Scores are influenced by the testing scaffold: e.g., GPT-5.5 uses Codex CLI for Terminal-Bench, Gemini uses Gemini CLI. SWE-Bench Pro and FrontierCode control for this variance.
Sources
- Anthropic: Claude Fable 5 and Mythos 5
- Anthropic: Fable and Mythos Access Update
- The Hacker News: U.S. Orders Anthropic to Suspend Fable 5
- Vellum: Fable 5 & Mythos 5 Benchmarks Explained
- LushBinary: Fable 5 vs GPT-5.5 vs Gemini 3.1 Pro
- Morphllm: Claude Benchmarks 2026
Stay Ahead of AI Model Releases
Bookmark AI Tools Insight for honest, data-driven AI model comparisons. No hype, just what works.
Subscribe
Comments & Danmaku
Leave a comment — it flies across the page as danmaku!