The AI Daily Brief: "Anthropic Accidentally Revealed Their Most Powerful Model Ever"
Host: Nathaniel Whittemore (NLW)
Date: March 27, 2026
Episode Overview
This episode explores recent breaking news in the world of artificial intelligence, most notably the accidental revelation of Anthropic’s newest and most powerful AI, Claude Mythos. NLW examines the implications of this leak, reviews significant new product announcements from Google and Shopify, discusses shifts in business models around vertical AI and in-house model training, and considers the enduring relevance of “the bitter lesson” in AI development. The episode’s central focus is on the emergence of high-performing vertical and post-trained models, questioning how they might disrupt the current dominance of general AI models from leading labs.
Major Headlines and Key Discussion Points
1. Anthropic’s Accidental Leak: Claude Mythos
- [01:02] A late-night leak revealed Anthropic is trialing a new model codenamed Claude Mythos, described as “by far the most powerful AI model we've ever developed.”
- The leaked draft blog post claims Mythos significantly outperforms the previous flagship, Claude Opus 4.6, especially in coding, academic reasoning, and cybersecurity.
- Anthropic is cautious: “We want to act with extra caution and understand the risks it poses, even beyond what we learn in our own testing.”
- The release will be gradual—currently, only a small set of early access customers can test it, especially regarding cybersecurity risks.
- The model is expensive to run, both for Anthropic and customers; efficiency work is ongoing before general release.
- The model was referenced also as “Capybara”—unclear if this was a code name or alternate launch name.
- The leak was part of a cache of some 3,000 unpublished Anthropic blog assets.
- Industry Reaction: Many commented on the naming, with some noting its mythological resonance (potentially “mythos” like Lovecraft’s Cthulhu mythos), and others poking fun at OpenAI’s “Spud” (codenamed like a potato).
- Quote: "I like how anthropic's mysterious, spooky new model is codenamed mythos, while OpenAI named theirs after a frickin potato." — Jason Botterell [~03:40]
- Quote: “It will only go faster from here.” — Gavin Purcell [03:55]
- No public release timing; this was not a formal launch, but an advance warning.
2. Other Notable AI Product Developments
Google: Gemini 3.1 Flash Live Voice Model
- [05:15] Google released “Gemini 3.1 Flash Live,” a small, real-time voice model enabling continuous human-like conversation (rather than the usual turn-based, stilted experience).
- Outperforms prior voice models on audio benchmarks, including multi-step function calling.
- Early deployment at Home Depot showed marked improvement in handling complex audio (product codes, noisy environments).
- Expected to uplift personal voice agent quality, with implications for Apple’s new Siri, which is reportedly powered by Gemini.
- Humorous take: “The long winter of our discontent of Siri not understanding a single damn word we say may finally be coming to an end.” [07:19]
Shopify: Tinker Mobile App
- [08:00] Shopify released Tinker, a free app featuring 100+ AI tools for e-commerce merchants—generating logos, product photos, ad videos, and more.
- Designed to flatten the learning curve, with easy outcome-based navigation and example-driven interfaces.
- Empowers small business entrepreneurship—Shopify as a positive force in “normalizing” AI.
- Quote: “If you want more artists, lower the cost of paint. And cost isn't just money, it's the time spent keeping up the friction...” — Rousseau Kazi, Director of Product, Shopify [09:20]
- NLW’s take: This could play a big role in shaping broader AI adoption perceptions.
OpenAI
- Codex Plugins Upgrade: Codex now supports more complex “real work”—planning, research, coordination around code, and broader workflows.
- Responds directly to recent Anthropic restrictions: as Anthropic limited free access to Claude, OpenAI relaxed limits for experimentation.
- Quote (Codex team): “You can just build unlimited things with Codex. Have fun.” [11:35]
- Adult Mode Cancellation: OpenAI has indefinitely shelved its plans for an “Adult Mode” (erotic chatbot features), favoring focus on coding/enterprise sales.
- Safety, age-detection flaws, staff controversy, and advisory council objections prompted the change.
- NLW: “All of the costs of going down this route were so obviously going to be higher than the upside for OpenAI.” [13:55]
- OpenAI is also reportedly culling other underperforming products (suggesting focus rather than flailing).
- IPO Race: Rumors indicate Anthropic may IPO as early as October, which could pressure OpenAI to move faster.
- Quote: “2026 is the year of the mega IPO.” — Noel Moldvay [14:45]
3. Main Theme: The Rise of Vertical and Post-Trained Models
(Main Discussion begins ~15:00)
The “Bitter Lesson” in AI
- [16:30] Revisits Rich Sutton’s “bitter lesson”:
- General, compute-heavy methods always outperform efforts to encode domain-specific human knowledge.
- Example: In chess, brute-force search/modeling defeated more artisanal, human-knowledge approaches.
The Changing Equation: Vertical Models and Post-Training
-
Intercom’s Fin Apex & Custom Models
- [17:59] Intercom announces their vertical service model Fin (Apex): claims it’s “objectively the highest performing, fastest and cheapest model for customer service,” surpassing GPT 5.4, Opus 4.5.
- Built on billions of proprietary annotated customer service interactions—”last mile” usage data.
- Domain-specific post training is the breakthrough—outsizes general foundation models for focused tasks.
- Quote (Paul Adams, Intercom): “We have built a brand new model for Fin called Apex, which has a higher resolution rate, fewer hallucinations and is far cheaper than any other model…and it isn’t close.” [18:41]
- The “flywheel” effect: more usage, better data, better models.
- Quote: “It means that vertical models can and will outperform general models.” — Paul Adams [19:12]
-
Cursor’s Composer 2 Controversy
- [20:00] Similar claims: Composer 2, Cursor’s coding model, beat state-of-the-art frontier models on coding benchmarks, despite being a fine-tuned open weight base (Kimik 2.5 with RL).
- Product Manager (Lee Robinson): “Yup, Composer 2 started from an open source base. We will do full pre training in the future. Only a quarter of the compute...came from the base. The rest is from our training…” [21:30]
- Some critique Cursor’s lack of transparency, but technical community is blown away by the rapid progress and efficacy of post-training.
- Quote (Leetllm): “Seeing an open weight QEMI 2.5 fine tune actually beat [Opus 4.6] on coding benchmarks is wild.” [22:15]
-
Industry Chatter on Verticalization
- [23:01] Many new SaaS companies (Pinterest, Airbnb, Notion, Cursor, Intercom) are shifting to in-house or post-trained open models, reducing reliance on API calls to frontier labs.
- Custom post training is fast eroding the “infrastructure moat.”
- Quote (Clem Delang, Hugging Face): “I believe the majority of AI workflows will be in house based on open source versus API. It took much more time than we anticipated, but it’s happening now.” [23:57]
-
Decagon’s Multi-model Architecture
- [24:20] Novel approach: a network of specialized models, not a single monolith, each optimized for steps in customer interaction (detection, orchestration, etc.).
Implications and Industry Impact
-
API Economics Shifting
- The “API tax” is becoming unsustainable. As open models get stronger and cheaper, internal adoption will accelerate, as was the case in cloud computing a decade ago.
- Quote (Adriana Sabatta): “The API tax is starting to look like the cloud markup of 10 years ago.” [25:08]
-
Future Model Competition
- “Classic disruption is now at [the labs’] door. The only way out is to disrupt themselves by building cheaper specialized models too.” [25:44]
- Battle lines: big labs excel in horizontal general intelligence; vertical players can now outperform on special-purpose tasks using high-quality, domain-specific post-training.
-
Theoretical Foundation: Experience vs. Human Knowledge
- [26:07] Clip from Richard Sutton (on the Dwarkesh podcast): Predicts systems that learn from “experience” (not just encoded human knowledge) will outperform.
- Quote (Richard Sutton): “I…expect there to be systems that can learn from experience, and those could, well perform much, much better and be much more scalable. In which case it will be another instance of the bitter lesson that the things that used human knowledge were eventually superseded by things that just trained from experience and computation.” [26:07]
- NLW synthesizes: Apex and Composer 2 are “post-trained from experience, exactly as Sutton said.” [26:52]
- [26:07] Clip from Richard Sutton (on the Dwarkesh podcast): Predicts systems that learn from “experience” (not just encoded human knowledge) will outperform.
-
What Comes Next?
- Not every company will succeed with a bespoke model: effective post-training is still a rare skill and requires massive, high-quality, annotated data.
- NLW’s forecast: Expect a significant uptick in experimentation with vertical and post-trained models among companies with strong last-mile data and post-training chops.
- Watch for data partnerships, M&A activity, and “hyper-specific” model providers going head to head with the labs.
Notable Quotes & Timestamps
- “We’ve finished training a new AI model, Claude Mythos…By far the most powerful AI model we've ever developed.” — Anthropic draft blog post [01:30]
- “I like how anthropic's mysterious, spooky new model is codenamed mythos, while OpenAI named theirs after a frickin potato.” — Jason Botterell [03:40]
- “The long winter of our discontent of Siri not understanding a single damn word we say may finally be coming to an end.” — NLW [07:19]
- “If you want more artists, lower the cost of paint. And cost isn't just money, it's the time spent…We wanted to lower all of it.” — Rousseau Kazi, Shopify [09:20]
- “You can just build unlimited things with Codex. Have fun.” — Thibaut, OpenAI Codex [11:35]
- “We have built a brand new model for Fin called Apex…higher resolution rate, fewer hallucinations and is far cheaper than any other model…And it isn't close.” — Paul Adams, Intercom [18:41]
- "Seeing an open weight QEMI 2.5 fine tune actually beat it on coding benchmarks is wild." — Leetllm [22:15]
- "The API tax is starting to look like the cloud markup of 10 years ago.” — Adriana Sabatta [25:08]
- "I…expect there to be systems that can learn from experience, and those could, well perform much, much better…" — Richard Sutton [26:07]
Thematic Takeaways
- Anthropic’s leak underscores fierce competition and rapid, sometimes erratic, development cycles in cutting-edge AI.
- There is a swift, industry-wide shift underway towards vertical, domain-specialized AI models, powered by proprietary post-training on experiential (rather than human-encoded) data.
- The “bitter lesson” remains true, but is evolving: massive experiential data and compute win—but the source of data is shifting from general to task-specific.
- API/closed lab dominance is facing a new wave of disruption by increasingly powerful open models plus company-specific fine-tuning.
- Practical implication: Expect more in-house AI, less outsourcing to general-purpose models, and a rise of specialized vertical solutions.
Final Thoughts
NLW closes by emphasizing that, while not every company will be able to take advantage of vertical models, the recent results from Intercom, Cursor, and other players are prompting many to experiment. The competitive landscape for AI is evolving faster than ever, making it a critical domain to watch for both practitioners and industry followers alike. “We will continue to explore this trend,” NLW concludes, “but for now, that’s going to do it for today’s AI Daily Brief.”
Summary by The AI Daily Brief Podcast Summarizer – covering all major themes, industry shifts, and memorable moments for listeners who want a detailed yet clear encapsulation of today’s AI news.
