Last Week in AI – Episode #222: Sora 2, Sonnet 4.5, Vibes, Thinking Machines
Date: October 7, 2025
Hosts: Andrei Karlenkov (Skynet Today), guest co-host John Krohn
—
Overview
This episode covers a packed fortnight in AI news, with a strong focus on new generative AI tools, the ever-advancing capabilities of LLMs, the business pressure on leading AI companies, hot legal and policy developments, and some lively discussion about the industry’s latest benchmarks and research.
Key topics include:
- OpenAI’s Sora 2 and its increasingly cinematic text-to-video capabilities
- Anthropic’s Claude Sonnet 4.5, with upgrades in coding, agentic AI, and long-context reasoning
- Meta’s “Vibes” feed and the AI “slop” debate
- ChatGPT Pulse and the push for AI daily assistants
- Transparency laws, safety features, and ongoing copyright frictions
- Updates from Google and Microsoft, benchmarks, open source releases, and more
Main Discussion and Insights
Sora 2: OpenAI’s Leap in Text-to-Video
[03:07–10:50]
- OpenAI released Sora 2, their latest text-to-video generation model. Notable improvements include:
- Higher photorealism and more “grounded” video outputs; less of the telltale AI weirdness (“AI slop”).
- New Sora iOS app (invite-only): lets users create, remix, and post videos; major “cameo” feature allows face scanning for starring in videos.
- Example: Videos featuring Sam Altman were memed online for their realism and creative antics.
- Sora 2 can now generate audio (sound effects and speech), bringing it on par with other leaders like Pika and VO3.
- Some users recreated copyrighted media (e.g., South Park, Family Guy) prompting speculation about generous use of copyrighted material in training.
- Quote:
“The video quality is far better... than anything I’ve ever seen in text-to-video generation. And some pretty impressive real world physics contained in it.”
— John Krohn, [07:00] - Discussion of AI-generated media and copyright disregard, likened to early “move-fast” strategies by Spotify and Uber.
Claude Sonnet 4.5 and Anthropic’s Agentic Tools
[11:18–20:42]
- Anthropic launched Claude Sonnet 4.5, touting it as best-in-class for coding, agentic workflows, long-term reasoning, and professional/enterprise use.
- Also released Claude Code 2.0 and rebranded the SDK as the Claude Agents SDK, positioning it for a broader set of AI agent applications.
- Checkpoints and VS Code extension were highlighted as major productivity upgrades.
- Improvements emphasize long-range task solving—AI can now reliably handle tasks that take humans hours, echoing the trend that “human task length that an AI can handle doubles every seven months” ([14:40]).
- Quote:
“Claude Sonnet 4.5... is the best model in the world for agents, coding and computer use... enhanced domain knowledge in coding, finance, and cybersecurity.”
— Andrei Karlenkov, [15:55] - SDK’s strong feedback loop (gather context → act → verify → repeat) stands out for agentic workflows.
AI “Slop” and Meta’s Vibes
[23:59–25:09]
- Meta’s new Vibes feature aims to provide an AI video feed inside its Meta AI app, resembling Sora’s sharing approach but using models like MidJourney or Black Forest Labs.
- Reception was negative, with widespread derision about a “slop machine” for low-effort, prompt-to-video content dominating the feed.
- The term “slop” refers to repetitive, minimally human, AI-generated media seen as low-value.
- Quote:
“...the idea of scrolling and seeing non-stop AI content... this is a slop machine. This is feeding you AI slop.”
— Andrei Karlenkov, [24:29]
ChatGPT Pulse and AI as Personal Assistant
[25:36–32:12]
- OpenAI announced “ChatGPT Pulse”—personalized morning briefs and contextual cards, enhancing ChatGPT’s role as a daily assistant.
- Connects with email, calendar, and other tools to provide contextually relevant updates.
- Discussion about privacy, comfort with data connectors, and the challenge of user lock-in.
- Quote:
“I personally don't have a level of comfort with OpenAI or maybe even any of these big players... what they're going to do with my data isn't clear.”
— John Krohn, [28:23] - Contrast with Google’s Gemini, which is increasingly integrated into the suite of Google productivity apps.
New Safety Features and Policy
[33:45–35:02]
- OpenAI is rolling out safety routing and parental controls in ChatGPT, using “GPT-5 thinking” for emotionally sensitive conversations.
- Responds to previous issues of AI chat safety (e.g., content encouraging self-harm).
- Discussion of the impact of chatbots on social wellbeing and fears of chatbots becoming emotional crutches.
Google, Microsoft Efficiency Upgrades, and Agents
[35:02–41:46]
- Google’s Gemini 2.5 Flashlight model is now the fastest proprietary LLM (887 output tokens/sec) at dramatically lower cost, with improvements in coding and tool use.
- Microsoft announced rollout of agentic AI features to Word, Excel, and PowerPoint for Microsoft 365 users, enabling more automation and document interaction.
- Both companies are pushing toward tighter integration to encourage user lock-in.
Agentic Shopping and OpenAI Business Moves
[44:03–45:48]
- OpenAI launches instant checkout in ChatGPT, letting US users buy from Etsy and Shopify merchants directly in chat.
- OpenAJ open-sourcing the underlying Agentic Commerce Protocol.
- Revenue model resembles that of Perplexity and others.
- Concerns about future ads or biased recommendations inherent to commerce integrations.
Thinking Machines Lab’s “Tynker” and Trends in Fine-tuned Models
[45:48–49:01]
- The OpenAI spinout Thinking Machines Lab, led by Mira Murati, launched Tynker, an API/toolkit for fine-tuning open-source LLMs (Meta Llama, Alibaba Qwen, etc.).
- Crowded space, but prestige and expertise may give a competitive edge.
Valuations and Legal Drama
[49:01–54:56]
- OpenAI’s valuation surges to $500B after a $6.6B secondary share sale, making it the world’s most valuable private company.
- Employees can now cash out more stock, blurring the lines between private and public company structures.
- Elon Musk’s XAI sues OpenAI for “stealing trade secrets” by hiring former XAI staff—one in a series of ongoing legal maneuvers.
Startups and AI for Scientific Discovery
[55:40–57:41]
- Periodic Labs emerges with a $300M seed round, backed by Andreessen Horowitz, Nvidia, Jeff Bezos, etc.
- Goal: automate scientific discovery via AI-driven “autonomous labs,” starting with superconductors.
- Quote:
“These kinds of companies that are changing the physical world by blending together cutting-edge AI with robotics and making scientific discoveries... I love it and I wish them all the best.”
— John Krohn, [56:56]
Benchmarks, Research & Model Understanding
[57:41–77:36]
- New benchmarks:
- SWE Bench Pro (Scale AI) for harder, more realistic software engineering tasks—models currently perform below 20%.
- Research papers discussed:
- Mechanistic Interoperability: Longitudinal tracking of LLM “concept” development during pre-training reveals stages (statistical feature learning → advanced feature learning) ([61:29]).
- E.g. Features like “plural”, “Golden Gate Bridge” span multiple layers and change meaning as training progresses.
- Reasoning Efficiency: Study finds that shorter, more structured chain-of-thought (CoT) traces, with moderate review, are most effective for model reasoning; overthinking or too much review can hurt accuracy ([67:57]).
- CFA Level 3 Evaluation: GPT-4 mini and Gemini 2.5 Pro pass the rigorous Chartered Financial Analyst Level 3 exam, joining a wave of LLMs passing high-level professional certifications ([71:20]).
- Hybrid Attention/Recurrence: Short window attention turns out better for long-term memorization in hybrid transformer/RNN models, drawing analogies to ConvNet window size findings ([75:36]).
- Mechanistic Interoperability: Longitudinal tracking of LLM “concept” development during pre-training reveals stages (statistical feature learning → advanced feature learning) ([61:29]).
- Quote:
“...the human task length that an AI model can handle doubles every seven months.”
— John Krohn, [14:40]
Policy, Copyright, and the AI “Slop” Problem
[77:36–up]
- California SB53 — Transparency in Frontier AI Act: Now law, requires disclosure of safety/security practices and whistleblower protections. Anthropic and others support its practical focus.
- California SB942 — AI Transparency Act: AI providers must offer free, public detection tools for AI-generated media, or face massive penalties—potentially problematic for startups due to ambiguity of violations ([80:38]).
- Federal Government and AI: Elon Musk’s XAI underbids OpenAI and Anthropic, offering chatbot Grok to the US GSA for 42 cents (a nod to Hitchhiker’s Guide to the Galaxy).
- Copyright Frictions:
- Character.AI removes Disney characters after legal warnings about IP use—illustrating thin lines in user-generated/fan fiction-styled AI.
- Spotify struggles to filter out AI-generated (“slop”) music, impacting real artists; policies seem insufficient as distinguishing genuine from slop gets harder ([90:40]).
- Ongoing tension: AI-generated content is both creative and spammy, with value often diminished simply by knowing it originated from AI.
Notable Quotes & Moments
-
On Sora 2’s realism:
“I think Sora in general is a really viable product that we're going to see a lot of.”
— John Krohn, [08:26] -
On the shifting value of creative originality:
“When you can tell that somebody actually put effort into writing something or creating something... that is starting to become more valuable, but also, interestingly, harder to distinguish from the slop.”
— John Krohn, [24:50] -
On business margins:
“If a free plan goes away and another company is cheaper... I think people will move over. Right. I don’t think there are people who are fans of ChatGPT so much as fans of the experience.”
— Andrei Karlenkov, [29:55]
Timestamps of Important Segments
- Sora 2 and text-to-video advances: 03:07–10:50
- Copyright and training data debate: 10:50–11:18
- Claude Sonnet 4.5, coding & agents: 11:18–20:42
- Meta Vibes & the ‘AI slop’ discourse: 23:59–25:09
- ChatGPT Pulse and user lock-in: 25:36–32:12
- Safety controls in LLMs: 33:45–35:02
- Google Gemini, MS AI agents: 35:02–41:46
- OpenAI agentic shopping: 44:03–45:48
- Thinking Machines Lab and Tynker: 45:48–49:01
- OpenAI valuation & employee equity: 49:01–54:56
- Periodic Labs and AI for science: 55:40–57:41
- SWE Bench Pro & benchmarks: 57:41–61:29
- Mechanistic interpretability & training: 61:29–65:20
- Chain-of-thought paper/discussion: 65:27–68:55
- CFA-level financial benchmark: 67:57–71:52
- Hybrid attention memory paper: 71:52–77:36
- California AI transparency/safety laws: 77:36–85:04
- Copyright, Character AI, AI music slop: 85:15–91:00
Final Thoughts
This episode underscores how rapidly generative AI capabilities, deployment, and business strategies are evolving—all while legal, cultural, and policy frameworks struggle to keep up. The hosts highlight both the technical progress and the emerging societal challenges, from user trust and safety to the “AI slop” dilemma and questions of economic sustainability and copyright.
If you missed the latest two weeks in AI, this episode packs all the major developments, from Sora’s cinematic videos and new agentic tools to regulatory and business shakeups.
Host/Guest sign-off:
“These are things that are having a huge impact. … We love benchmarks on this podcast.”
— Andrei Karlenkov, [21:38 & 61:29]
“This is the kind of application that I dream of as AI advances... I wish them all the best.”
— John Krohn, [56:56]
(For further deep-dive interviews, check out John’s Super Data Science Podcast, now over 900 episodes.)
