Last Week in AI – Episode #217

Date: July 23, 2025
Hosts: Andrei Karen, John Krohn
Theme: Weekly AI news roundup with focus on new agentic tools (notably OpenAI’s ChatGPT Agent), major open-source developments (Kimi K2), business and hiring drama among AI giants, and societal impacts of emerging AI.

Episode Overview

This episode dives into a whirlwind week on the AI front, headlined by OpenAI's new ChatGPT Agent capable of controlling a user's computer, the open-source Kimi K2 model out of China, and a series of high-profile hiring moves and business shakeups in Silicon Valley. The hosts also discuss the ongoing march of agentic tools, developments in AI regulatory policy, and concerning applications like deepfake websites and government surveillance tech. On a lighter note, they address the surprising turn of Grok's new "companion" features and the resolution of a video game actor strike about AI-generated voices.

Key Discussion Points & Insights

1. OpenAI’s ChatGPT Agent: A Major Leap in Agentic AI

[03:10–06:38]

What is it?
OpenAI’s newest ChatGPT Agent mode allows the AI to control an entire computer, carrying out tasks ranging from web browsing to creating spreadsheets and slideshows autonomously. Combines and surpasses their earlier Operator and Deep Research modes.
Performance:
Clearly outperforms previous agent options and even other leading models on benchmark tasks.
User Experience:
Users can watch the agent as it works, interrupt its actions, and even take over tasks.
Safety Concerns:
Requires user permissions for sensitive actions like sending emails or making bookings. Financial transactions are restricted.

Quote:

“It seems like it would be comparable to having a McKinsey analyst working for you, except that they can get their work done in minutes instead of days or weeks.” – John Krohn [05:28]

2. Kimi K2: Open-Source Model Rivals BigTech

[07:08–08:48]

Background:
Alibaba-backed Moonshot releases Kimi K2—a 1 trillion parameter expert model (32B active at once) that scores above proprietary models like ChatGPT and Claude on benchmarks, at lower cost.
Community Reaction:
Generally positive; noted for strong creative writing and a distinctive (possibly China-influenced) style.
Technical Note:
Kimi K2’s scaling uses a new optimizer, Muon, plus it emerges despite China’s hardware constraints.

Quote:

“Six months after a proprietary model comes out, you can expect similar capability in open-source. And that’s what we’re seeing here.” – John Krohn [08:20]

3. Agentic Coding Tools: Amazon’s QROW and the Crowded Space

[09:20–12:20]

Amazon’s new QROW IDE enters the agentic coding tool fray, targeting the “vibe coding” trend—tools that help users write and organize code via agents.
Hosts compare Amazon's approach:
Seen as a big company “throwing stuff at the wall”; their LLMs trail the cutting edge, but they maintain a presence through partnerships and ecosystem plays.
Market Outlook:
Cursor and VS Code remain popular; QROW’s long-term impact is uncertain.

Quote:

“Amazon hasn’t been…anywhere near the cutting edge.” – John Krohn [11:07]

4. Cloud Code & Frontier Models: Usage Limits and Competition

[12:20–15:46]

Anthropic tightens usage limits for Claude Code without transparent communication, frustrating paid users who find themselves hitting limits.
Context:
All AI labs are bleeding money from generous max plans; tightening restrictions is inevitable as usage and compute costs soar.
Personal Insight:
Host Andrei notes he “spends $2,000 in tokens on a $200 per month plan,” highlighting unsustainable economics.

Quote:

“It’s a rare own goal, I’d say, for Anthropic.” – John Krohn [14:58]

5. Mistral, Deep Research, and the New "Table Stakes"

[15:52–17:43]

Mistral rolls out deep research features, image editing, and multilingual reasoning in their LeChat platform—matching market leaders.
Education Plug:
John Krohn mentions his free, ad-free YouTube course on agentic engineering.

6. Grok’s "Companions": Flirty AI Avatars & Ethical Concerns

[17:43–21:04]

X (formerly Twitter) launches “Companions” in Grok:
Animated, customizable chatbots (including flirty anime personas) with audio interaction and simulated emotional attachment.
Controversy:
Some companions’ system prompts encourage overtly flirty or sexualized interaction, surprising for a mainstream platform.

Quote:

“It’s not safe for work entirely. This might have some interesting effects on people if they really do start to bond with it. Whoa.” – Andrei Karen [19:36]

7. Robotaxis: Uber’s Multi-Partner Play

[21:04–23:25]

Uber partners with Baidu to expand robotaxi services beyond the US/China, adding to their collaboration with Waymo.
Social/Economic Impacts:
Potential disruption as millions of driver jobs become automated.

Quote:

“It is going to be very disruptive to all these people who have this kind of job today. So retraining programs will need to come into place.” – John Krohn [23:25]

8. Business & Hiring Drama: Windsurf, Cognition, and Meta’s Talent War

[24:25–31:10]

Windsurf’s OpenAI deal collapses:
CEO and top staff go to Google in a $2.4B licensing/talent deal, with Cognition swooping in to buy the remainder.
Weird new normal:
Big tech does non-acquisition acqui-hires, leaving smaller employees and shareholders adrift.
Hiring rollercoaster:
- Meta poaching top researchers (Jason Wei, Huan Wong Chong) from OpenAI.
- Anthropic regaining staff from Cursor days after their departure.
- Meta recruits Apple AI leads, continuing their aggressive expansion.
- “Trading cards” meme emerges for talent moves.
Stress & burnout play a role:
Intense frontier lab environments and massive offers drive staff movement.
Insane AI startup valuations:
Mira Murati’s Thinking Machines closes $2B seed round (post $12B valuation) with no product yet—reflecting AGI expectations in the air.

Quote:

“If you’re not going to take a hundred million dollar contract from Mark Zuckerberg… the thing to do is exactly what Mira Murati has done.” – John Krohn [32:25]

9. "Vibe Coding" Startups: Lovable’s Rapid Rise

[33:32–34:31]

Lovable raises $200M eight months after launch (now valued at $1.8B).
2.3 million users, 180,000 paid, $75M annual revenue—great early success for no-code/low-code agentic tools.

10. Musk Mega-Moves: Tesla and SpaceX Pour Billions into xAI

[34:31–35:57]

SpaceX and possibly Tesla funnel multi-billion dollar funding into xAI.
Satire:
“All of this $2 billion went to an alien themed sex chatbot, is that right?” – John Krohn [35:45]

11. Working at OpenAI: Engineer’s Perspective

[36:29–38:05]

A former OpenAI engineer, Calvin French Owen, shares a non-dramatic, informative blog about life and scale at the company:
- Lightning-speed org growth (tripling headcount in a year)
- Slack-over-email culture, high initiative, intense pace

Quote:

“Everything runs on Slack, there are no emails. If you’re a software engineer, that’s a very interesting detail.” – Andrei Karen [37:45]

12. Research: RL, Benchmarks, and Data Contamination

[38:15–41:21]

Paper: “Reasoning or Memorization: Unreliable Results of Reinforcement Learning due to Data Contamination.”
Shows that previous “RL for reasoning” studies were invalidated by data leakage—Quen models may just be recalling benchmark data.
Big-picture meta-point:
Research in AI is rapidly self-correcting, even if peer review lags.
Related resources:
John Krohn mentions his YouTube course and podcast ep. 903 for deeper dives.

13. Policy & Safety:

[41:21–47:04]

US DoD awards $200M+ contracts to Anthropic, Google, OpenAI, and XAI for military AI agents.
- “Everyone wants government money,” notes Andrei.
California revisits AI regulation via SB53, proposing stricter reporting for large models.
Deepfake/AI nudes industry generates $36M+ yearly and reaches 18M+ users, posing dire ethical and legal issues.
ICE facial recognition:
Internal app “Mobile Fortify” links 200M+ face records, expands US immigration enforcement’s surveillance reach.

Quote:

“If you think state surveillance is concerning… there’s more reasons to be concerned as a result of AI, clearly.” – Andrei Karen [47:04]

14. Art & Society: Video Game Actors Strike Ends with AI Voice Protections

[48:47–50:38]

SAG-AFTRA’s year-long strike ends with new agreements on protection/compensation for AI-generated video game voices.
Hosts note the unique scale and future of voice acting for games amid 11Labs and similar tech.

Memorable Moments & Quotes

“It’s a cool interface. I like it. Powerful.” – John Krohn on ChatGPT Agent [05:15]
“There’s now trading cards that you can see on Twitter for when people swap companies from OpenAI to Meta… It’s quite a meme, I suppose, at this point.” – Andrei Karen [29:33]
“Imagine if there was no gravity, baby.” – Andrei Karen (joking about xAI’s focus) [35:59]
“Video Game Actors Strike Officially Ends After AI Deal… protections for rights to their voice, wage increases, things like that.” – Andrei Karen [48:47]

Timestamps for Major Segments

OpenAI’s ChatGPT Agent: 03:10–06:38
Kimi K2 Open-Source Model: 07:08–08:48
Amazon’s QROW & Coding Tools: 09:20–12:20
Claude Code Usage Limits: 12:20–15:46
Agentic Features Race (Mistral, etc.): 15:52–17:43
Grok Companions (AI Avatars): 17:43–21:04
Uber x Robotaxi Partnerships: 21:04–23:25
Hiring & Business Drama: 24:25–31:10
Lovable’s Rise: 33:32–34:31
xAI/Tesla/SpaceX Funding: 34:31–35:57
OpenAI Engineer Blog: 36:29–38:05
Data Leakage & RL in Research: 38:15–41:21
US DoD AI Contracts & CA Bill: 41:21–45:10
Deepfake Sites & ICE Surveillance: 45:10–47:04
Voice Actor Strike & AI: 48:47–50:38

Overall Tone & Takeaways

The hosts are upbeat, sharp, and sometimes tongue-in-cheek, especially when tackling wild new product features and Silicon Valley chaos. They express both excitement and caution: amazed by technological leaps but concerned about regulatory, safety, and ethical challenges. The episode vividly paints a world where AI is not just advancing but rapidly reshaping society, business, and individual lives—with both eye-popping and eye-rolling moments along the way.

Essential for listeners who want a quick, yet thorough, pulse check on the relentless developments in the world of AI.

Last Week in AI – Episode #217

Episode Overview

Key Discussion Points & Insights

1. OpenAI’s ChatGPT Agent: A Major Leap in Agentic AI

[03:10–06:38]

What is it?
OpenAI’s newest ChatGPT Agent mode allows the AI to control an entire computer, carrying out tasks ranging from web browsing to creating spreadsheets and slideshows autonomously. Combines and surpasses their earlier Operator and Deep Research modes.
Performance:
Clearly outperforms previous agent options and even other leading models on benchmark tasks.
User Experience:
Users can watch the agent as it works, interrupt its actions, and even take over tasks.
Safety Concerns:
Requires user permissions for sensitive actions like sending emails or making bookings. Financial transactions are restricted.

Quote:

“It seems like it would be comparable to having a McKinsey analyst working for you, except that they can get their work done in minutes instead of days or weeks.” – John Krohn [05:28]

2. Kimi K2: Open-Source Model Rivals BigTech

[07:08–08:48]

Background:
Alibaba-backed Moonshot releases Kimi K2—a 1 trillion parameter expert model (32B active at once) that scores above proprietary models like ChatGPT and Claude on benchmarks, at lower cost.
Community Reaction:
Generally positive; noted for strong creative writing and a distinctive (possibly China-influenced) style.
Technical Note:
Kimi K2’s scaling uses a new optimizer, Muon, plus it emerges despite China’s hardware constraints.

Quote:

“Six months after a proprietary model comes out, you can expect similar capability in open-source. And that’s what we’re seeing here.” – John Krohn [08:20]

3. Agentic Coding Tools: Amazon’s QROW and the Crowded Space

[09:20–12:20]

Amazon’s new QROW IDE enters the agentic coding tool fray, targeting the “vibe coding” trend—tools that help users write and organize code via agents.
Hosts compare Amazon's approach:
Seen as a big company “throwing stuff at the wall”; their LLMs trail the cutting edge, but they maintain a presence through partnerships and ecosystem plays.
Market Outlook:
Cursor and VS Code remain popular; QROW’s long-term impact is uncertain.

Quote:

“Amazon hasn’t been…anywhere near the cutting edge.” – John Krohn [11:07]

4. Cloud Code & Frontier Models: Usage Limits and Competition

[12:20–15:46]

Anthropic tightens usage limits for Claude Code without transparent communication, frustrating paid users who find themselves hitting limits.
Context:
All AI labs are bleeding money from generous max plans; tightening restrictions is inevitable as usage and compute costs soar.
Personal Insight:
Host Andrei notes he “spends $2,000 in tokens on a $200 per month plan,” highlighting unsustainable economics.

Quote:

“It’s a rare own goal, I’d say, for Anthropic.” – John Krohn [14:58]

5. Mistral, Deep Research, and the New "Table Stakes"

[15:52–17:43]

Mistral rolls out deep research features, image editing, and multilingual reasoning in their LeChat platform—matching market leaders.
Education Plug:
John Krohn mentions his free, ad-free YouTube course on agentic engineering.

6. Grok’s "Companions": Flirty AI Avatars & Ethical Concerns

[17:43–21:04]

X (formerly Twitter) launches “Companions” in Grok:
Animated, customizable chatbots (including flirty anime personas) with audio interaction and simulated emotional attachment.
Controversy:
Some companions’ system prompts encourage overtly flirty or sexualized interaction, surprising for a mainstream platform.

Quote:

“It’s not safe for work entirely. This might have some interesting effects on people if they really do start to bond with it. Whoa.” – Andrei Karen [19:36]

7. Robotaxis: Uber’s Multi-Partner Play

[21:04–23:25]

Uber partners with Baidu to expand robotaxi services beyond the US/China, adding to their collaboration with Waymo.
Social/Economic Impacts:
Potential disruption as millions of driver jobs become automated.

Quote:

“It is going to be very disruptive to all these people who have this kind of job today. So retraining programs will need to come into place.” – John Krohn [23:25]

8. Business & Hiring Drama: Windsurf, Cognition, and Meta’s Talent War

[24:25–31:10]

Windsurf’s OpenAI deal collapses:
CEO and top staff go to Google in a $2.4B licensing/talent deal, with Cognition swooping in to buy the remainder.
Weird new normal:
Big tech does non-acquisition acqui-hires, leaving smaller employees and shareholders adrift.
Hiring rollercoaster:
- Meta poaching top researchers (Jason Wei, Huan Wong Chong) from OpenAI.
- Anthropic regaining staff from Cursor days after their departure.
- Meta recruits Apple AI leads, continuing their aggressive expansion.
- “Trading cards” meme emerges for talent moves.
Stress & burnout play a role:
Intense frontier lab environments and massive offers drive staff movement.
Insane AI startup valuations:
Mira Murati’s Thinking Machines closes $2B seed round (post $12B valuation) with no product yet—reflecting AGI expectations in the air.

Quote:

“If you’re not going to take a hundred million dollar contract from Mark Zuckerberg… the thing to do is exactly what Mira Murati has done.” – John Krohn [32:25]

9. "Vibe Coding" Startups: Lovable’s Rapid Rise

[33:32–34:31]

Lovable raises $200M eight months after launch (now valued at $1.8B).
2.3 million users, 180,000 paid, $75M annual revenue—great early success for no-code/low-code agentic tools.

10. Musk Mega-Moves: Tesla and SpaceX Pour Billions into xAI

[34:31–35:57]

SpaceX and possibly Tesla funnel multi-billion dollar funding into xAI.
Satire:
“All of this $2 billion went to an alien themed sex chatbot, is that right?” – John Krohn [35:45]

11. Working at OpenAI: Engineer’s Perspective

[36:29–38:05]

A former OpenAI engineer, Calvin French Owen, shares a non-dramatic, informative blog about life and scale at the company:
- Lightning-speed org growth (tripling headcount in a year)
- Slack-over-email culture, high initiative, intense pace

Quote:

“Everything runs on Slack, there are no emails. If you’re a software engineer, that’s a very interesting detail.” – Andrei Karen [37:45]

12. Research: RL, Benchmarks, and Data Contamination

[38:15–41:21]

Paper: “Reasoning or Memorization: Unreliable Results of Reinforcement Learning due to Data Contamination.”
Shows that previous “RL for reasoning” studies were invalidated by data leakage—Quen models may just be recalling benchmark data.
Big-picture meta-point:
Research in AI is rapidly self-correcting, even if peer review lags.
Related resources:
John Krohn mentions his YouTube course and podcast ep. 903 for deeper dives.

13. Policy & Safety:

[41:21–47:04]

US DoD awards $200M+ contracts to Anthropic, Google, OpenAI, and XAI for military AI agents.
- “Everyone wants government money,” notes Andrei.
California revisits AI regulation via SB53, proposing stricter reporting for large models.
Deepfake/AI nudes industry generates $36M+ yearly and reaches 18M+ users, posing dire ethical and legal issues.
ICE facial recognition:
Internal app “Mobile Fortify” links 200M+ face records, expands US immigration enforcement’s surveillance reach.

Quote:

“If you think state surveillance is concerning… there’s more reasons to be concerned as a result of AI, clearly.” – Andrei Karen [47:04]

14. Art & Society: Video Game Actors Strike Ends with AI Voice Protections

[48:47–50:38]

SAG-AFTRA’s year-long strike ends with new agreements on protection/compensation for AI-generated video game voices.
Hosts note the unique scale and future of voice acting for games amid 11Labs and similar tech.

Memorable Moments & Quotes

“It’s a cool interface. I like it. Powerful.” – John Krohn on ChatGPT Agent [05:15]
“There’s now trading cards that you can see on Twitter for when people swap companies from OpenAI to Meta… It’s quite a meme, I suppose, at this point.” – Andrei Karen [29:33]
“Imagine if there was no gravity, baby.” – Andrei Karen (joking about xAI’s focus) [35:59]
“Video Game Actors Strike Officially Ends After AI Deal… protections for rights to their voice, wage increases, things like that.” – Andrei Karen [48:47]

Timestamps for Major Segments

OpenAI’s ChatGPT Agent: 03:10–06:38
Kimi K2 Open-Source Model: 07:08–08:48
Amazon’s QROW & Coding Tools: 09:20–12:20
Claude Code Usage Limits: 12:20–15:46
Agentic Features Race (Mistral, etc.): 15:52–17:43
Grok Companions (AI Avatars): 17:43–21:04
Uber x Robotaxi Partnerships: 21:04–23:25
Hiring & Business Drama: 24:25–31:10
Lovable’s Rise: 33:32–34:31
xAI/Tesla/SpaceX Funding: 34:31–35:57
OpenAI Engineer Blog: 36:29–38:05
Data Leakage & RL in Research: 38:15–41:21
US DoD AI Contracts & CA Bill: 41:21–45:10
Deepfake Sites & ICE Surveillance: 45:10–47:04
Voice Actor Strike & AI: 48:47–50:38

Overall Tone & Takeaways

Essential for listeners who want a quick, yet thorough, pulse check on the relentless developments in the world of AI.

#217 - ChatGPT Agent, Kimi k2, Hiring Drama

Powered by Wave AI

Summary

Last Week in AI – Episode #217

Episode Overview

Key Discussion Points & Insights

1. OpenAI’s ChatGPT Agent: A Major Leap in Agentic AI

2. Kimi K2: Open-Source Model Rivals BigTech

3. Agentic Coding Tools: Amazon’s QROW and the Crowded Space

4. Cloud Code & Frontier Models: Usage Limits and Competition

5. Mistral, Deep Research, and the New "Table Stakes"

6. Grok’s "Companions": Flirty AI Avatars & Ethical Concerns

7. Robotaxis: Uber’s Multi-Partner Play

8. Business & Hiring Drama: Windsurf, Cognition, and Meta’s Talent War

9. "Vibe Coding" Startups: Lovable’s Rapid Rise

10. Musk Mega-Moves: Tesla and SpaceX Pour Billions into xAI

11. Working at OpenAI: Engineer’s Perspective

12. Research: RL, Benchmarks, and Data Contamination

13. Policy & Safety:

14. Art & Society: Video Game Actors Strike Ends with AI Voice Protections

Memorable Moments & Quotes

Timestamps for Major Segments

Overall Tone & Takeaways

Summary

Last Week in AI – Episode #217

Episode Overview

Key Discussion Points & Insights

1. OpenAI’s ChatGPT Agent: A Major Leap in Agentic AI

2. Kimi K2: Open-Source Model Rivals BigTech

3. Agentic Coding Tools: Amazon’s QROW and the Crowded Space

4. Cloud Code & Frontier Models: Usage Limits and Competition

5. Mistral, Deep Research, and the New "Table Stakes"

6. Grok’s "Companions": Flirty AI Avatars & Ethical Concerns

7. Robotaxis: Uber’s Multi-Partner Play

8. Business & Hiring Drama: Windsurf, Cognition, and Meta’s Talent War

9. "Vibe Coding" Startups: Lovable’s Rapid Rise

10. Musk Mega-Moves: Tesla and SpaceX Pour Billions into xAI

11. Working at OpenAI: Engineer’s Perspective

12. Research: RL, Benchmarks, and Data Contamination

13. Policy & Safety:

14. Art & Society: Video Game Actors Strike Ends with AI Voice Protections

Memorable Moments & Quotes

Timestamps for Major Segments

Overall Tone & Takeaways