#161: GPT-5, Google DeepMind Genie 3, Cloudflare vs. Perplexity, OpenAI’s Open Source Models, Claude 4.1 & New Data on AI Layoffs - The Artificial Intelligence Show

Summary8 min read

The Artificial Intelligence Show - Episode #161 Summary

Release Date: August 12, 2025

In Episode #161 of The Artificial Intelligence Show, hosts Paul Roetzer and Mike Kaput delve into the latest advancements and controversies in the AI landscape. From the highly anticipated release of OpenAI's GPT-5 to Google's DeepMind Genie 3, the episode covers a spectrum of topics that are shaping the future of artificial intelligence. Below is a comprehensive summary of the key discussions, insights, and conclusions from the episode.

1. GPT-5 Launch and Community Reaction

Overview: OpenAI unveiled GPT-5, touted as the smartest and fastest model to date. GPT-5 integrates quick response chat capabilities with deeper reasoning, boasting a context window of 400,000 tokens and reducing factual errors by 45% compared to GPT-4.

Key Features:

Unified System: Combines multiple models with a routing system that selects the appropriate model based on the task.
Performance Enhancements: Improved coding, writing, health advice, and multimodal reasoning.
User Access: Free users receive GPT-5 by default, while Plus and Pro subscribers gain access to GPT-5 Pro with higher limits.

Community Feedback: The launch sparked mixed reactions. While some praise the advancements, others express frustration over the removal of previous model options without transparency.

Notable Quotes:

Paul Roetzer [06:56]: “It is not this life-changing model that we all have been anticipating for like a year and a half now.”
Alastair McClay [15:XX]: “OpenAI seems to have been so focused on the benefits of their new router could provide to their less sophisticated users, which automatically switches the underlying model without telling them that they totally overlook the user group that actually matters the most.”

Discussion Highlights:

Lack of Transparency: Users are irritated by the inability to select specific models, leading to distrust and calls for the return of previous versions.
Competitive Landscape: Paul suggests that Google's upcoming Gemini 3 and other competitors like Claude and Grok may soon surpass GPT-5, indicating that the frontier models are becoming commoditized.

2. Google DeepMind's Genie 3

Overview: Google DeepMind introduced Genie 3, a groundbreaking world model capable of generating fully interactive and photorealistic environments in real-time. This advancement is seen as a pivotal step towards Artificial General Intelligence (AGI).

Key Features:

Real-Time Rendering: Generates environments at 24 frames per second with visual and physical consistency.
Dynamic Interactions: Users can alter environments on the fly, such as changing weather or adding objects.
Applications: Potential uses in robotics, education, storytelling, and video game development.

Notable Quotes:

Paul Roetzer [27:36]: “World models are huge. [...] This could come into play in storytelling, where you're trying to create these narratives, video game development, where it's rendering in real time the environment.”

Discussion Highlights:

Path to AGI: World models provide a virtual training ground for AI agents, enabling them to learn and adapt in simulated environments.
Future Implications: The technology opens doors for complex simulations in various fields, though current limitations include short interaction durations and constrained actions.

3. Cloudflare vs. Perplexity: Stealth Crawling Controversy

Overview: Cloudflare accused AI search startup Perplexity of employing stealth crawling techniques to bypass website restrictions, including disguising bots as legitimate browsers and rotating IP addresses.

Key Points:

Cloudflare's Claim: Perplexity's practices violate site boundaries by circumventing robots.txt rules and firewall blocks.
Perplexity's Response: The company denies intentional wrongdoing, asserting that their AI assistants fetch specific pages in real-time without systematic scraping or long-term storage.

Notable Quotes:

Paul Roetzer [34:25]: “The reality at the end of the day is the rules of the web and business are being rewritten.”

Discussion Highlights:

Ethical Implications: The controversy underscores the ongoing struggle between AI developers and content publishers regarding data access and usage rights.
Future Challenges: As AI agents become more prevalent, establishing clear guidelines and agreements will be crucial to navigate content accessibility and intellectual property concerns.

4. OpenAI’s Open Source Models Release

Overview: OpenAI released its first open weight language models since GPT-2: GPT OSS 120B and GPT OSS 20B. These models are available under the Apache 2.0 license, allowing users to run them locally, fine-tune, and use them commercially.

Key Features:

Performance: The models support chain-of-thought reasoning, tool use, and code execution, performing on par or exceeding some proprietary systems.
Accessibility: The 20B parameter model can run on high-end consumer laptops, democratizing access to advanced AI capabilities.

Notable Quotes:

Mike Kaput [40:54]: “It's kind of a no-risk way, at least no risk of cannibalizing your existing products to get developer goodwill.”

Discussion Highlights:

Strategic Move: By open-sourcing previous model versions, OpenAI aims to foster innovation without undermining their premium offerings.
Industry Impact: This release facilitates broader experimentation and development within the AI community, potentially accelerating advancements and applications.

5. Claude Opus 4.1 by Anthropic and System Prompt Insights

Overview: Anthropic launched Claude Opus 4.1, an enhanced version of their AI model with significant improvements in coding, research, and reasoning tasks. The update also includes tweaks to the system prompt to refine the model's behavior and interactions.

Key Features:

Enhanced Performance: Achieves a 74.5% rating on the SWEBench, excelling in real-world coding tests.
System Prompt Updates: Reduces overly casual language, minimizes unnecessary swearing, and improves handling of mental health-related queries.

Notable Quotes:

Paul Roetzer [46:47]: “The verification gap that we've talked about numerous times is sort of illuminates why the verifiers would be so valuable.”

Discussion Highlights:

Transparency: Anthropic's Amanda Askel provided insights into the system prompt changes, offering a rare glimpse into the inner workings of the model.
Ethical Enhancements: The updates aim to make interactions more balanced, critical, and supportive, particularly in sensitive situations.

6. AI’s Impact on the Economy and Job Cuts

Overview: A report by outplacement firm Challenger, Gray & Christmas linked over 10,000 US job cuts in the first seven months of 2025 to employers adopting generative AI. Concurrently, researchers like Erik Brynjolfsson argue that AI's economic impact is visible more in consumer surplus than in traditional GDP metrics.

Key Points:

Job Cuts: Significant reductions attributed to AI-driven efficiency and automation.
Economic Metrics: Traditional GDP fails to capture the $97 billion in consumer surplus Americans gained from free or low-cost AI tools in 2024.

Notable Quotes:

Paul Roetzer [52:51]: “The logic of the value not being counted in the GDP makes sense.”
Mike Kaput [54:33]: “People are scratching their heads about like AI, we're seeing productivity gains in our own work. Is it just not diffused enough into the economy?”

Discussion Highlights:

Measurement Challenges: The discrepancy between AI’s tangible economic benefits for consumers and the lack of reflection in GDP statistics highlights the evolving nature of value creation in the digital age.
Future Implications: Economists and policymakers need to adapt their metrics to account for non-traditional forms of economic value introduced by AI technologies.

7. Additional AI Developments

Meta's Acquisitions and AI Voice Technology

Acquisitions: Meta recently acquired AI voice startups Waveforms and Play AI to bolster its Super Intelligence Labs.
Focus: Developing AI speech indistinguishable from humans and building emotional general intelligence to respond to emotional cues.
Discussion: Paul and Mike speculate that Meta is steering towards hyper-personalized voice assistants and companions, aligning with Zuckerberg's vision for voice-centric interactions.

11 Labs' Entry into AI-Generated Music

Product Launch: 11 Labs introduced 11 Music, an AI tool capable of generating fully produced songs from text prompts.
Ethical Approach: Claims to train only on licensed songs, aiming to differentiate from other AI music models accused of unauthorized scraping.
Discussion: While steps towards ethical AI development are noted, skepticism remains regarding the veracity of licensing claims, emphasizing the need for transparency and accountability in AI training practices.

8. Google’s AI Education Initiative

Overview: Google committed $1 billion to AI education, training, and research in the US, extending the offer to students in Japan, Indonesia, Korea, and Brazil. The initiative includes free AI Pro plans, custom training, and career certificates aimed at fostering AI literacy among the next generation.

Key Features:

AI Tools for Students: Access to Gemini 2.5 Pro, NotebookLM, VO3 for AI-generated videos, and AI coding agents.
Guided Learning: Introduces a mode that walks students through problems step-by-step to enhance understanding.
Career Development: Offers Google Career Certificates to prepare students for AI-centric roles in the workforce.

Notable Quotes:

Paul Roetzer [71:27]: “AI literacy is absolutely critical to the future of work and innovation, not just in the U.S. but beyond that.”

Discussion Highlights:

Alignment with Policy: The initiative aligns with recent White House executive orders promoting AI education, indicating a collaborative effort between government and industry to prepare the workforce for an AI-driven future.
Long-Term Impact: By embedding AI training in education systems, Google aims to cultivate a generation of "AI natives" equipped with the skills necessary to thrive in a technology-centric world.

9. OpenAI's Federal Deal and Enterprise Expansion

Overview: OpenAI secured a deal to make ChatGPT Enterprise available across the US federal executive branch for one year. The agreement includes access to advanced tools, custom training, and consulting support aimed at enhancing government efficiency.

Key Features:

Cost-Effective Access: Agencies receive the top model for just $1 per agency, along with additional perks.
Efficiency Gains: Early pilots indicate significant time savings in routine tasks, with high satisfaction rates among participants.

Discussion Highlights:

Government Integration: This move signifies OpenAI's deepening penetration into federal operations, potentially setting a precedent for broader governmental adoption of AI tools.
Potential Backlash: Paul hints at possible future scrutiny regarding the terms of such deals, drawing parallels with other tech-government interactions.

10. Conclusion and Final Thoughts

Overall Impressions: Despite the extensive advancements, both hosts express a sense of underwhelming impact from GPT-5's immediate release, suggesting that the true significance of these developments will unfold over time. They emphasize the importance of staying informed and proactive in understanding and leveraging AI technologies.

Notable Quotes:

Mike Kaput [73:26]: “We're going to look back and be like, okay, that might have been a subtle turning point.”
Paul Roetzer [73:54]: “The general public is waiting, waiting, waiting for a year and a half. The general public like Careless.”

Final Thoughts: Paul and Mike underscore the rapid pace of AI innovation and the necessity for continuous education and adaptation. They encourage listeners to engage with AI tools, stay curious, and recognize that the transformative effects of AI may be more nuanced and gradual than headline-grabbing releases suggest.

Upcoming Events and Resources:

AI Academy Relaunch: Scheduled for August 19th, featuring new courses and live sessions.
Intro to AI 50th Edition: Available on August 14th, offering fundamentals of AI with interactive Q&A.
Additional Resources: Links to webinars, courses, and further readings are available in the show notes.

Stay updated with the latest in AI by subscribing to SmarterX AI and joining the community for more insights, courses, and events tailored to empower your AI journey.

Loading summary

Transcript85 lines

[00:00]
Paul Raitzer
So the question has always been, does OpenAI have a secret sauce? Is there something they're doing that was going to allow them to get that 6 to 12 month lead over everybody else? The answer is no. Welcome to the Artificial Intelligence show, the podcast that helps your business grow smarter by making AI approachable and actionable. My name is Paul Raitzer. I'm the founder and CEO of SmartRx and marketing AI institute and I'm your host. Each week I'm joined by my co host and Marketing AI Institute Chief Content Officer Mike Kaput, as we break down all the AI news that matters and give you insights and perspectives that you can use to advance your company and your career. Join us as we accelerate AI literacy for all. Welcome to episode 161 of the Artificial Intelligence Show. I'm your host Paul Raetzer along with my co host Mike Kaput. We are coming to recording Monday, August 11th at 11am ish Eastern Time. Our long awaited GPT5 has arrived. Our team was like messaging us on Friday, like are we going to do an emergency podcast, talk about GPT5? And I'm like, you're either going to get these AI Academy courses finished or you're going to get an emergency podcast. So Mike and I chose to focus on getting the AI Academy courses ready for launch instead of the emergency pod. But we will have plenty to discuss about GPT5 today. All right, so this episode is brought To Us by AI Academy by SmartRx, which I was just talking about. We are having our kind of relaunch event, I guess. We first introduced AI Academy in 2020. We have spent the last almost year now completely reimagining what Academy is, how it functions, the technology behind it, how to infuse AI into it, the overall learner experience, how to build learning journeys. Like everything has just been completely revised, updated, improved everything. And so on August 19th at noon Eastern time, we will have a launch event. There's a webinar you can sign up for to hear all about it. We're going to go through the vision and roadmap for AI Academy. We're going to talk about all the new on demand courses and professional certificates that we are developing and launching that day. A bunch of them are coming out that day. We're going to talk about the new AI Academy live, which I'm super excited about, which is going to be a regularly scheduled occurrence where members are actually going to be able to join in live talk, you know, not only with Mike and I, but go through deep dives, go through AI Transformation spotlights, book clubs, things like that. There's a new learning management system coming later this year. We're going to preview that, how to build Personalized Learning journeys. We're going to talk about new business accounts where companies, universities, people can come in, get five plus licenses. You get a whole bunch of features and benefits specific to those, plus dramatically reduced pricing. And then we're going to have an Ask Us Anything session with me and Mike and Kathy. So all kinds of stuff coming out. We have a new AI Fundamentals series, third edition of our piloting AI series, a second edition of our Scaling AI series which I am finalizing literally between meetings and the podcast today. Mike did a new AI for professional services. Also Mike created a new AI and marketing series. So all these are launching along with a bunch of other stuff. So go to SmartRx AI. At the top of the page there's a banner you can click on to register for the webinar and we will also drop that link in. So again that webinar is free and it is happening on August 19th. This episode is also brought to us by Intro to AI. So this is I have been teaching this class free every month since November, October of 2021. So we are having our 50th edition of Intro to AI. This is happening Thursday, August 14th at noon. So you can register. We've had, I think close to 40,000 people have gone through this class since I started doing it almost four years ago. So it's a about 30, 35 minutes. I do a live kind of go through the fundamentals of AI and then we leave the last 25 minutes for questions. We usually get anywhere between 50 and 100 questions. We do our best to answer as many as we can and then the ones we can't get to we then do. The week later we do an Intro to AI special for the podcast where we go through a bunch of other questions that we got. So intro to AI 50th edition coming Thursday, August 14th and then we'll do a follow up podcast with some questions we didn't get to. So I will put a link to the show notes in the show notes to Intro to AI as well and we will share all of that information. All right, so two great live events coming up August 14th and August 19th. Check those out. And now Mike, the long awaited GPT5. Let's get into it.
[04:58]
Mike Kaput
All right, so first topic. Predictably OpenAI has unveiled GPT5. They are calling it their smartest, fastest and most useful model yet. It is the first unified system from the company, it combines quick response chat with deeper reasoning when needed. You don't really need to tweak any settings. Instead, GPT5 will route your requests to the right type of model that it deems to be correct for the job, depending if it needs to think for longer or act faster. The company says it outperforms earlier versions in coding, writing, health advice and multimodal reasoning. There are big reductions in hallucinations and it says it has a more honest approach when tasks cannot be completed. It also has a context window of 400,000 tokens and 128,000 max output tokens. Now another note on those hallucinations. OpenAI says it has significantly fewer hallucinations than GPT4O and is 45% less likely to contain factual errors compared to GPT4. Okay for coders, GPT5 can spin up full apps from a single prompt. It's got really good design sensibility and debugging skills. For health, it is far less error prone and more proactive about flagging issues. And creative work has also gotten a lift with more nuanced writing and better taste in design. Now this launch includes GPT5 Pro for extended reasoning. There's new preset personalities that change how the model responds, and API access across three different model sizes. Now free users are now getting GPT5 as the default, while plus and Pro subscribers get higher limits and access to GPT5 Pro now. Paul There's a lot to unpack here. There's a few different angles we're going to talk about here, but maybe let's kick off by saying what are your initial impressions of GPT5?
[06:57]
Paul Raitzer
A lot of my initial impressions come from curating opinions of other people online whom I trust and you know, I've read lots of their reviews. I have experimented with it a bit myself. I didn't have, you know, I was working on the courses all weekend so I couldn't like really put it through a bunch of experiments, but I was, you know, dabbling in it. So when you follow the people we follow online, they generally were the people who weren't super happy about this. So I think like I want to, I want to. My caveat here is like it seems like a really good model. It is not this life changing model that we all kind of have been anticipating for like a year and a half now. GPT5 it's always been like, well once GPT5 gets here then everything changes. So I will say one as part of the Academy we are introducing a new gen AI app series and Mike and I are talking this morning and he's going to do a GPT5 review as the first course in that series. So we'll have more to say. It's like a 15, 20 minute product review basically. So that'll be dropping next week for Academy members. But here's my, my take. Mike, I'm going to try and like, you can hear a lot of like here's the, you know, Ethan Malik has a bunch of great stuff, like Brian Brickman, our friend. Like people have done like these great reviews. Ali Miller had it, people who had access to it beforehand. There's all these great reviews. So I'm going to give more of like a zoom out. Like what's the here. So first, it is not multimodal from the ground up. So when they say unified model, what they mean is it's still like four or five different models that are packaged as one thing called GPT5. And then there's a router that based on your prompt decides which model it's going to use. So if it's going to use one that has reasoning, if it's going to use the traditional chat, if it's going to use image generation, video generation, like all that's not in a single model. So I assume GPT6 will be that it'll be truly multimodal from the ground up. As far as I know, they didn't give any updates on image generation or sora, their video generation as part of this. I think they made some tweaks to voice capabilities maybe I think they improved the voice a little bit. So we on this podcast have for a while talked about the confusion of the model choice and when you would go into ChatGPT last week there was eight models to choose from. And the point we always made was the average user has no idea what the difference is between those, you know, 4,003 mini. Like the average user has no idea. And so they would just use whatever default. And so our point was always why for the average user would you make them choose from a list of models that they don't understand what the difference is? And so it would seem that this router is sort of heading in the right direction, but it actually caused chaos because there is a small fraction of ChatGPT users who do understand what the different models are and have preferred models that they like to use. And what OpenAI did sort of their first misstep. And we'll go through a series of missteps that they made in this process is that they almost just ignored the loudest, the most vocal online users who do actually understand the different models and really like some of the other models. Because what OpenAI did is they turned on GPT5 and removed all the other models. And then when the router was doing its work, I go into ChatGPT, I give a prompt, help me write a business plan for this idea I have. I would have no idea which model it was actually using. So there was no transparency into what model was actually being used. And if there was a model I used to like that I liked the tone, the personality, the style, the format, it was gone. And so people were pissed by like end of the day Thursday, people are like, give me my model back. Like, I want 4.0. Like I like talking to 4.0. And so kind of surprisingly, Mike, it's like, it's almost like OpenAI didn't understand their user base. Like there was obviously people who wanted that choice and then there was this other faction of people who obviously were very attached to specific models and almost like emotionally attached to like 4.0 and 5 is a very different personality. It responds in like shorter bursts. Like it doesn't have, you know, it's not as like comforting and things like that. Like it's just missing some of that. So there was one user and I didn't know this guy previously on X, but I thought he gave a great synopsis. I'll just read this one. Put the link in. Alastair McClay is his name. He said OpenAI forgot who actually matters. Power users always lead the culture curve. They set the vibes for a product, especially in consumer software. They are the loudest, most passionate and have the highest expectations. They are your biggest asset as a consumer company and you need to keep them front of mind at all times. With the GPT5 launch in ChatGPT, OpenAI seems to have been so focused on the benefits of their new router could provide to their less sophisticated users, which automatically switches the underlying model without telling them that they totally overlook the user group that actually matters the most. If you put yourself in the shoes of ChatGPT power user, it's blatantly obvious they will continue to want the ability to hard switch between models. It's obvious they will expect transparency in which model is being used by the router at any point in any time. And most important of all, it's obvious they will expect to have a reasonable notice period before their existing models deprecated. The response we saw was inevitable. The power users who make up the majority of the noise online quickly set the vibes of frustration, disappointment and broken trust. People who used 4.0 or 4.5 for writing were suddenly left with no good alternative. Plus, users who had access to 04, mini and O3 suddenly found themselves with a 200 message weekly cap on GPT5 thinking and a router that wouldn't tell them which model they were actually talking to. Not to mention most people I've spoken to had no idea there's now a cap on GPT5 thinking. You only find out when you hit and lose access for the rest of the week. So it's like that's a pretty good synopsis of what was going on. And then OpenAI immediately realized this. Like Sam Altman was in full blown crisis communications mode by which told you like they just missed this. Like they didn't think this through. So Altman tweeted and we'll put links to all these tweets. So this was August 10th, this was on Sunday. If you've been following the GPT5 rollout, one thing you might be noticing is how much of an attachment some people have to specific AI models. It feels different and stronger than the kinds of attachments people have to previous kinds of technology. And so suddenly deprecating old models that users depended on in their workflows was a mistake. This is something we've been closely tracking for the past year or so, but still hasn't gotten much mainstream attention. People have used technology, including AI, in self destruct, self destructive ways. If a user is in a mentally fragile state and prone to delusion, we do not want the eye to reinforce that. Most users can keep a clear line between reality and fiction or roleplay, but a small percentage cannot. We value user freedom as a core principle, but we also feel responsible in how we introduce new technology with new risks. So that's the attachment thing, the rate limit thing was like almost just like sideswipe people. So this is an interesting one Mike, because not only did Sam tweet about this On Sunday, other OpenAI researchers were also tweeting about this. So you know that this one was like a real hot button internally and with their users. And the thing that I think about with this one is their restrictions on capacity, compute capacity to do inference so quick. Like, you know, there's compute to train these models, but then when you and I use them, that's inference. So when it delivers an answer, reasoning which is now baked into this requires way more computed inference than a standard chat, as does video as does image things like that. And so the fact that they're straight up saying this is an issue with capacity opens the door for Google in my opinion. Like this is a really interesting play where OpenAI's lack of maturity and infrastructure when it comes to computing data centers is not an issue for Google as much. So here was Sam's tweet again on Sunday. Said today we are significantly increasing rate limits for reasoning for ChatGPT+ users and all model class limits will shortly be higher than they were before GPT5. And then today being Monday or Tuesday, they expect to share their thinking on how we are going to make capacity trade offs over the coming months. Meaning we a lot of people like our product. We have 700 million users and the more they use reasoning the more these things like we're, we're going to just run out of capacity. Like we have to set rate limits but people don't want them. And then there was a couple other OpenAI people who also talked about the rate limits. Then the other one was that this was the first time we've seen this data that I thought was very fascinating. Mike was we assumed and we've talked about this. Like I've said, I go do talks all the time. I ask rooms of hundreds of people like who's ever used a reasoning model, who's used O3 and you get like five hands. And so our like vibe check or like just you know, eyeball check was I don't know, less than 1%, less than 3% of people have any clue what a reasoning model even is. And this is as of like a month ago, OpenAI verified that for us. So the vast majority of OpenAI's users have no clue that reasoning models exist or what they do. So they have 700 million users. For many people, GPT5 is the first time they're going to interact with a reasoning model. But they probably won't know it now because it's just baked into it. So Sam tweeted the percentage of users using reasoning models each day is significantly increasing. For example, for free users we went from less than 1% to 7% and for plus users 7% to 24% now that's a big jump. But that means that people who were paying the plus is 200 bucks a month. Right Mike, isn't that plus?
[17:31]
Mike Kaput
Plus is 20 and then pro is 200. So paying the paying tier of subscribers.
[17:37]
Paul Raitzer
So of the people paying 20 bucks a month, only 7% who are using the reasoning models. Which is wild. Yeah, so and that would Tell you, like once you go from 7 to 24, now all of a sudden the compute capacity becomes massive. And then. Three other quick thoughts here. The big question with GPT5 that we've all been waiting for an answer for is was it going to be a leap forward over the other Frontier models? GPT4, when it came out in March 2023 was State of the art for a year and a half. It took, it took a year and a half for Google and others to to create something on par with GPT4. So the question has always been, does OpenAI have a secret sauce? Is there something they're doing that was going to allow them to get that even 6 to 12 month lead over everybody else? The answer is no. My guess is Gemini 3 from Google, the next version of Claude, the next version of Grok, they will all leapfrog over GPT5. There's some arguments that like Gemini 2.5 Pro is probably already like better than GPT5 in some capacities. So we kind of have our answer that the frontier models have been commoditized. Like there is no apparent secret sauce at the moment, which means we're back into the game of distribution. Who can put a comparable model in front of enough users? So OpenAI has 700 million. That's huge. But Apple, like you're back in the game. Like if you're Apple you realize like, hey, we don't need like the best, we don't have to build our own frontier model. If you're Google, you have seven products with over a billion users. Seven, seven power platforms and product like distribution becomes massive again. And then the big question I was like, well, what about GPTs? I didn't hear anything about GPTs. And so I went and looked and it looks like the only thing that changed is the model selector of recommended model. As the creator of the GPT is now 55 thinking or 5 pro, like that's kind of all I can see. So again I just wanted to zoom out and be like high level. The things we were really waiting for was like the model choice issue, was it going to be a different frontier model than everything else that would cause people to switch back to ChatGPT if they were like Love and Claude or Gemini, things like that. And overall it just seems like it's probably a really, really smart model. The average user isn't going to notice the difference. And there's, there's lots they touted but there's very little that seems truly differentiated at this point. You spent more time with it Though. Mike, did you have any other different impressions of it or any other initial feedback?
[20:14]
Mike Kaput
Yeah, no, I largely agree with your take. I will say it just really struck me how much preferences matter here because personally, and this will seem crazy to some people, I love this model. Like, I genuinely find it more useful simply because it is smarter, it is faster, really fast, which is really helpful. I get a lot more done. All of my prompts and workflows I've tested so far with it work better, which is amazing. I personally don't have as much preference for switching models. I thought 4.0 was a little too dumb.03 was brilliant.
[20:50]
Paul Raitzer
But O3 Pro is like my favorite model to use.
[20:53]
Mike Kaput
Very much so. However, I would also get frustrated a bit sometimes with the formatting and the slowness of being able to not be able to just go back and forth rapidly and kind of iterate and converse. For me, this model, like squares that circle and like really provides the perfect balance. For me personally, I like the tone a lot more. That's all personal preference. I'm really glad we have it. I think some people hate that it exists. It's really interesting to see. And I would also add too, if you want to go down a horrifying rabbit hole, go to the ChatGPT subreddit. Because the stories of people. I don't know how much of this is like, too played up and like viral, but there are so many posts of people deeply emotionally attached to 4o that you feel like the posts are written by people going through withdrawal. And it's really, really weird.
[21:45]
Paul Raitzer
And that's, I think, what Sam was referring to with that. Like, hey, some people get really attached as therapists, as friends, as companions, and like, we have a tough job here to balance like, what is unhealthy because they can see the chats. Like, yeah, they know what people are doing with these things and they're trying to balance, like, what is good for mental health versus like, what is acceptable personal choice.
[22:12]
Mike Kaput
It's really interesting to just see that play out. And they did have an interesting emphasis on health throughout all their launch materials. So I think they're really just understanding that people, for better or for worse, are turning to this for emotional and physical health needs.
[22:28]
Paul Raitzer
You run a comparison like, do you use 2.5 Pro from Gemini much?
[22:34]
Mike Kaput
Yeah.
[22:34]
Paul Raitzer
How do you get quite a bit? Like, have you done any side by sides?
[22:38]
Mike Kaput
I haven't done too much yet. I really like and rely on Gemini 2.5 Pro for a lot of things, but I usually just cycle between that and either o 3/4o depending on the use case. Obviously it's way better than 4.0, but just in terms of speed or the complexity of it, that's kind of my next big thing is like, okay, let's run, you know, because I have GPTs and gems built out for some of the same stuff. Let's see how these stack up. I'll be interested to see what that, how that plays out. And also I think we've been seeing more and more chatter even this morning that Google is releasing something like today.
[23:13]
Paul Raitzer
Or I'm convinced they're just sitting there waiting. Like, I think they know that they probably have maybe something that'll perform better, at least on the evals. And they were just like it was a game of chicken, like, you go ahead and release yours first. Because OpenAI has done that to them so many times. So I would not be surprised at all if Google came out with something comparable or better in ways.
[23:35]
Mike Kaput
And just one kind of final note or impression or kind of perspective here is I genuinely would encourage people just go without any bias. Go use this model as extensively as you can. I mean, again, I find it extremely impressive. I also think we all might need to take a breath too, because it's so easy when we're in this bubble to be like, you know, you're going to see whatever Google comes out with and you're going to be like, OpenAI's dead or ChatGPT sucks. And it's like this is like the first thing that felt like minimum viable AGI to me, to be perfectly honest. But I feel like you could make that argument of 4.0 in a different context. Right. So I think it's worthwhile to keep some perspective because this is a genuinely useful model to me and it just works a lot of the time. And I really appreciate that.
[24:20]
Paul Raitzer
That. Yep. Yeah, I agree. I think, yeah, get in there, try it. And again, like, if people weren't using reasoning models.
[24:29]
Mike Kaput
Yes.
[24:30]
Paul Raitzer
And all GPT5 does is injects reasoning into their workflows without them even knowing it. It will feel like a leap forward. Yes. Because that's the biggest thing is Mike and I've talked about this many times. Using 2.5 Pro, using O3 from Chat GBT. That is like, at least for me, the majority of my uses is reasoning models now for higher level strategic thinking. So if you weren't using those, then you don't really comprehend how far along these models are to changing work, the nature of work.
[25:04]
Mike Kaput
And I wonder once we get past this kind of initial freak out. Like how many other stories we'll see given those numbers you shared. I mean giving 4, 5, 6x the amount of people suddenly access to using reasoning models based on those numbers and how they've jumped. I wonder what we' going to hear people say about this model moving forward too. Yeah.
[25:23]
Paul Raitzer
All right, good. Well, I'm looking forward to your course next week.
[25:26]
Mike Kaput
Yeah, me too.
[25:29]
Paul Raitzer
Awesome.
[25:30]
Mike Kaput
All right, so Next up, Google DeepMind has unveiled Genie 3. This is a breakthrough, what they call world model that can generate fully interactive photorealistic environments in real time. So unlike earlier versions of Genie, Genie 3 can render at 24 frames per second, maintain visual and physical consistency for minutes at a time, and respond instantly to both navigation and text based prompts. So this model can do things like simulate an entire virtual world of volcanic landscapes, enchanted forests. It can recreate historical sites like ancient Athens, all based on a short description. And those worlds evolve. Think of being in a kind of dynamically evolving video game. They evolve as you explore. And there's these promptable world events that let users change conditions on the fly, from altering weather to adding new objects. So DeepMind actually says they see world models as a key step towards AGI because they give a kind of limitless virtual training ground for AI agents to use to learn and adapt. So Genie 3 long horizon consistency essentially means agents can now tackle multi step goals. So this kind of opens the door for really complex simulations in fields like robotics education and science. But right now this is still somewhat limited. There's pretty short interaction durations, constrained actions and it is in a limited research preview. So you can go to that. We'll provide the link in the show notes. You can go test out some kind of pre made examples, but you cannot directly use this yourself. But DeepMind still calls it kind of a pretty significant moment in the evolution of these generative environments. Now Paul, I mean I realize like world models, this can kind of seem a little bit sci fi to a lot of people. It's not available yet to the general public. We've got massive news with GPT5 coming out. But we did want to talk about this because it seems like world models are pretty important to the trajectory of where AI is going long term. So maybe you could talk us through why they matter so much.
[27:36]
Paul Raitzer
Yeah, it's been a pursuit of labs for years. This idea of giving the machine the ability to understand the physical world world, to create simulations that follow the laws of physics. And DeepMind in particular, and Demis Hassabis Specifically, have been talking a lot more about them over the last year. Like, I was going back when I was kind of getting ready for today and just looking at the different times that we've featured quotes from Demis on the podcast where he was talking about world models and their importance. And they talked about, like, even with veo, the video generation, how it just, I mean, they. This is their words. Like, it. It just emerges. Like, when you train it on enough video data, it starts to, like, understand the laws of physics. And when you then ask it to produce simulations, it just seems to do it. Now, there's tons of limitations, and they highlight those in the launch post. But, I mean, in essence, it does open all of these possibilities for applications. And, you know, I. I think that this idea of the path to AGI, when they really start to think about embodying intelligence and like, humanoid robots and those robots being able to, like, see something happening and kind of, like, think out ahead of. Because I understand the laws of physics, I understand human nature, like, what is likely to be happening next. And that comes whether you're, you know, training autonomous vehicles or you're training a robot to, to work in a human environment, all these things become kind of essential. And so there are some cool examples, as you mentioned, Mike, you can play around with, like, modeling, modeling physical properties of the world. So like, water and lightning and complex environmental interactions, simulating the natural world. So they talk about generating fiber ecosystems from animal behaviors to intricate plant life. So again, it just, like, kind of learns and then it's able to recreate these things. And so this could come into play in storytelling, where you're trying to create these narratives, video game development, where it's rendering in real time the environment. So imagine, like right now, programmers write all the code to create everything that happens in the game. They create all the environments, all this stuff, this what they're envisioning. Elon Musk talks a lot about this. He actually tweeted this week, and he thinks by next year this will be a reality where you could go in and prompt your own video game and, like, everything just starts happening in real time, creating everything that you see. And. And that's kind of wild. And then even like another tangible example is like right now in a Tesla, when you have autonomous driving going, it shows very, like, video game, like, simulation. It's showing your car, and it shows cars of, like, approximate size. It'll show a truck or a motorcycle. But it's not like watching a live stream video of the road around you. What this is saying. And what, what Elon Musk implies Tesla is going toward is when you're driving a Tesla and you're watching the full self driving do its thing, it will actually render the physical world to show on the display. But it's not a live stream. It's actually like a rendering occurring where it's simulating this whole world. It's, it's really crazy and it becomes massive in robotics because now you can like simulate these environments and the robots can train in them and all these kinds of things. So world models are huge. We talked about Fei Fei Li Spatial Intelligence is the company she created. I forget what episode that is. We can drop the link in the show notes. But she's someone who's been working intensely on this in addition to the research that's going on in the major labs.
[31:06]
Mike Kaput
Yeah. It's a good reminder too that regardless of the hype or the buildup of something like GPT5, regardless of where the verdict ends on that, I mean, progress is happening on a lot of different fronts in AI and it is not slowing down on many of them.
[31:22]
Paul Raitzer
Yeah. And it's commonly like six to 12 months ahead of what the public is aware of. So if they're releasing this, they're obviously already probably far beyond this and within the lab itself. Yeah. And you get people like Elon Musk who just straight up tweet and say, I think this is coming three months. And so I mean, if as much again, like you have to, you have to filter like the stuff from Elon Musk you want to read. But like if you, if you want like a true inside, like just clear train of thought of like what someone thinks is possible. Nobody is more honest than Elon about what he thinks is going to happen and his opinions of these other models and kind of where they're going. And while he has a history of sort of overhyping when technology will arrive, dude built a Frontier model in like a year and a half that caught up to the best models in the world. So he, he knows a few things about science and technologies. He's kind of worth paying attention to from that side.
[32:20]
Mike Kaput
All right, our next or third big kind of main topic this week is that Cloudflare says that AI search startup Perplexity has been disguising its web crawlers to bypass site blocks. This is a practice known as stealth crawling. According to Cloudflare, when Perplexity's bots hit a robots TXT rule or a firewall block, they sometimes swap their identity from what's called Perplexity Bot to something like Google Chrome on macros and rotate IP addresses that aren't on its official list. So basically, Cloudflare says the company is doing things to dodge detection, including also changing its network identifiers, which is a tactic it claims has been used across tens of thousands of domains, making millions of requests each day. Perplexity has pushed back pretty hard against Cloudflare's claims. In a detailed rebuttal, they said they deny intentional wrongdoing. They called Cloudflare's post a publicity stunt and says the company mixed up legitimate user triggered requests with bot activity. Now, according to Perplexity, it says its AI assistants aren't really traditional web crawlers. They don't systematically scrape and store the Internet. Instead, they fetch specific pages in real time. When a user asks a question, they use that content to answer it, and then they discard it with no training or long term storage. So in response, Cloudflare has now delisted Perplexity as a verified bot and rolled out new methods to block its crawlers. Now, Paul, this seems a little technical on the surface, kind of in the weeds, but it does seem like a pretty important issue because, correct me if I'm wrong, it seems like at its core, this is about how AI companies are or are not respecting the boundaries set up by publishers and websites of how their content can and can't be accessed and used. And there's this big fear, given how models were trained, how the content's already been used, that this material is going to get scraped and used to train models or used to essentially bypass websites entirely.
[34:26]
Paul Raitzer
Yeah, which has been going on for the last few years. Like, that's the thing is like none of this is. Well, I mean, I guess the agent side is new, but yeah, I mean, part of the issue with like the New York Times lawsuit against OpenAI and others was that they were bypassing paywalls, like, to get access to information and stuff. And so, you know, I think in the case of Perplexity, the problem they're running into it here is this is their mo. I'd have to go back and find the podcast episode we talked about when Arvin was literally bragging about the fact that they used to scrape LinkedIn against the terms of use like that, that that is just what they do. And he was proud of the fact that they did it. And it's kind of like we're going to do it until we get caught. So when you're on the record saying you constantly do these kinds of things, it's really hard to have credibility when you come out saying, no, we're not doing anything wrong. It's like, dude, you've. You've admitted to things like this before, so you have to consider the company itself and its history when you're looking at this. But when you remove that out, the reality at the end of the day is the rules of the web and business are being rewritten. Like, we're going to have these messy instances where you have semantics of like, yeah, but we're not really scraping. It's an agent, and an agent's being requested by a user. So it's actually really the user that's visiting the website. So how this gets played out, whether it's through business agreements or court cases or whatever, we're going to have this very prolonged transitional phase where we start running into these kinds of issues, and AI agents are going to be a massive part of this. The more traffic on the web that comes from AI agents, the more challenging it's going to be for brands to deal with, for publishers to deal with. It's kind of similar to, you know, how we're struggling with copyright and, like, were the models allowed to steal it or weren't they allowed to steal it, and was it fair use or not fair use? There's just going to be so many unanswered questions that we're going to come up against as agents permeate the web and more and more of the traffic and actions taken online are taken by agents.
[36:35]
Mike Kaput
Yeah. The fact they're already having issues with this now, before we even have real. A real explosion of AI agents tells me that we are not ready for whatever's about to happen.
[36:46]
Paul Raitzer
Yeah. I mean, as a, as a publisher of a website, as a. As a brand, you can just like, say, well, we don't want these users or these agents or these, you know, bots to crawl our site. But then what, you're just going to stay out of the chatbot, AI assistant, AI agent economy? Like, you don't. Your content's not going to show up anywhere. Yeah. There's no simple answers. But I. And again, like this, when you look at, like, where. Where's the future of work? Like, there's going to be people whose jobs is just to kind of figure this sort of stuff out, to like, wade through all the issues and challenges and figure out plans for this stuff. But yeah, this is. This is kind of a messy one. I think it's just the tip of the spear, basically. Like, there's a lot more Common for sure.
[37:34]
Mike Kaput
All right, let's dive into rapid fire this week. First up, OpenAI has released its first open weight language model since GPT2. There are two new models, GPT OSS 120B and GPT OSS 20B that are free to download under the Apache 2.0 license, meaning anyone can run them locally, fine tune them and even use them commercially. They support chain of thought, reasoning, tool use and code execution. And the smaller 20 billion parameters with the 20B stands for 120 billion and 20 billion parameter versions. The 20 billion parameter version is able to run on a high end consumer laptop. OpenAI says the models perform on par with some of its proprietary systems and in certain benchmarks even exceed them, all while being cheaper and faster to operate. CEO Sam Altman framed this release as a way to keep innovation in open models happening in the US amid competition from places like China's Deep sea. So Paul, I'm curious about OpenAI's motivations here. Obviously they are doing a few things, they've got a few things on their plate at the moment. So why spend a bunch of precious time and resources competing in open source at all when your entire business model relies on selling access to closed bottles?
[38:54]
Paul Raitzer
Yeah, I mean, they've talked about the fact they were going to do this for a long time, that they were committed to, you know, the open source community or just, you know, open weights. So we've known it was coming. I think the way the labs are looking at this now, and we've talked a little bit about this before, I know Demis Hassabis has said point blank, this is what they're doing, is the open source versions that they'll release are basically like last year's proprietary models. So the proprietary models that they're selling keep getting better, keep getting smarter, more generally capable. Let's say every 8 to 12 months is the release cycle for a next version. GPT5 obviously took a little longer, but for the most part the labs are looking at kind of that 8 to 12 month release cycle of the next version. And so every roughly 12 months, the prior version that's now kind of outdated, you open source, as long as it's safe to open source it. And the belief obviously is that the paying users are still going to pay for the premier version of what's available, plus, you know, they're still able to, you know, service the developer community, build those relationships, integrate, you know, APIs, still drive a lot of revenue for these, you know, labs, specifically OpenAI and anthropic gets a ton of their revenue through their APIs. So it's just having to service that developer community and be a part of it and then just overall like the mission of the organization. Now we've seen some pullback a little bit on this. Like Zuckerberg, who's been the ultimate champion of open source, has said already, like, they, they may move off of that. They, they may, you know, keep some of their technology more in house. But again, I think what they'll do is they keep the current frontier model proprietary and then you open source the prior generations accepting that there's a small portion of users who will just use the open source and not pay for the other stuff. But it's just kind of the standard model the labs seem to be following now.
[40:54]
Mike Kaput
So it's kind of a no risk way, at least no risk of cannibalizing your existing products to get developer goodwill, move the ecosystem forward, remain relevant with people still building on your open source models.
[41:07]
Paul Raitzer
Yeah, and I mean, in some organizations they're going to want to build on the open source too. Like you get into an enterprise, so you may have enterprises that have 5,000 ChatGPT enterprise licenses, but then the IT teams, you know, also building on top of the open source model, things like that.
[41:22]
Mike Kaput
All right, next up, some more OpenAI news. They are in early talks to let employees cash out some of their shares at a valuation of around $500 billion. So this is a secondary stock sale. It's a deal that would potentially be worth billions, giving current and former staff a way to turn their paper wealth into real money while helping the company retain talent. In an era where Meta is trying to poach people for like nine figures, this would basically create a huge jump in OpenAI's valuation, going to 500 billion from the last $300 billion valuation when they did a $40 billion financing round led by SoftBank. And it comes on the heels of an $8.3 billion funding boost that was oversubscribed. And as OpenAI aggressively pushes on product. So we've got open weight models. GPT5. We'll talk in a second about a federal deal to provide chatgpt to the federal government. Paul, I guess as we're looking at employees being able to cash out of OpenAI, what motivates a move like this right now?
[42:25]
Paul Raitzer
Yeah, I mean, they're being drawn by a lot of money from other labs and you have to find ways to motivate people to stay. You have to give that ability to get something off the table. So it Makes sense. I'm just looking. Mike, real quick. I searched largest companies in the world by market cap just to provide some perspective of the significance of a half a trillion dollars. So ExxonMobil, which was the largest company in the world for quite a while, their market cap is 455 billion, Netflix is 515 billion, Mastercard's 519, Visa's 649. There's only what we got at the trillion dollar plus mark. We have Tesla, Berkshire Hathaway, TSMC or TSM, Broadcom, Meta, Amazon, Alphabet, Apple, Microsoft, Nvidia. That's it. That's a list of companies in the world that are a trillion or more. Yeah, and. And there's actually only two. Between a half a trillion and a trillion, so. Or well, I guess that's. There's seven. It's one of like the 20 to 25 biggest companies in the world.
[43:35]
Mike Kaput
Yeah.
[43:35]
Paul Raitzer
At a half a trillion is what I'm saying.
[43:38]
Mike Kaput
That's incredible.
[43:39]
Paul Raitzer
It's a big number.
[43:41]
Mike Kaput
So we're going to start seeing a whole host of other AI researchers being DECA 100 millionaires, billionaires. @ some point.
[43:49]
Paul Raitzer
Yeah, there was a crazy stat. I'd have to find it. But so don't quote me on like the exact numbers here, but it's. Go look it up. The number of Nvidia employees who are millionaires and the number who are worth like more than 25 million. It's absurd because their stock in the company, if they've been there for any amount of time, like go back, say nine years or more, you're 10, 20 million, like it's crazy.
[44:15]
Mike Kaput
That's wild.
[44:15]
Paul Raitzer
Yeah, it's a large percentage, but that's what's going to happen within some of these, you know, massive AI companies is everybody who's a part of them are just going to make a ton of money.
[44:26]
Mike Kaput
All right, next up, Anthropic has released Claude Opus 4.1 and it is a notable step up from Opus 4 in coding research and reasoning tasks. It hits a 74.5% rating on swe Bench, a benchmark that is a tough test for real world coding. Some companies are reporting it's better at pinpointing exact corrections in code without making unnecessary changes. The coding startup Windsurf says the improvement is roughly on par with the leap from Sonnet 3.7 to Sonnet 4 on their junior developer benchmark. And beyond code, Opus 4.1 has stronger agentic search and detail tracking. It's more effective for deep research and data analysis. And this Upgrade is available to paid users via Claude code, the API, Amazon Bedrock, and Google Cloud's Vertex AI, all at the same price as before. Now, interestingly, and related to this, just after the release, Anthropic researcher Amanda Askel shared some more information about the overall updates to Claude's system prompt. This is the master prompt that essentially influences how the model behaves and responds. So in addition to a new model, we gotta look kind of under the hood at how Claude works. These are basically a bunch of updates and tweaks to how Claude interacts with users. So, for example, Askl shared that one change was made that reins in overly casual language and needless swearing from the model. Another nudges Claude to be even handed and critical rather than hyping up every idea hears. Claude will also be more direct if it suspects someone might be dealing with a mental health issue, instead of only dropping subtle hints. So, Paul, really cool. I mean, in any other news cycle, this would be a huge story. Obviously GPT5 overshadows everything, but it was really cool to see Amanda giving us a peek under the hood of the system prompt too, because, I mean, correct me if I'm wrong, this is at least more transparent than it seems. Some of the labs have been about system prompts, at least until they're forced to, right when there's a huge change to a system prompt, like they did when GPT4O had the really controversial change in their personality, or unfortunately, when Grok had some really recent unhinged racist behavior due to some system prompt issues. So maybe talk me through what was cool to see about this system prompt stuff here.
[46:48]
Paul Raitzer
Yeah, Amanda's sort of the lead on the personality behind Claude, so she's great to follow. She. She's pretty transparent on X about that stuff. The system prompts, you know, the labs aren't very forthright in them, but they're not easy or they're not hard to extract. So there's a. I assume, I think it's a guy, I don't know, but there's a user on X called Pliny the Liberator. The handle is elderplinius. So we'll put a link in and the guy drops the system prompts within like an hour after every major update. So he's a hacker and he's able to get into, you know, the system and figure out what the system prompts are, and then he publishes the entire system prompt on X. So, like, if you ever want to know what the system prompt is, just follow Pliny and you'll know it and I know he's been recruited by a lot of the labs. Anthropic in particular was trying to hire him recently and he talked a little bit about that online. So the system prompts are intriguing. You actually learn a lot by seeing how they talk, you know, tell the systems to behave and things like that. Semi related. I listened like last week was my. I've been grinding to get these courses done and like, my brain has been like on overdrive every day. So I've started a new thing where like I just go for a run every night. So I run like three miles or something. And I've been listening to a lot of podcasts, so I put it on 1.75 speed and you can get through a lot of podcasts, you know, taking a three mile run every night. And so I had like five I listened to last week that were all really good and maybe I'll list them out in the newsletter this weekend. But one in particular, just to the whole point of the story, Big Technology podcast had an interview with Dario Amade. It was Mike, you gotta listen to this interview. Dario is pissed. Like, it was the most. I don't, I don't know. Like, he's generally a pretty authentic guy and he kind of seems to wear his emotions on his sleeve a little bit. But there was a quote where Jensen Wong, CEO of Nvidia, sort of accused him of being a doomer of like. And he. Here's, here's the quote. I get very angry when people call me a doomer. When someone like this guy's a doomer, he wants to slow things down. He says, you heard what I just said. And he's talking about like, his efforts to like, advance and accelerate AI. So my father died because of cures that could have happened a few years later. I understand the benefit of this technology. I'm sure you've heard the criticism. This is now the host asking this. I'm sure you've heard the criticism from people like Jensen who say, well, Dario thinks he's the only one who can build this safely and therefore wants to control the higher industry. Dario said, I've never said anything like that. That's an outrageous lie. That's the most outrageous lie I've ever heard. And it just like he was, he was edgy, like, the whole thing. It's fascinating about their, their model, their rivalry with OpenAI, how they make money, all this stuff. But like the doomerism and Anthropic's approach to safety and how they choose to release models when they release them, things like that, safety of the model. So we'll put the link in. It's a really good interview. It's like an hour long, but it's worth it. It's good.
[49:57]
Mike Kaput
All right, next up, we are still kind of trying to get a clear picture of AI's impact on the economy, and we might be making a little progress. So first, we got a report that outplacement firm Challenger, Gray and Christmas announced that more than 10,000 US job cuts were directly linked to employers adopting generative AI in the first seven months of 2025. They also said that AI appears in four times as many descriptions compared to the previous period. Now, at the same time, though, according to some other reports, including one in the Wall Street Journal, a core question is baffling economists. If AI is so valuable in, say, replacing human labor, producing productivity gains, why isn't it showing up in terms of impact in the form of increased productivity at the macroeconomic level? Because so far, economists say that AI is not showing up at all in GDP numbers, which is where they would expect to see AI's impact if it was truly transforming the economy. But according to a new study from researchers including Erik Brynjolfsson, who we've mentioned before, he studies AI's impact on the economy, AI's impact may be showing up in some other numbers. So Brynjolfsson and his colleagues argue that while government data barely registers the value of generative AI, Americans gained an estimated $97 billion in what they call consumer surplus from free or low cost AI tools in 2024 alone. Now, the way they define and quantify this is they basically estimated how much money a US adult would need to be paid to give up the usage of a free or low cost AI tool. And they estimated this based on a survey they ran at $98 per month. In other words, kind of the implicit estimate of the value that the user was getting out of those tools each month. Then they went and multiplied that by an estimated number of regular users of AI, and they come up with that $97 billion number. Essentially, they say consumers are getting $97 billion in value out of these tools. These are benefits that don't appear in GDP because they accrue to users, not companies. Traditionally, GDP counts transactions, so this kind of thing would be invisible. And Brynjolfsson's colleagues say this is similar to the paradox that economists spotted with computers starting in the 1980s. You start to see the technology everywhere except in the productivity stats. So, Paul, it's interesting to See real data on AI's job impact. Those 10,000 jobs seems clear it's having an impact. We know anecdotally through the conversations we're having it's having an impact, but it's not showing up in the economic data. Really. Can you maybe walk through the contradictions here in what we're seeing?
[52:52]
Paul Raitzer
So the opinion piece is based on a forthcoming paper called GDP B Accounting for the Value of New and Free Goods. So I read this article three times. I think I was trying to like comprehend what they're saying. So the way where I kind of landed on this, this is a rapid fire item. The logic of the value not being counted in the GDP makes sense. So the reason they give as to why it's not showing up at GDP is very logical and pretty straightforward. The math to get to 97 billion seems pretty subjective and like some math gymnastics. Like it, it's a really nice number to put in a headline. 97 billion. The consumer surplus concept. And like how they calculate it by like saying Mike, how much right. Would it take for you to not use chat GPT? And you're like I don't know, a hundred dollars? Like how do you, how do you come up with that number? So I, again I, I will withhold any judgment. I love the fact that we're doing this. I, I love that economists are trying to find other ways to measure value. I think it's great and the paper itself may end up being exceptional and make perfect sense in the form of a 500 word opinion piece. It's kind of hard to understand how they're coming up with that number and how valid that number is. It makes for a nice headline though and probably research worth reading through when it comes out.
[54:25]
Mike Kaput
Yeah, I feel like they should have waited for the paper to be.
[54:27]
Paul Raitzer
Yeah, I don't get it. Way too complex of a concept to try and do in a 500 word opinion piece.
[54:33]
Mike Kaput
But, and I won't go down the rabbit hole here since it is rapid fire. But the point here too is even if this research ends up being terrible, people are scratching their heads about like AI, we're seeing productivity gains in our own work. Is it. It's just not diffused enough into the economy? Like where are the numbers showing up? But we've talked in the past. We are also sometimes skeptical. Are economists measuring the right thing? Are they aware of the productivity gains happening in other areas? So it's definitely a relevant conversation that we need to keep tabs on.
[55:03]
Paul Raitzer
Yeah, it's just like, and I get out again, I don't want to spend too much time on this, but this is what it says. Rather than asking what people pay for a good, we ask what they would need to be paid to give it up. So let's play this out with ChatGPT. Let's assume you were a ChatGPT user, maybe paying 20 bucks a month, who was in the camp that had never tried the reasoning model and didn't know the full value of the system.
[55:23]
Mike Kaput
Yeah.
[55:24]
Paul Raitzer
So I ask you, as someone who's never used the reasoning model, what would it take for you to give it up? And it's like, I don't know, 25 bucks, 50 bucks, 100 bucks? You ask me or Mike, like, dude.
[55:33]
Mike Kaput
I don't even know 5,000.
[55:35]
Paul Raitzer
Like, it's just worth a lot of money to us. And so then it says, our own survey found their average valuation to forego these tools for one month is $98. Multiply that by 82 million users and 12 months in the $97 billion surplus surfaces. It's like, wait, what? It just seems like quite a leap to get the 97 billion. But again, I like the direction and I'm anxious to see the actual paper. They're respected economists and authors.
[56:01]
Mike Kaput
So, next up, a couple new articles are giving us a peek under the hood of ChatGPT. One of them tackles it from a highly technical perspective, the other from a behavioral one. Both are pretty important to understand if you want to understand where ChatGPT and AI is headed. So first, the information reports that OpenAI is now using something called a quote universal verifier as a quote secret weapon within ChatGPT. So basically, a universal verifier is a technique for checking whether an AI's answers are not just plausible, but actually correct. Basically like a referee AI model grading another model's work, pulling in research from multiple sources. For example, in math, it would essentially have AI verifying each step that AI follows to solve a math problem. The information speculates that universal verifiers may have actually helped OpenAI's latest model score a gold medal at the International Math Olympiad, which we talked about in past weeks. Researchers say the approach could boost performance in domains that are subjective or hard to score, from business decision making to creative tasks. Now, second, OpenAI themselves published a post called what We're Optimizing ChatGPT For. In it, they kind of lay out a short philosophy for how they're optimizing ChatGPT. They say they are not trying to keep you in the app longer. They are trying to help you get what you need, then get back to your life. They wrote, quote, instead of measuring success by time spent or clicks, we care more about whether you leave the product having done what you came for. They also point out that people are increasingly relying on ChatGPT for emotional and personal needs. And some new Updates reflect that. ChatGPT will now give gentle break reminders during long sessions. It will refuse to make decisions for you on high stakes personal matters and provide more thoughtful, grounded support when you are struggling. Apparently. OpenAI says they have worked with more than 90 physicians in over 30 countries, plus researchers in mental health and human computer interaction to fine tune how the model responds in sensitive moments. So, Paul, these are two really different looks at how ChatGPT works under the hood, but I think they're both useful to understand. So maybe first let's really quickly touch on why do universal verifiers matter and then Maybe talk about OpenAI's, like, emotional and behavioral approach to how this works.
[58:30]
Paul Raitzer
The verification gap that we've talked about numerous times is sort of illuminates why the verifiers would be so valuable. It's the more you can have other agents or AI that can look at the output. So like if you get a deep research product that's 42 pages long and the human has to go through and verify it, well, if they build a really smart verifier on top of that and it checks all the stats and, you know, makes sure all the citations are correct and the data is real and, you know, does lookups of those things like it's just increasingly able to do higher value work for humans. So they're going to be critical not only in the training of the models, the reinforcement learning of the models, but the actual use of them being a secret weapon. Seems like it's probably a bit of an exaggeration. I, I know for a fact the other labs are working on these kinds of things. They've talked about them publicly, so I can't imagine, I mean, maybe OpenAI's a month or two ahead on their use of a verifier, but that seems like a pretty standard practice within labs to be building agents that can do the verification process.
[59:36]
Mike Kaput
And you know, it did strike me too that some of their commentary around kind of the other side of it, like the emotional and behavioral stuff, like, was really interesting. I could, like, I feel like there are a couple companies they weren't naming that they were taking aim at in saying, you know, we're not trying to engage you on the app and keep you clicking and eyeballs on it, et cetera.
[59:58]
Paul Raitzer
Yeah, I think it was also part recruiting and part retention of talent. They're basically saying, like, listen, if you go work for XAI or Meta, you're just selling yourself off to monetize this technology and keep people on platform. That's what they need to do with their social platforms. It's clicks and time on site and daily active use, hourly active use, whatever their metrics are. And that's not what we're doing here. So it's sort of like a mission thing of like it's more than money. Like we're here to actually make the world better, not make more money on ads and clicks and time on site. So yeah, it was a pretty not so subtle dig at, I would imagine, Meta and XAI in particular.
[60:43]
Mike Kaput
All right, next up, OpenAI has struck a deal to make ChatGPT Enterprise available across the entire US federal executive branch for the next year. So under the agreement, each agency that participates will, for just $1 per agency, get access to OpenAI's top models and get an extra 60 days of unlimited use of advanced tools like Deep Research and Advanced Voice Mode. This also includes some custom training, a dedicated government user community, and consulting support from Slalom and Boston Consulting Group. Obviously, this program aims to cut time spent on red tape and paperwork, freeing public servants to focus on core missions. OpenAI cites some early pilots that show promise. In Pennsylvania, employees saved about 95 minutes a day on routine tasks. In North Carolina, 85% of staff in a 12 week trial reported positive experiences. So, Paul, the focus on the executive branch is interesting. They call out literally in the announcement the AI Action Plan. So I'm guessing this is somewhat related to or motivated by that. This definitely seems like a trend of OpenAI getting more embedded in federal and local governments, doesn't it?
[61:57]
Paul Raitzer
Yeah, and obviously the administration is just very, very aggressively moving in and doing deals on these things. Like we had, it came out over the weekend that Nvidia is now allowed to sell their H20 chips, I think it is to China. And then I think Financial Times had the story that they in essence bribed the government to allow it to happen. So like 15% of the revenue for all those sales goes back to the federal government. So they, they basically bought an exclusion on the tariffs.
[62:25]
Mike Kaput
Yeah.
[62:25]
Paul Raitzer
And so we know that the government is wheeling and dealing all over the place. And so yes, on its surface, great, it is probably going to make for more efficient government. No doubt. My guess is sometime within the next 30 days. The information or Financial Times or some Bloomberg. Somebody has the story of what was the quid pro quo here. Like what. What did. Yeah, yeah. What did OpenAI get in exchange for giving the federal government these licenses for a dollar? Like, it's. Yeah, I don't know. It's. It's just there's always layers to this stuff, but on the surface, great. It'll make for more efficient governments if they're trained how to actually use this stuff.
[62:59]
Mike Kaput
Right. All right, next up, 11 Labs, which is best known for its AI voice technology, is now stepping into music with 11 Music, an AI generator that can create fully produced songs from a simple text prompt in minutes. It can generate any genre or style with or without vocals, and blend instruments and traditions into seamless original tracks. It is apparently built for both creativity and commerce. It has licensing options for film, TV ads, gaming, podcasts and more. And the company frames it as a way for creators to kind of skip the stock music grind and produce fully unique soundscapes. Interestingly, AI expert and copyright advocate Ed Newton Rex, who we talk about, often posted about how the company's approach, at least initially, seems to differ from market incumbents. He said co founder of eleven Labs confirms that their new AI music models train only on songs they've licensed. That is really good to see when a handful of AI companies try to tell you generative AI can only be built with scraped copyright work. Remember that the majority of AI music models license their training data, including now 11 labs model. Very embarrassing for the couple of AI music companies that are known to train on people's music without permission. Now, Paul, Ed Newton Rex and some follow up comments in this thread on X did say he'd like to see evidence backing up the claim of ElevenLabs co founder. He also asked a few times if they trained their voice model only on audio they licensed. Did not get an answer. But at least this does seem like a step in the right direction.
[64:31]
Paul Raitzer
Yeah, the tech's awesome. Is it like anything else? Like all these tools are great. Image generation, video generation, music, whatever. There's always this underlying. Yeah, but it's trained on illegally at some point. I mean the story's not gonna go. I don't want the story to go away per se that like I think this is. There's people like Ed need to keep the pressure on these labs and find ways to compensate creators. I don't know the answer to how that happens, but so many of the AI labs just seem to kind of like moved on. It's like man, of course we took their stuff like, leave us alone like it's leg. General gist of how the Labs respond whenever they're called out on it. It just is what it is. So I don't know. I don't know when we're going to finally have like a court case that changes anything or some industry agreement that changes things. But up until then, every time we talk about how awesome something is, there's always the yeah, they did. They still got proof material.
[65:32]
Mike Kaput
Like now some more AI audio news. Meta has quietly snapped up a company called Waveforms, which is a fast rising AI voice startup, for an undisclosed sum. It is Meta's second major AI audio acquisition in just about a month that follows their purchase of Play AI. And this is all part of their new AI unit, Super Intelligence Labs. Waveforms was founded only eight months ago, but it already raised $40 million from Andreessen Horowitz. They hit a $160 million valuation. Their company, the company's tech is focused on passing the so called quote speech Turing tests. So basically making AI speech indistinguishable from humans and on building what they call emotional general intelligence to detect and respond to emotional cues. Two of the co founders, Alexis Kano, a former meta and OpenAI researcher who helped develop GPT4O's advanced voice, and Coralie Lemaitre, former Google Ad strategist, both of them have reportedly joined Meta as part of this. So Paul Meta acquires Play AI back in June. That's a startup that uses AI to generate human sounding voices. Waveforms is building emotional general intelligence. We've been talking in past episodes about Meta's aspirations to build personal super intelligence. I don't know. This really seems to me like we're heading in the direction of Meta building hyper personalized voice assistants or companions. Like, like what do you think?
[67:00]
Paul Raitzer
Definitely seems to be going in that direction. I mean, I think Zuckerberg's been on record in recent podcasts talking about voice plus glasses. You know, they basically think that the touch goes away as like a, largely as an interface and that most of your interactions with intelligence with agents with assistance happens through voice and your interactions with the world around you. And so it makes sense that they would be kind of making lots of investments in this direction. And again, it gets back to that distribution question, like obviously OpenAI is going in the same direction. They've been putting a ton in voice. It seems like OpenAI probably had a lead, maybe they still do on voice. Google's obviously making major plays into voice. I Do think, like, as you were saying this, the one thing that crossed my mind. I don't know if you have this issue, Mike, because I think you use ChatGPT voice as well. I love it, but I often use it when I'm driving.
[67:58]
Mike Kaput
Yeah.
[67:59]
Paul Raitzer
And it drops in like dead zones all the time. It drives me crazy. And that goes to the whole like the open source or like the opportunity for Apple to put a smaller voice model on the phone, like on device, where I don't have to be going off device to have that conversation. Those are like the windows of opportunity for someone like a Google with Pixel or Apple, you know, with the iPhone, where I don't have to leave and I can just have that uninterrupted voice conversation where like I'm talking, talking, talking. I'm like three minutes goes by and then I realize I lost the connection and the voice wasn't there anymore. And you're like, oh, everything I just said was perfect. I don't want to have to repeat that 100%.
[68:38]
Mike Kaput
That happens all the time. And I feel like, despite how amazing advanced voice mode is, I feel like voice is underrated or underutilized at the moment. So, yeah, it's not only just having it on device, but the type of device. Right. Like phone is the form factor right now. We know OpenAI is coming out with some type of device. We don't know what wearables maybe is the play. I feel like, yeah, AirPods would be incredible. Just feels like this could be a real big unlock.
[69:05]
Paul Raitzer
Yeah. And it just seemed like a year ago OpenAI was knocking on the door like they had basically solved it with their whisper technology and built it in. And then it just feels like they lost momentum or they ran out of computer. It's very possible they just couldn't launch it because they didn't have enough compute to do all the other stuff. But again, these are where those Apple Google where like the, the stalwarts, the people with the distribution with the devices. Like, that's where the opportunity is. I assume whatever they're building with Jony, I've like, that's probably tied to Voice in some capacity. So, yeah, I think there's just gonna be a lot more to come with Voice, you know, probably still in 2025.
[69:45]
Mike Kaput
Yeah, for sure. All right, last but not least, we have, as our last topic here, Google is making a big push to offer its most advanced AI tools to college students for free. It is committing a billion dollars to AI education, training and research in the US So starting Now students in the US and they also added on Japan, Indonesia, Korea and Brazil can sign up for a free 12 month Google AI Pro plan that includes Gemini 2.5 Pro for homework help and research NotebookLM for organizing ideas, VO3 for AI generated videos, higher limits on Google's AI coding agents and 2 terabytes of storage. This release also debuts Guided Learning which is a mode in Gemini that doesn't just give answers but actually walks students through problems step by step to deepen their understanding in the US Google also reports that over 100 colleges have already joined their new AI for Education accelerator which is awesome, offering free AI training and Google career certificates to college students. CEO Sundar Pichai says the goal is to put top tier AI in students hands and teach them how to use it well, helping them thrive as the first true generation of what he calls quote, AI natives. Now Paul, I feel like this might have flown a bit under the radar with all the other news. I mean I would have to benchmark it but a billion dollars in commitments to US schools over three years seems pretty significant. The offer of free AI training and Google Career Certificates to every student. I mean I just feel like I have a fair amount of conversations. I know you did too with teachers, higher ed institutions. This feels like something that could really move the needle if they stick the landing on it.
[71:27]
Paul Raitzer
Yeah, it's great to see and I don't know what the connection is to the like in April there was the executive order from the White House on advancing artificial Intelligence education for American youth and then they just came out. I think it was like last month or something with the policy plan because the executive order basically said that we would policy United States promote AI literacy and proficiency among Americans by promoting the appropriate integration of AI into education, providing comprehensive AI training for educators, and fostering early exposure to AI concepts and technology to develop an AI ready workforce in the next generation of American innovators. So that was like saying hey, we're going to do this, we're going to create a task force and in 90 days, 180 days, whatever, this is the plan. I don't know if this is connected to that and a commitment from Google related to that, but it would seem they're very closely aligned at least. So yeah, I think this is great. I think we're seeing more and more of this from the major AI companies, whether it's Microsoft, OpenAI, Anthropic's been releasing some great stuff. And so I would say like as you're building out and ironically like I was building the AI Academy course this morning about building internally academy. So this is very, very top of of mind for me. Think about these things as you're building personalized learning journeys for your teams. It's like, okay, we're going to have our core curriculum, but what can we pull from Google? What can we pull? And obviously, this is more K to 12, but conceptually, like, what can we pull from these different resources that can really enhance our people and prepare them for the future of work? And as you're even starting to hire, like, looking at what kind of curriculum have people gone through with their AI education? Where are they already at with their understanding and competency in this stuff? So, yeah, it's awesome to see this really large focus, not just from Google, but the White House and other major companies, that AI literacy is absolutely critical to the future of work and innovation, not just in the U.S. but beyond that.
[73:18]
Mike Kaput
Yeah, 100%. All right, Paul, we made it through GPT5 week. Thanks for breaking the whole.
[73:26]
Paul Raitzer
I think my overall is, like, it seems awesome. It's just like after a year and a half of, like, waiting, you just. You thought the world was going to change the day GPT5 came out. And it feels like they did more backtracking than, like, yeah, yeah, I don't know.
[73:41]
Mike Kaput
We'll see. I feel like it's, like, going to be much more impactful than I even realize now. That's going to be a lot more subtle.
[73:47]
Paul Raitzer
It's like when we look back in like 30 days, 90 days, it's like, oh, wait, that actually was a bigger deal than maybe it's those first 48 hours.
[73:54]
Mike Kaput
Hey, I. What I just said could be out of date by this afternoon when Google releases something. But yeah, I do think that we're going to look back and be like, okay, that might have been a subtle turning point, but again, it just shows, like, the bubble, the hype is out of control.
[74:08]
Paul Raitzer
Yeah. And that we all live that. Anyone listening to this show, at least, we generally live in a bubble. And most of your peers have no idea that GPT5 came out or what it is. Like, it's funny, my. My dad, who listens to the podcast every week, he'll often, like, text me things. And he texted me, I think, the morning after, and he goes, nothing on the news today. So he was, like, watching the news to see if GPT5 was even talked about in mainstream media. And he's like, nothing.
[74:35]
Mike Kaput
Wow.
[74:35]
Paul Raitzer
And so that, again, it tells you, like, we're not to the point, like, we're waiting, waiting, waiting for a year and a half. The general public like Careless. It's a non event to them until.
[74:46]
Mike Kaput
Until the next studio Ghibli filter goes viral or something. Right? All right, Paul, well, thanks. Thanks again.
[74:53]
Paul Raitzer
All right, thanks everyone. We'll talk to you next week. Thanks for listening to the Artificial intelligence show. Visit SmarterX AI to continue on your AI learning journey and join more than 100,000 professionals and business leaders who have subscribed to our weekly newsletters, downloaded AI blueprints, attended virtual and in person events, taken online AI courses and earned professional certificates from our AI Academy and engaged in the Marketing AI Institute Slack community. Until next time, stay curious and explore AI.