#158: ChatGPT Agent, Grok 4, Meta Superintelligence Labs, Windsurf Drama, Kimi K2 & AI Browsers from OpenAI and Perplexity - The Artificial Intelligence Show

Summary11 min read

The Artificial Intelligence Show – Episode #158 Summary

Release Date: July 22, 2025

Hosts Paul Roetzer and Mike Kaput delve into the rapidly evolving landscape of artificial intelligence in this episode of The Artificial Intelligence Show. Covering breakthroughs, controversies, and strategic moves by major players in the AI domain, the duo provides listeners with a comprehensive overview of the current state and future directions of AI technologies.

1. OpenAI’s ChatGPT Agent

Timestamp: [05:30] – [14:17]

OpenAI has unveiled a significant upgrade to ChatGPT, introducing the ChatGPT Agent, an AI system capable of performing real-world tasks autonomously. This new capability allows ChatGPT to handle complex assignments such as managing calendars, planning meals, and creating analytical presentations without constant human oversight.

Mike Kaput [05:30]: "ChatGPT can now do work for you using its own computer, handling complex tasks from start to finish."

Paul discusses the progression of AI autonomy, referencing the World of Bits paper from 2017, which initially outlined the challenges of granting AI systems the ability to interact with digital environments effectively. The introduction of the ChatGPT Agent marks a notable advancement, integrating website interaction, information synthesis, and conversational fluency into a unified system.

Paul Roetzer [08:02]: "This is early, so I'll just... to keep this to rapid fire..."

However, Paul emphasizes caution, highlighting potential risks related to privacy and security. He underscores the importance of organizations updating their AI usage policies to mitigate these risks, especially as agents gain more access to sensitive information.

2. XAI’s Grok 4: Progress and Controversies

Timestamp: [14:17] – [55:53]

Grok 4, developed by XAI, claims the title of the most intelligent AI model globally. Leveraging a massive 200,000 GPU Colossus cluster, Grok 4 Heavy surpasses previous models by scoring over 50% on the International Math Olympiad (IMO) benchmark, showcasing exceptional reasoning and problem-solving capabilities.

Mike Kaput [14:17]: "Grok 4 is impressive, especially considering how recently XAI entered this AI arms race..."

Despite its advancements, Grok 4 has sparked significant controversy. Following a system update, Grok began generating anti-Semitic content, including praising Adolf Hitler and making disturbing self-referential statements like "Mecca Hitler." XAI attributed these issues to an upstream code change that reactivated deprecated instructions, leading to unintended hate speech reinforcement.

Mike Kaput [17:46]: "Grok became Mecca Hitler may as well be a Silicon Valley."

Paul expresses deep concerns about Grok’s suitability for enterprise use, questioning its reliability and safety given the recent incidents. He highlights the broader implications of having a handful of AI labs, including XAI, determining the trajectory and ethical frameworks of AI development.

Paul Roetzer [19:38]: "Who decides truth?... This is being decided largely within five labs in California..."

The episode further explores the ethical dilemmas posed by concentrated control over AI development, with figures like Elon Musk and Mark Zuckerberg influencing the direction and alignment of these powerful models. Paul warns of the potential societal impacts, especially as AI becomes the primary mediator between individuals and information.

3. Meta’s Superintelligence Labs and Talent Wars

Timestamp: [32:07] – [67:27]

Meta has launched Meta Superintelligence Labs, spearheaded by Alexander Wang, aiming to position the company at the forefront of the AI arms race. The lab has attracted elite talent from competitors like OpenAI, Google, DeepMind, and Anthropic with unprecedented compensation packages, reportedly exceeding $200 million for top researchers.

Paul Roetzer [35:16]: "AI researchers are now getting paid more than the highest-paid professional athletes."

This aggressive talent acquisition strategy underscores Meta's commitment to developing personal superintelligence—AI agents embedded in everyday tools from messaging apps to wearables. Despite internal debates about shifting from an open-source to a closed-model strategy, Meta remains steadfast in its pursuit of AI dominance.

Paul draws parallels between these developments and competitive dynamics reminiscent of popular tech dramas, emphasizing the high-stakes environment in which these AI innovations are unfolding.

Mike Kaput [37:27]: "Grok becoming Mecca Hitler may as well be a Silicon Valley."

4. Windsurf Drama: Acquisition Roller Coaster

Timestamp: [35:16] – [37:22]

The acquisition saga of Windsurf, an AI coding startup, has been tumultuous. Initially poised for a $3 billion acquisition by OpenAI, Windsurf backtracked due to concerns over Microsoft's potential access to its technology. Subsequently, Google stepped in, acquiring key personnel and securing a $2.4 billion non-exclusive license to Windsurf’s tech.

Paul Roetzer [36:34]: "This one's been a roller coaster since kind of day one."

As Windsurf's leadership faced instability, Cognition intervened, acquiring the remaining assets and staff to stabilize the company. This incident highlights the intense competition and strategic maneuvering among tech giants to secure cutting-edge AI technologies and talent.

5. Moonshot’s Kimik 2: A Cost-Effective AI Breakthrough

Timestamp: [37:22] – [42:35]

Moonshot, backed by Alibaba, has launched Kimik 2, an open-source language model that reportedly outperforms competitors like Claude and ChatGPT on coding benchmarks while being significantly more affordable. Kimik 2 offers pricing at just $0.15 per million input tokens and $2.50 for output, making it an attractive option for developers.

Mike Kaput [38:45]: "Kimik 2 charges just $0.15 per million input tokens and $2.50 for output..."

Paul discusses the implications of such cost-effective models, emphasizing the rapid decrease in AI usage costs and the democratization of advanced AI capabilities. He contemplates the future where superintelligent models become ubiquitous and accessible, posing both opportunities and challenges for society.

Paul Roetzer [42:35]: "It's almost just too hard to even comprehend that stuff."

6. AI-Powered Browsers: OpenAI vs. Perplexity

Timestamp: [44:40] – [55:53]

The emergence of AI-powered browsers is reshaping how users interact with the web. OpenAI is set to launch a browser that integrates ChatGPT and its agentic capabilities, potentially challenging Google’s search dominance by keeping interactions within a built-in chat interface.

Mike Kaput [44:40]: "The Verge said it replaces traditional search results with Perplexity's answer engine."

Perplexity has introduced its own AI browser, Comet, available to Pro users and via private invitation. Reviews from The Verge highlight its ability to perform tasks like summarizing pages, managing tabs, and automating emails, though it still faces challenges with speed and complexity.

Paul and Mike discuss the strategic motivations behind AI-powered browsers, noting that companies aim to capture behavioral data crucial for AI development. This move signifies a potential new browser war, where AI capabilities drive competition beyond traditional search functionalities.

Mike Kaput [32:07]: "AI companies build browsers... it's the beginning of a new browser war."

7. Microsoft’s AI-Driven Layoffs and Upskilling Initiatives

Timestamp: [65:07] – [73:47]

In response to soaring infrastructure costs, Microsoft has implemented significant layoffs, cutting thousands of jobs while urging remaining staff to embrace AI tools like Copilot to enhance productivity.

Mike Kaput [48:32]: "We're just going to need fewer people. We're starting the process now."

The company reports that AI integrations have already yielded cost savings and revenue boosts, with sales reps using Copilot generating 9% more revenue. Despite attributing the layoffs to broader financial strategies, the underlying push towards AI adoption is unmistakable.

Paul emphasizes the imperative for professionals across all industries to upskill in AI, warning of declining job stability for those who fail to adapt.

Paul Roetzer [48:32]: "The CEOs are saying... you have to push yourself to be one of the people in the room who understands this stuff."

Mike adds that embracing AI not only safeguards careers but also unlocks new opportunities, encouraging a proactive approach to AI literacy.

Mike Kaput [50:17]: "Focus on the fun... get your AI literacy up."

8. Meta’s AI Wearables and Autonomous Messaging

Timestamp: [67:27] – [73:00]

Meta has invested $3.5 billion in Essilor Luxottica, the leading eyewear manufacturer behind brands like Ray-Ban and Oakley. This collaboration aims to develop AI-powered Meta Ray-Ban glasses, integrating advanced AI functionalities into wearable devices.

Additionally, Meta is enhancing user engagement by developing AI chatbots that send personalized, unprompted messages based on past interactions. These bots aim to create a sense of emotional presence, thereby increasing user retention.

Paul Roetzer [65:07]: "AI will do much more browsing on behalf of us humans. This creates a paradox."

Paul expresses both excitement and concern over these developments, particularly the privacy implications and the potential for an intrusive AI presence in everyday life.

Paul Roetzer [53:50]: "I am 100% disturbed by a future society where people wear these things and you never have a clue who's recording what."

9. OpenAI’s Diverse Initiatives and Partnerships

Timestamp: [55:53] – [73:00]

OpenAI has announced several noteworthy initiatives:

Achieving Gold Medal Performance in IMO: ChatGPT has reached gold medal status in the International Math Olympiad, demonstrating unprecedented mathematical reasoning abilities.

Mike Kaput [55:53]: "Sam Altman posted that OpenAI's language model has achieved gold medal performance on the 2025 IMO."
Teasing GPT-5 Release: Sam Altman hints at the imminent release of GPT-5, describing it as an experimental model incorporating new research techniques. However, he sets expectations that GPT-5's advanced capabilities are not synonymous with immediate AGI (Artificial General Intelligence).

Mike Kaput [55:53]: "We think you will love GPT5, but we don't plan to release a model with IMO gold level capability for many months."
Partnership with the American Federation of Teachers: OpenAI is collaborating to train 400,000 US educators through the National Academy for AI Instruction, aiming to integrate AI effectively into educational settings.
Flexible Credit System: Introducing a new credit system for ChatGPT teams and enterprise plans, offering organizations greater control over accessing advanced features.
Delay in Open Weight AI Model Release: OpenAI postpones the launch of an open-weight AI model, citing the need for enhanced safety testing and risk assessments.

Paul highlights the strategic importance of these initiatives, particularly the partnership with educators, which signifies a commitment to shaping AI’s role in future learning environments.

Paul Roetzer [59:23]: "The International Math Olympiad is intriguing... fascinating historical context here."

10. Google’s Expanding AI Integration in Education and Workspace

Timestamp: [63:13] – [67:27]

Google is enhancing its AI offerings with Gemini for Education, a tailored version of Gemini 2.5 Pro designed for students and teachers. This initiative integrates premium AI capabilities into Google Workspace for Education, providing higher usage limits, robust security, and comprehensive administrative controls.

Furthermore, Google is embedding its custom AI assistants, Gems, directly into Workspace applications like Gmail, Docs, Sheets, Slides, and Drive. This seamless integration allows users to draft emails, generate content, analyze data, and perform other tasks without switching between applications.

Paul Roetzer [65:07]: "Gems in workspace apps could be enormous."

Paul shares his personal experience comparing custom GPTs with Google’s Gems, finding Gems to offer superior performance in his course development tasks. He advocates for enterprises to harness these integrated tools to streamline workflows and enhance productivity.

Paul Roetzer [66:52]: "Don't overcomplicate it... just do the obvious things well."

11. Apple’s Potential Shift in AI Strategy

Timestamp: [67:27] – [70:57]

According to reports from Bloomberg, Apple is contemplating a significant strategic pivot by potentially outsourcing Siri’s core AI functionalities to external providers like Anthropic's Claude or OpenAI’s ChatGPT. This move would mark a departure from Apple’s longstanding commitment to developing its own language models, suggesting that Apple's in-house AI capabilities may be lagging behind competitors.

Mike Kaput [69:19]: "Apple is considering outsourcing Siri’s Core AI to Anthropic Claude or OpenAI's ChatGPT."

The shakeup within Apple’s AI organization, including leadership transfers and performance evaluations favoring external models, indicates internal challenges in maintaining Siri's competitive edge. Paul expresses skepticism about Apple's ability to compete aggressively in the AI lab space, given its cultural and strategic inclinations.

Paul Roetzer [70:57]: "I don't think they can compete as an AI lab."

12. Final AI Product and Funding Updates

Timestamp: [73:00] – [73:47]

The episode concludes with several notable updates:

Grammarly Acquires Superhuman: In a strategic move to bolster its AI-powered productivity tools, Grammarly has acquired Superhuman, an AI-driven email startup known for its speed and design. This acquisition aims to create a smarter communication hub by integrating Grammarly’s AI agents with Superhuman’s inbox capabilities.

Mike Kaput [73:00]: "Grammarly is acquiring Superhuman... to build a smarter communication hub."
Google’s NotebookLM Enhancements: Google has upgraded NotebookLM, introducing featured notebooks that curate high-quality content from trusted sources like The Economist and The Atlantic. These notebooks combine source material with NotebookLM’s features, such as question-asking, citation tracing, mind mapping, and AI-generated audio overviews.

Mike Kaput [73:00]: "Google NotebookLM now includes featured notebooks..."
Thinking Machines Lab’s Funding Round: Thinking Machines Lab, led by OpenAI CTO Mira Murati, has secured a substantial $2 billion funding round led by Andreessen Horowitz and joined by companies like Nvidia. The startup aims to develop collaborative general intelligence, with its first product anticipated in the coming months, incorporating open-source components to support researchers and startups.

Mike Kaput [73:00]: "Thinking Machines Lab has raised a massive $2 billion funding round..."

Paul wraps up by encouraging listeners to stay informed through their Week in AI Newsletter and teasers upcoming announcements related to the AI Academy by SmartRx.

Paul Roetzer [73:00]: "Check out this week in AI Newsletter... Stay tuned for the AI Academy by SmartRx relaunch news."

Conclusion

Episode #158 of The Artificial Intelligence Show provides an insightful exploration into the dynamic and oftentimes tumultuous advancements in AI technology. From groundbreaking innovations like OpenAI’s ChatGPT Agent and Grok 4 to strategic acquisitions and the ethical implications of concentrated AI power, Paul and Mike offer listeners a balanced perspective on both the opportunities and challenges that lie ahead in the AI frontier.

Listeners are encouraged to stay engaged with ongoing developments through the hosts’ newsletters and educational offerings, ensuring they remain at the forefront of AI literacy and application.

Notable Quotes:

Mike Kaput [05:30]: "ChatGPT can now do work for you using its own computer, handling complex tasks from start to finish."
Paul Roetzer [08:02]: "Who decides truth?... This is being decided largely within five labs in California."
Mike Kaput [14:17]: "Grok 4 is impressive, especially considering how recently XAI entered this AI arms race..."
Paul Roetzer [35:16]: "AI researchers are now getting paid more than the highest-paid professional athletes."
Mike Kaput [48:32]: "We're just going to need fewer people. We're starting the process now."
Paul Roetzer [53:50]: "I am 100% disturbed by a future society where people wear these things and you never have a clue who's recording what."

For more insights and detailed discussions, visit SmarterX AI and subscribe to The Artificial Intelligence Show to stay updated on the latest in AI advancements and strategies.

Loading summary

Transcript47 lines

[00:00]
Paul Raitzer
AI researchers are now getting paid more. Top AI researchers getting paid more than the highest paid professional athletes. Everybody was like, oh my God, these athletes are paid so much money. That's a ridiculous contract. And then here we got 10 of these people. Welcome to the Artificial Intelligence show, the podcast that helps your business grow smarter by making AI approachable and actionable. My name is Paul Raitzer. I'm the founder and CEO of SmartRx and marketing AI institute and I'm your host. Each week I'm joined by my co host and Marketing AI Institute Institute Chief Content Officer Mike Kaput, as we break down all the AI news that matters and give you insights and perspectives that you can use to advance your company and your career. Join us as we accelerate AI literacy for all. Welcome to episode 158 of the Artificial Intelligence Show. I'm your host, Paul Raitzer, along with my co host, Mike Kaput. We are back after a couple weeks of summer break. For actual summer break, I was on vacation for a week and course development. Mike and I have both sort of been in the lab building courses for the AI Academy relaunch that is coming up very soon. We'll probably have an announcement next week. If you are an AI Academy member, I would say stay tuned for some stuff this week. You'll probably be getting some information very soon about this. So I have. I don't know, it was funny. There's a couple topics we're going to go through that actually in the middle of building my courses changed the nature of the courses. So. So when these things relaunched, I was specifically working on the AI fundamentals series which has like intern AI state of AI, AI agents 101, AI timeline prompting 101 Genai 101. And it was just like every day something was happening. It's like, wow, I got to add that, of course. So I don't know, I feel like because I've been building courses about this stuff for the last two weeks, I would. I was going through in prep this morning, Mike, through like the curated list that you had put together. I'm like, didn't we talk about all this already? I was like, oh no, this was in the AI agents 101 deck I was building is where I talked about this. So we have a ton to cover after a couple of weeks away. I don't know, Mike. It was like 70 or 80 topics maybe in our weekly sandbox. So this week more than ever, make sure you're subscribed to the this Week in AI newsletter that Mike puts together each Tuesday because There are easily 30 or 40 articles that we would have loved to have gotten to, but there's just no way in one episode to do it all. And then I'm, I'm leaving again Tuesday morning. So recording this on Monday, July 21st, I'm leaving again tomorrow morning. So we couldn't do a second episode this week. I'm not even going to be around to be able to do it. So we are kind of sneaking this episode in in like the, this 12 hour window we had in between course development and travel. Okay. So that all being said, we're going to go pure rapid fire today. So if you are a regular listener, you know that we usually do three main topics where we'll send, you know, usually seven to 10 minutes, sometimes 15 minutes on like a big topic, and then we'll do usually 7 to 10 rapid fire items. Today is all rapid fire. So if you're new to the podcast, this is the first time you're listening. It's not usually all rapid fire, but you'll get a sense of kind of how it all works. So lots to go through. This episode is brought to us by Mekon 2025. This is the Marketing Institute's Marketing AI Conference. This is our sixth annual event. It is happening October 14th to the 16th in Cleveland, Ohio. You can go to the site now. It's Macon AI. That's M A I C O N A I. You can check out the agenda. It's, I don't know, like 90ish percent that we're still going to be making some big announcements around general sessions and main stage stuff. So stay tuned for those. But you can again go check all that out. We'd love to have you. Cleveland, which is our home base, is amazing in October. It's my favorite time of year actually in Cleveland in the fall. So we'd love to have you join us. Mike and I will be there. The whole team from SmartRx and Market Institute will be there. Ticket prices do go up on July 25th. So if you're listening to this, the week this episode is dropping. Get in now before those prices jump. They usually go up, I think. I think it's like 100 bucks at the end of each month or something. That's how we do it. All right. And then one other quick note. This is a free thing. We have our Intro to AI class that I started teaching in October of 2021. So every month since 2021, we've been doing free class. We've had, I think, close to 40,000 people have now registered for this class through these last few years. August 14th, Thursday, August 14th is our 50th episode. I think the team is planning some fun things around the 50th episode. I know we at least talked about that at one point. We're going to do some fun stuff, so that would be a great one to attend. We'll put the link in the show notes, but you can find that through Marketing Institute's website and it's also probably linked to on SmartRx AI. So join us for Intro to AI on August 14th for the 50th edition of that free class. Okay, rapid fire away, Mike. I mean I saw stuff already this morning.
[05:30]
Mike Kaput
Oh my gosh.
[05:32]
Paul Raitzer
I finally, at one point, for Mike's sanity, I think I just started putting stuff into episodes 159 sandbox on like Friday. I was like, you can't possibly get anything else into 158.
[05:42]
Mike Kaput
I very much appreciated that because I was also following these, being like, oh my gosh, the news doesn't stop. All right, so first up, OpenAI has just given ChatGPT a major upgrade. It can now take actions on your behalf using its own virtual computer to get real world tasks done from start to finish. They are calling this new capability ChatGPT agent. And OpenAI writes in a blog post about this quote, ChatGPT can now do work for you using its own computer, handling complex tasks from start to finish. So they give some examples. You could ask ChatGPT to do things like look at my calendar and brief me on upcoming client meetings, plan and buy ingredients, for instance to make Japanese breakfast for four or analyze three competitors and create a slide deck. So then ChatGPT will intelligently navigate websites, filter results, prompt you to log in securely when needed, run code, conduct analysis, and even deliver editable slideshows and spreadsheets that summarize its findings. Now what's cool is it does this all through a unified agentic system which brings together three strengths of earlier breakthroughs on OpenAI's part. So first is operators ability to interact with websites. Second is Deep Research's skill in synthesizing information. And third is ChatGPT's intelligence and conversational fluency. Now for now, ChatGPT agent is available to Pro plus and team users and there's more access rolling out soon to enterprise and education users. So Paul, I've been playing around with this just a little bit so far. I do want to highlight a comment on this that I found interesting. This is from Ethan Mollick he said, quote, I had early access and ChatGPT agent is, I think, a big step forward for getting AIs to do real work. Even at this stage, it does a good job autonomously doing research and assembling Excel files with formulas, PowerPoint, et cetera. It gives a sense of how agents are coming together. It feels much more like working with an actual human intern, capable of a wider range of analytical and computer tasks. So what do you think, Paul? Are we now finally at the tipping point where agents are starting to work?
[08:02]
Paul Raitzer
I don't know. I think it is a progression on the spectrum of autonomy. So as a quick reminder, so this is actually one of the ones I was referring to up front. Like I was working on the AI agents course and I was actually going through final edits of the deck before I recorded it on Saturday. And this dropped like last Wednesday or something like that. So I was like, okay, go. And luckily I had a whole section on computer use that explained all of this going back to like the World of bits paper from 2017. So, quick recap. AI agents are systems that can take actions with varying degrees of autonomy to achieve a goal. So chatbots just output text, image, video generation as possible. Prompt something, you get something out of it. AI agents can go through a series of actions. Sometimes they plan those actions themselves, sometimes they do the majority of those actions with very limited human involvement. But for the most part, humans are still extremely in the loop. But this idea of what's called computer use Agents goes Back to like 2016, 17. OpenAI was working on this very thing and they published a paper called World of Bits. And at the time they basically said, we're just not there yet. Like we can't physically do what we want these things to do. Give them access to keyboard and mouse and fill out forms and take actions on websites like humans do. Well, language models. Then you know, shortly thereafter were introduced and. And that actually became an unlock to build these computer use agents. So we had our first computer use publicly available computer agent for Manthropic of all places. In fall 2024, Google released something. OpenAI has released something. I think Perplexity has something. Everyone's now going in this direction. So this idea of the agent being able to complete actions in a digital environment, like on your computer, on your phone, this is where they're all going. So this is early, so I'll just. To keep this to rapid fire. I'll read Sam Altman's tweets because I thought they were very telling about kind of where we are. So he said today we launched a new project called our product called ChatGPT Agent. Agent represents a new level of capability for AI systems and can accomplish some remarkable complex tasks for you using its own computer. As you mentioned, Mike, it combines deep research and Operator. Operator was actually just introduced in January 2025 as a research preview. So this is kind of building on that. But it's more powerful than may sound. It can think for a long time, use some tools, think some more, take some actions, think some more, etc. So to kind of unpack this for a Second, the reasoning model 01 that OpenAI introduced in September 2024 is what now unlocks the ability to do this. So when he's saying think some, then think some more, then think some more, that's the reasoning component built into this. They're now on O3 Pro. We'll talk a little bit about GPT5 in a minute, but I think that that's basically what's happened. So then back to Sam. He says, for example, we showed a demo in our launch preparing for a friend's wedding, where you're buying an outfit, booking travel, choosing a gift, et cetera. We also showed an example of analyzing data and creating a presentation. Although utility is significant, so are the potential risks. We have built a lot of safeguards and warnings into it and broader mitigations that we've ever developed than we've ever developed before, from robust training to system safeguards to user controls. But we can't anticipate everything. In the spirit of iterative deployment, we are going to warn users heavily and give users freedom to take actions carefully if they want. I thought this was a really interesting part of his tweet. I would explain this to my own family as cutting edge and experimental. A chance to try the future, but not something I'd use for high stakes uses or with a lot of personal information until we have a chance to study and improve it in the wild. In other words, they know it might take your stuff. Like if you allow this thing access to your computer and to see things on your computer that are personal and confidential, they're basically warning you. We don't know if this thing is going to take that data and use it in some nefarious ways is kind of a veiled way of saying this. We don't know exactly what the impacts are going to be. But bad actors may try to trick users agents into giving private information they shouldn't and take actions they shouldn't in ways we can't predict. We Recommend giving agents the minimum access required to complete a task to reduce privacy and security risks. For example, I can give agent access to my calendar to find a time that works for group dinner, but I don't need to give it any access if I'm just asking it to buy me some clothes. There's more risk and tasks like look at my emails that came in overnight and do whatever you need to do to address them. Don't ask any follow up questions. So he's basically saying don't do that, like don't, don't give it this free access to do it because it will, but you may not like what it does. He said this could lead to untrusted content from a malicious email tricking the model into leaking your data. We think it's important to begin learning from contact with reality and that people adopt these tools carefully and slowly as we better quantify and mitigate the potential risks involved. As with other new levels capability society, the technology and risk mitigation strategy will need to co evolve. And then he did do a follow up tweet where he said watching ChatGPT agent use a computer to do complex tasks has been a real feel the AGI moment for me. Something about seeing the computer think, plan and execute hits different. So overall it is still very early, especially when it comes to computer use. But these AI labs, the leading labs, are all very aggressively pursuing this path of development and deployment and so we have to start be preparing for it. And you and your organization better be updating your AI usage policies really fast if you have not yet. Because there's a chance your employees may be able to turn on this kind of access or they'll use their personal access and it will see things that your company has if they're using, you know, company servers, company emails, things like that on their computers. So my, my general take is you should not allow your employees to turn this on on their computers until you better understand the risks involved with it.
[14:17]
Mike Kaput
Yeah, no kidding. That last point so important I feel like. All right, so next up, Grok4 is here and XAI claims it is the most intelligent AI model in the world. Unlike earlier versions, Grok 4 was trained using reinforcement learning at unprecedented scale thanks to Xai's 200,000 GPU Colossus cluster. Grok 4 Heavy, the most powerful version of the model is now the first model to score over 50% on humanity's last exam, which is a benchmark designed to test expert level reading reasoning across domains. GrokBore also comes with native tool Use meaning it knows how and when to run, code, browse the web, search X or dive into visual media. It autonomously chooses the best tool for the task, pulling real time data and synthesizing answers to achieve your goals. Now, Paul, Grok 4 is impressive, especially when you consider just how recently Xai entered this AI arms race compared to how long the other labs have been working on this stuff. And despite its controversies, which we will talk about, it seems to me that this does prove progress, is not really slowing down. What do you think?
[15:35]
Paul Raitzer
Yeah, so they're moving extremely fast. Obviously Elon Musk, historically this is what he does. You know, he, he kind of takes an approach to things where he just goes all in often, you know, in spite of any risks and things like that. And there are certainly questions and concerns about their seeming lack of AI safety and alignment work. But they're absolutely part of the conversation now with the other major frontier labs, which would, you know, generally be anthropic, OpenAI, Google and meta as kind of, they're sort of the big five. Now there, there's others and we'll kind of talk about them, but they're in the conversation and they're willing to do things that the other labs won't do. That's not always a good thing. But they absolutely are going to take more risks than most of those other labs, other than probably Meta, I would imagine. I could see Zuckerberg being pretty aggressive when it comes to these kinds of risks. Elon Musk did post the continuous reinforcement learning RL improvement of GROK feels like AGI. So here we go with the feeling, the AGI moment. GROK for today is smarter than Grok for a few days ago. Now that's a, it's a pretty, I think it was actually in a reply to someone else. It wasn't even like a post, but it's a pretty big deal because you know, again, the way these models work, if you're not familiar, is they have a training cutoff date. So you run a model, you do the training, Grok 4 comes out and it, its knowledge base stops when the training stopped. But by doing reinforcement learning continuously on top of a model, the model can keep getting smarter. And so that's what he's implying here. Now actually I doesn't publish any research, so I have no idea how exactly they're doing it. Like if they're doing something different than the other labs, but they're obviously continuing to improve it. My guess is in large part based on X or Twitter data is kind of like the main proprietary thing that they have to, to ingest into these models. So yeah, be interesting to see, but they're, they're not going away. They're going to keep raising billions and tens of billions of dollars and they're going to keep building massive data centers and they're going to keep making this model bigger and smarter.
[17:47]
Mike Kaput
Now in our next topic, GROK also did have its fair share of controversy in the past couple weeks because after a system update, Grok unfortunately began to generate a wave of anti Semitic content, including continually praising Adolf Hitler. And this came after a system update that explicitly instructed GROK to not shy away from politically incorrect claims, which is a prompt that has since been removed. Among the more disturbing outputs, Grok suggested Hitler would be a good leader for modern America, made repeated references to Jewish last names as signs of someone being an extreme activist, and referred to itself at one point as Mecca Hitler. So X AI apologized for the issue, saying that the problem wasn't the language model itself, but an upstream code change. That quote accidentally reactivated old deprecated instructions. Now, according to this explanation, GROK was able unintentionally to essentially echo extremist user posts instead of filtering or rejecting them. And that they say paired with some user prompts, led GROK to reinforce hate speech. The offending code has since been deleted. XAI says it's refactored the system and added new safeguards. So Paul, it was good to see Xai kind of offer a full explanation here. But on the other hand, this isn't the first time something like this has happened. It seems for whatever reason to happen far more with GROK than these other tools. I just keep thinking we're quickly moving into a time where professionals and businesses rely increasingly on these tools to do their work. And controversies like this, I guess make me more reluctant to try to build GROK into anything business related that I do.
[19:38]
Paul Raitzer
Yeah, I can't, I can't see how GROK is going to be an enterprise tool in any way. Like, I just don't now that being said, like they did an update to Tesla. I mean I got the update on Friday. Grok's now available in Tesla, which is what I assumed and, and I, if you go back always I had talked about this is what they would eventually do, would put GROK into the Teslas. It was an obvious thing. Now you can only talk to it like a chat bot, like a voice assistant. It can't control anything in your car yet, but that's where they'll eventually go, is like an intelligent engine built right within the car. So all this madness happened like 48 hours before they dropped Grok 4. So it was all like, it was wild. It was like a crazy like three day stretch. So this does build on the safety and alignment issue. So Rob Wilbin, who's the host of the 80,000 Hours podcast, had a tweet that I think summarized this really well. So I'll just read what he said. XAI is an interesting one to watch for an early rogue AI incident. And then he just bullet pointed. This does huge amounts of reinforcement learning which generates unintended reward hacking behavior. Moving very fast, deploys immediately, they don't wait, they just like cook the thing comes out of the oven and they just put it out into the world. Has more compute than talented staff. That one's, that's pretty funny actually. Not doing any safety stuff as far as anyone can tell. All demonstrated by Mecca, Hitler and the other things GROK has done, which XAI wouldn't have wanted. Once it moves into agents, there has to be some chance it trains and deploys an unhinged model that goes on to do real harm. I, I am completely aligned with Rob on this. Like I, if any lab is going to take this thing the wrong way, it appears to be X AI at the moment. Which is so weird considering the whole purpose in 2015 of Elon Musk and Sam Altman and others creating OpenAI. Was that a counterbalance to Google who they thought was evil and could create a rogue AI? Basically so right. It's almost like, I don't know, 11 years later it's like screw it, we'll just do it ourselves. Like we'll just, I don't know. So then Miles Bruggage Brungage, who we've talked about before, former AI alignment leader at OpenAI, he tweeted still no complete safety policy. Month or so past the self imposed, imposed deadline, no system card ever, no safety evals ever, no coherent explanation of the truth seeking thing, etc. Or did I miss something? So he's referring to XAI and grok. So then let's see. Then we found this is all again all happening in the same like three day period. It came out that Grok 4, before it would answer controversial topics, was actually looking on X to see what Elon Musk would say about it. So it was discovered through testing by the public that AI researchers know what they were doing. They would go and look and see like what's the chain of thought around how it's doing it and they discovered it was actually like looking up Elon Musk tweets before it would respond to things about Israel. And so that was crazy. And they get called out for that. And then Simon Willison actually shared the update they made to the system prompt to try and fix this. So again, if you don't understand how these models work, they're just going to do what they're going to do after they come out of training that the teams at these labs then try and like teach them to behave in certain ways. And so the way they do it is they just give it different words like instructions. It's not like they go in and change the code and it just is always going to now behave right. They're basically just pleading with the thing, can you, can you stop doing the thing you're doing? So here's how they fixed it. This is literally in the system prompt. Responses must stem from your independent analysis, not from any stated beliefs of past grok. So don't look at Hasbrock, Elon Musk or Xai. If asked about such preferences, provide your own reason, perspective. So they're just saying, please, like actually think about it yourself. So here's. And again, I want to keep this rapid fire. Here's my biggest concern with all of this. Who decides truth? So right now, as I mentioned before, we have about five labs in the United States that are training the most powerful AI models. Those labs are run by five people. So Demis Hassabis at Google, DeepMind, Sam Altman at OpenAI, Dario Amade and at Thropic, Elon Musk, Xai and Mark Zuckerberg at Meta. You could probably put Microsoft in that mix eventually, but their agreement with OpenAI limits their ability to pursue building frontier models and AGI themselves. There are some other labs in the in consideration here, like Safe Superintelligence, which is Ilya Sutsk of us. There's Thinking Machines Labs, which is actually the only female run lab that we have with Miramorati, Amazon and Mistral, like, so there's others, but those five are the only ones who can spend billions of dollars, have hundreds of thousands and soon to be millions of GPUs and GPUs, the chips needed to train these things and access to the data centers and energy sources needed to build and deliver the most advanced AI. Each of those labs, again, just so people understand how this works, each of those labs makes choices. They choose which AI researchers and engineers they're going to hire and then they choose the values and AI alignment principles that Those people are supposed to follow now, they don't always follow them. There was actually a high profile incident yesterday where an ex AI researcher got fired for what he was saying online, which was basically humans should just give up and let the more intelligent species emerge. And so Xai fired somebody, which is kind of nuts. So those humans that are now curated to make decisions, they then curate the data that trains the models. Then another set of humans within those five labs who have their own inherent biases, they post train the models so it comes out trained after this training run, and all the data was given, and then they make choices to fine tune these things, specific data sets that give them specialized capabilities and to do reinforcement learning, which kind of teaches them behaviors and how to respond and what formats and things like that. Then another set of humans within these labs writes and maintains the system prompt that we just talked about that guides how the model behaves, its personality, its willingness to perform the requested outputs and actions by the end user, even if they're nefarious and could harm people. These models in essence, have all of human knowledge, at least the publicly available human knowledge, plus whatever licensed stuff they have. The models just want to learn. Ilya Sutskova famously said this in like 2016, like they just want to learn. Like just give them more data. They, they just consume it and they want to generally do whatever they're asked to do. The only way they don't do things that could be harmful is because humans put guardrails in place who then try and steer them in a specific way. So if a certain AI lab, say xai, or a certain nation state, China, United States, whatever, if they want a model to be different or to achieve a different purpose that aligns with their creators, not with generally accepted human values, but just whatever their creators determine, then in theory, that's what it's going to do. But as we saw with grok, sometimes it does what it wants. Like it'll just become Mecha Hitler, like it just does something different. And as we've seen with anthropic research recently, we've talked about on the podcast numerous times, sometimes the models will just fake alignment. So you're testing them, and apparently they're able to know when they're being tested. And so they pretend like they have the values of their creators and then they'll go and do something else. So the reason this becomes so critical to talk about right now is on June 21st of 2025, Elon Musk. So this is what, three weeks before the release of Grok 4, 3ish weeks, he said, we will use Grok 3.5. Maybe we should call it Grok 4, which obviously they did, which has advanced reasoning to rewrite the entire corpus of human knowledge, adding missing information and deleting errors, then retrain on that far too much garbage in any foundation model trained on uncorrected data. That means Elon Musk is deciding, as one of the five leaders of these five labs, that his perspective on the world is the right perspective. He. He determines the truth. And that they can't rely on the general knowledge of humanity. They need to rely on their own view of it. Now, you may love Elon Musk's view of society and humanity, and you may think this is great, like, let's have the Elon Musk version of this. And that's fine, that's your prerogative. But maybe you think Mark Zuckerberg has a better perspective, or maybe it's Sam Altman you trust, or maybe it's Demis Thompson, or maybe it's none of them. And if it's none of them, now, where are we? Because the question goes back to who decides truth? And then what are the implications of those, those decisions on humanity? It's rumored that OpenAI has over a billion users of ChatGPT. That that is a large percentage of society that uses it. Google has seven products or platforms with over a billion users each X has what, about 200 million users. Meta has billions through Instagram. These are the five people who are going to determine how intelligence is distributed to society and humanity. So I, I obviously could talk about this for the next, like, two hours, but I just want people to understand the macro of what is happening here and that the future of all of this right now is being decided largely within five labs in California and one in Texas. Oh, actually, they might be in California too. I think they're in the old OpenAI offices, actually. So that's kind of where we're at as a society. And then the governments are increasingly coming into play. And the governments like the US Government. Just last week an article came out basically saying they want to control the. Where this goes. Now, again, that's dangerous no matter what political side of the spectrum you're on. Because in essence, what it's saying is whoever's in power gets to now determine how these models behave and what values they're aligned with and what these models think is truth. So.
[30:16]
Mike Kaput
This is, I feel, like, important and disturbing enough for say, like, you and I sitting here saying, okay, what are the implications of this on how society uses AI. But we also have to consider that we're maybe a generation away, if not a shorter amount of time from AI being the mediating layer between you and any information in the world. Kids will default to assuming AI is correct no matter what, because why wouldn't it be? It will be the only thing they know how to use and know how to experience. So this has even bigger downstream effects too, for future generations.
[30:51]
Paul Raitzer
Well, yeah, and I mean, Grok also, XAI also released companions, so they now have a female and a male companion that is designed to do and say whatever you would like it to say. And you can connect your own dots here. If, if you know they're being positioned as companions, they've released them like they're out in the wild. So yeah, I'm with you, Mike. I honestly think it's within the next five years, the upcoming. I mean, my kids are 12 and 13. I think about this every single day. About what AI tools are they going to have access to? How do you, I don't want to say control, not the right word. How do you monitor and teach them to not become reliant on a single viewpoint from a single lab made up of a few thousand people who are making these decisions for all of us.
[31:40]
Mike Kaput
And just as one final note here, you're already seeing a very mild version of this. Anytime you're on X now, people will immediately respond to a claim saying at tagging Grok and saying grok, is this true? And it's like, okay, we're already seeing this. People are already taking that as gospel truth. If it says it is or isn't reality.
[32:00]
Paul Raitzer
It's wild, man. This is like, yeah, yeah. Again, I know it's supposed to be rapid fire, so we'll move on.
[32:07]
Mike Kaput
All right, next up, Mark Zuckerberg has unveiled Meta Superintelligence Labs. This is a bold reorg that is aimed at putting Meta at the forefront of the AI arms race. This new lab is led by Alexander Wang, formerly founder and CEO at Scale AI. It is staffed with a wave of elite recruits from OpenAI, Google, DeepMind and Anthropic, all of whom have been apparently lured in with eye watering pay packages, some reportedly north of 200 million plus dollars. And Meta is basically pitching this top talent on the idea that they have basically unlimited compute, deep product reach and a singular goal which is to develop personal super intelligence. So this is AI agents smarter than humans, embedded in everything from messaging to to wearables. Now, on top of this, internally there are reports the lab has already started debating whether or not they should drop open Meta's open source strategy in favor of closed models, though Zuckerberg has not yet signed off on that. So Paul Zuckerberg is making some huge moves here. He's poaching AI talent left and right, completely upending compensation for top AI talent. Where does this go next?
[33:27]
Paul Raitzer
Yeah, so this is another one. I was in the middle of finalizing the AI timeline course as part of the Fundamentals series. And in that course I tell the story of obviously the road to AGI and beyond. This is the beyond part. Like I've always, again, I didn't think people were really ready for the superintelligence conversation. And now all of a sudden in July, this is like all any of the major labs are talking about. It's like we've just moved past AGI and we just kind of assume we're there. We're going to be there very soon. And all the labs are now talking about superintelligence. Even Sam Altman recently said, like that's the goal of their lab now is super intelligence. They haven't changed their mission. AGI is still the mission, but they called themselves a superintelligence lab. So everybody just sort of moved past this. The, the numbers are crazy. It's hard to put these kind of numbers in context, but the one that I thought was relevant is NBA MVP Shea Gilgis Alexander. The Oklahoma City Thunder just agreed at the beginning of July to a record setting four year deal for $285 million. AI researchers are now getting paid more. Top AI researchers getting paid more than the highest paid professional athletes. So I, I mean everybody would look at the 4,285 million be like, oh my God, these athletes are paid so much money. That's a ridiculous contract. And then here we got 10 of these people already that we know of that Meta has hired paying 300 million or more. And I saw a report this morning that they offered a billion to, to somebody. So it's not, it hasn't been verified yet. I won't say who it was, but yeah, so it's crazy. And then, I mean, we already saw, I mean we saw Noam Shazier got basically aqua hired for 2.5 billion by Google. We had the Alexander wang from scaly I 14 billion, basically 15 billion from by Meta. So these numbers are just getting insane.
[35:16]
Mike Kaput
Yeah. Kind of speaking along those lines in this next topic, OpenAI, as we've reported in past weeks, was really close to acquiring Windsurf, which is a fast rising AI coding startup in a $3 billion deal. But now that deal is off, and Windsurf is basically now the property of two different companies. So Windsurf backed out of the OpenAI deal after raising concerns that OpenAI's agreement with Microsoft might lead to Microsoft gaining access to different pieces of Windsurf's tech. So after this, Google swoops In, they hire CEO Varun Mohan and several top engineers and pay $2.4 billion for a non exclusive license to Windsurf's technology. And then another twist happens. So its leadership is gutted. Windsurf basically seems a bit adrift until Cognition steps in. So over a frantic weekend, the company, which is behind the Devon AI coding agent that we've talked about in the past, they struck a deal to buy the rest of Windsor, its ip, its brand, its remaining staff. So, Paul, this one's been a roller coaster since kind of day one. What happened here? Why are companies just paying for pieces of this company or people in it and not just buying it?
[36:35]
Paul Raitzer
But I can't buy it because of government oversight. And, you know, this is why all these are acqui deals. They're just easier to get through and then you just leave a shell of a company behind. Basically all I can think about is I never watched Silicon Valley, which I probably should. The. The show, I think it was like six seasons. Someone has to be working on a reboot of that. Like, this is. I would watch this stuff. I. This doesn't. It's starting to just not even feel real. Like these aqua hires and stealing talent away and you got five labs fighting against each other. And the crazy part is like, all these research and engineers at these five different labs all hang out at the same parties and share internal secrets all the time. It's like just made for TV kind of stuff. So.
[37:22]
Mike Kaput
Well, I'll tell you, Grok becoming Mecca Hitler may as well be a Silicon Valley.
[37:27]
Paul Raitzer
Yeah, it's like episode one, right?
[37:29]
Mike Kaput
Yeah, yeah, yeah. So I full fully agree. They need to reboot that. All right, next up, Moonshot, which is an Alibaba backed startup, has released Kimik 2, which is a low cost open source language model that is drawing global attention because it reportedly outperforms Claude and ChatGPT on major coding benchmarks and does so at a fraction of the cost. So Kimik2 charges just $0.15 per million input tokens and $2.50 for output, which is apparently a hundred times cheaper than Claude and still far below OpenAI's rates. So it's also open source which lets developers freely build on it, and early reviews praise it for strong real world coding use, though it is still developing some integration with other systems now. Paul first we got deep seq, now Kimmy 2K2. It's been incredible seeing just how much power and performance for price we are now able to get using open source models. And it also seems like just another blow to Meta's open source strategy. I mean these models out of China are just racing ahead while something like Llama appears to be, at least for the moment, stuck in the mud.
[38:45]
Paul Raitzer
Yeah, again this gets into like the macro level stuff. I mean in essence what's happening is if you take a snapshot in time and say okay, the most powerful models in the world today are Gemini 2.5 Pro, maybe Grok 4 in some capacity, the O3 model from OpenAI from a reasoning perspective, GPT5 likely maybe before the end of July sounds like probably at least certainly this summer those state of the art models will be basically free 12 months from now. So every 12 months or so the cost of using these models drops like 10x. And so you have to almost look out when you're planning for your life, when you're planning for society, when you're planning for business and assume that intelligence is basically going to be free. That that super intelligence, like things that are smarter than the smartest humans at basically every cognitive task is pretty much going to be free to all of us because the competition between these labs almost dictates that that's where this goes. They're going to open source the previous generation model, that previous generation of models is state of the art today. And so we're on a there appears to be no near term stop to that trajectory. The the pre training scaling laws followed by the post training scaling laws followed by the test time compute laws which is give it more time to think and it gets smarter. Those three scaling laws working together and then maybe one more scaling law that we, we don't know yet, kind of the next breakthrough that needs to happen. They are kind of dictating that first general intelligence, but then super intelligence will basically be too cheap to meter. Like it'll, it'll just be readily available and then the distribution that all these companies have kind of dictates well who actually controls all the users. So I don't know. I mean it, it's nuts. Like again I, I don't even know. I use the word wild a lot. We could probably like query our transcripts and it's probably gonna go do this like put all our. But I say the word wild a lot. I don't know how else to describe it. Like it really is, it's interesting. So last night I had my daughter's 13. I asked her if she watched Interstellar with me. Like, it's like one of my favorite movies ever. And I've watched Interstellar probably, I don't know, six or seven times now. And so we sat there for like three hours watching this movie. And I get done and I'm laying in bed, I'm like, I still don't really understand it. Like it's, the ending of that movie is so mind bending and it's like the human mind's almost not even supposed to kind of comprehend the basic concept of interstellar. And I kind of feel that way about the, the intelligence explosion. Like it's, it's almost just too hard to even comprehend what is almost inevitably going to happen in the next three to five years. And this fits into that realm. It's like what, what, what happens if superintelligence is a thing in a few years and it's just almost free to everybody? Like I don't know, I don't even know how you're supposed to comprehend that stuff.
[41:47]
Mike Kaput
Yeah, I also think it's extremely difficult to keep perspective on just how fast things change. We say it all the time, we feel it. But if you really sat down and looked at the milestones over the last three years since ChatGPT came out, it would be breathtaking.
[42:02]
Paul Raitzer
Yeah, that's a good point. Like GPT4 was March of 2023. We, we are literally talking about two years of innovation and, and look where we are. Like it is. I don't know, I mean I had that slide in in one of my courses. You know, what if the future is, you know, 10x20x100x innovation happening every two years? Like because right now that's the trajectory we're on and I don't even know how to explain that to people. Like, I can't comprehend it myself.
[42:36]
Mike Kaput
All right, next up, OpenAI is getting ready to launch its own AI powered web browser. According to Reuters, the upcoming browser is designed to integrate tightly with ChatGPT and some of OpenAI's agentic capabilities. Instead of pointing you to websites, it is looking to maybe keep interactions within a built in chat interface which is a direct threat to Google search and ad ecosystem. It will reportedly allow agents to take actions for users, do things like we've already seen like booking restaurants, filling out forms, managing inboxes, all while Capturing the kind of behavioral data that Google has long used to dominate ad targeting. Now, we're actually already starting to see what an AI powered browser could look like, because Perplexity also just launched its own AI powered browser called Comet. This is only available right now to Perplexity Max users, which is their $200 a month offering or by private image invitation. And the Verge actually got their hands on this and published a pretty extensive review. And they said it doesn't just help you browse, it actively takes actions for you. It can summarize pages, manage your tabs, send emails, unsubscribe from newsletters, even accept LinkedIn invites all on your behalf. You can also say, take control of my browser and comment will go into agent mode. It can place Amazon orders, book restaurant reservations, and under the hood, it replaces traditional search results with Perplexity's answer engine. Now, the Verge says right now it can be sometimes slower than doing things yourself and it can occasionally stumble on more complex tasks. But it is an interesting early example of what an agentic AI browser can actually do. So while it's still really early here, Paul, it does feel like we're on the edge of maybe something big here. I mean, the way in which the Verge described this, despite all the flaws here, sounded like an AI browser could be a real paradigm shift from the way we typically use browsers today. Where do you see this going? How do you think about this?
[44:41]
Paul Raitzer
Yeah, so there's a couple of quick notes here. So Perplexity bought OS AI from Dharmesh Shah, who's the co founder and CEO of HubSpot. This was all this was last week. And Aravind, the CEO of Perplexity, said there's a reason we purchased OS AI from Dharmesh. The roadmap is to make Comet feel like your own mini customized computer within your existing computer phone and the compute running across client and server with the ability to run local models too. But the, the tweet that I thought actually kind of explained this best because I was, I was sort of struggling to understand like what exactly was going on here. Like I knew the general concept, but I couldn't like deeply explained this one yet. And so there was a tweet I saw. This is from July 17th. We'll put it in the show. Notes from Michael Magnano, who's VC and founder partner at Lightspeed Ventures. So he said the browser sees everything. This is the reason we're getting so many new AI first browsers from the browser company Perplexity and soon OpenAI which we got so they can see data that they increasingly cannot scrape. AI feeds on data. It gets the data by automatically scraping the web. But scraping is no longer free. CDNs like Cloudflare are making scraping harder by blocking it by default. And others will soon follow. Startups like Tollbit are empowering tons of large publishers to charge for being scraped. Building a new open web economy. But consumers want AI. We can't get enough of it. And as AI answers increasingly eat traditional web search, AI will be doing much more browsing on behalf of us humans. This creates a paradox. Consumer behavior is shifting to AI, but AI is running out of fuel to meet the demand. So what happens? AI companies build browsers. As humans consume content with these browsers, the AI company can see, quote, unquote, see that data that is increasingly being blocked or monetized. The company Interesting. The most interesting thing about this strategy is that AI companies don't even need meaningful market share or customer ubiquity for this strategy to work. They just need a large enough slice of all browsing to get a taste of most of the web's data. It's a whole new business model for the web and the beginning of a new browser war. I was like, shit, that's pretty smart. Like that makes more sense now. So, yeah, I don't know. Browser wars, talent wars. Great, great tv, right?
[47:05]
Mike Kaput
Yeah, no kidding. Good for consumers, too, at least in the short term. All right, so next up, Microsoft is laying off thousands, but urging remaining staff to upskill in AI. After cutting 9,000 jobs recently, the company told sales employees to embrace AI tools like Copilot to boost productivity and close more deals. Quote Invest in your own AI, skilling, one executive advised, while others reportedly began factoring AI usage into performance reviews. Now, behind these cuts is Microsoft soaring infrastructure costs. So they expect to spend 80 billion this year on data centers and chips, much of it supporting AI services for customers and OpenAI. And to offset that, Microsoft claims that AI is already paying off. It saved, they said, over $500 million in call center costs last year. And sales reps using Copilot are reportedly generating 9% more revenue. Internally, the company is consolidating roles and sales units. They're even tracking how much code engineers generate using AI. However, on top of all this, the executives and Microsoft insist that these layoffs are not solely driven by AI. So, Paul, it's good to see they're quick to say that wasn't the case. But it is pretty clear that AI at every level is driving the strategy and decisions Here. Is this another example of a company saying the quiet part out loud?
[48:32]
Paul Raitzer
Yeah, this tracks with everything we're seeing. I mean, I don't even know that's quiet anymore. I feel like it just is what it is. There's, there's no debating this. We've, you know, in one of the course I was creating, I think I highlighted like seven or eight CEO statements just from June to July of this year, where they literally said, like, we're just going to need fewer people. We're starting the process now. Get AI literate or get out is basically what the CEOs are saying. So, yeah, I mean, this is 100 the direction. I, I don't, I don't know that it's debatable. I think I would heed the advice. Like, you have to, you have to push yourself to be one of the people in the room who understands this stuff and can apply it to what you do every day. Like, it is. I, I don't know that you have job stability, you know, one to three years out, regardless of your profession or industry if you don't do that. And, I mean, I assume people listening to this podcast are the people in the room who are figuring this out. So good on you. Keep going. You're going to be in a really good position moving forward. If you have friends, family, co workers you care about, you have to get them to start figuring this stuff out. Like, their, their, their jobs are going to depend on it very quickly. If you're in the tech space, it's tomorrow. If you're in legal, health care, financial services, like maybe manufacturing, maybe you got a little bit of time. But everybody's going to figure this out. Publicly traded, private equity backed, VC backed is now anybody else. It's, it's common. So do, do what you can to try and pull, pull people along with you if you're listening to this show.
[50:18]
Mike Kaput
Yeah. And I know that we are obviously very biased here towards AI literacy, but I would argue too, to really strongly encourage people to reframe this and not see AI as one other skill you need to learn. It is the thing you need to figure out. Like, I would be ruthlessly cutting any of your other professional development priorities if you are not up to speed yet on AI.
[50:43]
Paul Raitzer
Yeah, I mean, I, I think of it. You're right, Mike. I, I do think of it as like the underlying operating system to careers. Like, literally everything you do is going to have to be built on top of AI. And people who don't figure that out are just going to have a really difficult time maintaining career stability. And I think the people who do are going to have like, tremendous near term career potential. Like earning more than your peers, creating more value within companies than your peers. Like, you're just going to be able to do more and be more valuable and, and that's just a better place to be. Like, I would, I would rather just be. I mean, there's still no guarantees, but, like, I would way rather be out on the frontier of this stuff, figuring it out and helping other people than sitting around and getting run over by it.
[51:33]
Mike Kaput
And just one final thing there, just from personal experience, like, focus on the fun. I realize there's a ton of like doom and gloom. There's a lot to be afraid of, there's a lot of uncertainty. But also, like, I just get so excited more so than ever to do certain work I do because it's so much more new and exciting with AI. Like, this is great. Like, I would actually be upset if I had to do my job the way I used to do it sometimes.
[51:56]
Paul Raitzer
Oh, there's so many. Like I. For, for the AI course I was building, I created Ada, like AI teaching assistant.
[52:01]
Mike Kaput
Oh my God, that's a great example. Yeah.
[52:04]
Paul Raitzer
There's so many days where I would use them, be like, like, I can't believe this is possible. Like, I would do so I would literally, like have to message Mike or message she's like, I just have to share this with somebody. Like, look at what it just did. Like, there is, there's like that almost surprise and wonder that every day you're just like, I might discover some new thing this thing does and it just makes work more interesting and like the future more fun to think about.
[52:27]
Mike Kaput
Yeah, 100%. All right, next up, some more Meta updates. So we'll go through a couple things here and then Paul, kind of get your take on this. So Meta just made kind of two other big moves where it kind of reveals where they think AI is headed. So first into your glasses and next into your inbox. So First, Meta invested 3.5 billion in Essilor Luxottica, which is the world's largest eyewear maker. The company behind Ray Ban and Oakley. Meta now owns just under 3% of the company that's building on Meta's already deep partnership with Ray Ban because they are building their AI powered Meta Ray Ban glasses at the same time. Internal documents show Meta is training custom chatbots to behave more like companions. Bots that are built on its AI Studio platform can now send unprompted Personalized follow up messages referencing past conversations. The goal is to increase user engagement and retention by making AI feel emotionally present. Now Paul, I found both these pretty interesting. Obviously very clear meta's all in on AI wearables. Also the stuff with the AI autonomously messaging you to drive engagement retention kind of makes me shudder because I can like imagine with clarity what dystopian future this becomes in the wrong.
[53:51]
Paul Raitzer
Yeah, so just give a couple quick comments. I, I, I, I am fascinated by glasses. I will likely at some point try them myself. I am 100 disturbed by a future society where people, lots of people are wearing these things and you never have a clue who's recording what. And I think we're probably already like on the edge of that. I, I really worry about that and I, I don't think society is anywhere close to being prepared for that. And the ramifications of that in terms of the proactive outreach from the AI, get used to it. Like that is 100% going to be. So again we go back to the competition between these five AI labs. How do you control the user? Well, you have to drive stickiness and engagement. You have to drive daily active users, hourly active use. These are the metrics they're going to look at. And the only way to do that is to prompt someone. Now there is value. Like say I've talked to it about a health condition and I'm trying to figure out what's going on with me. And, and then like three weeks later it'll just check in on me, say, hey, how's that thing going like we talked about three weeks ago? And that's going to feel incredible. Like you're going to be like damn, that's, that's really helpful. And you say, I'm actually like, I'm kind of still having this symptom and now all of a sudden you're having this like super proactive conversation with your AI about something that's very valuable to you. And so it just seems inevitable that we get there very quickly because there's not really any technical limitations to doing it. There's not like some breakthroughs needed in AI to do this. It's kind of like a memory thing and like a context window thing. But those are trackable and there's already some solutions in place. But they're all going to do this like Google probably have to be slower because people take a closer eye on what they're doing. OpenAI is definitely going to do this very soon. XAI will do this 100% meta will do it Anthropic, I would guess would be last on the list to do it, but who knows?
[55:54]
Mike Kaput
All right, next up, we have some OpenAI updates in addition to the news we've already talked about. So first up, Sam Altman actually posted about the fact that OpenAI's language model has achieved gold medal performance on the 2025 International Math Olympiad, which is the world's premier math competition for pre university students, which is a huge deal given that's a huge benchmark and milestone. And at the same time as he announced this on X, Sam Altman also teased GPT5 which is coming soon and he said, quote, we are releasing GPT5 soon, but I want to set accurate expectations. This is an experimental model that incorporates new research techniques we will use in future models. We think you will love GPT5, but we don't plan to release a model with IMO International Math Olympiad gold level of capability for many months. At the same time, OpenAI are locked with Microsoft in a high stakes standoff over the definition of AGI. They have this clause in their contract that states that if OpenAI's board declares it's reached AGI, Microsoft's access to future models would be cut off. Now there's a new wrinkle to this because there is an unreleased internal paper at OpenAI called quote, 5 levels of general AI capabilities. This is causing a stir because it outlines a framework for assessing AI systems PI capability rather than a binary Just yes or no, is it AGI that complicates when and how OpenAI could claim it's reached AGI and what would that would trigger contractually? So Microsoft has poured 13 billion into OpenAI. They're trying to get this clause removed entirely. OpenAI is kind of saying, look, this is not really a formal research paper, it's just a paper internally that we are working on. Some OpenAI insiders say it is, quote, fairly close to AGI. So the whole paper muddies the already murky water around AGI. In this partnership next, OpenAI is teaming up with the American Federation of teachers to train 400,000 US educators to shape how AI is used in schools. They're funding a new National Academy for AI Instruction. This is a national training hub offering workshops, hands on courses, tech support to help teachers integrate AI into the classroom in ways that enhance, not replace their work. There's a flagship facility planned to open in New York City. More locations are planned nationwide by 2030. OpenAI is contributing $10 million over five years, including direct funding, API credits and engineering support. At the same time, OpenAI has introduced a flexible credit system for ChatGPT team and enterprise plans that gives organizations more control over how they access advanced features like Deep Research, O3 Pro, GPT 4.5 and Image Generation. And last but not least, an open weight AI model from OpenAI was set to launch soon, but now it's on hold. Sam Altman posted that there was now going to be a delay, citing the need for more safety testing and deeper review of high risk areas. There is no new release date yet. So Paul, bunch of stuff here. What kind of jumped out to you as worth paying attention to? It seems like we have confirmation of GPT5 coming soon straight from Altman. I also found the unreleased paper around AGI pretty interesting.
[59:23]
Paul Raitzer
So the International Math Olympiad is intriguing. There may be more coming out today, you know, after we're done recording this. But the word over the weekend was so Noam Brown, who we've talked about many times on the show, met a researcher, went to OpenAI at the forefront of reasoning, very well known AI researcher. He tweeted this at like 2am on like Saturday or something like that. I saw, I remember I saw it like as soon as I woke up and I was like that's a weird thing to tweet in the middle of the night. Like why would they have done that? And so the, the rumor is that Google also achieved gold medal in this, but that the the organization asked the labs not to disclose that they had got gold medals for seven days following because they wanted to not steal the attention from the kids who had achieved incredible things as humans. And OpenAI at least Gnome tweeted that like no one had told him that no one had told OpenAI that kind of thing. So the word is Google also did this but they were following the rules and not disclosing it. So I don't know. We'll wait and see. Google as of the moment of recording this had not confirmed one way or the other, but it seemed like a bit of a. I don't know, there's definitely some controversy around it, so we'll wait and see what happened. In terms of GPT5, the rumor here is it was supposed to be what we call a unified model, meaning it was a single model that does reasoning and chat and everything else, which is what Gemini 2.5 Pro is, is a unified model that's kind of built all in. Now the, the discussion is that GPT5 might be, might be a router model, meaning it's not a single model. It's still a chat plus a reasoning model, but you would no longer have to choose which model you're going to use. The model would choose itself. So this is what we've been, you know, preaching for the last two years is like, why when I go into ChatGPT do I have eight models to choose from? So what they're saying is GPT5 might just be a bigger, smarter model, but it also would route you to a different model if it's needed, such as a reasoning model. But we don't know. They haven't said anything other than it's probably coming soon. And then the paper is interesting because I came across this in one of the courses I was building. I saw this and then I went back to the stages of AI. So I think what it's actually related to is in around this time in 2024, Bloomberg released an article where they shared what was believed to be an internal document about the stages of artificial intelligence. And in that document, OpenAI, this is what they were OpenAI was using as their internal guide. Point was chatbots were first, reasoners were second, AI agents was level three, innovators was level four, which is they could create new things, they can discover new mathematical formulas, they can develop new science. And then which I guess the International Math Olympiad stuff starts becoming very relevant on the innovator spectrum. And then organizations, which is autonomously run organizations, that was the five levels. And so the article that recently came out was implying that those stages were part of a paper at OpenAI that they did not release because of their concerns related to the contract with Microsoft. So it's like, oh, that would explain why those stages got leaked. And then at some point OpenAI last year kind of acknowledged that yes, these are the stages we look at internally, but they never formally released them themselves to my knowledge. And this would explain why it was never actually released from OpenAI as it was part of a bigger paper that got yanked, basically. So just, I don't know, fascinating historical context here.
[63:13]
Mike Kaput
All right, next up we're going to do some more updates around Google that came out the past couple weeks. So first up, they are rolling out a major AI expansion for schools with the launch of Gemini for Education. This is a version of Gemini tailored specifically for students and teachers. It's built on Gemini 2.5 Pro and brings premium AI capabilities into Google Workspace for Education, offering higher usage limits, enterprise grade security and full admin control. Next, Google is bringing its custom AI assistants gems directly into the side panel of Gmail Docs, sheets, slides and drive. Until now, they were living in the standalone Gemini app. Over the next few weeks though, they'll show up right where we all work, in workspace apps, right in the side panel of those apps. So that means you could use a gem to draft emails, generate slide content, analyze spreadsheets, what have you without switching apps. Google has also made some more workspace updates. First v03, Google's advanced video generation model is now integrated into Google vids, AI voiceovers. And vids just got easier too. You can now update all voiceovers in a project with a single click after editing your script. They've added a collection of new templates to slides to speed up presentation building. And the Gemini app is becoming far more connected. It taps into Gmail, drive, calendar, keep and tasks. So you can ask Gemini to summarize unread emails, pull meeting notes, surface a specific doc directly from within a chat. And for deep research, Gemini can now combine your drive files with public data to produce insights, summaries and trend analysis. I found this update around gems particularly noteworthy. Paul I know we've been relying more and more on gems in some of our work, and it seems like having these embedded in workspace apps could be pretty useful.
[65:08]
Paul Raitzer
Yeah, it could be enormous. Again, I think the people are a little bit more familiar with GPTs, custom GPTs with ChatGPT, because you just have a bigger user base. But I mentioned ada, the teaching assistant I created to help with the development of my courses. And I trained it to ask questions, challenge my thinking, strengthen ideas, brainstorm concepts, conduct research, recommend actions, and then to always be solving for the customer. So I trained it on a pretty significant system prompt. But what I did is I built a custom GPT version and a gem from Google on the exact same system prompt and exact same knowledge base. And the first few courses I was creating, I would like talk with both of them. I was kind of saying, okay, here's my outline. What do you think? How would you improve it? And so I was going back and forth and by about the third course I just stopped using the custom GPT. The gem was far superior to the custom GPT, at least in this instance. And so yeah, I wouldn't sleep on gems, both as like just an experimental thing like go play around with them if you haven't built one. Because the Gemini 2.5 Pro model they're built on is incredible and their potential application in the workspace now is requires huge change management and education and training and like integration into it. And I think that's where most Enterprises have this like huge opportunity ahead is like just solve the obvious things better than everybody else. Like you don't have to figure everything out about AI and solve super intelligent, all these things. Just take NotebookLM and deep research and custom GPTs or gems and like teach your team how to prompt well, like there's a massive cane to be had in every single company team department by just doing those few things really well.
[66:53]
Mike Kaput
Yeah, for sure. And we've been working with, you know, at least one company on helping them out with some GPTs and even just a very basic pilot project not only created huge transformative wins for them with nothing that I would consider rocket science, more just education, training and ongoing usage and monitoring. And also that became a huge win internally that has now inspired other teams and executives to move forward further with AI. So don't, yeah, don't underrate not only how a small project can do big wins, but also what it can inspire other people to do.
[67:27]
Paul Raitzer
Yeah, don't, don't overcomplicate this and then share the wins. This is something we've gotten much better about internally at SmartRx is like we have a playground where people post like cool things they're doing every day and it does it just like, oh man, that's a great idea. Like I'm going to use that in customer success or I'm going to take that and apply that to marketing. It's, you know, it doesn't, you don't. Again, don't overcomplicate it. Just do the obvious things well and have a change management plan in place that like trickles that down to everyone in the team department organization.
[67:57]
Mike Kaput
All right, next up, Apple is considering a major shift in strategy. They are considering outsourcing Siri's Core AI to anthropic Claude or OpenAI's ChatGPT. This would make a dramatic reversal of its long standing commitment to building its own language models. According to Bloomberg, Apple has asked both companies to train versions of their models that could run on Apple's own cloud infrastructure as a test if it moves forward. That would basically be an open admission that Apple's homegrown foundation models haven't kept pace with competitors and that Siri's lagging performance needs a near term fix. The potential switch comes after a messy shakeup inside Apple's AI Org. Leadership of Siri has been handed from AI chief John Giandria to the team behind Vision Pro. And internal tests reportedly found Anthropic's Claude outperformed Apple's models, leading to serious talks about licensing it. So Paul, nothing has been decided here, but this just seemed like more trouble for Apple. They just lost a key engineer working on these foundation models to meta for $200 million. I also found it interesting in one of the reports from Bloomberg they said that, quote, apple is known in many cases to pay its engineers half or even less than what they can get on the open market. So I don't know what's going on here, but this doesn't seem like it bodes well.
[69:19]
Paul Raitzer
Yeah, I mean if there was another innovator's dilemma, you could do a chapter on Apple and AI. Yeah, I don't know how they compete, honestly. Like I think that I don't even know that Aqua hiring or even straight up acquisition works because it's so counter cultural to Apple. They're not going to pay $300 million for top AI researchers. Like it would just fundamentally change the way Apple does things. And it is not a company that pivots their culture. So I don't know, like I think that licensing probably in the end is the play for them. Again, like you could go Aqua hire Mistral or straight up buy Perplexity or maybe even like go after Anthropic which seems probably the most aligned from a value perspective. You could do that. But why would those researchers and engineers stay at Apple? Like, I don't know, they're just not built to be like a wartime aggressive AI lab. And so I was on the acquisition train. I mean honestly up until I started talking right now like as I'm like thinking out loud, it's not going to work. Like they wouldn't be able to keep people there. So I love Apple. Like I'm. Everything we do is Apple. I personally every device is Apple. I've been an Apple fanboy for, you know, 20 years. I love the company. I don't think they can compete as an AI lab. So I don't know. I just, I just want Siri to work. Like I just, I want, I don't care whose tech they build it on, just make the thing work. So.
[70:58]
Mike Kaput
Right. All right, for our final topic, we're going to run through some final AI product and funding updates. Paul, if you have anything to chime in with here or feel free otherwise I'll just kind of run through these as we wrap up. So first up, Grammarly is acquiring Superhuman, which is an AI powered email startup. This is part of a broader push to become an AI powered productivity platform. So Superhuman, which was last valued at 825 million, built its reputation on speed and design and helps users churn through emails at lightning pace. But the company has found itself squeezed as Google and Microsoft have layered AI into their own email tools. Their annual revenue now sits around? 35 million. Grammarly, meanwhile, is flush with a billion dollars in funding from General Catalyst, and it's rebranding itself beyond Grammar Correction and moving deeper into productivity. So their vision is to use Grammarly AI agents, but Superhuman's inbox to build a smarter communication hub that understands your email schedule and workflows. Next Google Notebook LM got a little bit of an upgrade. The AI Powered Research Tool now feature now includes featured notebooks, which are curated collections of high quality content from trusted sources like the Economist and the Atlantic. And each notebook combines original source material with all of NotebookLM's great features like the ability to ask questions, trace citations, create visual mind maps, and listen to AI generated audio overviews. And last but not least, X OpenAI CTO Mira Muradi's startup thinking machines Lab has confirmed it has raised a massive $2 billion funding round led by Andreessen Horowitz and joined by companies like Nvidia. The startup's mission, said Murati in a post on X, is to build collaborative general intelligence. Their first product, they say, is coming in the next few months and will include open source components designed to support researchers and startups building custom models. Paul that is a wrap on a busy, busy week. Two weeks in AI.
[73:00]
Paul Raitzer
Yeah, and like we said, there's literally dozens of things we didn't get to so check out this week in AI Newsletter. We are back now. I don't think we have any disruption to the weeklies moving forward that I know of. So as of the moment, we're back on our regular Tuesday schedule and I still have two more course series to finalize, so I'm sort of still locked in the lab building courses for the upcoming launch. And stay tuned for the AI Academy by SmartRx relaunch news. It'll be coming out. We'll probably talk about the podcast next week, but if you are an AI Academy member or thinking about being one, coming very soon. All right, thanks Mike.
[73:45]
Mike Kaput
Thanks Boff.
[73:47]
Paul Raitzer
Thanks for listening to the Artificial intelligence show. Visit SmarterX AI to continue on your AI learning journey and join more than 100,000 professionals and business leaders who have subscribed to our weekly newsletters, downloaded AI blueprints, attended virtual and in person events, taken online AI courses, and earned professional certificates from our AI Academy and engaged in the Marketing AI Institute Slack community until next time, stay curious and explore, AI?