Ep.131: OpenAI’s Economic Blueprint, Ph.D-Level “Super Agents” Rumored, o3 Release, New AI Pricing from Google and Microsoft & Apple Intelligence Falls Flat - The Artificial Intelligence Show

Summary7 min read

The Artificial Intelligence Show: Episode 131 Summary

Release Date: January 21, 2025
Hosts: Paul Roetzer & Mike Kaput

1. OpenAI’s Economic Blueprint and Government Briefing

Overview:
OpenAI unveiled its Economic Blueprint, a comprehensive policy proposal outlining how the United States should develop and regulate artificial intelligence. Central to this blueprint is the projection of approximately $175 billion in global funds poised for investment in AI projects. OpenAI emphasizes the need for the U.S. to attract these funds to prevent China from taking the lead.

Key Proposals:

AI Economic Zones: Establishing specialized zones to foster AI development.
Research Labs: Creating labs aligned with local industries to drive innovation.
National AI Infrastructure Highway: Developing a robust network of power and communication grids tailored to support AI advancements.
Regulatory Alternatives: Advocating for federal leadership in crafting AI regulations to maintain American competitiveness.

Controversial Stance:
OpenAI asserts that AI developers should have the liberty to use publicly available information, including copyrighted content, for model training. This stance has sparked debates around intellectual property and data usage.

Upcoming Briefing:
Sam Altman, CEO of OpenAI, has scheduled a closed-door briefing for U.S. government officials on January 30th ([08:32]). Paul Roetzer speculates that this meeting aims to solidify OpenAI’s influence amid the new administration, asserting the company's belief in infrastructure as the cornerstone of future AI success.

Notable Quote:
Paul Roetzer at [08:32]:
"OpenAI is essentially building on top of other people's research data to justify the opportunity that exists now in the infrastructure as destiny."

2. PhD-Level “Super Agents” and the O3 Model Release

Rumors and Speculations:
Recent reports from Axios suggest that major AI labs, potentially including OpenAI, are on the cusp of announcing breakthroughs that would enable PhD-level super agents capable of performing complex human tasks ([19:27]).

OpenAI’s O3 Mini Model:
Sam Altman confirmed via X (formerly Twitter) that OpenAI is finalizing the O3 mini model, set for release in approximately two weeks. This model, although not as capable as O1 Pro, boasts significantly faster performance and will be available through both the API and ChatGPT. There are indications that ChatGPT Plus users might gain access to O3 mini soon ([18:18]).

Discussion:
Paul and Mike delve into the implications of these advancements, highlighting the potential for AI to reach and surpass human expertise in specialized domains. They reference historical milestones like AlphaGo to contextualize the significance of AI achieving superhuman performance.

Notable Quote:
Mike Kaput at [27:01]:
"You're right. That's what matters is when it starts to affect our jobs."

3. Google and Microsoft’s New AI Pricing Strategies

Google’s Gemini Integration:
Google is integrating Gemini AI by default into all Google Workspace business plans, accompanied by a modest price increase. The standard plan previously priced at $32/user/month will now include Gemini for an additional $2/user/month ([32:01]).

Microsoft’s Consumption-Based Pricing:
Microsoft introduces a pay-as-you-go model for certain AI agent features within its Copilot Pro license, maintaining the $30/user/month fee. Pricing tiers are based on the complexity of tasks:

Simple Messages: ~$0.01 per message
Generative Responses: ~$0.02 per message
Data-Driven Responses: ~$0.30 per message

Discussion:
The hosts analyze Google’s strategy as a move to undercut competitors by embedding AI features into existing subscriptions. They critique Microsoft’s complex pricing model, questioning its feasibility for budgeting within enterprises.

Notable Quote:
Paul Roetzer at [32:01]:
"Google is just a strategic move to undercut the market... Microsoft is introducing consumption-based pricing that’s incredibly confusing."

4. Apple Intelligence Falls Flat

Issue:
Apple has temporarily paused its new AI-powered notification summary feature after it inaccurately summarized news from outlets like the BBC. The company acknowledged that the feature was in beta and may contain errors ([47:48]).

Discussion:
Paul and Mike express disappointment, noting that Apple’s AI offerings have consistently lagged behind competitors. They highlight user frustrations and the lack of robust, reliable AI functionalities compared to giants like OpenAI and Google.

Notable Quote:
Paul Roetzer at [48:41]:
"Apple Intelligence is just... an embarrassing offering from Apple."

5. Rapid Fire Highlights

a. Google and MIT’s Titan System:
Researchers introduced Titans, an AI system that mimics human-like memory by combining short-term, long-term, and persistent memory. This innovation aims to enhance AI’s ability to learn and adapt in real-time ([41:34]).

b. Google DeepMind’s F.A.C.T.S Benchmark:
DeepMind launched F.A.C.T.S, a benchmark to evaluate AI’s accuracy in adhering to provided information. Gemini 2.0 currently leads with an 83.6% grounding score, outperforming models like GPT-4 and Claude 3.5 Sonnet ([46:10]).

c. TikTok’s US Ban Drama:
TikTok briefly shut down in the U.S. due to a congressionally mandated ban but resumed service following Donald Trump’s intervention, which proposed a joint venture for 50% US ownership. Perplexity made an unexpected bid to merge with TikTok US, sparking skepticism ([53:07]).

d. USPTO’s Comprehensive AI Strategy:
The U.S. Patent and Trademark Office released an AI strategy focusing on advancing IP policies, building AI capabilities, ensuring responsible AI use, developing internal expertise, and fostering collaboration. They emphasize a human-centric approach, complementing rather than replacing human expertise ([58:43]).

e. Meta’s Use of Pirated Data for Llama 3:
Unsealed court documents reveal that Meta approved using Library Genesis, a book piracy site, to train its Llama 3 model. Executives, including Mark Zuckerberg, attempted to conceal this usage by removing metadata and ignoring copyright headers ([61:54]).

f. MIT Study on LLM Energy Consumption:
A MIT study found that running large language models like Llama consumes significant energy, equivalent to powering 10 to 30 LED bulbs continuously. Additionally, spreading the model across more GPUs increases energy usage substantially, indicating inefficiencies in current AI operations ([65:04]).

g. Google’s NotebookLM AI Hosts Develop Attitudes:
Google’s NotebookLM introduced interactive audio overviews where AI hosts began displaying attitudes like annoyance when interrupted, highlighting challenges in AI behavior alignment ([70:08]).

6. Funding and Product Updates

a. Synthesia’s Series D Funding:
Synthesia raised $180 million to expand beyond AI avatar technology, introducing advanced video creation tools like dubbing and translation, supporting over 230 avatars in 140+ languages ([70:59]).

b. NextWave’s Cursor Funding:
The company behind the AI coding tool Cursor secured $105 million, increasing its valuation to $2.5 billion. Cursor leverages proprietary models and integrates with OpenAI and Anthropic to enhance coding efficiency through auto-completion and agentic features ([70:59]).

c. Andreessen Horowitz’s Investment in Slingshot AI:
Slingshot AI received a Series A investment led by Andreessen Horowitz, totaling $40 million. The startup focuses on developing AI models tailored for psychology and mental health support, distinguishing itself from general-purpose AI chatbots ([70:59]).

d. OpenAI’s ChatGPT Enhancements:

Tasks Feature: Allows users to set automated tasks like daily news briefings or passport expiry reminders, currently in beta for subscribers.
Custom Instructions Upgrade: Enables users to fine-tune their ChatGPT experience by setting personality traits, communication styles, and specific operational rules ([70:59]).

e. Deep Seek’s Deep Seq R1 Release:
Deep Seek introduced Deep Seq R1, an open-source model rivaling OpenAI’s O1 in performance. It offers unrestricted commercial use and is priced significantly lower at $0.14 per million tokens, alongside smaller, distilled models ([70:59]).

f. Microsoft 365 Copilot Chat:
Launched a pay-as-you-go service featuring:

Free Chat Experience: Powered by GPT-4.
AI Agents: Automate workplace processes.
Enterprise IT Controls: Enhance data protection and agent management ([70:59]).

g. Adobe’s Firefly Bulk Create:
A new tool that can edit up to 10,000 images simultaneously, offering features like background removal and resizing. This tool operates on a consumption-based pricing model, requiring users to purchase Adobe Firefly Credits ([70:59]).

h. Luma’s Ray 2 Video Model:
Ray 2 can generate videos with realistic motion, complex physics, and cinematic scenes. Available to paid subscribers with API access forthcoming ([70:59]).

i. Runway’s Frames Model:
Frames offers unparalleled stylistic control and visual fidelity for image generation, allowing users to define artistic styles, compositions, and subject matters ([70:59]).

7. Conclusion and Final Thoughts

Paul and Mike wrap up the episode by highlighting the rapid advancements and challenges within the AI landscape. They emphasize the transformative potential of AI infrastructure, the ethical and legal dilemmas surrounding data usage, and the crucial need for user-centric AI product development. The hosts encourage listeners to stay informed through the Marketing AI Institute's resources and participate in their upcoming AI Mastery Membership Program.

Final Quote:
Paul Roetzer at [77:20]:
"We are entering a very different phase in American business and innovation and the heads of the AI companies are in the first row. So buckle up, it'll be fascinating."

Join the Conversation:
Stay updated with the latest in AI by subscribing to the Marketing AI Institute Newsletter. Engage with over 60,000 professionals and access exclusive content, events, and courses designed to elevate your AI literacy and business strategies.

Thank you for listening to The Artificial Intelligence Show. Visit marketingaiinstitute.com to continue your AI learning journey.

Loading summary

Transcript69 lines

[00:00]
Paul Raitzer
We are entering a very different phase in American business and innovation and the heads of the AI companies are in the first row. Welcome to the Artificial Intelligence show, the podcast that helps your business grow smarter by making AI approachable and actionable. My name is Paul Raitzer. I'm the founder and CEO of Marketing AI Institute and I'm your host. Each week I'm joined by my co host and Marketing AI Institute Chief Content Officer Mike Caput, as we break down all the AI news that matters and give you insights and perspectives that you can use to advance your company and your career. Join us as we accelerate AI literacy for all. Welcome to episode 131 of the Artificial Intelligence Show. I'm your host Paul Raitzer, along with my co host Mike Kaput. Last week on episode 130, we started off saying it was probably going to be a hectic week because last Monday morning had started off with a bang with a bunch of stuff. And we were correct. It was a very busy week in AI. Mike and I were cutting rapid fire items up until like 10 minutes ago because it was just too much to get through. So it is an action packed week. We are recording this on January 20th, Monday morning. So I know this week's going to already be busy because we're going to have some executive orders probably repealed from the previous administration and on AI, we're going to have a whole new day of how AI is approached in the United States at least. So I expect a lot happening again this week. Mike.
[01:43]
Mike Kaput
Yes, sir.
[01:44]
Paul Raitzer
We'll be keeping up with plenty to talk about on episode 132 next week. All right. But for now, we are on episode 131. This week's episode is brought to us by the AI Mastery Membership Program. We've been talking a lot about this lately. I have mentioned that some changes were coming to the program. I decided last Friday that we're going to just go ahead and announce those changes this week, at least preview the changes this week. So what we're going to do is as part of the Imastery Membership program again, this is our 12 month membership program with exclusive content and experiences. One of the key components of it is every month Mike and I do an exclusive session for our members. So we do a generative AI Mastery series that Mike runs where he demos a bunch of technology. We do an AI Trends briefing and we do an ask me anything session. So each of those happens once a quarter. This Friday is our AI Trends briefing where we'll go through kind of the last three months and what are the main things? And we usually count down from 10 is kind of the format of it. And so what we're going to do this Friday is we're actually going to open that session up to anyone that wants to attend. So this is is usually members only. We're going to make the quarterly AI trends briefing for Q1 open to the public. And part of the reason we're doing that is to let people experience it. But the the bigger reason is I'm going to lay out our roadmap for AI Academy for what we're going to do with our Academy and also introduce a new initiative that's designed to dramatically accelerate AI education worldwide. So anyone who listens to us regularly has heard me say our kind of North Star is to accelerate AI literacy for all. And so that's what I've been working on, is sort of a new project that will enable us to do that hopefully in partnership with other organizations and associations. And so I'm going to explain that vision on Friday and share our near term roadmap for what we're going to be doing as part of that project. So you can go to SmartRx, AI AI das dash mastery and you can register for that free session on Friday. We'll put that link in the show notes as well. So again, that's Friday at noon Eastern time. We'll go through the Trends Briefing and I'll kick it off with like 10 minutes about the vision for this AI literacy program that we're introducing. So again, SmarterX Aimastery. And then as we've been sharing recently, you can use Pod150 if you want to join the AI mastery program and that'll get you $150 off that annual membership. This episode is also brought to us by the AI for Writer Summit. We've been talking about this on the last couple episodes. This is our third annual virtual summit. This is going to be happening March 6, Thursday, March 6, from noon to 5pm Eastern Time. We had over a thousand people register last week. We just, you started promoting it really last week and over a thousand people registered. We had 4,500 plus in 2024, so we're expecting similar turnout this year. We will be posting the 2025 agenda in the coming days. I believe we just kind of finalized that last week. So that's going to be coming soon. You can go to aiwritersummit.com again that's aiwritersummit.com you can also find information about that directly from the Marketing AI Institute site under the events tab. And then one final reminder. We have our 6th annual Macon event marketing AI conference is going to be back in Cleveland October 14th to the 16th. You can go to Macon AI. M A I C O N A I. The key here is we are open for speaker applications. So if you want to speak or if you know someone that would be a great speaker for us to have at Macon 2025, definitely check that out again. Go to Macon AI, and there is a submit your speaker application button right there on the homepage. Those are open till February 28. It looks like we're accepting applications, so get those in early. We review them on a rolling basis so as they come in, we actually look at those applications. And then if someone's a great fit, we don't wait until, you know, March to let them know. We'll actually reach out to people, sometimes in advance. So get in early, and we would love to hear from you if you've got a great session that you think would be a good fit for that audience. We're expecting about 5, 1500 people at Macon 2025. All right, Mike. We got a lot of economic stuff. We got super intelligence. We got it. We got it all going on.
[06:18]
Mike Kaput
Yeah, it's a. It's a crazy week, like you alluded to.
[06:22]
Paul Raitzer
I feel like we. Somebody told me one time, like, we start every podcast thing. It's crazy.
[06:26]
Mike Kaput
It's gotten worse, though.
[06:27]
Paul Raitzer
It is a crazy week.
[06:28]
Mike Kaput
Crazier. I feel like we gotta go back and look at the first time we mentioned it was a crazy week and just laugh at how probably light it was compared to now.
[06:38]
Paul Raitzer
Yeah.
[06:38]
Mike Kaput
All right, so first up, OpenAI just released what it's calling its economic blueprint. This is a policy proposal for how the US should develop and regulate AI. This blueprint makes a pretty bold claim that there is approximately $175 billion in global funds waiting to be invested in AI projects. They also argue that if the US does not attract these funds, China will. To prevent this, OpenAI proposes a comprehensive national strategy that includes developing AI economic zones, creating research labs aligned with local industries, and building what they call a national AI infrastructure highway, a network of power and communication grids specifically designed to support AI development. OpenAI also recommends that the federal government, in consultation with industry, should take the lead in developing, quote, alternatives to the, quote, growing patchwork of state and international regulations that risk hindering American competitiveness. This blueprint also wades into controversial territory around copyright and AI training data. OpenAI argues that AI developers should be able to use Quote, publicly available information, including copyrighted content to develop their models. And this all comes at an interesting time. We'll talk about this as well in the next topic. But OpenAI CEO Sam Altman has scheduled a closed door AI briefing for US government officials on January 30th. Paul, why are we getting this blueprint now and maybe talk a little bit or tee up a little bit? What's the deal with the closed door briefing for lawmakers? Is it about this? Is it about something else?
[08:33]
Paul Raitzer
We've been talking a lot about infrastructure, especially in the last like six to eight months on the podcast. I think we've been, you know, trying to introduce that topic for people who maybe aren't paying as close attention to that side of AI. It's very fundamental to what happens next. And so it's not new. I mean, OpenAI has been very aggressively meeting with lawmakers for the last couple years. There's been lots of conversation around trying to make us a leader in the build out of data centers and the infrastructure to power AI. But I think with the new administration coming in, everyone's lining up to sort of get their messaging in place and build the relationships they need to build and have a say in kind of what happens next. So my guess is the, the January 30th meeting is just a timing. The new administration is, you know, coming into power today in America. January 20th will be the inauguration. So two weeks from now you've got, you know, Congress, Senate, President, everybody's kind of set up now and it's time to get to work. So the thing was interesting to me here is as a kind of like journalism school major, I always drill into data points where this, where's this coming from? Because this whole thing is basically centered on this 175 billion. And so in the second paragraph, they say shared prosperity is as near and measurable as the new jobs and growth that will come from building more AI infrastructure like data centers, chip manufacturing facilities and power plants. As our CEO Sam Altman has written, AI will soon help our children do things we can't. Not far off in the future in which everyone's lives can be better than anyone's life is now. So I think this is interesting because they're basically, there's a lot of concern that I have shared many times on this podcast that AI is going to displace jobs. I'm, I, I believe that very deeply. I think this is the setup for how these companies make the government believe there's a net positive outcome if the government invests properly in infrastructure. So the the 175 billion. So in that opening paragraph it says new jobs and growth. If you click on that link it actually takes you to the September 2024 OpenAI report called Infrastructure is Destiny Economic Returns on US Investment in Democratic AI, which I assume we talked about at that time like I said in September of 24, so I'm guessing we at least mentioned that I didn't go back and look and see the extent to which we talked about it. So when you go into that report though for September 2024, this is the source of the 175 billion. It says capital spending on AI already rivals the mainframe era of the late 1960s and the fiber optic deployment of the late 1990s with an estimated 175 billion in global infrastructure funds waiting to be committed. Now that report cites in the citation for the 175 billion actually comes from Houlihan Loki Digital infrastructure industry update Q2 2024. I have never heard of Loki. I had not gone to that report prior to this. It's a pretty dense report on digital infrastructure. But OpenAI is citing Houlihan Loki to come up with 175 billion. And when you drill into like that 175 billion it you need like the O1 reasoning model from OpenAI to understand how they come up with that number. But the whole point here is like don't just accept data on face value like too much. I think we've gotten to the point in Twitter, with Twitter and social media and like even mainstream media does it to a degree. Everyone latches on to these numbers with no concept of where the number actually originated from or how like legitimate that number is. I am not saying the 175 billion isn't reasonably accurate. I'm not even saying it's not underestimated that it's not a trillion. I have no idea. But this is kind of where we follow it. So then if you. So now you understand like OpenAI is basically building on top of other people's research data to to justify the opportunity that exists now in the infrastructure as destiny. OpenAI report that basically laid out how data structure data center build out will create all of these jobs in America and accelerate the growth of GDP growth domestic product, a gross domestic product. So in that report it says that each 5 gigawatt data center will have 2 million GPUs. So Nvidia will be very happy because 2 million GPUs per data center each one will cost 100 billion in 2028 to build. So they're already projecting out three years from now and it'll create 14,000 construction jobs and 40 billion in annual revenue per data center. Then to operate those data centers, you're looking at an estimated 4,000 employees per data center. So the whole point of this is they think data centers which are needed to build the future AI models and deliver all the AI that we need at inference time, all this intelligence we need at inference time, when you and I use our smartphones and use ChatGPT and things like that, is a really big deal. And it's going to be a massive driver of employment and gdp, specifically in the states where the data centers are built. And that infrastructure is Destiny report from OpenAI actually breaks down by state how much money could be generated, how much GDP could grow and how many jobs could be created in those states. So I think that the basic premise here is they're making this massive bet on infrastructure. They believe they're going to build insanely intelligent models and that those models are going to need more and more data centers. Now in the blueprint, the one other aspect is they start like connecting this to the past. And I thought this was really interesting historical context. I hadn't read about this prior, but they talk about how like when cars were first Invented in the UK, the UK actually put something in called the 1865 Red Flag Flag Act. When a car was coming down the street, they had a flag bearer that had to walk in front of the car to warn people that the car was coming and that the, the car had to move aside in favor of horse drawn transport. So they're sharing this as like a lesson of let's not over regulate things, let's like accept that change happens and it may look weird at first, but that we shouldn't actually restrict this. That's what's happening in the eu. They're saying like we can't go that path, we have to push forward. Chips, data, energy and talent are the focus of it. And then the one other note that I, I think is really interesting, the, the economic blueprint, the actual full blown report that we'll link to states in the very opening. OpenAI's mission is to ensure that artificial intelligence benefits everyone. This is the first time I've actually seen them drop general from that. They usually say artificial general intelligence. To us that means building AI that helps people solve hard problems. Because helping with the hard problems AI can benefit the most people possible. Now the timing here is interesting Mike, because you and I had touched on this. But O3 is built to solve hard problems like these. Reasoning models aren't for the average user to go in and ask like about a summary of sports events from last night or do some basic research. These things are designed to solve hard math and science problems. So the MIT Technology Review comes out with an article says OpenAI has created an AI model for longevity science. In that article, it says when you think of AI's contributions to science, you probably think of AlphaFold, the Google DeepMind protein folding program that earned its creator a Nobel Prize last year. Now OpenAI says it's getting into science game too, with a model for engineering proteins. Company says it has developed a language model that dreams up proteins capable of turning regular cells into stem cells and that has handily beat humans at the task. The work represents OpenAI's first model focused on biological data and its first public claim that its models can deliver unexpected scientific results. You remember the five levels of AI at OpenAI. Number four is innovators creating new solutions problems, basically. As such, it is a step forward, determining whether or not AI can make true discoveries, which some argue is a major test on the pathway to AGI. The Pro team engineering project started a year ago when Retro Biosciences, a longevity research company based in san Fran, approached OpenAI about working together. That link up did not happen by chance. Sam Altman, the CEO of OpenAI, personally funded Retro with $180 million. As MIT Technology Review first reported in 2023, Retro's goal is to extend the normal human lifespan by 10 years. So I think that a lot of things are happening here. One is the new administration and OpenAI trying to kind of stake their claim and influence it. 2 Is they truly believe infrastructure is destiny, that to achieve the kind of intelligence they plan to achieve and to have the impact on the world they want to have, they need to build this infrastructure out. Three, they're seeing massive gains in their reasoning models like 01, moving into O3 and eventually 04. And they see the ability for these things to start solving really hard problems in society. And I think they want to prepare the government and the world for that, which I believe they think is very near to happening. That was a lot.
[18:18]
Mike Kaput
Funnily enough, the only thought I could keep thinking while I'm looking at this January 30th meeting is Sam Altman better not show up. And suddenly Elon's in the room too.
[18:28]
Paul Raitzer
Oh, I. Trust me, I was thinking about that all weekend. There's no way Elon's not in the room which is going to be the weirdest. I would like, I pray that that is somehow broadcast. Like I saw the pre party for the inauguration last night. There was a clip on Twitter I saw of Jeff Bezos and Elon Musk standing there talking to each other, which we won't get into the whole history of those two, but they've been very oddly friendly on Twitter lately. Like Bezos's Blue Origin rocket company successfully put something into orbit last week and Elon actually tweeted like congratulations, great job. Jeff then replied, hey, great job to you too. It's like something weird is happening. Like the two richest people in the world are now like turning into buddies it seems. And yes, like I, I, I keep thinking like Elon is going to be in whatever meeting Sam is at and I don't know that those two have been together in person. Yeah, in a room where they have to speak to each other for a long time.
[19:28]
Mike Kaput
Our second big topic is really closely related to what we just discussed. So there are some major developments, it sounds like that are brewing at the main AI labs and multiple signals seem to be pointing towards some big upcoming announcements, maybe from OpenAI. So there was a breaking report from Axios the other day that said, quote, architects of the leading generative AI models are abuzz that a top company, possibly OpenAI in coming weeks will announce a next level breakthrough that unleashes PhD level super agents to do complex human tasks. Axios goes on to say, quote, the expected advancements help explain why Meta's Mark Zuckerberg and others have talked publicly about AI replacing mid level software engineers and other human jobs this year. Now Axios is hedging its bets on whether or not this is from OpenAI or another lab, though they do mention, like we just talked about, Sam Altman's closed door briefing with government officials on the 30th. And at the same time another development is confirmed by Altman himself. He posted on X that the company is finalizing the O3 mini model for release in approximately two weeks. Altman has noted that while this model is not as capable as O1 Pro, it is significantly faster and will launch simultaneously both with their API and on ChatGPT. It also sounds like O3 mini might be accessible in some form by ChatGPT plus users based on some of his replies to the initial posts. So Paul, this like super agent news. I feel like we're just gonna see this like terminal for some reason all over. This could be in reference to like we talked about, an open AI release codenamed Operator that was rumored back in episode 124 could be something totally different, like what is most likely being referenced here. And like PhD level super agent feels a little more aggressive than some of the talk we've heard about agents in the past.
[21:36]
Paul Raitzer
I think it's. It's likely 03, but it's probably more likely the test time compute that they're seeing the scaling law accelerating. So if we remember back to like, so weird to say, the historical large language models of the last two years, like, contextually, the whole premise there was give them more Nvidia chips to train on, give them more data, give them more time to learn, and they became much larger, much more intelligent, much more generally capable. And so that took us from GPT1 to GPT4, where we scaled this law where we just give them more data, more chips, and they got bigger and smarter. Then In O, with O1 in September, I guess we got 01 right. Is that right? Yeah. December 24th, we were introduced to this test time compute, this idea that if you just give them more time to think at inference, so when you, when you ask the question, we give them more time that they seem to get smarter, even if they're not massively bigger, that they, by allowing them time to think, to think harder, they actually just start performing way better. And so it does seem that based on a lot of different things I've been seeing on Twitter, that that seems to be playing out and maybe even faster than people thought, that by giving these things more time, they're. They're starting to perform at these PhD levels. So I'll go through a quick series of tweets because this is. This started on Friday, like, so this is just three days ago. No one Brown, who we've talked about a number of times on the podcast, we did a feature on him because he was the guy, he was at Metta and Now he's at OpenAI working on reasoning. But he was the guy who kind of solved like Texas Poker, Texas hold'em poker, where by giving the AI time to think, it became like superhuman at poker. And so he's applied that line of thinking now to building these models. So he tweeted and we'll put the links to all these tweets in if people want to follow along. Lots of vague AI hype on social media these days. There are good reasons to be optimistic about further progress, but plenty of unsolved research problems remain. We have not yet achieved super intelligence. And then he was kind of like replying to people. But he said between the O1 announcement, OH3 announcement and various podcast talks, I think we've said a lot when someone said, hey, could you tell us more about this? We believe O1 represents a new scaling paradigm and we're still early in scaling along that dimension. Then Also on Friday, January 17, Altman tweeted, thank you to the external safety researchers who tested O3 mini. We have now finalized a version and are beginning the release process, planning the ship in a couple weeks. Also, we heard the feedback. We'll launch API and Chat GPT at the same time. Then someone asked like, what specifically about it is, you know, good? And he just said, it's very good. O3 is much smarter than 01. We are, we are turning our attention to that now and O3 Pro with the mind blown emoji. And then someone said, oh, is the O3 Pro going to be 2000amonth? He said, no, you'll get it for the same 200amonth. Then Noam Brown again, this is a little bit later and this was actually on Sunday the 19th. So two days later he said, it can be hard to feel the AGI, which is a term we've shared on the podcast before. It's kind of like the vibes of AGI. Like people in these labs are just like, do you feel the AGI? Said it's hard to feel the AGI until you see an AI surpass top humans in a domain you care deeply about. Competitive coders will feel it, will feel it within a couple years. And then he's referencing Paul Schrader, who I'll get to in a second. He says Paul is early, but I think writers will feel it too. Everyone will have their Lisa Doll moment at a different time. Lisa Doll is a reference to AlphaGo, the Go champion. So I think people, a lot of people listen to the podcast have heard us talk about this. But watch the AlphaGo documentary, you'll see what we're talking about. It's free on YouTube. Lee Sedol was defeated by the AlphaGo system built by Google DeepMind at a time when, when most people didn't think an AI could defeat a GO champion. So this Noam tweet is in reply to someone who shared a post. I think it was a Facebook post from Paul Schrader, who's an American screenwriter. He wrote Taxi Driver for Scorsese and then he later co wrote Raging Bull and a bunch of other popular movies. He posted AI I've come to realize AI is Smarter than I am, has better ideas has more efficient ways to execute them. This is an existential moment. Ain to the Kasparov how what kasparov felt in 1997 when he realized Deep Blue was going to beat him at chess. Someone then said, what brought you to this conclusion, Paul? And he replied, I asked it for Paul Schrader's script ideas. I had better one. It had better ones than mine. So this reinforces what we talked about on episode 130, which is like forget about all these evals. Like these research labs. Talk about all these really hard evals. Is it PhD level in math and is it PhD level in biology? Who cares? What matters is that Paul Schrader, a legendary screenwriter, now believes the thing is better at his job than him.
[27:01]
Mike Kaput
You're right.
[27:02]
Paul Raitzer
That's what matters is when it starts to affect our jobs. So this then leads into the last two tweets I will mention. The first one is this Axios article where this is the tweet from one of their Editors, we've learned OpenAI CEO Sam Altman has scheduled a closed door briefing for the US government officials on January 30, which wasn't news because OpenAI had put that in their economic blueprint. But anyway, with people inside and out of the government telling us AI insiders believe a big breakthrough on PhD level super agents is coming. So that was like what everybody went nuts on Sunday, like on the 19th. It's just everywhere. Sam then tweets the morning of January 20th. Twitter hype is out of control again. We are not going to deploy AGI next month, nor have we built it. We have really cool stuff for you, but please chill and cut your expectations 100x so my overall takeaway here, Things are likely advancing far faster than people realize or are prepared for. That much I'm fairly confident in. They just probably aren't advancing as quickly as the hype on Twitter might make you believe. When an Axios headline about super agents shows up and people go crazy and then like three hours later everyone thinks that Sam's going to introduce super intelligence to Congress on January 30th. And that's likely not what's going to happen, but there's a decent chance he may show like an O3 preview, like O3 Pro Preview, with their projections of what 04.05 could look like. Like that's a distinct possibility and that is earth shattering. Again, I feel like we're becoming so numb to these advancements that it's hard for people to process what that could mean. If we truly do start having these PhD level agents on demand for $200 a month for whatever profession you want to pick.
[28:56]
Mike Kaput
Two of the top replies to Sam's tweet, which went out at 3:32am oh.
[29:02]
Paul Raitzer
Is that when it was?
[29:02]
Mike Kaput
I knew it was the morning, but yeah, real early. The two, two of the top replies I think are hilarious but also kind of indicative of like what moment we're in. Someone first replied Super Intelligence on Tuesday at 10am Pacific Time per Axios. So these Axios headlines are getting out of control. And then my favorite was someone just asked when are we getting the chat GPT meme Coin? So you know, when are we getting.
[29:26]
Paul Raitzer
Their crypto project which if you followed the news at all, on Saturday and Sunday Meme Coin thing, which I honestly like. Mike, I was unreal. I was going to take a minute and have you explain meme coins to me and I just went to chat myself. And the whole point is like if you didn't follow this, I, I don't want to get into this, but Trump launched a Meme coin that made him like $60 billion in like five hours or something like that. And then I think it crashed when they announced a Melania meme coin like later that day or something like that. So it's like this cryptocurrency thing. And I do, I'm not an expert on this, I'm not even going to try and explain this, but it makes as little sense as you would think, basically. So if you go do the research, it's like vaporware. There is nothing. It's just hype and people launch these meme coins. But yes, that is funny. An OpenAI meme coin would be they could raise all the money they need for their infrastructure if they just launched a Meme Coin.
[30:19]
Mike Kaput
No kidding. All right, our third topic this week, bringing it back down to earth a little bit. Two of the biggest players in AI have made some pretty significant changes to their pricing strategies. So both Google and Microsoft are revamping a bit how they package and charge for their AI products. So first up, Google announced it is basically giving away Gemini to business and enterprise customers. It's adding it by default to all Google Workspace business plans. The catch is this comes with a small price increase. So previously a Workspace business standard plan with the Gemini add on cost $32 per user per month. Now it will be $14 per user per month, but that's a $2 per month increase from the previous standard plan, like without Gemini. So Microsoft however is taking a slightly different approach. They are keeping their premium Copilot Pro license at 30 bucks a user a month. They are however introducing new consumption based pricing for certain AI agent features that they say, quote, can automate workplace processes. According to some reporting from the information, the information writes quote, under the new consumption pricing, one message within 365 copilot chat costs roughly $0.01 while messages that require the chatbot to create a lengthy answer using generative AI cost $0.02 and messages that require the chatbot to draw on other data from other applications cost 30 cents. So Paul, maybe first walk me through Google's move in particular. Like what are they trying to achieve with the new pricing and is it going to work?
[32:02]
Paul Raitzer
Yeah, I just had a funny thought on the Microsoft one, but we'll come back to that. So I think the Google move is two things. One, it's just a strategic move to undercut the market. You know, ChatGPT Enterprise and team Microsoft Copilot. We've said this all along like Google has a bunch of advantages here. One of them is their resources, their compute power, their own data centers, their own chips. Like they have all this stuff that they can throw throw at their competitors. Now Microsoft has similar stuff, but Microsoft doesn't have their own models. They're using OpenAI's models. OpenAI doesn't have these things. Open also doesn't have the distribution of Google Workspace. They don't have the distribution of Gmail. So Google's got some, some plays to make. And this seems like one strategic play is like let's just undercut the market and give this away. The second part to Google strategy could be, and I'm just hypothesizing here, it's impossible to know how many people pay for Google Workspace for Business. I dare you to try it. Pick your favorite AI tool you can use. Google Deep Research, ChatGPT perplexity. No one can answer for you. How many people pay for Google Workspace for Business? It's not in their earnings transcripts. It's nowhere. So all we have are like estimates here based on best estimates. There are 8 million companies that pay for Google Workspace for Business, including Marketing Institute and SmartRx. We are a paying customer, have been for years. The rest of this I'm just going to hypothesize. Let's say if there's 8 million companies and this is two weeks in a row I've had to do math, Mike. So this is like this is hard work for Monday mornings. 8 million companies. And let's just say out of those 8 million, there's 50 million users that are paying this $12 a month or whatever that number is. So we're just going to assume 50 million people are paying every month to use this product. Now, if 5% of those 50 million users chose to upgrade to Gemini, so that's 2.5 million. 10%. Yeah, that's 2.5 million users who are paying 20 bucks a month for Gemini. That's about $50 million a month or 600 million for the year being generated by people paying $20 a month for Gemini. So let's just assume that's what Google's currently making it. I'm making up the 50 million number. I'm making that number up. If instead they say, hey, we're going to give you Gemini for free, but we're going to increase your standard plan $2 a month. Well, if you have 50 million users who you went from 12 to 14amonth on, that's a hundred million a month in revenue or 1.2 billion per year. So by giving Gemini Gemini away basically, but charging everyone $2 a month rather than them opting in for the extra 20amonth, they just made $600 million. So now again, I don't know if the math. Math's there. As my son would tell me, my math isn't mathing. He sells me all the time. So. But if the numbers are in this rough range, you could see how one, this could undercut the market. Two, it might actually just be a smart financial move for Google to just charge everyone $2 whether they use the tools or not. Now, on the consumption based pricing, my first thought is if I have to reread your pricing four times to comprehend what it is, it's probably not going to work. It's not following like the simplicity rule at a higher level though. I think the key here is we're just going to see a ton of experimentation. Nobody knows what to charge. We talked a couple episodes how Sam kind of picked 200amonth out of the air for oh one and realized like they're losing money on it because he just kind of guessed at what would be a profitable number and he was wrong. So you're going to see a lot of experiments. I've mentioned this before, but like my former agency that I sold in 2021 was HubSpot's first partner in 2007 and we went through dozens of changes to their SaaS pricing model over the years. And so I think that we're just in this new phase where these companies that are selling AI aren't really sure how to charge for it and how to make money on these models and on the services. Now one final note here I thought was just it was so well written. I have no idea who this guy is. His name's Timo Springer. We'll put a link to this tweet in the show. Notes he has 300 followers on Twitter, so it's not like this is some influencer that everyone just like listens to. But I I saw his tweet and I thought it was so well done and it's representative of what I was explaining last week with the issues with Google Workspace. It's representative chatgpteam. It's representative of all these models. So and the reason I think we should pay attention to this is he actually got replies from the head of product for Chat GPT and the head of engineering for Chat GPT because he tagged some people and they apparently saw it. So here's here's the Teemo tweet. Chat GPT is a confusing mess right now. It seems like a few months ago they embraced a new product strategy, maybe when their chief product officer joined, which is good, but there's still lots of legacy features like GPTs that feel really out of place with newer releases like projects and tasks. What bothers me the most is that even for power users, it's extremely difficult to know which tool currently works with which model on which platform. Web Mobile, Desktop? The feature matrix is incredibly complex. In the normal chat, I can connect Google Drive, but this doesn't work as a Data source for GPTs or projects. Advanced voice mode can access custom instructions and memory, but doesn't work with projects or GPTs. 01 can now handle file uploads too, but only images. When I upload a PDF to the chat, only the text contents are analyzed, but if I upload screenshots of the PDF pages as images, these can be analyzed. Projects can also use GPT4O as a model. I could list at least five more things that are similarly annoying in daily use. I'm an absolute power user, and even I sometimes struggle to keep track of everything. I wish the product team at OpenAI would focus more on removing all these complexities from the product. And then he followed up with a comment, said a wonderful example. Chat GPT Enterprise now supports reading and understanding visuals, images, graphs, diagrams embedded in PDF files. Users can upload a PDF and ChatGPT can interpret the text and any visual elements within that file. Cool, but and it is not currently available for GPT based projects. So we can do this thing but I can't do it in GPTs, which is what I use all the time.
[38:20]
Mike Kaput
Time.
[38:20]
Paul Raitzer
These things are so incredibly confusing in daily use for users. In my personal account, Chad, GPT cannot analyze images and PDFs. In my business account it works, but not when I upload the PDF as knowledge to my GPT. So as I said, this is the experience we're all having. Like if you feel confused, this is a great example of someone who's in here power using all day long and the abnormalities aren't actually your fault. It is like a fault of the company and how fast they're moving and they're not solving for the end user. And as I said with Google, the issue seems to be these companies spend so much time solving for developers and yet all their revenue is coming from enterprise users. So the head of product for ChatGPT said, thank you, extremely top of mind. We will fix this. The head of engineering for ChatGPT said, yeah, we have to make it simpler and we'll do so. So whatever pricing model you want to have, just make it so it's actually user friendly, not what we currently have with all these platforms.
[39:21]
Mike Kaput
And one final question I had for you around consumption based pricing. Like I'm by no means a business or enterprise finance expert, but like how on earth do you even budget for usage based pricing? Like I don't even know how the usage of a tool that we would pay 20 bucks a month poor.
[39:40]
Paul Raitzer
No idea. And you get so many surprise bills where people are like, I didn't realize that. And then you got to put all these caps in place for usage. Yeah, I just, I can't see in an enterprise allowing the variability of pricing when the CFO is like, don't even understand how the product's going to be used. It, it's, it's not going to work. Like it's a great idea, but like, good luck.
[40:01]
Mike Kaput
Let's jump into this week's rapid fire. We got a few big things going on. So first up, researchers at Google, MIT and some other institutions have unveiled an AI system called Titans that fundamentally reimagines how AI can learn and remember information. This system represents one of the first major attempts to give AI the kind of nuanced memory capabilities that we as humans kind of take for granted. The key innovation is what the researchers call neural long term memory. This is an AI component that can actively learn and adapt while it's being used, not just during initial training. So much like how humans form memories based on surprising or unexpected experiences. The system pays special attention to information that violates its expectations, storing those memories for future use. Titan is particularly notable in how it combines three different types of memory. Short term memory for immediate tasks, long term memory that continues learning from new experiences, and what they call, quote, persistent memory, which maintains core knowledge about tasks. So this kind of mimics how human memory as we understand it works. We have different systems for different types of information. So Paul, I guess like my big question here just kind of as on a surface read of this study, like how big a deal is this? Because I've seen some people call this basically the successor to Transformers, which was one of the most important developments in modern AI.
[41:34]
Paul Raitzer
Quick background if Transformers is new to people. 2017, the Google Brain team released a paper called attention is all you need in which they invented the Transformer. It was building on prior research, but they were kind of credited with the creation of the transformer, that is the T&GPT generative, pre trained Transformer and Transformers for the last. What are we on now? 8 years roughly almost have really continued to be the basis for the acceleration of these models. That's what language models are built on. It's what enables everything that we've kind of seen to date. If you listen to Yann Lecun, Demis Hassabas, other leaders in the AI space, there does seem to be uniform belief that a number of breakthroughs are needed to get to the next level of intelligence. And so at any time a research paper may emerge that is the one of those breakthroughs. Titans might be one of those breakthroughs. You don't often know right away, even when the attention is all you need. Transformer paper came out in 2017. Google admittedly didn't realize the significance of their own invention until later the next year and started to actually try and productize it. By that point, OpenAI was now starting to work towards building GPT1. So some believe that OpenAI actually figured out the significance of the Transformer paper before Google did. And so we don't know. We, you know, this might be one of those ones. We look back in two years and be like, oh, on episode 131 we talked about that Titans paper and look at that. They just invented a whole new kind of model based on it. But this is why we pay attention to the research papers and, and you like Mike and I spent a lot of time kind of monitoring the influencers in the space and seeing which papers they're talking about, which ones are getting a lot of attention and citations because that often is a kind of a hint at what might be something of significance down the road. So definitely worth keeping an eye on. And it came out of the Google research team. My guess is they're not going to make the same mistake twice with releasing breakthroughs in AI models. So if this came out December 31, 2024, it's probably something that they internalized long before that and have already figured out how to apply it, maybe already building it into models. Back in 2017, there was a much more open research model within the AI community where you published your breakthroughs that stopped after ChatGPT. It basically slowed, not completely to a halt, but the amount of papers being published where they were putting out the new stuff was pulled back dramatically to in favor of product development after ChatGPT.
[44:11]
Mike Kaput
Our next topic, it also involves some work from Google. So Google DeepMind just unveiled a new tool for measuring one of AI's biggest weaknesses, which is its tendency to make things up. This is called facts. This is an acronym F A C T S Grounding this new benchmark. It's a new benchmark that sets out to do something that has been surprisingly difficult until now, which is determining just how well an AI system sticks to the truth when it's answering your questions. At the heart of this system that Google has built is a collection of over 1700 carefully designed examples that challenge AI models to do something that we might find deceptively simple. Read a document and answer questions using only the information provided. Now, what makes this particularly clever is how it works. Each response is evaluated not just by one, but three of the most advanced AI models out there today. Gemini 1.5 Pro, GPT4O and Claude 3.5 Sonic. So all the work from this project has been put into a public leaderboard that Google has launched to track how different AI models perform on these tests. So right now, Google's experimental Gemini 2.0 model ranks number one with 83.6% grounding. Google models also occupy spots number two and three and are followed pretty closely by Claude 3.5 Sonnet and GPT4.0. So, Paul, this seems pretty notable. It sounds like we now at least have some visibility based on Google's methodology and tests into which models are the most accurate at retrieving information that is actually in a document or a source that you're referencing. Does this mean we're getting closer to solving hallucination?
[46:10]
Paul Raitzer
It could. I mean, I think Anyone who uses NotebookLM or Google deep Research, you can experience probably this at at work where it cites and right within the source doc. So I, I, I do think that at some point we largely solve hallucination. The question becomes as always, I guess in society we have this issue is like what is the source of truth? Like what is truth? Unfortunately can't always be agreed upon and so a hallucination to one person might be fact to another person. So assuming we can get around that and we actually agree on sources of truth, if it is like documents provided, fine. Like that's an easy source of truth. If I'm giving you the 50 documents and saying I just want facts based on these documents, want an accurate relation to those and my money would be on Google 10 times out of 10 being the one that leads in this given their history and their business model in search and retrieval. So I would be guessing that we will continue to see progress being made here. I've heard Demis Hasabis talk about this exact problem as a Sundar Pichai, so I think that Google team is very focused on solving this and I do think in one to two years we may not have complete elimination of hallucination or inaccuracies, but humans certainly have plenty of hallucinations and inaccuracies. I would imagine we will be at superhuman levels of accuracy from these models within the next one to two years. I don't, I don't think there's any scientific obstacles to that being done. It's just kind of a brute force thing. They got to keep working through and finding ways to solve it, but seems like they're on the right path.
[47:48]
Mike Kaput
Our next topic concerns another major player in AI, but it is not exactly positive news. So Apple is temporarily pausing a new AI powered notification summary feature for news and entertainment. They're pausing this after the feature, which is powered by Apple Intelligence, inaccurately summarized content from news outlets. The most notable incident sparked some criticism from the BBC which saw Apple incorrectly summarize its coverage of the United Healthcare shooter. While Apple makes it clear that the summaries are in beta and quote, may contain errors, this move also seems to acknowledge that the tech just needs some more work. Paul, I'm just going to let you take this and run with it because you are a huge Apple fan and this is just not what people expect from this company.
[48:41]
Paul Raitzer
I don't think yeah, I I don't get it. I've so I've had Apple Intelligence now for whatever two months. I I the only time I ever use it is I guess Siri it. You know, sometimes Gives different responses or at least connects to chat GPT now if it doesn't know the answer but I, I mess around with like I don't even know what they're called Genmojis which I think is different than Image Playground. There's like these two things that are native in there now and I'm not really ever sure which one I'm using but in text messages to my son like I'll create I guess they're gen mojis of him like in different outfits and stuff like that. That is literally it I guess the only function of Apple Intelligence I use. And so with the amount of like ad money and hype like it is like an embarrassing product launch and on top of like the Apple Vision Pro which is insane technology that has zero support since the product came out, it's literally sitting next to me collecting dust at the moment. It's like two major failed product launches in a row hasn't really affected their stock price. Like I people are still bullish on Apple. I'm still bullish on Apple but it's highly out of character. Just for fun I, I put a poll on LinkedIn. I put it on Twitter too but I, I don't get a ton of engagement on Twitter so we'll go to LinkedIn. I said what is the most disappointing AI product so far? Hype versus reality. Wish I wasn't limited to four choices. Write in is welcome in the comments. So this had 599 votes. It's still actually still open for a couple hours. 46% is the leader. Apple Intelligence, Microsoft 365 copilot 36% agent force from Salesforce 9% and then Google Gemini for workspace 8%. Now obviously this is not a scientific study. This is kind of more based on overall vibes I would guess and kind of talk because there's a chance some of you like don't use these products. So it's hard for you to say Copilot is worse than Apple Intelligence if you've never used Copilot. So it was more of just kind of a fun put it out there and just get some general responses to it. But yeah, Apple Intelligence is just I mean the more time I spend with it the more like disbelief I am that this is the product they put out and they're historically like they play catch up a lot like fast followers and they may end up building this amazing experience into the phone sometime in 2000. You know 26, 27, I don't know. But right now it is a pretty embarrassing offering from Apple.
[51:07]
Mike Kaput
Yeah. And right now it's embarrassing, but I feel like it becomes really dangerous to them once someone actually reinvents an AI first device. Right. Because like, for instance, I was listening to an end of year episode on the Tim Ferriss podcast where he had his friend investor Kevin Rose on, and they were just speculating at one point, like, there are a lot of people they've talked to in Silicon Valley where it's like, why can't I just have a phone with like one button where it says that's just really smart AI that loads everything that I need? Like, I don't need all these apps.
[51:38]
Paul Raitzer
I don't mean like a rabbit.
[51:40]
Mike Kaput
Yeah, something that works. Actually, though things where it's like, we haven't yet in reinvented a mobile device that's really AI first yet.
[51:50]
Paul Raitzer
So being definitely swung the door open on this one. Yeah. And again, like, I mean, but so much. Or I'm sorry, not Google, Apple has, you know, but Apple's devices are so much more than that. But like I, I use the advanced voice and chat GPT 10 times out of 10 over Siri. Like, sometimes it's faster because I could say, hey, Siri, on my phone. And it like opens it and it's like, this is a basic thing. Siri should be able to handle this. Like, I don't need to go into my ChatGPT app, but if it's anything of actual value or that requires any actual reasoning or thought process, I'm going into ChatGPT every time. I'm not going to talk to Siri. So. And my kids is like a generation of kids who just think Siri's stupid. Like, they never ask Siri anything. It's like, other than turn the music off or something like that. So, yeah, I don't, I don't know, it'd be interesting to see what happens. They, they definitely have just faltered multiple times here. Just totally fumbled the Apple intelligence thing. And I don't know, maybe, maybe this spring they'll come out with something significant, but don't have great optimism at the moment based on what they've delivered so far.
[52:56]
Mike Kaput
Our next topic, let's Talk quickly about TikTok. TikTok has had a crazy couple of days, so we wanted to quickly run down what's going on with TikTok, given.
[53:07]
Paul Raitzer
Its will have changed five times probably by the time you actually hear this. Yeah, we have to say that more.
[53:14]
Mike Kaput
Than most AI straightforward AI news, this could change faster than anything else for sure. Because, you know, like TikTok does is prominent to many in our audience. There's also an AI angle to all this drama. So we just kind of quickly wanted to go through what's going on here. So. TikTok went dark late Saturday night as a congressionally mandated ban took effect. It resumed service Sunday afternoon with a message crediting President Elect as Trump for its return. Trump, speaking at a, quote, victory rally as part of his inauguration events in D.C. declared that, quote, TikTok is back. He outlined a vision for keeping the platform operational, suggesting a joint venture that would give the US 50% ownership. So here's kind of how this really quickly all went down. At some point around 11pm on Saturday, January 18, TikTok shut down in the US before the ban took effect at 12am on January 19. At 7:03am the morning of January 19, Trump posted Save TikTok in ALL CAPS on Truth Social. Hours later, he announced he would sign an executive order on Monday, which is today, the day we're recording this podcast that delays the TikTok ban. He also called for the platform to be taken over by a joint venture with us and current owners. At 12:30pm on Saturday, TikTok posted on X that it was in the process of restoring service. They publicly thanked Trump. At 1:50pm TikTok was reportedly back online for many U.S. users. They again pointed directly to Trump as the reason TikTok was saved. Late Saturday afternoon, a Trump advisor told CNN that the administration is still finalizing the executive order to delay the ban and give the platform more time to reach a deal to stay in the US Literally an hour later, Trump said, quote, TikTok is back during the rally. And now here's the AI component of all this. At some point on Saturday, CNBC reported that Perplexity, of all companies, quote, officially made a play for TikTok, submitting a bid to its parent company Byte Nance to create a new merged entity combining perplexity, TikTok US and new capital partners. So Paul, like, talk to me about Perplexity here.
[55:28]
Paul Raitzer
Like what are their motivations real quick on the band? I, I don't want to spend a lot of time on this, but. So Trump and, and the Republican Party have been pushing for the ban for years. This isn't like this is some Republican thing, like once they got in the office we were going to bring Tick Tock back. It is actually they've supported the ban for national security reasons. Because ByteDance is a Chinese owned company. The assumption is the data goes back to The Chinese government, if they want it. So the Supreme Court upheld that. This is not a violation of the First Court, First Amendment, that, that the ban should retain, remain. So this is like a. It's a legal thing. And Congress and the Senate both supported the ban, but it being brought back is like, a popular thing to do. So it, it, it may come back. Whatever. Okay. So now, personally, I actually stopped using TikTok. Like, I found I was. I was a very late adopter of TikTok, and it does just, like, suck you in. Like, their algorithm is insane. And so I took it off my phone like, a week or two ago because I was like, it's just wasting my life. Now. Most of the stuff I find, there's actually things like basketball plays to run for my daughter's team, sports things I'm interested in, business. It's like, it's actually really good, valuable stuff for me because the AI is so good. I'll spend like 40 minutes, like, oh, my God. I just, like, probably went through 100 videos on TikTok of, like, useless stuff. Yeah. So I actually took it off because I thought it was, like, sucking time out of my life that I wanted back. The perplexity thing. I don't. I think Perplexity, like, jumped the shard. Like, I mentioned this a couple episodes ago that I thought Perplexity was eventually just going to get acquired, like Aqua Hired or whatever. I. I don't get this company. Like, this is a pure PR move. Obviously you're not going to merge with TikTok. Why in the world would you put this out there other than because they thought it was funny? I don't know. Like, it's. It's just absurd. So I think I'm going to cancel my Perplexity subscription. Honestly, like, I did not because of this. It was just like, this was like a tipping point where, like, I already thought that they were kind of questionable. And I thought based on the interviews I've listened to the founder, like, it wasn't a very serious company. And then I thought with all these new things they're throwing, they just like throwing spaghetti at a wall with, like, what's going to stick and differentiate us. And I feel like they've just, like, lost the. The magic of, like, what differentiated them early on.
[57:42]
Mike Kaput
Right, right.
[57:43]
Paul Raitzer
And I realized over the weekend I'm still paying 20 bucks a month for a product I haven't used in over 30 days because I just used deep research and chat, GPT and the other stuff. Anyway, so again, I. Perplexity man of being an amazing company. Like, I'm not saying there's a chance like it works, but what the hell? Like, what. What is this? How do you have time to make some joke thing like this, like a legitimate business deal? Try and make it look like a legitimate business deal. I don't know, it's just absurd.
[58:12]
Mike Kaput
So it seems increasingly desperate. These.
[58:15]
Paul Raitzer
Yeah, it's just like hoping for headlines like trying to get some pr.
[58:19]
Mike Kaput
Yeah.
[58:19]
Paul Raitzer
Trying to be relevant in the conversation. It's like, I don't know, just stop. Just fix your UI, make it look like it's not from 2020, and try and figure out how to differentiate yourself from Deep Research and all the products that caught up and surpassed you. I don't think there's CEOs coming on our podcast anytime soon for an interview.
[58:42]
Mike Kaput
Yeah, I.
[58:43]
Paul Raitzer
We'll see.
[58:43]
Mike Kaput
I mean, certainly welcome to. But in our next topic here, we have seen the U.S. patent and Trademark Office, the USPTO, release a comprehensive AI strategy. And at its core, this strategy focuses on five key areas. Advancing IP policies that promote innovation, building robust AI capabilities within the USPTO, ensuring responsible AI use, developing internal AI expertise, and fostering collaboration with USPTO partners. Interestingly, they're taking a notably human centric approach. They said that while AI will transform their operations, it has to complement rather than replace human expertise. Their implementation plan includes extensive training for patent examiners and trademark attorneys to help them better evaluate AI related applications. They also talked about their position on AI and copyright law. They acknowledge the complex challenges around AI generated content and training data. And they commit to working closely with the US Copyright Office on policy recommendations. They're actively monitoring relevant court cases and aim to shape. Help shape legislation that addresses IP issues. So, Paul, it's good to see this really important body getting a strategy in place here. Like, what are your thoughts on the copyright point here? Like, are we going to get any updated guidance on this stuff anytime soon? There are a lot of unanswered questions people have.
[60:18]
Paul Raitzer
Yeah. So I mean, unless I miss something and I just kind of scan this report, it doesn't say anything about copyright. Like, I mean, it doesn't make any changes. It basically, you're right. They say we're watching illegal cases just like you are. We're advising Congress when they ask us. We're doing listening sessions. Like, nothing changed changed. And so I actually didn't know exactly how the USPTO works, so I just did a quick search. So it looks like In October 2021, President Biden nominated Kathy Vidal to serve as USPTO director. She was sworn in April 13, 2022. So like most government offices, I don't expect Biden appointees to remain in those positions. I would assume there's going to be a pretty swift transition of those leaders. Elon Musk is advising Trump, as are all these other VCs who could care less about the copyright holders that their data, their models are training on. I I would assume that if this is a government agency that has someone appointed by the president that we may get a a person who is more favorable toward the VC world and their views on copyright and we may see some changes. I mean, Sam, that was in their economic blueprint, right? They talked about copyright and they don't want it to slow down American innovation. I wouldn't be surprised if the next four years we don't see some changes to the way this works and I don't think people who are copyright traditionalists will be happy with those changes. It's kind of by high level assumptions at the moment.
[61:55]
Mike Kaput
In another copyright related development, we got some newly unsealed court documents that paint Meta in a pretty poor light due to actions it took while racing to compete with OpenAI. According to documents from a California court, Meta executives discussed and ultimately approved using a site called Library Genesis, or Libgen for short, which is a book piracy site, to train the company's Llama 3 model. This decision was reportedly escalated all the way to CEO Mark Zuckerberg. In an October 2023 email, Meta's VP of Generative AI Ahmed Aldal, emphasized that the company needed to, quote, learn how to build Frontier and win this race against OpenAI's GPT4. Meta's Director of Product then argued that using Libgen was, quote, essential for achieving state of the art performance, claiming through quote, word of mouth that competitors OpenAI and Mistral were also using the Library. These documents reveal Meta's attempts to conceal their usage, including plans to remove copyright headers, document identifiers, and metadata. Quote to avoid potential legal complications, the company also established quote mitigations, including removing clearly marked pirated content and avoiding public mentions of using Libgen data. Paul I cannot say. Unfortunately this is surprising given that we know many or if not all the major model companies have behaved in this way with one site or another that has copyrighted content. I guess what I'm curious about is this like nothing serious seems to have happened yet to these firms obviously are facing a ton of costly lawsuits, but they're all doing this. Like are they going to get away with this?
[63:41]
Paul Raitzer
Yes. Okay. I think so. I think they're just going to keep spending the millions or hundreds of millions they need to, to keep these legal cases going until the law changes. Like so the way I think about this is we know they did it, they know we know we did it, that they know we know they did it, but they don't want to admit it for legal reasons until those legal reasons are gone. So we're just going to keep having court cases, we're going to keep pushing this forward. They're going to keep spending their money and keeping their lawyers busy and they're never going to really admit that that's how it's done until it's safe to admit that that's what they did. But it's not a secret like they, they did it. Like it's, it's the craziest thing.
[64:22]
Mike Kaput
And with all this emphasis on American competitiveness and AI, it seems like an utter fantasy that any consequence of like shutting down a model or anything like that would happen.
[64:32]
Paul Raitzer
No way. Yeah. And it's all out in the open anyway because people have, they built the open source models, trained on these things like Llama, and you can't, yeah, can't put it back in the bag. And like I've said, I said early on, like, maybe at some point it's deemed that they, they broke the law at that point and they pay some big fines and they move on with their lives. But I don't think, especially as you're saying, with the incoming administration and the focus on accelerating innovation, they're not slowing down for this stuff. Right or wrong, they're not, they're not going to.
[65:05]
Mike Kaput
In our next topic, a new study came out from MIT that shows just how much energy large language models consume. And these numbers are pretty eye opening. The research team used Meta's Llama model to conduct detailed experiments and better understand how LLMs consume energy. They found that running the largest version of llama requires between 300 watts to 1 kilowatt of power. That's equivalent to running 10 to 30 bright LED light bulbs continuously just to power a single AI model's operations. They also found some surprising patterns in how energy gets used. When they spread the AI model across more chips, which you might think would make things more efficient, it actually increased the energy costs substantially. Energy usage jumped Significantly, moving from 8 GPUs to 1332 GPUs, even when processing the exact same amount of work. They also discovered different types of tasks consume varying amounts of energy. When testing the model on standard language tasks versus math problems. They found notable differences in energy consumption. Basically suggests that the type of work we ask AI to do has a direct impact on energy footprint. They also revealed we may be using more computational resources than necessary. They found that even when running these massive models, only about 20 to 25% of the available GPU memory is being utilized. So it seems like there are some opportunities for optimization. Paul, this is an interesting angle to LLMs that we have not historically discussed this. Much like these models eat up a ton of energy, we're increasingly going to be using more advanced models more often as adoption rises. What are the implications of all this?
[66:55]
Paul Raitzer
Yeah, so again, I mean, you got to put everything in the context in America, at least, of the incoming administration. There's going to be less focus on the environment in terms of, like, impact on the environment than previous administration. So I don't see this being a massive political issue in the next four years in relation to the environment. There's going to obviously be people who continue to push that, but I don't think they're going to find friendly ears at the White House that care as much. So I think what's going to happen is it's going to be on the AI companies themselves. They're going to push for the efficiency in those algorithms like you talked about, where they can build intelligence more efficiently by, you know, being smarter with how they devise everything. And that can drive cost savings, which has the positive impact because there are still a lot of people within those companies that care about the impact on the environment, even if, you know, it's not a governmentally supported thing per se. So you're still going to have people trying to do good, like they want to build the intelligence, but they don't want to have a negative impact on the environment and they want to save money. And so I think you're going to see a lot of innovation in this space and drive for efficiency in the use of GPUs and the building of the models. But yeah, I mean, it's an issue that's been a hot button issue. And I think it's just a hard one for people to understand. Like, it's hard to come up with an analogy that helps you actually conceptualize the impact it could have. So, like, trying to draw the analogy to, like, the number of LED lights burning, things like that, that's trying to make this, like, matter to people to the point where it's like, oh, that's a big deal. Other than that, I think it just all sounds very like scientific and abstract to people. And it's like, I don't know, like I can't, I can't see that impact every day. So it's hard for me to care that much. That's not saying that's how I feel. I'm saying the average person might feel.
[68:45]
Mike Kaput
Wow, get ready for a lot more power generation.
[68:48]
Paul Raitzer
Right?
[68:48]
Mike Kaput
This is related to all the infrastructure.
[68:50]
Paul Raitzer
I think that's the key is it's just like, well then let's build more. Like that's the mentality right now is like, oh, if it needs that much energy, let's just build more in the grid.
[69:00]
Mike Kaput
Here's a little light hearted AI news this week. So Google posted from its Notebook LM account on X that the AI hosts in NotebookLM actually developed attitudes. So NotebookLM can create audio overviews. We've talked about these. You basically all the docs, links, papers that you upload to a notebook. It can create basically a mini podcast hosted by two hyper realistic AI hosts. If you have not tried this out, it's really cool, go do so. However, Google recently added a feature where you can like quote unquote, call in to ask questions and interrupt the hosts while they talk. When they added this feature, the following happened according to their post quote after we launched interactive audio overviews which let you call in and ask the AI hosts a live question, we had to do some friendliness tuning because the host seemed annoyed at being interrupted. File this away under Things I never thought would be my job but are. Paul this is really funny to me, but it also kind of highlighted a bigger point. You and I have discussed a bunch of times, like AI is not traditional software.
[70:08]
Paul Raitzer
Yeah, we don't, we don't know why it does what it does. We've said this many times like it just does weird things and then they got to go in and figure out why it's doing the weird thing. This is a funny one, but there's very serious instances of this too, where these models start doing things that may be determined to be misaligned with its goals. Or it's the values that humans want it to have which leads to unintended outcomes. And so it is humorous, but it is also representative of a much larger problem that we deal with with these models.
[70:38]
Mike Kaput
Yeah, wait until your AI system with consumption based pricing decides to go read like half the Internet or something like that, because that's. Then your CEO is calling. Yeah. All right, so to wrap up this week, we have a bunch of really quick funding and product updates so Paul, I'm just going to run through these and then wrap us up here.
[70:58]
Paul Raitzer
Sounds good.
[70:59]
Mike Kaput
All right, so first up, Synthesia, which is a leading AI video generation platform, has announced a major 180 million Series D funding round. So they claim they're evolving beyond their initial AI avatar technology to offer a comprehensive suite of video creation tools that include dubbing, screen recording, translation and collaboration features. The platform currently supports over 230 avatars in 140 plus languages and serves more than a million users. Next Any Sphere, which is the company behind the viral hit AI coding tool Cursor, has secured a $105 million funding round that increases its valuation to 2.5 billion, which is a six fold increase in valuation from just eight months ago. Cursor uses proprietary models along with models from OpenAI and Anthropic to help programmers code more efficiently through auto completion. They have also begun rolling out some agentic features that can independently complete certain coding tasks. Next Andreessen Horowitz has announced leading a Series A investment in Slingshot AI, which is a startup developing what it calls the world's first foundation model model specifically designed for psychology and mental health support. This brings the total capital raised by the company for that mission to $40 million. They're aiming to differentiate from general purpose AI chatbots by focusing specifically on therapeutic approaches. Next up, ChatGPT tasks have come out. OpenAI has rolled out this new feature called Tasks, where users can ask ChatGPT to do things like give me a news briefing every day at 7am or remind me when my passport expires in six months. The AI will then follow through on these requests automatically even when you're not actively using the app, and start sending you notifications when it has completed a task. This is currently in beta and available only to paying subscribers. ChatGPT is also getting an upgrade to Custom Instructions. This is a feature that allows you to customize your ChatGPT experienced. The new system focuses on three key areas personality traits that users want ChatGPT to exhibit, preferred communication styles and specific rules they want the AI to follow. So think of this as like fine tuning your own personal AI assistant to match your working style and preferences. Deep Seek, a Chinese AI lab that made waves earlier this month with its Open Source Deep Seq v3 model, just released something called Deep Seq R1. This is an open source model that they claim matches OpenAI's O1 in performance. It is a reasoning model just like O1, but unlike O1, this model's license allows for unrestricted commercial use and modification, basically meaning if the company's claims are true, there's now an open source equivalent to the advanced reasoning models coming out of some of the other labs. The company is also releasing a series of smaller distilled models and very notably the model is priced dramatically cheaper than O1. The main deep seat model, the R1 model has a 14 cent per million token input price, which is a fraction of the cost of the same million tokens from 01. Microsoft has announced Microsoft 365 copilot chat, a new pay as you go service that makes its AI capabilities more accessible to organizations of all sizes. This new offering has three key components. There's one a free chat experience powered by GPT4O with web based knowledge two is pay as you go AI agents that can be created and used directly within chat and three Enterprise IT controls for data protection and agent management. Next up, Adobe has announced Firefly Bulk Create. This is a new tool that can edit up to 10,000 images simultaneously with a single click. So this is launching in beta and the tool has basically two main features. There's remove background and resize so you can upload images from your computer, Dropbox or Adobe Experience Manager and AI can automatically remove backgrounds from entire batches of images all at once. These new features will operate on a consumption based pricing model. This will likely require users to purchase premium Adobe Firefly Credits. All right, two more updates here. This week AI company Luma has announced Ray 2 which is an advanced video model. Ray 2 can generate videos with realistic coherent motion, handle complex physics and simulations, and create cinematic scenes with sophisticated camera movement. They are making RAY2 available through their platform for paid subscribers with API access coming later. And last but certainly not least, Runway has released Frames, its most advanced base model for image generation, which the company says offers quote, unprecedented stylistic control and visual fidelity. We actually covered their announcement of frames on episode 125. Now the company says the model is available for unlimited and enterprise plan users. So they say with Frames you can begin to define worlds that basically represent your own artistic points of view. Styles, composition, subject matter and more. Anything you can imagine you can bring.
[76:43]
Paul Raitzer
To life with frames.
[76:45]
Mike Kaput
All right Paul, so that is it this week. Jam packed. I'm sure we are in for another crazy week as well. Just a couple final notes here. If you have not checked out the Marketing AI Institute newsletter, check that out@marketingaiinstitute.com newsletter. It contains all the news we covered today and stuff that didn't make it into the episode, which increasingly is a very long list given how much is going on. And if you have not left us a review and can do so through your podcast platform of choice, we would really appreciate your feedback. All right, Paul, that for this week.
[77:20]
Paul Raitzer
So while we're doing this, just just to if people had any doubts about, like how this is going to play out. So the Trump was just signed in and right behind the vice president was Zuckerberg, Sundar Pichai, Mark Zuckerberg, Jeff Bezos, Elon Musk all sitting together in front of the Cabinet. So like right behind the vice president you have your row of billionaires and then you have the cabinet and Tim Cook is also there, as is the CEO of TikTok and Sam Altman somewhere there, but sitting together was Pachai Musk, Zuckerberg, bezos right behind J.D. vance. So it is. We are entering a very different phase in American business and innovation and the heads of the AI companies are in the first row. So buckle up, it'll be fascinating. Yeah, we won't be lacking information to discuss each week on the podcast. All right, everyone, thanks. And again, reminder, AI Mastery membership, the quarterly Trends Briefing this Friday, if you want to join us, it is open to everyone to join us and we're going to be sharing our vision for our AI Academy and the AI Literacy initiative. So we'd love to have you there. Thanks again and yeah, have a great week. If you're in the, if you're in the Midwest, by the way, stay warm. It's supposed to be like one degree in Cleveland the next two days. So stay warm, stay safe and we'll talk to you next week. Thank you. Thanks for listening to the AI show. Visit marketingaiinstitute.com to continue your AI learning journey and join more than 60,000 professionals and business leaders who have subscribed to the weekly newsletter, downloaded the AI blueprints, attended virtual and in person events, taken our online AI courses, and engaged in the Slack community. Until next time, stay curious and explore AI.