#118: OpenAI’s Whopping Valuation, Big OpenAI Tech Updates & Accenture’s Big Nvidia Team-Up - The Artificial Intelligence Show

Summary7 min read

Episode #118: OpenAI’s Whopping Valuation, Big OpenAI Tech Updates & Accenture’s Big Nvidia Team-Up

Hosts: Paul Raitzer and Mike Kaput
Release Date: October 8, 2024
Podcast: The Artificial Intelligence Show
Episode Title: OpenAI’s Whopping Valuation, Big OpenAI Tech Updates & Accenture’s Big Nvidia Team-Up

Introduction

In Episode 118 of The Artificial Intelligence Show, hosts Paul Raitzer and Mike Kaput delve into significant developments in the AI landscape, focusing on OpenAI's latest funding round and valuation, recent technological advancements from OpenAI, Accenture's strategic partnership with Nvidia, and notable updates from industry giants like Nvidia, Google, Microsoft, and Meta. The episode provides an in-depth analysis of how these developments impact businesses, the future of AI technology, and societal implications.

OpenAI’s Massive Funding Round and Valuation

Timestamp: [03:18]

Paul and Mike begin by discussing OpenAI's latest funding achievement—a substantial $6.6 billion round that boosts the company's valuation to $157 billion, effectively doubling it from $80 billion nine months prior. This round, led by Thrive Capital with participation from Microsoft, Nvidia, SoftBank, and the UAE investment firm MGX, underscores the immense investor confidence in OpenAI despite its financial losses. OpenAI predicts $3.7 billion in sales for the year but anticipates losses of approximately $5 billion due to the heavy costs of developing and maintaining AI technologies like ChatGPT.

Notable Quote:

Paul Raitzer [04:54]: "They're projecting 11.6 billion in future revenue over the next 12 months, which is a reasonable number given that approach."

The funding comes with a critical condition: OpenAI must transition into a for-profit entity within two years, or the investment will convert into debt. Additionally, there are restrictions on investors supporting five key competitors, including Elon Musk's XAI and Anthropic.

Notable Quote:

Paul Raitzer [09:30]: "We have a lot of questions about how OpenAI is going to transform into a for-profit business and the complexities surrounding Microsoft's significant investment and ownership stake."

OpenAI’s Technological Advancements

Timestamp: [11:33]

Following the funding discussion, the hosts shift focus to OpenAI's recent tech updates unveiled during their Dev Day. Key highlights include:

Canvas: A new interface for ChatGPT designed for complex writing and coding projects, facilitating real-time collaboration with AI.

Notable Quote:

Paul Raitzer [18:13]: "As a writer, as someone who came out of journalism school, this is really impressive."
Real-Time API: Now in public beta, this API allows developers to integrate fast speech-to-speech functionality into their applications, enhancing natural multimodal conversations with low latency.
Vision Fine Tuning for GPT-4: Enables developers to fine-tune the model using both images and text, broadening application possibilities in areas like medical image analysis and object detection.
Prompt Caching: Helps developers reduce costs and processing times by offering a 50% discount on reused input tokens, optimizing expenses for applications with repetitive interactions.
Model Distillation: Allows developers to fine-tune smaller, cost-efficient models using outputs from larger models, streamlining the deployment of powerful AI capabilities at lower costs.

Notable Quote:

Paul Raitzer [21:42]: "The cost of intelligence is plummeting to zero. Every new frontier model that comes out pushes the cost further down."

Accenture’s Strategic Partnership with Nvidia

Timestamp: [23:10]

Next, Paul and Mike explore Accenture's latest initiative to form a dedicated Nvidia business group comprising 30,000 professionals. This group will leverage Nvidia’s AI stack, including Nvidia AI Foundry and Nvidia Omniverse, to help enterprises scale AI adoption and reinvent business processes through agentic AI systems—AI agents capable of acting on user intent and creating new workflows autonomously.

Notable Quote:

Paul Raitzer [28:16]: "Accenture is making a massive bet here. We are going to train our workforce on this and help a bunch of companies along the way."

The partnership emphasizes the role of consulting firms in driving AI adoption within enterprises, addressing the gap in business and technical expertise necessary to implement AI solutions effectively.

Rapid Fire: Nvidia’s Open Source AI Model

Timestamp: [30:40]

In the rapid-fire segment, Nvidia's release of the open-source AI model NBLM 1.0 is discussed. The flagship model, NVLM-D-72B, boasts 72 billion parameters and is designed to compete with proprietary systems like OpenAI’s GPT-4. Notably, Nvidia is making the model weights and training code publicly available, a significant departure from the trend of keeping AI models closed.

Notable Quote:

Paul Raitzer [35:00]: "Intelligence keeps getting more affordable and better and smarter."

Google’s AI-Driven Search Enhancements

Timestamp: [35:30]

Google's ongoing integration of AI into its search functionalities is examined, highlighting several key updates:

Google Lens: Now incorporates video understanding capabilities, allowing users to analyze moving objects within videos.
Voice Input for Lens: Enables verbal queries while taking photos, enhancing user interaction.
Enhanced Shopping Capabilities: Provides detailed product information, including reviews and price comparisons, directly within search results.
AI-Organized Search Results: Starting with recipes and meal inspiration, these results now feature prominent links to supporting web pages, increasing traffic to external sites.
Ads in AI Overviews: Google is introducing ads within AI-generated overviews for relevant queries, monetizing AI-enhanced search features.

Notable Quote:

Paul Raitzer [37:30]: "Google has so much reach, so much distribution, managing that portfolio of features and capabilities in AI is challenging."

Additionally, Google’s NotebookLM, an AI research assistant, receives updates including the ability to add public YouTube URLs and audio files to notebooks and the introduction of custom chatbots based on user-created notebooks.

Notable Quote:

Paul Raitzer [40:32]: "It's like a Black Mirror episode. The tech is there. Pandora's box is open."

Microsoft’s Copilot Enhancements

Timestamp: [46:08]

Microsoft announces several updates to its Copilot suite:

Copilot Vision: An experimental feature allowing Copilot to analyze and respond to content on users’ screens within Microsoft Edge. Emphasizing privacy, processed data is deleted immediately after interactions.
Think Deeper: Enables Copilot to reason through complex problems using advanced models, offering step-by-step answers to challenging questions.
Copilot Voice: Facilitates spoken conversations with Copilot, featuring four synthetic voices that adapt to users' conversation styles.

Notable Quote:

Paul Raitzer [47:56]: "Everything they're doing is a wrapper on top of OpenAI stuff. I don't understand how Microsoft is differentiated."

Paul expresses skepticism about how differentiated Microsoft’s Copilot features are, given their reliance on OpenAI’s models, and questions the innovation behind these enhancements.

Meta’s AI Initiatives and Privacy Concerns

Timestamp: [51:41]

The discussion shifts to Meta, focusing on two major topics:

AI-Powered Wearables:
- Ray-Ban Meta Smart Glasses: Meta confirms that images analyzed by its AI assistant can be used to train AI models, raising significant privacy concerns. This technology allows for the automatic recording and analysis of visual data in users' environments, posing challenges for privacy and data protection.
Notable Quote:

Paul Raitzer [52:42]: "There's no way to get that data out and it's just a thing that I don't feel like we're prepared for."
Ixray Project: A controversial project by two Harvard students that combines Meta’s Ray-Ban smart glasses with facial recognition technology to identify and gather personal information about strangers in real-time.

Notable Quote:

Paul Raitzer [55:23]: "Is this the future we're headed towards with these things? It’s terrifying... the tech's going to be there. Bad people are going to do bad things."

Paul highlights the potential misuse of AI-powered wearables, emphasizing the societal and ethical implications of such technologies. The ability to instantly identify individuals and access their personal information without consent poses significant risks to privacy and personal security.

Meta’s MovieGen: A Glimmer of Positive Innovation

Timestamp: [62:10]

Ending on a more positive note, Meta unveils MovieGen, a generative AI model capable of creating high-definition videos and audio from text prompts. This breakthrough represents Meta’s third wave of generative AI efforts, focusing on enhancing creativity and providing tools for creators to bring their artistic visions to life.

Notable Quote:

Paul Raitzer [64:03]: "It's a race now with OpenAI and Google and Runway and Luma and Pika and Nvidia and Meta and everybody's building for the same stuff. Text, images, video, audio, code—they're all pursuing those same modalities."

Despite the impressive capabilities, Paul remains cautious, noting that while MovieGen offers significant creative potential, it also underscores the ongoing race among major tech companies to dominate the generative AI space.

Conclusion

In this episode, Paul and Mike provide a comprehensive overview of the latest developments in AI, highlighting both the exciting advancements and the pressing ethical concerns. From OpenAI's significant funding and technological strides to Accenture's strategic partnership with Nvidia, and Meta's innovative yet controversial AI initiatives, the episode underscores the rapid evolution and pervasive impact of AI technologies across various sectors. The discussion emphasizes the need for increased AI literacy, ethical considerations, and proactive measures to mitigate potential abuses as AI continues to integrate deeper into everyday life and business operations.

Notable Closing Quotes:

Paul Raitzer [60:52]: "Bad people are going to do bad things, but if everyone is completely oblivious to the bad things that can be done, they just happen without anybody knowing."

Mike Kaput [68:18]: "Thanks for listening to the AI show. Visit MarketingAIinstitute.com to continue your AI learning journey... Until next time, stay curious and explore."

This detailed summary captures the essence of Episode #118, providing listeners and non-listeners alike with valuable insights into the current state and future trajectory of AI technologies and their broader implications.

Loading summary

Transcript89 lines

[00:00]
Paul Raitzer
As a society, we're struggling to grasp the current technology. Like people's heads are going to explode if they try and start comprehending the complexities of everyone walking around with glasses that can record them. This is already the reality that people are wearing these things and they're going to be able to analyze things and that stuff you ask it to analyze is going to automatically be recorded if it's in your home, if it's your family. Like it's just there's no way to get that data out and it's just a thing that I don't feel like we're prepared for.
[00:32]
Mike Kaput
Welcome to the Artificial Intelligence show, the podcast that helps your business grow smarter by making AI approachable and actionable. My name is Paul Raitzer. I'm the founder and CEO of Marketing AI Institute and I'm your host. Each week I'm joined by my co host and Marketing AI Institute Chief Content Officer Mike Caput, as we break down all the AI news that matters and give you insights and perspectives that you can use to advance your company and your career. Join us as we accelerate AI literacy for all.
[01:08]
Paul Raitzer
Welcome to episode 118 of the Artificial Intelligence Show. I am your host Paul Raitzer along with my co host Mike Kaput. We are coming to you. We're actually recording this one on Friday afternoon, October 4, 2pm Eastern. So if anything last minute happens on a Friday, can we miss it? That is why we have a full agenda to get through as is. So hopefully nothing happens in real time while we are doing this. Today's episode is brought to us by Rasa. Again, we talked about Rasa last week. Let's talk about a common challenge we all face making our email newsletters truly engaging. I've been at this marketing thing for a while. Obviously I started in 2016. Confidently say that Rasa IO is changing the email newsletter landscape. We've been followers of theirs for a long time. Mike and I use it more as like an internal tool. So we our newsletters are not run through Rasa, but we'll use it to keep track of like research in the industry and things. Kind of, you know, find things for us, send us links that we can look at and kind of track what's going on. So imagine each of your subscribers receiving a newsletter tailored just for them. Sounds impossible, right? Well, Rasa IO's AI powered platform makes this easy. We've known the team at Rasa for about six years and they were one of our earliest partners and sponsors and so they've been doing the personalized newsletter game for a long time. You can check them out at Rasa IO Maii and use the code 5M A I I and that's for 5% discount on a rasa subscription. So you can give it a try. Your subscribers and your engagement rates will thank you. So again, Rasa IO M A I I. All right. It. It seems impossible, Mike, to go a week without talking about OpenAI. Last week was like the madness that was going on within this week. We've got some new funding, some new product updates, some more people leaving. It's just the never ending saga with OpenAI. But a lot of other stuff going on this week too with Accenture and Nvidia and Meta just showed up this morning and dropped a new movie gen model on us that we've been kind of scrambling to figure out. So tons to talk about, but let's kick it off with OpenAI. The latest on their news.
[03:18]
Unknown
Sounds good, Paul. Okay, so first up, the topic is OpenAI. They have completed a significant $6.6 billion funding round that values the company at $157 billion, which basically doubles the company's valuation from just nine months ago when it was valued around 80 billion. This funding round was led by Thrive Capital. There was participation from Microsoft and Nvidia as well as SoftBank and the United Arab Emirates investment firm MGX. Thrive Capital alone invested about 1.3 billion. They have an option to invest up to a billion more at the same valuation through 2025. This of course, as we've talked about on past episodes, comes despite OpenAI's current financial losses. I mean, the company expects about 3.7 billion in sales this year, but is projecting losses of roughly 5 billion due to the costs associated with developing and running AI technology like ChatGPT. Interestingly, this funding comes with certain conditions. OpenAI has two years to transform into a for profit business or the funding will convert into debt. So Paul, we have been talking about this news for a while. It's been very well rumored. We've covered many of the salient details here. Now it's official. Can you kind of walk through like what matters most to pay attention to here now that we know exactly what the details are of the fundraise.
[04:55]
Paul Raitzer
Yeah, anyone who listened to episode 117, this was the number that was being rumored. We talked about that and we touched on the valuation. Like how do you get it? 150 pre money. 157 roughly post money. It's because they're projecting 11.6 billion in future revenue. The next 12 months basically. And then you apply multiple to. That is roughly how it's done. There may be some nuances this time around, but it gives you a ballpark of that number. So it's actually a reasonable number given that approach. A couple of other elements to this one, in addition to the 6, was it 6.6 billion? 6, yeah, 6.6. They also secured a $4 billion line of credit and or revolving line of credit. So they in their own post where they announced the the 4 billion credit facility, they said they have 10 billion in liquidity which gives them flexibility to invest in new initiatives. And then in their other posts, so they had a post announcing the credit line and a post announcing the equity. So we're making progress on our mission to ensure that AGI benefits all of humanity. That was the lead to the blog post. So again bringing it back to like their overall mission and said that the new funding is going to go toward leadership. So invest in talent, basically increasing compute capacity that is buying more Nvidia chips and continue building tools that help people solve hard problems. They're going to need the money. They continue to lose people. So just, you know, last week we talked about all the people who have left this year. They just lost the co lead for Sora. So the guy who was building Sora, which we talked about last week had its delays. He is leaving to go to DeepMind and he tweeted I will be joining Google DeepMind to work on video generation and world simulators. And then they also had another, let's see, what was this guy's name? He was one of the co founding members, Dirk Kingma. And Dirk said, I'm joining Anthropic. Anthropic's approach to AI development resonates significantly with my own beliefs. Looking forward to contributing to Anthropic's mission of developing responsible AI. So you know, things kind of keep evolving. The other thing that's sort of a nuance to this is apparently the word is that they wanted exclusives with their investors. So they apparently and I think OpenAI has come out and said this is not true. But there's lots of sources saying it is in fact true. They asked for exclusives from their investors, meaning they were not allowed to invest in five companies that OpenAI identified as key competitors. One was Elon Musk's XAI. Another is Anthropic. Another is Safe Superintelligence which is Ilya Sutskova, one of the co founders of OpenAI. That's his company Perplexity is Another one. And then Glean I thought was interesting that made this list and anybody who listens, you know, weekly to the show might recognize glean. On episode 115, Mike and I talked about Glean. They had just raised a $260 million Series E funding at a $4.6 billion valuation to build what they called the Google for work using generative AI. It is co founded by four guys, three of them are former Googlers and one is formerly from Facebook. So on the surface it's a ton of money. One of the biggest if not the biggest raise in history, a massive valuation. But as we have talked about many times on this show, it is a bridge to the next round. This is not enough money to go where OpenAI and Sam Altman intend to go. They are going to need at least another 50 to 100 billion in the next 12 months is my guess. I, I'm guessing it's going to probably be north of 100 billion. So sometime in the next 12 to 18 months they're going to do another massive round and, or they're going to go public. My guess is it's going to be really complex to switch to the for profit that they're going to need to do before they can go public. So chances are they raise another 50 to 100 billion in the next 12 months as the final bridge to, to the IPO. And at that point they're probably valued at half a trillion dollars or more. Like it's just. And I know the numbers are nuts, but that's the thing is keep in context. While this sounds like a whole bunch of money, this is not enough money to do what they're intending to do.
[09:20]
Unknown
So I wanted to talk just a little bit about that funding condition about becoming a for profit business and in the next couple years, like is that the biggest hurdle they have to figure out right now?
[09:30]
Paul Raitzer
It does seem to be a massive hurdle. I'm sure that there's other complexities such as Microsoft's deal with them. So Microsoft is rumored to have put in about 13 billion and I think, I think that gives them from this going back a year or so. But if I remember correctly it's like 49 ownership of the 4 prop, the current for profit arm that's underneath the nonprofit right now. And I think Microsoft has access to like I want to say it was like the first hundred billion in profits or something crazy like that. Like it was some sort of condition. So Microsoft's not going to just give away this position they had. So I'm Sure. There's all kinds of complexities and I the other thing that caught my attention in the OpenAI's blog post announcing the 6.6 billion, they said we aim to make advanced intelligence a widely accessible resource. And then it went on a little bit. And that said by collaborating with key partners, including the US and allied governments. That's really interesting. That is a very intentional phrase, I would say. So my expectation is the next round. I don't know that we'll ever hear about the US government money going into this, but I wouldn't be shocked if there's something there. It's the allied governments that's setting the stage for other governments, which we've previously talked about on the show, possibly getting involved heavily in the funding of the future build out. Because what that means is you look at allied countries where data centers can be built and so this, you know, country A may put in, I don't know, like 50 billion. And in exchange we're going to build 50 data centers and they're like that kind of. That's the sort of stuff you're going to hear about over the next three to five years is these really complicated partnerships that are money plus, basically.
[11:17]
Unknown
Yeah, we are starting to see some of that from Aschenbrenner's situational awareness, his kind of contours that he painted of like great power competition essentially, or geopolitical wrangling around funding for these companies.
[11:30]
Paul Raitzer
It's going to be complicated for sure.
[11:33]
Unknown
All right, so next up, some more OpenAI news. So OpenAI unveiled several significant updates to its AI offerings. This was both an individual announcement of a specific update we're going to talk about, and also a bunch of announcements that came during their recent Dev Day. So first up, before Dev Day, they introduced something called Canvas, which is a new interface for ChatGPT that is designed for more complex writing and coding projects. And Canvas allows users to collaborate with AI in a separate window, basically side by side, both prompting it and seeing the outputs during kind of inline feedback and targeted editing projects when you're doing things like writing or coding. So this kind of enhances chat GPT's ability to assist with tasks that require quick multiple revisions and contextual, contextual understanding. Now second at Dev Day. And the rest of these updates come from Dev Day as well. The company launched the Real Time API in public beta. So this API enables developers to integrate fast speech to speech functionality into their apps, supporting natural multimodal conversations with low latency. And it appears to be doing that through using many of the features of Advanced Voice Mode, which we all got access to through this past week. So this basically simplifies the process of managing speech interactions by combining multiple steps into a single API call. Now third, OpenAI introduced Vision Fine Tuning for GPT4O. This allows developers to fine tune the model using both images and text, which opens up tons more possibilities for applications in visual search, object detection, medical image analysis, et cetera. Fourth, the company unveiled Prompt Caching, which is a feature that helps developers reduce costs and processing times when using repeated inputs across multiple API calls. This offers a 50% discount on reused input tokens, which optimizes expenses, improves latency for applications that have repetitive interactions. Last but not least, OpenAI announced model distillation, which is a new offering that allows developers to fine tune smaller cost efficient models using outputs from larger, more capable models. This basically streamlines the process of improving smaller models with real world data, making it easier to deploy powerful AI capabilities at a lower cost. So that's a lot to unpack here. But Paul, let's first talk about Canvas. Like this basically seems like an answer to some of the functionality that Claude has, like projects and, or artifacts where it kind of shows up and like pairs with you as you're building an app, writing code, writing, complex language. Like you also talked offline with me a bit though about other possible businesses and use cases that Canvas kind of challenges. Can you walk us through your thoughts there?
[14:45]
Paul Raitzer
Yeah, so the, so Canvas was split off as an announcement. So most of the stuff Mike, you were outlining came from their Dev Day was October 2nd, I think was the dev day. And so for our listeners who aren't developers, some of that may be like, okay, yeah, so what the, so what is. Developers are going to build a lot of cool stuff. In essence, what it means is OpenAI is making their capabilities, their models available by these open APIs, including some variation of Advanced Voice. That's going to allow developers to accelerate innovation very affordably and start building more and more tools and applications that the non developer crowd like us can enjoy and benefit from. So that's kind of like the key takeaway from the dev. So Canvas comes out, I think this was October 3rd, I think it was yesterday or something. Or in the last two days this came out and it's one of those initially you're like, oh, this seems like a big deal. Let me, let me go kind of play around with this a little bit. So I did have a chance to do it because everyone should have it. So if you're a paid user I think it's plus an enterprise or team. Everybody now has access and you can just go in and choose ChatGPT4.0. With Canvas, it's actually one of the models in the dropdown, so they position as an early beta introduces a new way of working together, not just through conversation, by creating and refining ideas side by side. My first reaction when I saw it was, oh, this is like just their copy of Anthropic's project, but doesn't actually seem true. I think the user interface that is true for. But it seems like it's doing a lot of interesting things that I don't think Anthropics currently does. Now, I'm not the biggest user of Claude and projects, so I can't say that confidently, but in my early testing it's. It definitely starts blurring into a true writing assistant because. So like the example I gave, let's just to play around, I just went in, chose the dropdown, I said, write an abstract for an AI presentation and then it does its usual thing like, sure thing, here you go. Before I give it anything else, it just writes it and it's like, okay, let's jump in the gun a little bit. Like, I wasn't really like, I didn't tell you anything else about it. It just gave me one. And so then I just started kind of playing around with it. So now what happens is your chat, rather than just being down the center, moves to the left and now 3/4 of your screen is the canvas. And now your document is kind of hand holding off to the right there. So I said, let's use bullets to make it easy to read. So it immediately kind of rewrites it and then it gives you the edited form of that. And I was like, make it more fun. Don't use the word delve ever. Because it had word the delve in there. And so it started doing it and then I was just playing with it. I was like, okay, can you write this more in the style of Paul Raitzer? And I was like, I wonder if it'll do this. So it revised it. It's like I rise the text. Well, I'm more close with Paul Raitzer's style, making it more engaging and focused on transformative power of AI. And so I said, do you know who Paul Racer is now? I was like, curious, is it actually doing this or just BSING me? And it said, yeah, he's the founder and CEO of Marketing Institute. He is well known for his work in making AI more accessible and applicable for marketing professionals. He's also an advocate for integrating AI into business strategies to drive growth, efficiency and innovation. He often uses focus on simplifying AI concepts and making them actionable for business leaders, with an emphasis on practical use cases, responsible adoption, and ethical considerations like, damn, this is actually. This is actually pretty good. His approach is characterized by a practical, optimistic view of AI's potential, while also ensuring transparency and responsible usage. I actually, I couldn't have written that.
[18:12]
Unknown
No kidding.
[18:14]
Paul Raitzer
So then I said, okay, what specific elements of what you just wrote in that draft are in my style? And then it actually explained in five bullet points the components of the writing that were like, tied to my style. So you can see like the power of the underlying model that's able to do the things GPT4 has always been able to do. But now you have these style buttons. So in the bottom right corner you can click and it's got suggest edits, adjust the length, reading level, add final polish, add emojis. And what it does is it lets you. It gives you like these cool sliding scales. So like, the user interface is really slick how they did it. So when it pops up reading level, you just drag like this book up and down. I want a high school PhD level. I want like grit. So it's pretty. It's pretty cool. My. When I step back for a moment after kind of testing it myself, the thing I realized is like, we're just. They're still not ready for this. Like most companies we meet with, most schools I talk to, yeah. Aren't teaching how to use ChatGPT at all. And now all of a sudden we have this whole true writing assistant. And I'm not gonna get into the coding. Cause Mike and I aren't coders. I couldn't really tell you how good the coding part is. But as a writer, as someone who came out of journalism school, as you know, Mike and I spent our lives writing. This is really impressive. And I have. I don't know if I've talked about this on the show, but I've been teaching my daughter, who's 12, how to use ChatGPT to become a more creative writer. And so I went to school to become the kind of writer I am today. I spent years trying to become a really good creative writer, right? And what I have found is because she likes to develop story ideas and things like that, what I have found is I'm able to teach her how to be a creative writer way, way faster. And what I'm doing is having her go in and say, okay, let's have ChatGPT write the first paragraph of this idea. Now I want you to write it in the style that you learned by how it used different words, how it create these visualizations. Like, you know how when it wrote that first paragraph, you could see what it was saying. I want you to now write the next paragraph in that same way. And so rather than me trying to figure out how to teach her to be a writer, because I don't, I've never been an instructor, like, I don't know how to actually teach her that way, but I am able to explain to her how to use the tool to learn that way. And I just, I really find myself with tools like Canvas. So again, I'll kind of leave it at go try it. It is really impressive. My early work with it is fascinating. I do think it, it starts to creep into tools like Grammarly and things like that. You do start to wonder the competitive environment as you see what these things are. Knowing this is a beta, but they're obviously coming for writing. But my bigger thing becomes how are we going to use these tools in schools and in businesses to. To accelerate people's capabilities in learning and not have it become a crutch to critical thinking? And I don't know the answer to this, but every time I go do a talk, I get asked these questions like, how are we going to teach the next generation to do things when they can just have chat GPT do it? And every day these tools get smarter and they have more and more capabilities to where if you want to take the shortcut, it is there to be taken. And I don't know. So that's, that's kind of my overall thoughts. Like it's, it's just, it's an impressive user interface. It's a really cool tool. It creates more questions in my mind about how people do work in the future where they don't just let the AI do it for them.
[21:43]
Unknown
Yeah, and it's interesting with the previous topic that something like a company like Glean is being mentioned as a big competitor. This, compared with that, this certainly feels like, okay, we're trying to get into enterprise productivity essentially in a more formal way than ChatGPT Enterprise.
[22:01]
Paul Raitzer
I mean, honestly, like, I even had the thought about what about Microsoft? Like Microsoft Word, Google Docs? Like, there was the first time where I wondered, is OpenAI going to build like a productivity platform? Yeah, like, are they going to just build their own version of Excel and, and Docs? And it sure seems like they, they could be going that direction which is a really fascinating thing I hadn't really thought about before. But I would certainly understand why Microsoft. A few episodes ago, Mike, I think you called out Microsoft was now listing OpenAI as a competitor in like a public filing. It starts to make a lot more sense when you think that maybe they are going to go at that enterprise productivity market, not just through a ChatGPT interface, but different interfaces.
[22:45]
Unknown
Yeah, Especially with all our talk of this research company becoming a product company. They have to find revenue to hired.
[22:52]
Paul Raitzer
A chief product officer. Yeah, they're very much positioned themselves. I would be. Oh, man, I would give anything to see their pitch deck. I would love to know their roadmap of what, where the. Because they're what, 11.6 billion in revenue next year and then I think it was 20, it was in the 20s, 20 some billion the year after. Yes, I would love to see where is that, where that's going from there.
[23:11]
Unknown
Yeah, yeah. All right. Our third big topic this week, Accenture, the consulting firm, is forming a dedicated Nvidia business group. So this newly formed group will comprise 30,000 professionals who will receive specialized training to help enterprises reinvent processes and scale AI adoption. At the heart of this kind of initiative is Accenture's AI Refinery platform. And this uses a ton of Nvidia products. It leverages Nvidia's full AI stack, including Nvidia AI Foundry, Nvidia AI Enterprise and Nvidia Omniverse. So one of the key focuses here is the development and implementation, interestingly, of agentic AI systems. So basically the next frontier of generative AI agents that are capable of acting on user intent, creating new workflows and taking action to reinvent processes or functions without total constant human input. Now, apparently to support this initiative, Accenture is expanding its network of AI refinery engineering hubs globally. They're adding new locations in Singapore, Tokyo, Malaga and London. Basically, these provide deep engineering skills and technical capacity for transforming operations using Agentix systems. And Accenture claims the partnership with Nvidia is already yielding practical applications. They developed an Nvidia NIM agent blueprint for virtual facility robot fleet simulation, which basically could help industrial companies build autonomous robot operated software factories and facilities. So pull. Obviously it's a bit of a, you know, PR win for Accenture and Nvidia, but it does seem like a pretty substantial initiative and interestingly focuses on agents. Like, what does this mean for enterprises trying to deploy both AI and AI agents?
[25:16]
Paul Raitzer
Yes, I'll start saying we do not give investing advice on this show. Do not take anything I Say as investing advice, I will just say if you think all Nvidia does is make chips, like, you got to zoom out a little bit. They are everywhere. Like they're embedded in the future of business and the economy at almost every level of the infrastructure. It is remarkable how every other major tech company just wants to tout the relationship with Nvidia like that is. It is shocking to me how prevalent they are in all technology circles. So good on Accenture for, you know, deepening the relationship with Nvidia that, that is a great win. You and I talked about Accenture's Gen AI bookings back in episode 91. I went back and looked episode 91, April 9th of this year. And at that time they were on Track to do 2.4 billion. So this is obviously like, if we can just zoom out, this is a massive growth area for not only them, but other consulting firms. We also Talked in episode 104 about the people who we know are making money in gen are the consulting firms. Right? McKinsey, Deloitte, Accenture, obviously. Now how much of this is net new? I have no idea. And I don't even know if they broke it out in their earnings calls. But like it's great. They're doing three point, you know, three billion, whatever. But is that net new consulting that they wouldn't have done prior to gen AI or is like is the money just moving from we used to do this consulting, now we're doing this and stuff, I don't know. But the growth is there, the demand is there. We Talked in episode 104 about like, what are the services people need? It's what to do with these language models, whether you're fine tuning them, whether you're integrating them into your business, finding use cases, personalizing use cases, driving innovation like new markets, new ideas, new products, change management. Like there's so much that needs to happen in enterprises and there's so few people in those enterprises trained to do this. And I'm not talking about the technical stuff, right? I'm talking about the business side, the HR side of all of this. And that's where the consulting firms have a massive window of opportunity here. And I don't see it going away anytime soon. And then you mentioned the agentic systems. That's a whole nother element to the service mix that we didn't talk about in episode 104. But if you go back to episode 116 where we had the AI and the AI agents in the enterprise conversation, that's where this is all Going like, now you have this whole world of we can go Build agents and HubSpot and Salesforce and Google and wherever we're going to build our agents, who's going to build those? They don't have to be developers. They can be business people. Like, I've built jobs, GPT and campaigns. GPT. I'm not a developer. So who's going to go in, identify business problems, analyze business processes, build agents and GPTs that do those things more efficiently in a more innovative way, more creatively?
[28:16]
Unknown
Who.
[28:16]
Paul Raitzer
Who on your. If you're in an enterprise, you listen to this. Who on your team can do that? My guess is you're going to struggle to count on one hand, at least, on how many people could actually do that. There just aren't business people trained to do these things. And in my opinion, those are the people who should be doing that. It's the people who understand the business pain points and the processes and can interview the people on the team and understand what they go through each day, identify the tasks, build agents. That's the opportunity here. And so either you build those capabilities yourself within your company, or you got to turn to somebody like Accenture to do it for you. And I think a lot of big companies are going to be turning to companies like Accenture to do it for them.
[28:56]
Unknown
Yeah, I know that in some circles, big consulting firms occasionally get a bad rap. It's like we're paying a bunch of money for someone to come in and tell us what we already know or that we've been saying and it just doesn't come from us. So I sympathize with that. I'm not saying, like, go hire a big consulting firm, but to your point, we, how many enterprises have we talked to at this point where it's like, good luck if you, if you think you have all this talent ready to go today to do this stuff. Very few people do.
[29:23]
Paul Raitzer
Yeah. And I think that was the thing that stuck out to me most. And again, who knows how real these numbers are? Like, this is a press release from them, basically. But 30,000 people is a big commitment. I don't know the total employment at Accenture, but what they're basically saying is like, we're going to make a massive bet here. We are going to train our workforce on this. We're going to, I assume, you know, infuse the internal education training around AI, drive, change management among their team, improve staffing, add new staffing. Like, good on them. Like, I, I love to see this idea that AI is actually Creating a growth engine for the economy and for this company. Yeah, to hopefully employ more people and train those people to do this thing. That's the kind of stuff we want to see. So again, is it actually 30,000 people? Is it going to be everything they're claiming it is in the post? Who knows? Like, it never, never really is. There's always a PR element to this. But I love the vision for it. Like, I hope, I hope they see it through, I hope they build it and I hope they help a bunch of companies along the way because a lot of companies need the help right now. And you know, it's on a lot of these consultancies, as you're saying, like sometimes they get a bad rap or just assume it's, you know, kind of blowing your money just getting those outside opinions. But a lot of times that's, that's what these companies need and nothing's going to happen until they go get that third party to come in and drive this change for them.
[30:41]
Unknown
All right, let's dive into this week's rapid fire. So first up, some other Nvidia related news.
[30:47]
Paul Raitzer
Nvidia not stock advice, but more Nvidia news.
[30:51]
Unknown
But pay attention. Nvidia actually just released a very powerful open source AI model called NBLM 1.0. The flagship model in this model family is called NVLM D Dash 72B. So real good marketing here.
[31:11]
Paul Raitzer
It keeps getting worse.
[31:12]
Unknown
Robot from Sci Fi Movie and. But this model has 72 billion parameters. It's designed to compete with proprietary systems from OpenAI, Google and others. It is set apart by its exceptional performance across both vision and language tasks. It demonstrates state of the art results in vision language tasks rivaling leading proprietary models like GPT4O. Now notably, unlike many multimodal models, NBLM Dash 72B actually improves its performance on text only tasks after multimodal training, which is an interesting development. Now what's also worth noting here is that Nvidia has made the decision to make the model weights publicly available and they have promised as right now to release the training code, which is kind of a departure from the trend of keeping both systems closed. But also some of the ones that are open don't always go this far in their openness. So Paul, Wow, the name is a mouthful. Sounds a bit technical. It's notable because it's Nvidia. It's also sounds like this is actually open source with the publicly available model weights and training code. Eventually if they follow through on that, like how big a deal is that? Because I don't think even Meta has gone necessarily that far.
[32:35]
Paul Raitzer
Yeah, I mean, welcome to the party. I guess, like you talk about companies with infinite resources. This is where I think, like, it's hard to underplay Meta's role because they have billions to throw at this stuff. It's the argument I made for Google over OpenAI last week. It's like these massive companies, I mean, I don't know what the R and D budget at Nvidia is, but it's got to be 20 billion a year or something. I mean, yeah, it's nothing like they can throw stuff at this with no big deal and they're using their own chips. Like that's the thing is if they want to be a major player in the model game, all these other companies are lined UP to get Nvidia's GPUs to train their frontier models. If Nvidia wants to be a major player in the frontier model world, they just pull them off the shelves like it's their inventory that's doing this. So I find that fascinating. And then I think just going back to the technical side of like the OpenAI Dev Day stuff, why does this matter to the average listener that isn't going to be building these models? Because it accelerates innovation. It pushes the other model companies to do more. It drives the cost of intelligence down close to zero. And that's what we keep seeing time and time again is these models come out like, let's say advanced voice, for example. I think the numbers I saw was roughly if you wanted advanced, advanced advanced voice to be used in a customer service environment, like to do calls and stuff, it would come out to somewhere between like 18 and $21 per hour to, to use it. So we start talking about what does it cost for AI to start doing the work humans do. That's around where advanced, advanced voices today, by next year it'll be under 10 bucks. It may be within six months it'll be under $10 per hour. And a year after that it'll be down to a dollar per hour or a penny per hour. Like the cost of intelligence is plummeting to zero. And every new frontier model that comes out or any open source model that comes from someone like an Nvidia, it just p pressure for Google and OpenAI to drop their prices becomes so massive if Meta and Nvidia just give this stuff away. And so that's the outcome is all of us in theory benefit from commoditized intelligence because every like you're gonna have five or six companies Spending billions a year to build the most advanced intelligence and then fighting each other to push that cost to basically zero for all of us. So that's what it means. This is like, intelligence keeps getting more affordable and better and smarter.
[35:01]
Unknown
And like we've talked about on a few episodes, not enough people are ready for essentially free intelligence on demand everywhere.
[35:09]
Paul Raitzer
Yep. Yeah, yeah, we just like, yeah, people are still trying to figure out how to use like a custom GPT or like chat GPT. And now you have Canvas and now like, yeah, just, it just keeps coming. This is why every day we say like, every day is like the dumbest form of AI you're ever going to have. Like, it only gets smarter from here. It only gets more capable. And enterprises are not keeping up with the rate of innovation.
[35:30]
Unknown
All right, so next up, Google announced some updates to search, mostly around AI advancements. So one of the key developments is an evolution of Google Lens, which now incorporates video understanding capabilities. That means users can take a video and ask questions about the moving objects they see in the video. AI can provide comprehensive answers. This is available now globally in the Google app for Search Labs users. Additionally, Google has introduced voice input for Lens. It allows users to ask questions verbally while taking photos. And the company has also improved Google Lens's shopping capabilities. So when users photograph products, Lens now provides a more detailed results page, including reviews, price comparisons and purchase options in audio search. Google has expanded its functionality around circle to search to identify songs playing in various contexts. That's available on 150 million Android devices. They are also rolling out AI organized search results pages, starting with recipes and meal inspiration on mobile in the U.S. to enhance connections to web content, Google has redesigned AI overviews, which we've talked about in the past, to include prominent links to supporting web pages within the text. This change reportedly has increased traffic to those websites and improved user experience. And lastly, Google is introducing ads in AI overviews for relevant queries in the U.S. so Paul, definitely seems like Google is leaning in even more to AI powered search. AI overviews are starting to get monetized. Like as a marketer or business leader, how should I be thinking about these changes to Google Search?
[37:30]
Paul Raitzer
Yeah, I mean we're seeing all this tech we keep hearing about gradually infused into different features. Sometimes I have trouble like knowing where to go for some of this stuff. Like Google Lens, I thought I knew in my browser because I'm a Chrome user. We have Google workspace. Like we use Google all the time. I don't know where to use Lens at. Like I'M not even sure how to get to it. I thought it was in my browser, but I'm not seeing it there. So sometimes it's like I want to try some of these things, but I'm not even sure where to go. Maybe it's only in their mobile device. And sometimes I get confused like, is this a mobile only thing? Is it only on Pixel? Is it also on the iPhone? Is it just in my Chrome browser? I did notice the I overviews though. I actually searched something yesterday or this morning and I noticed the citation. There's the link now next to each thing and you click it and it'll pop up over to the right and it'll show you where that that information is coming from. And I assume that's kind of how they're doing, like the ad units and stuff too. So yeah, it's just, it's a lot. And I sympathize with Google. I mean, they have so much reach, so much distribution, so many different products and features. How you manage that portfolio of features and capabilities and AI, it's challenging. But yeah, some of these things like I'd love to check out, I just got to figure out how to check them out.
[38:39]
Unknown
Well, and we've talked about this a little bit with like Project Astra, right, where it's like we're getting into that idea of being able to actually look at stuff in the physical world, answer questions about it. Kind of a prelude to that.
[38:51]
Paul Raitzer
Yep.
[38:53]
Unknown
Okay, so more Google news coming up. The popular Google tool Notebook lm, which we've talked about, is an AI research assistant that allows you to engage with, understand, summarize, query up to 50 different sources of material. The tool attracted a ton of buzz when it launched Audio Overviews a few weeks ago, which turns your material into a deep dive podcast between two AI hosts, both of whom sound ultra realistic. Now we've seen NotebookLM get a bunch more updates, including a big one that allows you to add public YouTube URLs and audio files to your notebooks. And we actually just heard from Google's product lead on NotebookLM, Raisa Martin, who recently teased some new functionality coming out in the tool around custom chatbots. So Martin responded to a post on X that showed a video of a work in progress custom chatbots feature that basically allows you to build a custom chatbot based on the notebooks you build in NotebookLM. And so Martin said of this feature, quote, custom chatbots, I have a lot to say. This is pretty widely used internally at Google and literally every day someone pings me to say this has 10 xed our team's productivity. Not joking. And in the video she references that has been posted, you're still looking at the old version, so I'm excited for what you all think when the new version launches. So, Paul, Notebook LM certainly seems to be the darling of the AI world right now. It's pretty incredible. You and I both used it quite a bit. Like, how big an update would custom chatbots be for this tool?
[40:33]
Paul Raitzer
That'd be fascinating to consider how it could work. But I think what we're seeing is continued evolution of the user interface. Like, for two years, we've all just been interacting with chatbots. Now we're kind of starting to interact with voice more regularly in a more reliable way. But you're seeing kind of innovation at the user interface level where it's a mix of chat and something else. So Notebook LM allows you to have a chat with these documents, but you can also create a podcast with the documents. You can have it output, you know, FAQs and all these things. And so, yeah, it's like, it's interesting now to start seeing the evolution of how these tools allow people to interact with the information. And quick Note, on the YouTube, so what they they're doing right now is they're not actually processing the YouTube video and using, like, computer vision to know what's in the, you know, in the videos and everything. They're pulling the transcripts. So it's adding a transcript to it, but you can add the URL links, you can add YouTube videos, and then it'll automatically get the transcripts. But I have seen a lot of this product just in the last few days on mainstream media. And it's always hilarious when you see these, like, CNBC or something like that that's using it, and they're just like, completely shocked by what they're hearing. It's like real people. So we're seeing a lot of that. And then one other thing I'll mention is there's another guy on the team that I just kind of came across, I think it was yesterday or this morning. Jason Spielman. He's a senior interactive designer at Google Labs. He seems to be more active on LinkedIn than on Twitter. Doesn't have an overly active Twitter account, but he posted kind of a cool. Like, what I love, what Raza and Jason are doing is, like, this inside look, which we don't always get at Google. Like, it's like these very approachable personalities. And so he posted said our newest Feature Audio Overviews has taken over the Internet the past few days. The team has been sprinting. We went from idea to prototype in weeks, then launched publicly in under two months, which is very ungoogle like. And he said it's not perfect, but that's the point here. A few takeaways and I'll just highlight these because I thought a couple of these are really good that it's not about building products with our users or it's about building products with our users, not just for them. We're not waiting to launch, we're shipping early and iterating. So getting it out to users, getting their feedback. They have an active discord channel apparently. And then the second one I really like built in, not bolted on. We're building net new AI native products. This isn't just AI for the sake of AI. We're working to bridge the gap between state of the art research and human problems. I love this approach and this is the challenge like a Microsoft faces or even Google with Workspace is you're putting AI into places people already are and it might not be natural to them to find the value. Maybe that's part of the issue people are having with Copilot and Workspace is I'm good in Excel, I don't need this AI thing here. Whereas what they're doing here is create this standalone thing that has such immense value because there's such obvious use cases that sometimes Net new product is what is needed to drive adoption. And so that's what we're seeing here. He also said meetings are spent building, not talking about building. I love that just as an overall business takeaway. Have a point, have a reason to be there, have an output you're looking to get and then meet. If you don't, don't meet. And then putting user feedback and community engagement at the heart of everything we do building quickly and have a lot more coming soon. So yeah, again, good on Google, good on, you know, the labs team for giving these two the freedom to like share a little behind the scenes. I think it gives more personality to Google and that's not a bad thing. And so hopefully we see more of this, this kind of stuff from their teams. I think people love to get that inside look and like it also gives more patience when stuff goes wrong.
[44:07]
Unknown
Right.
[44:08]
Paul Raitzer
Like when you've got someone who's like a voice. We see that with Logan Kilpatrick there, you know, on the, you know, building AI Studio and working with the development group came from OpenAI. That's like a personality, people like, you know, respect and that, that it's like something doesn't go wrong. It's cool as long as you're transparent with us. And so I can see that kind of stuff working well with this team here.
[44:27]
Unknown
There was an interesting post on X yesterday from Ethan Malik that I think really hammers home what you're talking about. He said, Google's Notebook LM has been available for a year before this new podcast feature made it go viral. And I love this part because I think it's underrated by a lot of people. There is a lesson here about accessible magic, making this stuff more tangible and accessible through these, like, light bulb features almost where like the light bulb goes off and you say, oh my God, it can do this. That's really important to getting more attention and adoption to these tools. And I wonder how much this feature has, like, caused Google to devote more resources to Notebook lm.
[45:09]
Paul Raitzer
Yeah, that's like, it goes back to. We talk so much about, you know, if you think about ChatGPT, how many companies still struggle to justify the 20 bucks a month per license for ChatGPT? Why? It's because they don't get it. There isn't that light bulb moment. And that's why I'm a huge believer. Like, just if you're, if you're in charge or involved in rolling out ChatGPT or Google Workspace, Gemini or whatever, Copilot, roll it out with customized or personalized use cases for the people you're rolling it out to. You got 20 writers on your team. Show them how to use Canvas with their right. Like, give them the one or two or three use cases where they're going to immediately understand the value. And if that's all they use the tool for, fine. But if they discover the other thousand ways they can use it, even better. But so many people don't just hold the hand to the first few use cases where the value becomes so obvious. And yeah, I think like a tool like Notebook allow it just you immediately see it, they use it once, it's like, oh my God, I got 20 other ways I want to use this tool right now. Yeah.
[46:09]
Unknown
All right. So speaking of, you had mentioned Copilot, Microsoft actually announced a few additions to its Copilot products. One of the first is most notable rather is Copilot Vision, which is an experimental feature available to Copilot Pro subscribers through Copilot Labs. And this tool allows Copilot to actually analyze and respond to questions about what's on your screen, particularly content in Microsoft Edge, so users can ask questions about images or text on web pages and Copilot will provide insights and suggestions. Microsoft emphasizes, based on some past issues they've had, that this feature is designed with privacy in mind, immediately deleting processed data after conversations. Another new capability is something called Think Deeper, which enables Copilot to reason through more complex problems using advanced reasoning models. Think Deeper takes more time to provide step by step answers to challenging questions. This feature is initially available to a limited number of Copilot Labs users in select countries. We are also seeing Copilot Voice being introduced, allowing users to have spoken conversations with Copilot. This also includes four synthetic voices. It can adapt its tone based on the user's conversation style and it's launching in English with in several countries. So Paul, like, what do you make of these updates as you're reading them? Like, I'll be honest, Think Deeper in voice don't seem like coincidences given that.
[47:47]
Paul Raitzer
OpenAI just released the morning of update.
[47:49]
Unknown
Yes, just since we just got 01 and advanced voice and the real time API. Like, and also, what do you make of this Vision feature?
[47:57]
Paul Raitzer
Yeah, so the vision. I don't remember when we talked about this that it wasn't called Copilot Vision.
[48:02]
Unknown
It was like Recall, I believe.
[48:03]
Paul Raitzer
Recall. Yeah. So that was a few months back. And pushback is like a light way of saying they made a really stupid move and had to back off it. So what happened was when they first debuted this product, it was like ready to go. Like they were going to be shipping computers with this baked into it.
[48:24]
Unknown
Yeah.
[48:24]
Paul Raitzer
And it was going to remember everything on your screen. So anybody who's listening to the show for a while would recall this conversation. We'll find the episode and put it in the show notes for reference. But they were basically going to out of the box by default, record everything that happens on your screen. And anyone apparently outside of Microsoft who heard this was like, well, what about this? What about that? How about when I do this? How about when I do that? And they didn't have answers for this. Like they, they apparently just thought it was a good idea to just record everybody's stuff and think through the ramifications of that. So they've had to. Now what, what did you say? With privacy in mind, I think is. Yeah, the term, the term used. So after significant pushback on a terrible idea, they have rebundled it as Copilot Vision with something you apparently probably opt in to use now. So yeah, that, that's my thoughts on that. Like I guess there's some useful purposes for that product, but I still have massive concerns around the privacy side. Okay, then, yes. I'm kind of with you on this voice and reasoning thing. So what I find myself wondering hearing is I know Microsoft is a very innovative organization. I know they build their own stuff. I know they're building their own smaller models. I know they're invested heavily in OpenAI, but all this stuff I hear, I know they acquired Mustafa Solomon, or at least Aqua, hired him from Inflection AI, who was one of the founders of Google DeepMind, and he's now head of consumer AI. Like, I get it, but it just seems like everything they do is a wrapper on top of OpenAI stuff.
[49:52]
Unknown
Right.
[49:53]
Paul Raitzer
And I find myself wondering, like, what in the world would I need Microsoft's voice for? I have advanced voice from Open. What would I need their reasoning for? I have, oh, one from ChatGPT. Like, I don't understand how Microsoft is differentiated. Like, how their products, other than the fact that they can put them into Microsoft Word and Excel and PowerPoint, what else is different that they're. Because they're just wrapping everything on top open as models. And if OpenAI chooses to come after the productivity market and build ChatGPT docs and ChatGPT spreadsheets or whatever they want to call it, then it's like, literally, they're in direct competition and the only thing Microsoft has is distribution because they're built on top of the same infrastructure as OpenAI and it's not theirs. I don't know. It's weird. It's a very weird relationship that just keeps getting more bizarre. And it doesn't look great from an innovative perspective for Microsoft because it looks like just. Yeah, we. We're actually built on top of that, too. We call it Copilot Voice, or like, we're built on top of Copilot Reasoning.
[50:56]
Unknown
Yeah. In a weird way, I wish they had just, instead of giving this, like, a name, like, think deeper. Like, just tell me.
[51:02]
Paul Raitzer
Yeah, it's straw. Why don't we just call it Strawberry?
[51:04]
Unknown
Right? Yeah.
[51:05]
Paul Raitzer
Right? Yeah. I don't. I don't know. It's. It's weird. And maybe I'm just not understanding their marketplace, but I feel like I have a pretty decent understanding of their partnership with OpenAI and how they're building things. But maybe I got to go listen to some recent Mustafa Solomon stuff. Like, maybe he's explained this differently, and maybe they're not just doing everything on top of, you know, OpenAI's models. But that's my current understanding. So I don't know if anybody at Microsoft listens to the show and wants to, like, hit us up and give us, you know, a better understanding of situation. Like, I'm all ears, but everything I've researched to date, that's kind of what it seems like.
[51:41]
Unknown
All right, our last few stories here all are about Meta. So first up, Meta has confirmed that it may use any image analyzed by its AI assistant on its Ray Ban Meta smart glasses for training its AI models. So according to Meta's policy, communications, images and videos shared with Meta AI in regions where multimodal AI is available, which is currently the US and Canada, can be used to improve the AI as per the company's privacy policy. That means that while photos and videos captured on Ray Ban Meta are not used for training, if users don't submit them to AI, the moment a user asks Meta AI to analyze them, they fall under a different set of policies. So, Paul, this is a thing where it seems like it's going to just become more of a problem as AI powered wearables roll out. Like, can you even build an AI wearable or AI glasses that don't collect data from what they see?
[52:43]
Paul Raitzer
I don't, I don't know how you would. I think that this is, again, we're, as a society, as a business community, we're struggling to grasp the current technology. We're struggling to grasp the implications of language models and the ability to put text in and text out and out images in and videos and things like that. Like, that's still so new to everyone. If you try, like, people's heads are gonna explode if they try and start comprehending the complexities of everyone walking around with glasses that can record them on. And the thing is, like, we don't need Meta Orion that we talked about last week, that isn't coming for years. We already have Ray Ban glass. Like, these things are already in the world. Maybe you have a family member, maybe you have them. Yeah. You know, maybe you're using them.
[53:29]
Unknown
Right.
[53:30]
Paul Raitzer
This is already the reality that people are wearing these things and they can see things and they're going to have computer vision, they're going to be able to analyze things and, and that stuff you ask it to analyze is going to automatically be recorded if it's in your home, if it's your family. Like, it's just, there's no way to get that data out. And that's again, it's just a thing that I don't feel like we're Prepared for. I wonder if I started thinking, like, I wonder if there's like, school policies around this. Like, are. I know a lot of schools now, like, hey, leave your phone here. Like, you have to take your meta glasses off too. If you walk into a classroom, I assume you would. I don't know.
[54:05]
Unknown
I would get there. That's when it hits, you know, real mass consumer product. If we have policies around.
[54:11]
Paul Raitzer
Right. Yeah. It's like, I mean, I feel this kind of goes to, like, some of this is just personal, but if. If I'm in a meeting with someone, one, it's kind of like when their note taker shows up on Zoom.
[54:22]
Unknown
I was gonna say this. Exactly.
[54:23]
Paul Raitzer
I want your note taker there. If I'm talking to you and you're wearing meta glasses, I'm not saying anything that I wouldn't assume is being recorded. And that, that doesn't mean I'm saying something that I'm like, ashamed to say or anything that's like, I'm not going to talk to you about my personal life. I'm not going to talk to you about, like, financials of my business. Like, if I'm talking entrepreneur and entrepreneur and we're just like, kind of like having an honest conversation with each other. Yeah. If you're wearing those glasses, I'm just kind of assuming that maybe they're recording. I don't. I don't even know how they were. So I feel like there's all these unanswered questions in society. And I know the next topic. Mike, we're going to talk about sort of expands on this a little bit. There's a lot of open questions and problems with this technology that we just.
[55:08]
Unknown
Haven'T dealt with yet before we even get to wearables. Like, isn't it possible I could be running an app right now that just records our facial expressions on our screens to, like, figure out what you're feeling like. We know that technology already exists. I tried it.
[55:23]
Paul Raitzer
Right.
[55:24]
Unknown
This is going to just open up this whole can of worms around surveillance and how we interact, which is really weird.
[55:31]
Paul Raitzer
Yeah. I think we. We've focused a lot of our talks on this show over the last year and a half have been about laws and regulations related to, like, copyright and intellectual property and the training of these models and the, you know, the harm and risk. My guess is the reality going into next year, now that I'm kind of thinking about, this is gonna. There's gonna be far more movement on privacy and things like this, like protecting people against, you know, Someone running emotion detection software when they're interviewing for a job or like at that application level where you start to find these things, where there's bias and there's, you know, more harmful things, where it's not catastrophic, but at an individual level and it starts to invade people's privacy and their rights. And I could see a lot of legislation soon that start to focus on that. Maybe that's already out there and we just haven't, you know, we haven't dug deep on it, but it's a problem. And again, we're going to find out why in a moment.
[56:34]
Unknown
Yeah, let's talk about that because we have another Meta related story about how this can go really wrong. So two Harvard students created a controversial project called Ixray which combines meta's Ray Ban smart glasses with facial recognition technology to instantly identify and gather personal info about strangers. The I X ray system works by using Meta's commercially available Ray Ban smart glasses to capture images of people. It then employs the facial recognition service PIM Eyes to match faces with online images. We talked about PIM Eyes like many, many episodes ago, about how crazy it is you can find literally many people's faces online. That system scrapes information from web pages and uses a large language model to then infer personal details about the individual. Going a step further, Ixray then performs a lookup on people search sites, which are data brokers that offer extensive personal information. This process allows the glasses wearer to potentially access a stranger's name, job, education history, home address, phone number, and even information about their family members. Now, the two students that created this, they claim their project is designed to raise awareness about the potential risks of the technology. They tested it on unsuspecting people in public places. They're not releasing their code, but it's pretty noteworthy in the sense that despite being designed to raise awareness, Paul, like you could do this. Someone can replicate some version of this using off the shelf technology. Like, is this the future we're headed towards with these things?
[58:20]
Paul Raitzer
Yep. I don't have a better answer. Like, this is exactly the stuff I worry about all the time. Like again, I have a 12 year old daughter. Like I think deeply about this stuff all the time. It's just like, do you? I don't want to get into like exact scenarios and stuff, but you can imagine like even me, like, I don't need people at the gym knowing like who I am or what I do or anything like that. Like you just assume some level of privacy even when you're out in public. Yeah, and I get that like people may go on Facebook or you know, wherever and try and find people, but to think that someone's just wearing meta glasses, which I think are harmless. Like, I don't maybe I don't know any better. And they've just got some off the shelf open source thing that some college kids built and they're actually like scanning faces unbeknownst to everyone and doing lookups and having chat GPT write summaries of who they are and what they do and where they live and how much money they make. Is that what we want in society? Right? Like, and yeah, you're right. Like this is too hard for kids. This could be knocked off in an hour. You could probably, if Claude would do it, if it wasn't red team of it, you could probably write the code for this. So even you and I as non coders could probably use a language model to write the code to emulate this program. And yeah, it's, it's terrifying. Like, and there's no, there's no logical way that to stop it.
[59:52]
Unknown
Right.
[59:53]
Paul Raitzer
Like the tech is there. Pandora's box is open. Like, people know you can do this kind of stuff. They're gonna do it. And then it gets to the societal thing of like, I'm just not gonna trust anybody wearing AI glasses. Like, I don't care what brand it is because I. What apps are they running on? The thing that I don't know about. It's. I hate thinking about this stuff, honestly. Like, I get asked all the time about that. Like, how do you not think about the dark stuff? It's, it's very intentional. This is like a Black Mirror episode.
[60:20]
Unknown
100%. I was gonna say it's like a sci fi novel, right? Where you're like, I know how this ends.
[60:25]
Paul Raitzer
Yeah. I don't, I don't want it. I get that it's gonna be here, but I, I don't want this. Yeah.
[60:32]
Unknown
Yeah. Especially as we talk more and more about models ability to persuade or reason and understand. Kind of like to coerce people, so to speak. Like, this is just like you're playing a poker game against someone now that like they know all your cards. If they're like looking at your facial expression like, that's so weird to me to even think about. I guarantee you people are thinking about that.
[60:53]
Paul Raitzer
Yeah. And if you've got like AirPods in the AirPods are connected to an app, right. Telling you what to say and how to, you know, persuade them to do. But all of it's going to happen, all of it. Like, if your mind like goes this direction, I apologize if you're now heading in the wrong direction. But all of it is going to happen and soon. Like this tech is, is here. There's no scientific barriers to doing these kinds of things. And I just, it's why we, it's like again, the only way I, I have some peace at the end of the day about any of this is more people are becoming aware of these issues. And hopefully the more people we help become aware of it, the more gets done to prevent misuse of the technology. Yeah, because the tech's going to be there. Bad people are going to do bad things. But if everyone's completely oblivious to the bad things that can be done, then they just happen without anybody knowing. But at least if we have an educated society that's aware of the downsides of AI, at least we can try and do something to ensure like a positive outcome for all this. Because bad stuff's going to happen, but we got to offset it with the good stuff. And that's not going to happen on its own for sure.
[62:04]
Unknown
All right, let's end on a high note because Meta hasn't been all in negative news now come up with ways.
[62:10]
Paul Raitzer
This can be misused. Just give me a minute.
[62:12]
Unknown
Yeah, yeah, yeah. But on the surface at least, they did just unveil something called moviegen, which is a breakthrough in generative AI.
[62:22]
Paul Raitzer
And this is by the way, this morning. So like Mike and I are, this is on the fly. We're kind of doing this one, so give us some grace on if we don't get all the details exactly right.
[62:31]
Unknown
Yeah. So this is basically, they had a research breakthrough in media generation, generative AI to generate images, videos and audio. And this new suite of models represents basically their third wave of generative AI work. It builds on some things we've talked about in the past, like their make a scene technology, their llama image projects. What moviegen does is it generates videos from text is the primary capability of it. It uses a 30 billion parameter transformer model to create high quality high def videos up to 16 seconds long from text prompts. It can generate videos featuring a specific person based on a single image input into text prompt. It offers precise video editing. It can make localized and global edits to existing videos based on text instructions, preserving original content while targeting specific elements. And a 13 billion parameter model can generate high quality audio up to 45 seconds long, including ambient sound, sound effects and instrumental background music synced to the video content. Meta claims that moviegen outperforms similar industry models across these tasks when evaluated by humans. So Paul, this is just the latest in advanced video models coming out. We heard there were delays and inadequacies with Sora. It sounds like Meta is a major player to take seriously. Like how seriously should we take this video generation model? Like did they get a leg up on the other players maybe?
[64:03]
Paul Raitzer
I actually put the research paper in a Notebook LM to create a deep dive podcast on it. It wasn't done rendering by the time you and I got on to record. So I'll be listening to that summary of the research paper later today. Thank you Notebook lm. What it means is video is a major frontier that progress is being made on. And at some point you and I and others will have access to actually generate 10 to 20 second clips reliably and quickly at an affordable cost. None of those things are true today. You, you can go into Runway, you can use Pika, you can use as some of the other tools, but you can go in and create videos. But the consistency isn't great. Like the characters will change, things remain consistent frame to frame. It takes forever to to output them. So like a 15 second video, if you could do it in Runway, might be take 10 minutes, costs a lot of money. Like so it's not. The tech isn't there yet. It's not ready to kind of scale in the business world. But we know Sora is coming eventually. We know VO from Google, DeepMind is coming. Nvidia is a major player in this space. Again, they're kind of everywhere. I think we still have time, but it's interesting that we need to figure out how this impacts the creative profession. So it's so funny. Like Runway in particular, they're very. They make a lot of efforts to, to make it sound like they're doing everything in collaboration with creators, that it is only augmenting what creators can do. And there's certainly an element to that. But I everybody just kind of glosses over the negative impact. So even with Meta, which again, this tech isn't available, this is just research. They're sharing this. But it's not like I don't think you can go into Meta AI and start playing around with these tools. So they say this is in their kind of release post. Whether a person is an aspiring filmmaker hoping to make it in Hollywood or a creator who enjoys making videos for their audience, we believe everyone should have access to tools that, that help enhance their creativity. So we're excited. Premiere Meta Movie Gen. We anticipate these models enabling various new products that could accelerate creativity. While there are many, many exciting use cases for these foundation models, it's important to note that Generative AI isn't a replacement for the work of artists or animators. We're sharing this research because we believe in the power of this technology to help people express themselves in new ways and provide opportunities to people who might not otherwise have them. Our hope is that perhaps one day in the future everyone will have the opportunity to bring their artistic visions to life and create high definition videos and audio for using and audio using moviegen. So it's kind of like their vision for what they're doing, but there's going to be good and bad and I don't know. So it seems like really impressive tech. I think it's a race now with OpenAI and Google and Runway and Luma and Pika and Nvidia and Meta and everybody's building for the same stuff. Text We've got images, video, audio, code. Like those are the five main modalities we talk about all the time and they're all pursuing those same modalities.
[67:13]
Unknown
And like we talked about last week with YouTube, because this is Meta, expect to see a lot more video generation on social platforms.
[67:22]
Paul Raitzer
Yes, and assume that any video you've ever uploaded to Meta is being used to train the model. Good call.
[67:32]
Unknown
All right, so we we didn't end on exactly the highest note, but it wasn't like super dark. But it was. Yeah, middle of the road. Gray. A little gray.
[67:42]
Paul Raitzer
All right, Paul, I'm glad it's a Friday and I can have a drink now.
[67:45]
Unknown
Yeah, no kidding. Well, thank you as always for breaking everything down. Just a couple quick housekeeping announcements. Go sign up for our newsletter MarketingAI institute.com/forward/newsletter we have tons of topics we don't get to every week that are all in the newsletter broken down for you. And if you have not left us a review and are able to, we would love to hear your feedback on the show and help us get the show to more people. Paul, thanks for weathering the some of the doom and gloom topics this week. Always interesting.
[68:19]
Paul Raitzer
Thank you Mike. We'll talk with everyone again next week. Thanks for listening.
[68:23]
Mike Kaput
Thanks for listening to the AI show. Visit MarketingAI institute.com to continue your AI learning journey and join more than 60,000 professionals and business leaders who have subscribed to the weekly newsletter, downloaded the AI blueprints, attended virtual and in person events, taken our online AI courses, and engaged in the Slack community. Until next time stay curious and explore.
[68:49]
Paul Raitzer
AI.