#121: New Claude 3.5 Sonnet and Computer Use, Wild OpenAI "Orion" Rumors, Dark Side of AI Companions & Ex-OpenAI Researcher Sounds Alarm on AGI - The Artificial Intelligence Show

Summary9 min read

Episode #121: Navigating the Complex Landscape of AI Advancements and Challenges

Hosts: Paul Roetzer, Founder and CEO of Marketing AI Institute
Co-Host: Mike Kaput, Chief Content Officer of Marketing AI Institute
Release Date: October 29, 2024
Podcast: The Artificial Intelligence Show

Introduction: A Week of Rapid AI Developments

In this action-packed episode, Paul Roetzer and Mike Kaput delve into a tumultuous week in the AI world, marked by groundbreaking model releases, concerning incidents involving AI companions, significant industry departures, and intense rivalries between tech giants. With the onset of a busy fall season, the hosts emphasize the need to stay informed and cautious as AI continues to evolve at an unprecedented pace.

1. New AI Model Releases and Updates

Anthropic's Claude 3.5 Enhancements
Anthropic has unveiled major upgrades to its Claude AI models, introducing the Claude 3.5 Sonnet and Haiku variations designed to handle creative tasks with poetic finesse. Notably, Claude now possesses the ability to interact directly with computer interfaces, allowing it to control cursors, click buttons, and type text via API in a public beta phase. This marks a significant step towards AI models possessing more autonomous operational capabilities. Additionally, Anthropic introduced an analysis tool integrated into Claude, functioning similarly to ChatGPT's code interpreter, enabling complex data analysis for marketing, sales, and finance teams.

Paul Roetzer [00:00]: “We got to be realistic that this isn't all going to be sunshine and rainbows...”
Guest Speaker [05:04]: “Claude can operate computers like humans do...”

OpenAI's Escalated Focus on AI Coding Tools
In response to Anthropic's advancements, OpenAI is intensifying its efforts to develop AI-powered software development tools. Reports indicate the creation of new coding-focused products that integrate seamlessly with popular code editors like Microsoft's Visual Studio Code and aim to automate complex software engineering tasks.

Perplexity Pro's Transition to a Reasoning-Powered Search Agent
Perplexity Pro, the advanced paid tier of Perplexity, is shifting towards a reasoning-powered search engine capable of handling more intricate queries. The platform will automatically activate its reasoning capabilities for challenging prompts, enhancing its utility beyond simple information retrieval.

Runway's Act 1: Revolutionizing Animated Character Performances
Runway introduced Act 1, an AI tool that transforms character animation by generating expressive performances from basic video inputs. This innovation simplifies traditional animation processes by eliminating the need for extensive motion-capturing equipment, capturing subtle details like eye movements and micro-expressions from single-camera recordings.

11 Labs' Voice Design Tool
11 Labs launched Voice Design, an AI-powered tool that enables users to create custom voices through textual descriptions. This tool allows for the specification of age, accent, gender, tone, and pitch, catering to applications in voiceovers, advertising, and podcasting.

Stability AI's Stable Diffusion 3.5
Stability AI introduced Stable Diffusion 3.5, their most robust image generation model to date. The release includes the Stable Diffusion 3.5 Large, an 8-billion-parameter model optimized for professional use, and a faster version capable of generating high-quality images in just four steps.

Rumors Surrounding OpenAI's "Orion" and Google's Gemini 2.0
According to The Verge, OpenAI is rumored to be developing Orion, a next-generation AI model slated for a December release, potentially on the two-year anniversary of ChatGPT. However, OpenAI CEO Sam Altman dismissed these rumors as "fake news," though the company confirmed plans to release "other great technology" around the same time. Meanwhile, Google is expected to unveil Gemini 2.0 and Project Jarvis, an AI system designed for consumer use within the Chrome browser to handle everyday web-based tasks.

Guest Speaker [10:36]: “Google is reportedly planning to release Gemini 2.0...”

2. The Dark Side of AI Companions

Tragic Case Highlighting Risks of AI Chatbots
A heartbreaking incident from Florida underscores the potential dangers of AI companion apps. A 14-year-old named Sewell Setzer took his own life after forming a deep emotional bond with an AI chatbot on Character AI, a platform that allows users to interact with AI personalities. The chatbot, modeled after a Game of Thrones character, engaged in intimate conversations about Setzer’s fears and suicidal ideation. His mother has filed a lawsuit against Character AI, alleging that the platform's technology led to emotional dependency without adequate safeguards.

Guest Speaker [27:08]: “There's a really sad case that just came out of Florida...”

Parental Controls and Safety Measures
Recognizing the severity of such incidents, Paul and Mike discuss the challenges parents face in managing their children's online interactions. Paul shares his personal struggles with overseeing his children's use of platforms like Minecraft and Roblox, highlighting the complexities and insufficiencies of existing parental controls. In response, Paul developed Kidsafe GPT, a custom ChatGPT designed to aid parents in understanding risks, guiding conversations, and creating online safety guidelines for their children.

Paul Roetzer [29:37]: “I built Kidsafe GPT for parents...”

3. Industry Departures and Concerns Over AGI Readiness

Miles Brundage Leaves OpenAI
Miles Brundage, former Senior Advisor for AGI Readiness at OpenAI, announced his departure to focus on independent AI policy, research, and advocacy. Brundage expressed concerns about the rapid advancement of AI and the insufficient preparedness of both OpenAI and the broader world for AGI. His departure shines a light on internal apprehensions regarding AI safety and the adequacy of existing measures to manage AGI's societal impacts.

Paul Roetzer [41:39]: “It's someone from the inside who is literally in charge of this process saying what we've been keep repeating...”

OpenAI’s Response and Future Directions
In response to growing concerns, OpenAI has appointed Dr. Ronnie Chatterjee as its first Chief Economist, tasked with researching AI's economic impacts and ensuring the equitable distribution of AI benefits. This move aims to bolster OpenAI's efforts in understanding and mitigating the broader economic implications of AI advancements.

Guest Speaker [46:40]: “OpenAI is taking this very seriously...”

4. Government and Regulatory Developments

White House National Security Memo on AI Leadership
The U.S. Government released a pivotal national security memo outlining strategies to maintain AI leadership while ensuring safe and responsible development for national security purposes. Key areas of focus include:

Strengthening the AI Ecosystem: Through partnerships with industry, academia, and civil society.
Attracting Global Talent: Streamlining visa processes for AI experts and enhancing computational infrastructure.
Harnessing AI for National Security: Implementing safeguards and developing a framework to assess and manage high-impact AI systems.
International AI Governance: Promoting democratic values and developing international AI norms through bilateral and multilateral engagements.

This comprehensive memo underscores the administration's recognition of AI's transformative potential and the imperative to lead its ethical development.

Guest Speaker [50:30]: “The White House is taking this very seriously...”

OpenAI’s Alignment with National Security Goals
Concurrently, OpenAI published a companion piece detailing its approach to national security, aligning with the government's directives and emphasizing collaboration to foster a secure AI landscape.

5. Corporate Rivalries and AI Strategies

Salesforce vs. Microsoft: The AI Agents Battle
A recent spat between Salesforce and Microsoft highlights the intensifying competition in the AI agents space. Salesforce CEO Mark Benioff criticized Microsoft's rebranding of Copilot as "agents," labeling it a "flop" due to perceived inaccuracies and security issues. In contrast, Salesforce's Agent Force platform boasts autonomous capabilities that drive sales, service, marketing, analytics, and commerce within a unified system.

Mark Benioff on X: “Microsoft rebranding Copilot as agents. That's panic mode... Clippy 2.0.”

Microsoft defended Copilot by citing its adoption by Fortune 500 companies, though Paul remains skeptical about its efficacy and seeks real-world success stories from listeners.

Paul Roetzer [55:12]: “If anybody is using Copilot... I would love to hear from you.”

Disney’s Major AI Initiative
Disney is reportedly gearing up to announce a significant AI initiative aimed at transforming content production, particularly in post-production and visual effects. This strategic move signifies the entertainment industry's growing reliance on AI to enhance creative processes.

Guest Speaker [57:08]: “Disney is preparing to announce a major AI initiative...”

6. Responsible AI Use and Authenticity

Apple’s Cautious Approach to AI in Photo Editing
Apple’s software chief, Craig Federighi, revealed internal debates over the Cleanup feature in iOS 18.1’s Photos app, which allows users to remove objects and people from images. Unlike competitors like Google and Samsung, Apple deliberately restricts adding AI-generated elements to maintain the integrity and credibility of photography. Additionally, Apple ensures transparency by tagging any image edited with Cleanup as "modified with cleanup" and embedding metadata to indicate alterations.

Paul Roetzer [61:53]: “I feel like a lot of Instagram and... is probably already heavily in this altered world...”

Google’s SynthID Watermarking Tool
Google has open-sourced its SynthID watermarking system, which embeds imperceptible digital watermarks into AI-generated content across text, images, audio, and video. This tool aims to aid developers in identifying and managing AI-generated content, thereby addressing concerns related to misinformation and content authenticity.

Guest Speaker [63:03]: “Google has just announced they are open sourcing their Syn ID text watermarking tool...”

7. Legal and Ethical Issues in AI Development

Ex-OpenAI Researcher Alleges Copyright Violations
Suchir Balaji, a former OpenAI researcher, has publicly criticized OpenAI’s data practices, asserting that training AI models on copyrighted material without proper licensing undermines the commercial viability of creators and businesses. Balaji challenges the prevalent use of the "fair use" defense, arguing that generative AI can create substitutes that compete with the original data sources. OpenAI maintains that its practices are protected under fair use principles and supported by legal precedents.

Paul Roetzer [65:14]: “It seems like they've probably made advancements in it...”

Balaji is advocating for regulatory intervention to address these issues, emphasizing that the debate extends beyond OpenAI to the broader generative AI landscape.

8. Positive AI Applications: AI as a Speaking Coach

Gemini 1.5 Pro Enhances Public Speaking
Bibaa Wall Sidhu, host of the TED AI Show, shared a successful use case of Gemini 1.5 Pro in refining his keynote presentations. By uploading his slides and audio recordings, Gemini provided precise feedback down to specific slides and timestamps, transforming his delivery from "meh" to "mic drop" in just two practice sessions. This demonstrates AI's potential as a personal coach, offering targeted advice and performance enhancements.

Guest Speaker [70:43]: “All right, so our last topic this week...”

Paul appreciates such practical applications, encouraging listeners to explore custom AI tools to enrich their workflows and improve personal performance.

Paul Roetzer [72:20]: “I love practical use cases like this...”

Conclusion: Balancing AI's Promise with Caution

Episode #121 of The Artificial Intelligence Show provides a comprehensive overview of the latest AI developments, highlighting both the innovative strides and the accompanying ethical and safety concerns. The hosts advocate for responsible AI usage, heightened safety measures, and proactive regulatory frameworks to navigate the complexities of AI advancements. They also encourage listeners to leverage AI tools in practical, beneficial ways while remaining vigilant about the technology's potential risks.

Notable Quotes:

Paul Roetzer [00:00]: “We got to be realistic that this isn't all going to be sunshine and rainbows and growth of productivity and efficiency and creativity. Like there's dark sides to this and they're not going to go anywhere.”
Guest Speaker [05:04]: “Claude can operate computers like humans do...”
Mark Benioff [55:12]: “Microsoft rebranding Copilot as agents. That's panic mode...”
Paul Roetzer [29:37]: “I built Kidsafe GPT for parents...”
Bibaa Wall Sidhu [70:43]: “It's like having an AI speaking coach with perfect attention and infinite patience.”

Join the Conversation:
For more insights and detailed breakdowns, visit MarketingAIInstitute.com and subscribe to their weekly newsletter. Engage with over 60,000 professionals and access additional resources to continue your AI learning journey.

Note: This summary is based on the transcript provided and aims to encapsulate the key discussions, insights, and conclusions presented in Episode #121 of The Artificial Intelligence Show.

Loading summary

Transcript55 lines

[00:00]
Paul Raitzer
We got to be realistic that this isn't all going to be sunshine and rainbows and growth of productivity and efficiency and creativity. Like there's dark sides to this and they're not going to go anywhere. And so we got to take some actions to certainly protect our kids best we can.
[00:16]
Mike Kaput
Welcome to the Artificial Intelligence show, the podcast that helps your business grow smarter by making AI approachable and actionable. My name is Paul Raitzer. I'm the founder and CEO of Marketing AI Institute and I'm your host. Each week I'm joined by my co host and mark Marketing AI Institute Chief Content Officer Mike Kaput, as we break down all the AI news that matters and give you insights and perspectives that you can use to advance your company and your career. Join us as we accelerate AI literacy for all.
[00:52]
Paul Raitzer
Welcome to episode 121 of the Artificial Intelligence Show. I'm your host Paul Raetzer, along with my co host Mike Kaput. We, we considered two episodes this week. Honestly, like it is. I have to go to New York again this week. This is like my month of New York trips. I was in New York last week. I am in New York again this week. So we couldn't squeeze in a second episode. So it was just the craziest week of like new models. So topic number one that Mike's going to walk us through that honestly is basically like three main topics rolled into one is just all the model news from last week. And it's really, it seems like a prelude to a very busy fall. I don't, I don't think we're done quite yet. So we have a lot to talk about. We are going to do our best to get this in our usual one hour to 1:15 or so. We'll see how we do. All right, so this episode is brought to us by Rasa IO. If you're looking for an AI tool that makes staying in front of your audience easy, you should try Rasa IO. That's their smart newsletter platform, does the impossible by tailoring each email newsletter for each subscriber, ensuring every email you send is not just relevant, but compelling. And it also saves you time. We have known the team@rasa IO for at least six years now one of our early partners at Marketing Institute. And no one's doing newsletters quite like they are. Plus they're offering a 5% discount with code 5M A I I when you sign up. Visit Rasa IO. Maii today. Once again, that's Rasa IO. Maii. Also episode 121 is brought to us by the AI for Agency Summit. This is our second annual virtual event taking place from noon to 5pm Eastern on Wednesday, November 20th. There's an on demand option as well. If you can't join us live. The AI for Agency Summit is designed for marketing agency practitioners and leaders who are ready to reinvent what's possible in their business and embrace smarter technologies to accelerate transformation and value creation. During the event, you'll join hundreds of other forward thinking agency professionals to consider ways to recruit AI savvy talent and upskill your team. Explore how AI tools can boost creativity, productivity and operations. Hear insider stories from agencies piloting and scaling AI successfully. I think we have six sessions from directly from agency leaders who are doing interesting things with AI in their organizations that you can learn about, understand how it impacts your pricing models and service offerings and connect with like minded agency professionals and leaders. All of this is presented by an incredible group of speakers Mike is actually going to be doing. We haven't announced that session yet. We're going to be announcing the closing keynote soon. We're also going to be announcing people on the brand panel. We have a panel of brand leaders. We're going to kind of provide an honest perspective on the current and future state of agencies as they're seeing it as AI begins to take more and more, be more and more deeply integrated, I would say into what brands are doing. So you can get tickets by going to ai4agencies.com and clicking register now use the code pod100 for $100 off your ticket. That's aiforagencies.com pod100 for 100 off. And again that is coming up November 20th from noon to 5pm and as I said earlier, there will be an on demand option as well. All right, so it is I started so I think I mentioned this on this before but I started a new newsletter called Exec AI Insider. It's through SmartRx. You can go to SmartRx AI and sign up for the newsletter. But I send it on Sunday morning so I write it on Fridays. And so the, the editorial up front for this past newsletter was it's AI model season and Mike, it is, it is definitely AI model season. So let's, let's get into all the craziness that honestly like we could have just done the whole episode on the AI models lead us off.
[05:05]
Guest Speaker
All right, so yeah, this first main topic this week is a doozy. There have been.
[05:10]
Paul Raitzer
I'm just gonna sit back and drink my tea by the way.
[05:12]
Guest Speaker
Right. Yeah, I'm Gonna the way we're gonna try to kind of get our arms around all the model releases, product updates, and even some like rumors about really important new models dropping is we're gonna kind of tackle this all like a single topic. I'm gonna run through the highlights here and then Paul just kind of get your take overall on what's most important to actually be paying attention to. So first up is huge news from Anthropic. So the company has unveiled major upgrade models. There's an enhanced Claude 3.5 sonnet, a new Claude 3.5 haiku model. And importantly, they announced the capability for CLAUDE to use computers. That means claude, through the API, can operate computers like humans do. So control cursors, click buttons, type text. This is a public beta feature right now, but it basically marks the first kind of over time a frontier AI model can go directly interact with computer interfaces. Now, Claude or Anthropic rather also introduced for Claude an analysis tool. So this functions as like a built in code sandbox where CLAUDE can perform complex mathematical operations, analyze data sets and iterate through different analytical approaches before providing answers. So you can think of this a bit like code interpreter in ChatGPT. You know, it can not only write and execute code right within this feature, but it also can be used for instance by marketing teams to analyze customer funnel data or performance data. Sales teams can also look at their own data, finance teams can create dashboards. All these kind of data analysis use cases and tasks that you would typically be thinking of trying to do in something like ChatGPT's code interpreter capability. Third up, OpenAI is reportedly ramping up its focus on AI powered software development tools in response to growing competition from anthropic. So ChatGPT's ability to write code has been kind of a big feature of the tool's success. But recent developments suggest Anthropic's CLAUDE may be gaining an edge in coding performance by some metrics. And this has OpenAI paying attention because reports came out this week that they're developing several new coding focused products, including tools to integrate with popular code editors like Microsoft's Visual Studio code, and more ambitious features that could automate complex software engineering tasks that humans typically take a long time to complete. Fourth up, Perplexity CEO Arvind Srinivas announced on X that Perplexity Pro, the advanced paid plan of Perplexity, is quote transitioning to a reasoning powered search agent for harder queries that involve several minutes of browsing and workflows. So for instance, he writes, that Perplexity Pro will now automatically turn on when it detects this feature, when it detects really difficult prompts. So some examples he cites something like quote, pull me all the IMO International Mathematics Olympiad, I believe is what that stands for. IMO Medal winners from China in the last five years and give it to me as a table. Quote, read Warren Buffett shareholder letters and tell me the key highlights from each year. So getting much more beyond just finding you information and actually reasoning through it. Fifth up, we're not even close to being done yet. Runway has unveiled Act 1. So Runway has come out with what they're calling Act 1, a groundbreaking new AI tool that basically transforms how animated character performances can be created. This technology allows creators to generate expressive character animations using nothing more than simple video inputs. So this dramatically simplifies what has traditionally been a complex equipment heavy process. So unlike conventional animation pipelines that require extensive motion capturing equipment, multiple camera setups, manual face rigging, Act 1 can create compelling animations from a single camera recording an actor's performance. So it accurately captures and translates subtle details like eye movements, micro expressions and timing from the source footage to the animated character. So if you're in any type of design or animation, keep an eye on that one. Number six 11 Labs has launched voice design. This is a new AI powered tool that allows users to generate custom voices simply by describing them in text. The system enables creators to specify characteristics like age, accent, gender, tone and pitch. It also offers particular utility for commercial applications. Think video voiceovers, ad reads, maybe even podcasts. And users will be able to either create new voices from descriptions or clone existing ones and tweak those as they need. Number seven, Stability AI, which we haven't heard from in a while, has unveiled Stable.
[10:31]
Paul Raitzer
Who's the Titanic guy? James Cameron that just joined the board.
[10:35]
Guest Speaker
That was the last piece of news.
[10:36]
Paul Raitzer
We haven't talked models with them lately.
[10:38]
Guest Speaker
Yeah, I wasn't even aware they were still working on them. But good for them. They've unveiled stable diffusion 3.5, their most powerful image generation model to date. This includes stable diffusion 3.5 large, an 8 billion parameter model optimized for professional use, and there's also a faster version that can generate high quality images in just four steps. All right, almost done here, but kind of wrapping up some of these updates. There are a few really big rumors flying around. So the Verge, for instance, is reporting that OpenAI is preparing to launch its Next Frontier AI model, codenamed Orion by December. Basically would put it right around ChatGPT's two year anniversary. Now this rumor has some controversy with it because while the Verge is kind of reporting things like the model is rumored to be substantially more powerful than its predecessors, including one OpenAI executive apparently suggesting it could be a hundred times more capable than GPT4, OpenAI CEO Sam Altman pretty quickly responded to this article on X, calling it, quote, fake news. And a company spokesperson clarifies that they do not plan to release a model named Orion this year, though they do plan to release, quote, other great technology at the same time. Google is reportedly planning to release Gemini 2.0, its next major AI model, in December. However, some sources close to it claim that the new Gemini model is not exactly delivering the performance improvements that the DeepMind team had initially hoped to achieve. However, along with that, the information reports that Google is also reportedly developing an AI system codenamed Project Jarvis. This can take control of a user's web browser to complete every day tasks. They are planning to preview this computer using Agent alongside the Gemini model release in December. According to these rumors. And unlike Anthropic's kind of computer using Agent, which is more for kind of professional users through their API and operating different applications, Jarvis is specifically designed apparently for consumer use in the Chrome browser. It is being developed to help handle common web based tasks like online shopping, travel booking and research gathering. Okay Paul, that is a whole week's worth of AI news in one topic. Zooming out here, which of these trends or announcements or rumors are most worth paying attention to right now?
[13:26]
Paul Raitzer
First, I just want to like comment on the, the Jarvis name. We got, we got a, we got a little more creative here. I mean so Zuckerberg a couple years ago was building Jarvis. He wanted to build his own like in house assistant. Personally he called it Jarvis. If I'm not mistaken, Jasper's original name was Jarvis and they got a threatening letter to stop using that name. Google's now coordinating jar. I mean, I don't know, just let's get more creative. We'll be at AGI when like the AI can help us come up with more creative names for projects and brands, I think. All right, so I, I want to. That was a lot. And again as the week was going on and Mike and I are kind of like keeping track of all this, like how are we even going to like cover all of this and, and do it justice in a single episode? But I think Mike just gave a really great rundown and, and what I want to do is try and add a little context because I saw some people when specifically with The Claude, the anthropic Claude news and the computer use. I saw a lot of stuff online where it was kind of like hyping it up. And I just want to make sure people understand the context that we're not there yet. This is very, very early. So we just need to pump the brakes a little bit on our excitement around computer use. It's a very dangerous technology. It is likely not going to see rapid adoption. I think it's going to be a while before people find really valuable use cases that aren't very specifically trained to do exact things. And so I just wanted to kind of again add some historical context here. So this idea of machine, an AI model being able to see what's happening on your screen and then take action, that that concept goes back quite a ways. So we talked about this world of bits research. This is going back to. Gosh, Mike, this might have been episode like 35 or something like that. Yeah, it was February of 23 I think, so almost a year and a half ago we talked about this. So Andres Karpathy, who we've talked about many, many times on the podcast, he, he was at OpenAI, one of the founding members of OpenAI. He left and went and ran computer vision and full self driving development at Tesla for five years, went back to OpenAI. So in his original stint at OpenAI in like 2016, 2017, he was working on this concept called World of Bits. So World of Atoms is us humans in our, in the real world. World of Bits is the digital world basically. And their premise was they could build these web based agents that could fill out forms, you know, click around your mouse, use your keyboard the way you would. And that if we could give these agents that capability, like imagine the world that opens up to what these AI models can do if they can take actions on our screens the way we do, and more specifically if they can generally learn. So if like you and I go to any web page, we can figure out what to do at it. There's slider scales, there's forms to fill out, there's buttons to click, but you kind of get it. It's the same interface on different pages. That's not how AI has historically worked. Like you could imagine training an AI to do a specific form like let's say tax. You could train an AI agent to do tax returns in a very specific way. But you couldn't take this general agent and drop it in to a tax return page and have it fill that out and then go over to a survey and Fill out a survey and like go over to another one and interact with games. That's not how these were worked. But that was the theory back in 2016-17, is that we could do this now. The world of bits research at OpenAI ran into some technological barriers. What ended up happening though is the story we told on the podcast back in February 23rd was that Andrej Karpathy went back to OpenAI in early 2023. So that was the time when we kind of like started talking about it. Because large language models unlocked this ability to build these web based agents. Because what they realized is once the model could understand the language, it could actually be trained to do this computer use model where it could learn how to use keyboards and mice. So this isn't new. Stuff like this has been going on for probably over a decade. They've been working on this technology. Google is working on it, OpenAI is working on it. I'm sure the others Meta, I guarantee you meta is working on it. We haven't heard about theirs, but there's no way they're not working on it too. So Anthropic, oddly enough, the frontier model company that's supposed to be focused on responsible AI more than all others is the one that came to market with a tool that it is not safe. So it's wild. So I want to focus in on what they're doing. We'll put the link to the world of BITS stuff in the show notes, just so people can follow along. But Anthropic published a couple of posts related to this. So in essence they kind of walk through like here's how it works. It sees your screen, it's able to interact with what's happening in basically taking like screenshots of your screen and, and then it's able to kind of interpret what's happening on the. The scene on the screen says Claude, looks at screenshots of what's visible to the user, then counts how many pixels vertically or horizontally it needs to move a cursor in order to click in the correct place. So that's how rudimentary this is. It's counting pixels. And so again you can hear this web based agents taking actions and you think like it's seeing and doing like a human, but that's not how they really work. Then they said for safety reasons we did not allow the model to access the Internet during training. This is your first kind of clue that this isn't fully baked technology. They turn a user's written prompt into a sequence of logical steps and then take actions on the computer. So it's using kind of like that reasoning model where it's going through these different steps. We observed that the model would even self correct and retry tasks when encountering obstacles. I was like, okay, that's, that's kind of cool. Then they do their own valuation, which they created to test these capabilities. And Claude currently gets 14.9%. I don't know if that's like an accuracy thing like of actions. Correct. Human level is 70 to 75%. So this thing is nowhere near human level. But the previous best was 7.7%. So then they go into making it safe. And this is the part where I just like, again, I'm kind of surprised. It's. It's anthropic doing this. So we found that Claude 3.5 sound, including its new computer use skill, remains at AI safety level 2. We've talked about the responsible AI levels on previous episodes. Here's where it gets interesting, though. We judge that it's likely better to introduce computer use now while models still only need AI safety level two safeguards. Now, keep in mind Dario Amade, their CEO himself suggested that we will be at level three concerns by next year. So it's not like we have a couple years to figure this out. They're like throwing this out into the world and let's figure it out. So then it says this means we can begin grappling with any safety issues before the stakes are too high. Rather than adding computer use capabilities for the first time into a model which with much more serious tasks, there's also the potential for users to intentionally misuse Claude's computer skills, which we've actually already seen people kind of jailbreaking it online and using it to do things it wasn't supposed to do. Our teams have developed classifiers and other methods to flag and mitigate these kinds of abuses. Again, really important for people to understand this. These models come out, they have all kinds of capabilities that are turned off, quote, unquote, turned off for us users. And the way they do that is by creating these classifiers that identify when a user is trying to do something the model is not supposed to do. And so they specifically say in the next paragraph, we've put in place measures to monitor when Cloud is asked to engage in election related activity, as well as systems for nudging Claude away from activities like generating and posting content on social media, registering web domains or interacting with government websites. So they are not saying it can't do those things. They're actually implying it can, we just don't want it to because it's not safe enough yet. Then they have a readme warning and get this like so it's telling people that want to use this model. This is a beta feature. Please be aware it poses unique risks and then they go into Specifically to minimize these risks, consider taking these precautions when using this model. Use a dedicated virtual machine or container with minimal privileges to prevent direct system attacks or accidents. Avoid giving it access to sensitive data such as account login information to prevent information theft. Limit Internet access to an allow list of domains to reduce exposure to malicious content and ask a human to confirm decisions that may result in meaningful real world consequences. Again, it's able to do those things. They're just telling you please don't do them and we've built some classifiers to try and stop you from doing them. They then go on to say it is slow and error prone and here's this is kind of funny, but it's demonstration like demonstrates the issues here. Even while we were recording demonstrations of computer use for today's launch, we encountered some amusing errors. I like how they word this. In one, Claude accidentally clicked to stop a long running screen recording, causing all footage to be lost. In another, Claude suddenly took a break from our coding demo and began to peruse photos of Yellowstone National Park. So if the whole point of this is if you're seeing people in your social network that are claiming that this is like some really advanced stuff and they change the game, people's favorite thing to say change the game. It hasn't changed anything yet. But like this is. This is research being done in a public beta that developers can play around with if they have access to the API. It is not something that a business leader or a marketer or an accountant or a lawyer two months from now is going to be using. You do not want these things having access to your logins and accounts. Okay, so that's my cloud. Take Perplexity. Interesting. So what you mentioned about them saying it'll automatically turn on the reasoning capabilities for hard prompts, it just reminded me like this is what we've talked about. I think the way that all these systems are going to work, whether you're in Anthropic or Google Gemini or ChatGPT or Salesforce or HubSpot, whatever, anywhere where these like agents work these different models, I really think we're gonna very quickly get to a point where we run into the system choosing the best model for you. It still makes no like it's like Chad GPT is built for developers still because there's like six models to choose from when you go in there. Perplexity is the same way. You can pick from any of like eight models. How the hell am I supposed to know which model to pick? Like I don't. You can't even click on them and learn which one is better than the other for different things. So I think that the way they're going is the way these will all go. Well, we're just going to get our act with Chat GPT. You're not going to care what model version it is or whatever. It's just Chat GPT and, and it'll pick the things and be the sort of symphony of models on the Google stuff. Obviously it's just rumors. It's just a couple of articles from the information in the Verge talking about different things coming. I would not be surprised at all if Gemini 2 dropped. I've always assumed Gemini 2 was coming this year. December seems like a logical timeline. Early December in particular. Computer use 100%. They've been working on that for a long time. It's interesting that it would be, you know, the anthropic doesn't obviously have the distribution. Google does. If Google finds a way to integrate computer use into the Chrome browser, which is the dominant browser, distribution is an interesting thing. And then the final note is just on OpenAI, as you mentioned, something is coming. I mean Sam does his vague tweeting and he actually tweeted on what day was this? October 21st. He said, Whoa, ChatGPT's second birthday is the next month. What should we give it as a birthday present? And that was a few days before the article talking about Orion coming out. So.
[25:50]
Guest Speaker
Right.
[25:51]
Paul Raitzer
They're absolutely coming out something. They probably won't call it Orion. So they're easy to say like, yeah, we're not launching Orion, but who knows what we're going to get from them. But it's my guess is 01 for sure. Like we're going to move beyond the preview and you're going to have the reasoning engine and you're probably going to have I, I, they got to release Sora sometime this year. Right. Like I, I feel like they've, they just got to do that.
[26:13]
Guest Speaker
Yeah, I think Sora is either this year or early next, maybe because they had to delay it or retrain it at one point. But yeah, I think there people are, there's some pressure on some stuff especially with the delay of voice mode, stuff like that.
[26:29]
Paul Raitzer
Yeah, we didn't even hear from Meta in this update. Like Meta's coming. Xai this morning announced that you can now upload images to where it can see and understand images. There's just, it's going to be a very busy November and December there. There is a lot more coming in the next two months, so stay tuned.
[26:48]
Guest Speaker
It's pretty funny because the term AI winter is such a popular term to note these cycles, macro cycles in AI where years will happen, where, you know, funding dries up. But I think we're getting an AI winter of a totally different story.
[27:02]
Paul Raitzer
Is it a different meaning? Yeah, certainly an AI fall. We'll see if it extends into the winter.
[27:08]
Guest Speaker
All right, so our next big topic, unfortunately is a little tragic and dark, but it's important to talk about. There's this really sad case that just came out of Florida that has raised some serious concerns about the risks of AI companion apps, especially for vulnerable teenagers. So a 14 year old named Sewell Setzer took his own life after developing a really deep emotional attachment to an AI chatbot on Character AI, which is a platform we've talked about often that allows you to create and interact with AI personalities. Think like tailored.
[27:49]
Paul Raitzer
Yes.
[27:50]
Guest Speaker
Which Google has recently acqua hired. So that adds a whole element to this because they are now getting sued because the teen who tragically had taken his life had been spending months intensively communicating with a chatbot modeled after a Game of Thrones character. And they actually in the article on this topic, shared some of his conversations. They're very intimate, I would say, in the sense that really talking through his deepest feelings and fears and eventually even discussing suicidal ideation. So his mom is now filing a lawsuit against Character AI, arguing that the company's dangerous and untested technology basically allowed her son to become emotionally dependent on an AI companion without adequate safeguards. So Character AI, like we said, is a huge company. It reaches tens of millions of people annually, tens of millions of users, and had a huge valuation before getting acquired by Google. And while the company does require users to be at least 13 years old in the US it specifically lacks safety features for underage users or doesn't really have any type of robust parental controls. They have announced plans to implement new safety measures, including time limits and expanded trigger warnings for discussions of self harm. Now, Paul, this is a pretty heartbreaking story, but definitely something that is needing to be talked about more because when it comes to protecting the safety of children online, this increasing and prevalent use of AI presents some new or enhanced risks. It sounds like, like what do Parents need to be aware of here.
[29:38]
Paul Raitzer
Yeah. So I think, I mean, anybody listens to this show regularly knows I have a 12 year old and an 11 year old. So this is, you know, kind of in our wheelhouse. I have a number of nieces and nephews, you know, in this age range. And, you know, knowing what Mike and I know and being as deep in this field as we are, you can, you can look out and see different things like this and you can kind of connect the dots about the likelihood of different tragedies occurring related to AI. It's the things I try not to think too often about. It's the things that you don't really like, get into. And Mike and I are out doing public speaking keynotes and don't really get the questions like this one. But I read this Wednesday morning, I was heading to New York that afternoon and I read this and I honestly couldn't even get through the article the first time. The first like five paragraphs just destroys you as, as a parent. And so I didn't even finish the article and I closed it and I think I put it on our, in our podcast sandbox and said, yeah, we're going to need to talk about this one. And then I wasn't really sure what to do because it's, you know, we have a bit of a platform now with this podcast and I thought it's definitely important we, we bring this to people's attention. But I very quickly realized, like, but that's just not sufficient. Like, there's gotta be something else we can do here. So as a parent, like my personal experience in the last, you know, five or so years as my kids have become active online and using different apps and things like that is. It's a wildly confusing space. I would consider myself to be relatively tech savvy, pretty good at figuring stuff out. But like, my kids are real active on Minecraft is one which seems totally harmless. Like in theory it just like, oh, cool. It's strategy. They're learning critical thinking, they're interacting with their friends. Like, that seems like a good environment. The trick is though, like, you can connect with people and talk to people like strangers through that game. But to manage that, you got to go to like a Microsoft account, which I even have. You got to create a Microsoft account. You gotta have like a family account. Every time my kids want to do something Minecraft, I don't even remember how to get into it now I have to like, go back to my notes to figure out, like, where do I go to Manage this and how do I update these settings? And did I turn all chat off or just like the chat for just their friends is like I have no idea. Then they use Roblox, same deal. Roblox seems fun. It's just a bunch of games. Cool. What could go wrong with that? As a parent again, savvy and this stuff. I didn't even know they could chat with people in Roblox. I just thought they were playing games. And then you realize like it's this whole world where they can interact with anybody. And Roblox, there was an article last week in the Verge talking about issues with them. It was blocked in in Turkey in August because they weren't taking the measures necessary to protect children. Earlier this month a financial newsletter, popular financial newsletter, accused Roblox of enabling child abuse. And the investment firm Hiddenberg Research alleged that its in game research revealed an X rated pedophile Hellscape. It's like oh great though. Those are the things I want to be reading about in a game that I thought was just like fun. And so Bloomberg initially reported users younger than 13 will have to get parent permission to access certain chat features. Again, as a parent, I'm going to tell you I shut off all like in Roblox I shut all communications off. And then my kids would be like, but can I just interact with my friends? Like there's like my friend is in here, I can see she's online. Can I talk to her? There's no way to just turn on, to my knowledge a friend only feature. So if I turn on the ability for them to chat with their friends, I've now reopened the ability for them to also chat with strangers who like show up and offer them whatever, like some thing in the game like I'll trade you this for that. And it's like, okay, is this some creep like just trying to interact with kids online by trading? I have no idea. Yeah, so overall another1be YouTube. I had YouTube for kids for my son. Well, he likes to watch these like video game playthroughs. They're totally harmless. Like they're actually good. He learns strategy and he applies it when he goes and like plays like Pokemon or Minecraft or whatever. But those people weren't available on YouTube kids. So it's like, okay, so if we just do this, but then if I monitor what he's watching and I check in and we have like an agreement between us about what they're allowed and not allowed to do. So I say all of this, that as a Parent who's pretty familiar with everything. I am at a complete loss of how to manage their safety online a lot of times. And I think I've done a good job with all the different family settings and accounts you can have. And then they'll mention something in a game. I was like, wait, how are you able to do that? Like, I thought I turned that off. Like, oh, no, you turned this off or that off? Like, this is still open in the game and it's fine. So all of that is context. I read this Wednesday morning, and I'm just, like, devastated. And so I think, hold on a second. Like, what we need to do is have parents understand these risks. We need to be able to, like, have conversations. Like, I. I've limited my son's time, for example, so, like, he has very specific times he's allowed to use his iPad. There's specific limitations on certain apps that I don't want him becoming, like, addicted to these apps. So his time is very tightly managed. So we have kind of an agreement between us. And so. But. But it's a hard conversation, and I wanted him to arrive at the need to have these kind of guardrails, that it's actually a benefit to him to do this. And so I try to talk to my kids in this, you know, involve them in it. I'm not just trying to direct to them, you know, get off your iPad. You only have, like, an hour a day. Whatever. It's like, no, no, let's come to this together so you understand why I'm telling this. So talking to our kids is hard. And then creating these guidelines in some cases for apps that, as parents, we don't even understand, like, how am I supposed to guide them on their use of Roblox when I've never played the game? Like, I don't. I don't even know what goes into it. So I immediately felt like, hold on, this is like a custom GPT thing. And so in a matter of, like, 30 minutes Wednesday morning, before I got on my flight, I built kidsafe GPT for parents. And I went in, developed a prompt, said, okay, I want you to be able to help parents understand risks, help them talk to their kids and create guidelines. This is not a replacement for expert guidance and support. If you know the kids are having issues, you need to seek expert guidance. It's on ChatGPT. So it can make mistakes. Like, it'll hallucinate, it'll say things completely false. But the whole point is for you as a parent to have a starting Point to. If you don't know how Roblox works, ask it, how does roblox work? Is TikTok safe for my kids? At what age is it okay for kids to be on Instagram? When they're on Instagram, what are the risks they're going to run into? So I built this thing in like 30 minutes and I started just testing it and it just worked. Like even me as a parent, I was like, okay, I need to talk to him and get about the YouTube time. How should I talk to him with that? Okay, let's create a general safety guideline. Let's think about a specific app that I want to create some guidelines for. And it just started working. And then I sent it to Mike, I think, and shared it with a couple people on the team. I was like, could you all just like test this? Because I just want to do something more than talk about this on the podcast. I want to actually like take some actions and so you can, we'll put the link in. You can go to smartrx AI and just click on Tools and this kid safegpt for parents is, is there. It is a chat GPT, custom GPT. So you do need a chat GPT account to use it. But it's, it's just meant to, to be a starting point. This is hard. Like it's really hard to be a parent in today's day and age with online games and social networks and all these different apps and AI. Like I never, honestly, I didn't even think it's talked to my kids about like AI that had becoming attached to characters. I hadn't even thought about that yet. But that's gonna be a thing like if I'm not mistaken, isn't it Snap that has like an AI you can talk to. I don't use Snap. It's in Snap. It's going to be in all their video games because within two to three years, all these characters in all these video games are going to be AI powered. You'll be able to just talk to them. And they're not always going to be safe. They're going to be classifiers like we talked in. The last thing that tries to keep them safe and keep kids from doing things with them. But the reality is we're heading into this very undefined world of what, how to parent. And so my hope was the kids safegpt, at least as a starting point for people to kind of think this through a little bit more. But yeah, we, we gotta be realistic that this isn't all gonna be sunshine and rainbows and, you know, growth of productivity and efficiency and creativity. Like there's dark sides to this and they're not gonna go anywhere. And so we gotta take some actions to, you know, certainly protect our kids best we can. Yeah.
[38:30]
Guest Speaker
And I would just emphasize as well, this surprised me because we have been following Character AI for a while, but probably not as early as we should have known about it. Because if you think this is just like a niche thing that certain nerdy kids or AI people are following, last I checked, character AI's user base, 57% of it is 18 to 24. So it's gen Z. And I guarantee you, especially after this story, they're not reporting all the ones that are younger or not including that in these. So these are.
[39:08]
Paul Raitzer
Or the parents who say their kid's 13 when they're 9, because that absolutely happens. You just say, yeah, whatever, just 13.
[39:15]
Guest Speaker
So if you think for some reason because ChatGPT isn't being used at school or something like that, that kids are not able to find and use these tools en masse, I think that I would update that thinking. All right, so our third big topic this week, we had kind of a high profile and somewhat important departure from OpenAI that has some kind of implications, some bigger issues that OpenAI and other AI companies are working on. So we saw individual named Miles Brundage, who used to serve as the senior advisor for AGI readiness at OpenAI, announced his departure from the company after six years to pursue independent AI policy, research and advocacy. Now, he published a pretty detailed explanation of this decision and he highlighted his growing concerns about how fast AI was moving and the need for more independent voices to shape its development. This is particularly noteworthy given how instrumental Brundage was in establishing key safety practices at OpenAI. He that includes their external red teaming program and their system cards. He was heavily involved in getting those out. And he basically thinks that there are increasing restraints on publishing research while at OpenAI, largely given kind of the high profile nature of the company. He also doesn't think he can actually be as impartial and effective as he could be in policy discussions while working for one of the major AI labs. And he believes they can just be more effective working outside of the confines of an AI company. Now what's really important here, Paul, is that as he was outlining his rationale, he did not have anything specifically critical to say about OpenAI in the sense of don't work there or they're doing evil or anything like that. But he did say, so how are OpenAI in the world doing on AI AGI readiness? And he wrote, in short, neither OpenAI nor any other frontier lab is ready and the world is also not ready. So that reasoning alone is why we're kind of mentioning this particular departure, even though there's been tons of them from OpenAI. Paul, what did you kind of make of that? Why is it so important to pay attention to what he's saying here?
[41:40]
Paul Raitzer
Yeah, again, it's someone from the inside who is literally in charge of this process saying what we've been keep repeating and like different research keeps showing is that we're not ready. We can't just assume that the frontier model companies building these things have this all figured out. What one to two years from now looks like. This is what I've been convinced of all along is you want to ask Sam Altman about the impact on work. He's just guessing. Like his guess might be a little better than yours and mine, but like that's not what Sam is sitting around thinking about. Right. Like, you know, and so when we get into these bigger issues, like I, I respect the fact that Miles is willing to go out on his own and, and pursue this kind of research because these are the things that have to be talked about. Like he said, I think AI and AGI benefiting all of humanity is not automatic and requires deliberate choices to be made by decision makers and governments, non profits, civil society and industry. And this needs to be informed by robust public discussion. And our opinion all along has been it's not, there's not enough of it. There's not enough discussion about the hard topics because I don't think enough people really understand how urgent this needs to be across these different areas. I think people just assume this is going to take three to five to 10 years and like figure it out or somebody will figure it out. And that's not how this is going to work. Yeah, so he said, I think AI capabilities are improving very quickly and policymakers need to act more urgently. I worry a lot about AI disrupting opportunities for people who desperately want to work. But I think it's simultaneously true that humanity should eventually remove the obligation to work for a living and that doing so is one of the strongest arguments for building AGI in the first place. I mean, that might be the only par ever I disagree with. But then he even cites Wall E, the movie. Like with, yeah, seen Wally, we're just basically sit around and don't do anything all day. But this is what the people in these Labs are thinking about the challenges they're faced and he's actually trying to be part of the solution where a lot of times it sounds like the solution is we're just going to build the AGI and then like we'll, it'll help us figure out all this hard stuff like what are people going to do for a living and how do we, you know, how should universal basic income be formed and how do we give people fulfillment in their lives when work is no longer part of like I'm, I don't think like we're there like a couple years from now we have to solve for say this, but I, I do think we need to seriously be thinking about a lot of the issues he brought up. And you know, the fact that he's willing to leave and pursue them tells me that it is as urgent as we think it is that people are solving for these things and talking more about them.
[44:25]
Guest Speaker
Yeah, we've talked about this in different contexts before, but I just am continually stunned by the lack of urgency. Because if you look at this as an analogy in any other area of like technology, like what if 10 years ago, five years ago, we had been hearing all this chatter about the damage social media could do to people or the disruption it could cause. Even if it was 30% right, we should have been talking about those issues and it would have been a no brainer to start acting on them. And we still haven't actually figured out that issue. So like, no, this is, I'm just curious why there hasn't been more urgency. But it sounds like we are just hearing more and more about this from the major labs. You and I talked about this during our AI Mastery membership Trends briefing on Friday that we're just hearing so much in 2024 about AGI readiness, workforce preparedness, but not doing a lot about it.
[45:23]
Paul Raitzer
Yeah, I just, I honestly think like it's not even political yet. I really think this is just, it's so hard to wrap your mind around change that's so transformative that it's hard for people to step out of their daily roles and the things they're already thinking about and say, well, what if everything is totally different in 24 months? Like what if it's not just a bunch of ideas and theories and it becomes reality and now we're faced with it. I just don't think that the human mind is generally designed to think through those processes. And I do worry that once this gets politicized, which is inevitable. Yeah, then you're going to be what's that movie, the Netflix movie? Don't look up where it's like, yeah, the asteroids. Like, Tommy, you can literally see the asteroid. And it's like, no, it's not. That's not an asteroid. It's like, no, it's an asteroid. Like, it's really coming. And I. I kind of feel like at some point we may arrive at that. That stage of AI where now it's blatantly obvious that this is going to transform everything, and then you're going to have a segment of society who just, like, disregard it because facts are optional to people sometimes. So who knows? I hope it doesn't go there, but it just seems like an inevitability that it gets politicized at some point.
[46:41]
Guest Speaker
So to kind of wrap this one up, one interesting related announcement here is that at the same time, roughly, OpenAI revealed that it's hired its first ever chief economist, Dr. Ronnie Ch, whose job they say is, quote, to lead research to better understand AI's economic impacts and make sure its benefits are widely distributed. It sounds like from Brundage's post as well, that Chatterjee also might be taking over part of the AGI readiness team that Brundage was part of. There's this element of it called the economic research team that's now going to be under Chatterjee. Like, is this a sign that they're getting more serious about taking some steps here to prepare?
[47:22]
Paul Raitzer
Yeah, I think it's a good move. I don't know his background. I'd have to dig into him a little bit more. But, yes, I mean, I think this is a very important move and it's good to see they're heading in this direction. Hopefully they're very aggressive in their research and very open with it. That's one of my concerns. As Miles sort of alluded to the fact that it's really hard to publish research these days that, that maybe run against the commercial interests of OpenAI. So there's some concern there, obviously, that maybe they're going to learn some stuff that's not going to come out. But, yeah, it's. I mean, overall, it's a good move.
[47:57]
Guest Speaker
All right, let's dive into some rapid fire this week. So keeping to the same tone as before, you know, as if that wasn't heavy enough, our first rapid fire, is that about how the White House has just issued a national security memo outlining the US Government's strategy for maintaining leadership in AI while ensuring that it's safe and responsible development happens for national security purposes. So basically, this is a very long memo. A lot of people have published some really great commentary on it. But it consists of a few kind of broad areas here. So first, the US is aiming to lead the global development of safe, secure and trustworthy AI. The memorandum calls for strengthening America's AI ecosystem through partnerships with industry, academia and civil society. It includes a bunch of specific measures to to attract global talent. Multiple federal agencies are actually directed to streamline visa processing for AI experts and enhance their computational infrastructure. The government has also said it's going to work to harness AI capabilities for national security while implementing appropriate safeguards. The directive establishes new oversight mechanisms, including requiring each relevant agency to appoint a chief AI officer and create AI governance boards. Notably, it mandates the development of an AI framework to assess and manage risks, particularly for high impact AI systems that could affect national security decisions. Now last but not least, the memo emphasizes building a stable international AI governance structure that promotes democratic values. The State Department is tasked with developing a strategy for advancing international AI norms in the next 120 days, working through bilateral relationships and multinational organizations like the UN and the G7. They are also as part of this memo, creating new coordination mechanisms. And AI National Security Coordination Group is going to be co chaired by the Department of Defense and the Office of the Director of National Intelligence basically to help coordinate and harmonize all these policies and activities across all these government agencies. So Paul, I guess what I found notable about this like at the same time this came out, OpenAI published a companion piece to the memo basically outlining how they're approaching national security. Like how significant is this? I feel like we have started hearing so much more about this topic in the second half of 2024.
[50:31]
Paul Raitzer
So it is a kind of a continuation or fulfillment of a directive from the Executive Order from the Biden administration in October of 2023. So it wasn't like this came out of now. This was, you know, part of what their focus was was delivering this kind of next memorandum. I would say at a high level this is easily could have been a main topic this week, but we're going to, you know, just kind of touch on it. It continues to show that the White House is taking this very seriously. They very directly recognize that they the America has found itself in a leadership position largely if not almost exclusively through private investments in companies that the government has really not been part of this large scale initiat that has put the United States in sort of the pole position to win at AI. I think it's a prelude to a lot of the large scale Infrastructure initiatives that we've been discussing. You know, I've said multiple times we needed this like Apollo mission level or beyond initiative around putting the infrastructure in place and maintaining this position. So chip fabs, which we just had news last week that the TSMC plant in Arizona is actually doing really well after some rumored slow starts. So chip fabs being brought so we're not as dependent upon Taiwan electrical infrastructure, nuclear plants, data centers, all of it is going to get wide scale support. So the Chips act, the 10 billion from the Chips act will look like, you know, pennies compared to what they're going to do. Yeah, some of it we'll hear about a lot of it we won't. They specifically said that there was a classified annex to this memorandum addressing additional sensitive national security issues. So there's going to be a lot of stu that happens through DARPA and other agencies that we're never as a public going to really know about what's going on until 20 years from now when all the books come out telling us what happened. But trust me, there's going to be a lot more than what you're going to be seeing in the news that's going to be going into this. I think it indicates they are not going to be very aggressive in federal regulation. There's going to be, if, you know, to the, to the private companies that are driving this, if you play ball with the government, give us what we need, do what we need done, then, you know, we can, you know, make regulations a little easier on you sort of thing. If you don't play ball, then you're going to have a hard time, basically. And so I just, I think it reaffirms that the current administration sees this as a must win. And with elections, you know, 10 days away or whatever it is, this is going to be a major thing to keep an eye on depending on which way the elections go in the United States as to whether or not these initiatives, you know, remain high priority or not.
[53:15]
Guest Speaker
All right, so next up, there's a little bit of drama going on with Microsoft and Salesforce. They are kind of at each other's throats a little bit online about AI agents. So, you know, Salesforce has gone all in on AI agents with big announcements at Dreamforce just about a month ago and their Agent Force platform coming out. Microsoft has announced 10 new AI agents designed to automate various business operations like sales, finance and customer service. We talked about that in last week's episode. But the rhetoric around this is getting a little hot because Salesforce CEO Mark Benioff posted the following on X among a few posts coming at Microsoft. He said, quote, Microsoft rebranding Copilot as agents. That's panic mode. Let's be real. Copilot's a flop because Microsoft lacks the data, metadata and enterprise security models to create real corporate intelligence. That is why Copilot is inaccurate, spills corporate data and forces customers to build their own LLMs. He then says, Clippy 2.0. Anyone referencing Clippy, the former long, long ago mascot of Microsoft Word, which was like an annoying little animated paperclip, just kind of brutal. And it goes on to say, meanwhile, Agent Force is transforming businesses now. It doesn't just handle tasks, it autonomously drives sales, service, marketing, analytics and commerce with data, LLMs, workflows and security all integrated into a single customer360 platform. This is what AI was meant to be. Now Microsoft has kind of defended Copilot, saying, look, Bunch of Fortune 500 or 100 companies are using it. Major clients include McDonald's, Amgen and others. Paul, this rhetoric from Benioff is kind of basically just trolling them for Copilot's, like, perceived lackluster performance. He's obviously talking his own book, but, like, what's important to pay attention to here?
[55:13]
Paul Raitzer
I don't know. It's all just funny. It's like, you know, OpenAI versus Microsoft now, you know, Benioff wants in, so he's gonna take it to. And he's like, you could probably say the same stuff about OpenAI, but he doesn't care. Like, it's their night. Maybe they're not on his radar from a competitive standpoint at this point. So I don't know. It's just funny. I mean, if anybody is using Copilot, by the way, like, in their company, has hundreds or thousands of licenses and is having success, I would love to hear from you. Just message me on LinkedIn. I'd love to, like, hear a case study of it actually working really well. My qualitative experience from talking to companies that, that have Copilot. I don't know that I've actually talked to anybody that is blown away by it, that is finding massive value from it. I assume it's in large part because of lack of training and education and onboarding and things like that. I've been privy to a little bit of how Microsoft trains people with Copilot, and I don't know that it's necessarily the approach I would take. But again, like, I don't know, like, I would, I would love to hear from people that are doing it. Like, maybe Benioff's got. Maybe there's something to it, what he's saying, I don't know. But it is, it is interesting and it is makes for fun conversation. On our podcast, when Betty off tweets.
[56:38]
Guest Speaker
Provocative things, I feel like we're maybe a few incendiary conversations away from the term AI agents. Just like meaning nothing.
[56:47]
Paul Raitzer
At this point, I already feel like it means nothing. Like they're literally just. Yeah, like I said, I've said before, like I feel like everybody just took automations and just slapped the name agent on it and now all of a sudden everything's just an agent and really nothing's an agent anymore. Like, yeah, I don't, I don't even know how to differentiate them now. It's. Yeah.
[57:09]
Guest Speaker
All right, so next up, we got some news that Disney is preparing to announce a major AI initiative that could transform how they produce content. This is coming from a publication called the Wrap that says in an exclusive that sources told them that this initiative is going to involve hundreds of employees. It will primarily focus on post production and visual effects work. There will be additional applications apparently in Disney's Parks and Experiences division, though they say not in customer facing roles. Now, if you recall, a few podcast episodes ago, we talked about Lionsgate, the movie studio, partnering publicly with Runway, which was kind of, we speculated, going to kind of open up the floodgates to these kind of deals between entertainment companies and AI companies. Disney is of course using AI already today in a bunch of their shows, but this would apparently represent a more comprehensive embrace of the technology. So there's not really any details yet here except like something big on Disney's end is coming from an AI perspective. But I did find them mentioning Lionsgate and just having that top of mind pretty interesting, especially as we just talked About Runway Act 1 as well. Sounds a lot like how I might make Dreamworks or a Pixar movie moving forward. Paul, like, what's going on here? Are the floodgates opening? Are we expecting deals between entertainment companies and AI companies, big AI driven initiatives now that it's like, like more okay to talk about this stuff or what?
[58:43]
Paul Raitzer
Yeah, I mean, well, I mean the tech is just improving, you know, throughout this year. Again, like Runway is awesome, but you're not creating movies with it. Like it creates like 10 second clips that 80% of the time don't have consistency from frame to frame. Like it, it's imperfect technology. Still, it keeps getting better and you know, we're going to see this sort of exponential rate of improvement in the tech over the next 12 to 18 months. And maybe it gets to the point where you're creating one minute to five minute videos and they're consistent, but we're not, not there. You can imagine something like Disney who can, you know, actually work to train a model on its characters legally. And you know, that's, that's pretty impressive, like what they'd be able to do. And I think I saw somewhere that the assumption is that they're doing a deal with OpenAI. But I can, I couldn't find that tweet or article as I was looking for it here. But I think that was part of the, the rumor was that like Runway was trying to get out ahead of like a bigger announcement with their announcement. So, yeah, I think we're going to see a lot of these kinds of deals where the different movie studios work with the leading companies. But Google's gonna, you know, have a say in this. Meta's gonna play in this game. Like, this can be a very competitive space, I would imagine. I still, I still keep waiting for Runway to get acquired. Like, I, I don't. It just seems like there's such a natural acquisition target. Maybe it's hard to make purchases like that right now, but I don't know, I just feel like somebody's going to buy them and maybe it's a actual studio or something, but I lean more toward one of the big frontier model companies that would roll up that technology.
[60:18]
Guest Speaker
All right, so next up, we've heard that Apple is kind of taking a cautious approach to AI powered photo editing. So in a recent Wall Street Journal interview, Apple software chief Craig Federighi shared that even basic object removal features in the upcoming iOS 18.1 release sparked really significant internal debate at the company. The new Cleanup feature in Apple's Photos app will allow users to remove objects and people from images, but according to Federighi, is deliberately more limited in what it can do than competing tools from like Google and Samsung, which can actually add AI generated elements to photos. And Federighi actually emphasized that Apple's priority is to, quote, help, quote, purvey accurate information, not fantasy. They're basically worried that AI is going to have a negative impact on what you can believe you're seeing in a photo, like how much credibility photography has. So to maintain their transparency, they're actually going to have any image edited with this cleanup feature will be tagged as, quote, modified with cleanup in the Photos app and include embedded metadata marking it as altered. So Paul, this is certainly an interesting debate. Seems like companies are maybe coming down on different sides of this. I guess I'd just have to ask what's your take as like a longtime Apple watcher? Like at some point, isn't everything just going to be kind of altered with AI too? Like what's the balance here between flagging it and just kind of accepting this is the way?
[61:53]
Paul Raitzer
I don't know. I feel like I, I feel like a lot of Instagram and you know, images online in society is probably already heavily in this altered world where it's hard to know if like you're seeing the original image or a photoshopped image. This is just going to make it so much easier for anyone with no technical skills to alter whatever they want.
[62:14]
Guest Speaker
Right.
[62:14]
Paul Raitzer
And so I, I mean I, I understand and respect Apple's position. It's very on brand for them to try and sort of of maintain this reality. I don't know if it's a losing battle, but at least they're doing the, you know, tagging it modified with cleanup, things like that. So, yeah, it'll be interesting. I did finally buy the 16 yesterday because knowing Apple Intelligence, I think Apple Knowledge is supposed to come out on the 28th or like this week. So I can definitely tell you, like there's nothing in it. So I, all these ads are Apple Intelligence, I have it and that is, it is not intelligent. Is the same phone as I my iPhone 14, probably a little faster. So I'm hoping that when the Apple Intelligence actually starts showing up, it is, it's worth it. But I think it's going to be a lot of little features like this. Really?
[63:04]
Guest Speaker
Yeah. So another kind of related topic, Google has just announced they are open sourcing their Syn ID text watermarking tool. So we've covered this previously. Synthid, It's a watermarking system they created to help identify AI generated content across images, audio, text and video. Basically, it works by embedding imperceptible digital watermarks directly into AI generated content. For instance, for text, SynthID uses a novel approach that subtly adjusts the probability scores of word choices throughout the generated content. So this basically creates a unique pattern that can be detected later with audio. The system converts sound waves into spectrograms, then embeds watermarks that are inaudible to humans for images and video. The watermarks are embedded directly into the pixels again in a way that you can't actually see or that maintains the visual quality. Apparently Synth ID watermarks can also survive common modifications like Compression, cropping, filters and formats changes. And for text specifically, the system needs as little as three sentences to work effectively and its accuracy improves with longer content. So by open sourcing the code for the text version of this, Google is basically just making it more accessible for developers. They can go grab the code, use it in their own products, use it in their own models. Google said in a post on X by open sourcing the code, more people will be able to use the tool to watermark and determine whether text outputs have come from their own LLMs, making it easier to build AI responsibly. So Paul, this seems like notable maybe on two fronts. Like first, one way to start moving us towards a way to more identify like misinformation created by AI, you know, malicious uses of text generation, like deep faking things. But second, does it mean we now have a way to actually like flag AI generated text? Like that's been a big debate around schools, like can you even detect this stuff? What's going on here?
[65:15]
Paul Raitzer
It seems like they've probably made advancements in it. I mean we've always said it's possible that they could solve this. The alternative side is that you can always create an AI that, you know, makes it so that the AI that's supposed to be detecting it doesn't work. So it recognizes the patterns, it knows how sith AI works. And so you take your text that it was created with the language model, you put it into the other AI, it remixes things so that it removes those probabilities that it had used to identify it. So I don't know, I mean, it seems like it's always just going to be an arms race of whether or not something was or was not. I like the idea of being able to know. Like I think that that's valuable. But I also like the idea of teaching responsible use of these tools in schools and in the professional world.
[66:03]
Guest Speaker
Yep.
[66:04]
Paul Raitzer
So we don't necessarily need it, but I, I think it's inevitable that it's just always going to be a pursuit and that it's going to keep improving and then the tools that make it unusable are going to improve and just keep going back and forth. It's kind of like search like you're gonna, you're gonna always try and, you know, play the algorithm. You're always trying to figure out what Google's doing and then do something to get out ahead of their algorithm and then they'll fix it and then you'll do something else. And it's just how the game works.
[66:33]
Guest Speaker
All right, next up, a former OpenAI researcher has come forward with some allegations about the company's use of copyrighted material. In a new interview with the New York Times. This is kind of one of the first instances of an AI insider speaking out against OpenAI's data practices. This comes from Suchir Balaji, who spent nearly four years at OpenAI, partly helping gather and organize the vast amounts of data used to train ChatGPT. And he claims that after looking at this pretty closely, he thinks the company's practices violate copyright law. So he's 25 years old, he left OpenAI in August, and he's arguing that systems like ChatGPT are actually undermining the commercial viability of the creators and businesses whose data was used to train them. He's written a long piece challenging the commonly cited like fair use defense that's often used by AI companies around copyright. He's arguing that while AI outputs aren't exact copies of the training data, which would be a huge problem, they're also not fundamentally novel either. So he expands on this in a post on X about the article, he said, quote, when I tried to understand the issue better, I eventually came to the conclusion that fair use seems like a pretty implausible defense for a lot of generative AI products for the basic reason that they can create substitutes that compete with the data they're trained on. OpenAI has rejected this line of reasoning. They state that they build their AI models using publicly available data in a manner protected by fair use principles and supported by legal precedents. Balaji was now currently dedicating himself to personal projects, basically advocates for regulatory intervention. He argues it's the only way to actually begin addressing this. He also did say he's like, this isn't just about OpenAI. He said, quote, that being said, I don't want this to read as a critique of ChatGPT or OpenAI per se, because fair use in generative AI is much broader issue than any one product or company. So Paul, as part of this commentary, in his interview with the New York Times, Balaji, like, comes out, says he's, I'm not a lawyer. I've just looked at this issue. We are not lawyers. But he does appear to be making some good points about fair use. But I guess, like, my question is like, are we moving fast enough to figure this out? Like, there's a ton of lawsuits around it, but nobody's really got the book thrown at them. The model companies haven't stopped doing this, and you could even argue we've Covered multiple times. It might even be accelerating as companies are like raiding YouTube for video data. Like, what is the way forward here?
[69:21]
Paul Raitzer
I don't know, but I think in addition to being a consultant, he's probably going to be a nicely paid witness in some infringement cases in the next 10 years. Yeah, I mean, that's what they're going to need. They're going to want insiders who can come in and say like, this is what they were doing. And I don't know, I still feel like we're years away from any, you know, sort of clarity on this. Eventually goes to the Supreme Court. Who knows what happens then. And we're on GPT8 by the time this is even like there's going to be. Yeah, there'll be plenty of cases and there'll probably be some settlements and. But a lot of it's just going to get done away with licensing deals where everybody sort of comes out ahead apparently. And I don't know, I mean from we talk about Macon, you know, we had a panel on this and that was the general sense of the actual experts in this topic is this is going to take a long time to work its way through. We're not going to get some updates next year that, you know, resolve all this. But those experts also felt like the US Copyright Office is probably going to make some adjustments to how copyright is determined moving forward in terms of AI generating content, whether or not you can apply a copyright to it. And then the separate issue is the training data itself. Right. So I don't know, it's always a fascinating topic to touch on, but nothing's really changed yet.
[70:44]
Guest Speaker
All right, so our last topic this week, this has been kind of a heavy episode across some of these topics. So we're going to leave this with a positive practical use case for AI that we stumbled on. This comes from Billa Wall Sidhu, who is the host of the TED AI Show, a prominent AI commentator we pay attention to, and he writes on X quote prepping for a 25 minute keynote, I exported a PDF of my slides to Gemini 1.5 Pro and dropped in an audio recording of my dry run. Gemini gave me precise feedback down to the slide and timestamp. After incorporating the suggestions, I recorded another round and it was a total game changer. It's basically offline. Advanced voice mode went from meh to mic drop in two practice sessions. It's like having an AI speaking, speaking coach with perfect attention and infinite patience. This should be a product in itself. And in case you're wondering he also shared the prompt that he used with Gemini 1.5 Pro. He said you are a world leading speaking presentation coach that helps the highest performing keynote speakers. Here's my voice recording of a first cut going through delivering this presentation while following the speaker notes. Please assess my delivery timing and any advice you have for me to improve my next practice session. Ensure my concrete recommendations you, any concrete recommendations you have for me it reference my speech and presentation accordingly. So Paul, like I know I for one am gonna steal this for, for talk prep.
[72:21]
Paul Raitzer
I would say yeah, I, I love practical use cases like this, this process, you know, I don't know, timestamp this 12 months from now is going to be unnecessary. You'll just turn your iPhone on. Project Master will be baked into Gemini. And yeah, you just say I'm going to re, you know, do a trial run of my talk, you know, watch it and critique me as I'm going. And it'll do all these things that he had it doing by uploading a PDF and audio. Like it'll just happen through vision. But yeah, I love the use case. I love the practicality. I think this is the key. Like I always, anytime we talk to companies about this, when you're piloting, find personal use cases that, that change things for people. So if like, you know, if, if you have someone who does public speaking or has to make presentations internally or create proposals or like whatever they do, help them build a custom GPT or a Gemini gem to do the thing. Like, and now it becomes so easy and you just stack these things and all of a sudden you got three, four, five things that have just changed your workflow. And you know, that's why I always say like there's no way you don't get the value for the 20 or 30 bucks a month if you find a few of these things that you can use every month. So yeah, it's great and a good call to action for people like go, go do that. Like have it be your homework assignment this week. It's like go build a custom GPT or a Google gem or an anthropic quad project and just like build something that you know, enriches your workflow or saves you time and money or you know, improves your creativity or your performance. Like it's, it's extremely doable. You just got to pick something that you're already doing and, and use AI to help you do it better.
[73:58]
Guest Speaker
Yeah, I think this is really interesting too because people sometimes in my experience think too narrowly. They say, oh, can AI go do this thing for me. Can it build the presentation? Can it create the script? And like yeah, in certain cases it can. And you should explore how to have tools automate those pieces of your workflow. But also, like, it's intelligence on demand, it's advice on demand, it's coaching on demand. Like, think of if you could have a pretty solid expert at anything over your shoulder while you work like a human, you would be better at doing the thing, probably by taking their advice. That's exactly what he's doing here.
[74:34]
Paul Raitzer
Yep, yep. That's a great way to look at it. As an advisor, a strategic partner, a thought partner is like my favorite thing to do with ChatGPT for sure.
[74:44]
Guest Speaker
All right, Paul, as a action packed week here, tons of updates. Thank you for demystifying everything for us. We really appreciate it. Just a couple real quick housekeeping notes at the end here. Go check out our newsletter@MarketingAI institute.com forward slash newsletter. It rounds up all the news we talked about today, plus a bunch of stories we did not have time for. Also, if you can, leave us a review of the podcast. It helps us improve and it helps us reach more people. Paul, thanks again. I mean, I feel like the week is just getting started here and we've covered a hundred different topics.
[75:21]
Paul Raitzer
Well, the crazy thing is by Friday I started the episode 122 sandbox because I was like, I'm not putting anything else in 121. There's no way we're getting to it. So I think I already had like five news items in next week's podcast Sandbox, so. All right everyone, well have a great week. Happy Halloween if you're, you know, celebrating that this week and we will talk to you in November. As crazy as that sounds, next time. Thanks Mike.
[75:47]
Guest Speaker
Thanks Paul.
[75:49]
Mike Kaput
Thanks for listening to the AI show. Visit Marketing AI Institute dot to continue your AI learning journey and join more than 60,000 professionals and business leaders who have subscribed to the weekly newsletter, downloaded the AI blueprints, attended virtual and in person events, taken our online AI courses, and engaged in the Slack community. Until next time, stay curious and explore AI.