
Loading summary
Paul Raitzer
If you hear about AI agents and you think, oh my gosh, they're taking my job next year, that is not happening. Like if, if you realize all the things that have to go into making agent work, goal setting, planning, building it, monitoring it, improving it, that is almost always the human's job right now.
Mike Kaput
Welcome to the Artificial Intelligence show, the podcast that helps your business grow smarter by making AI approachable and actionable. My name is Paul Raitzer.
Paul Raitzer
I'm the founder and CEO CEO of.
Mike Kaput
Marketing AI Institute and I'm your host. Each week I'm joined by my co host and Marketing AI Institute Chief Content Officer Mike Caput, as we break down all the AI news that matters and give you insights and perspectives that you can use to advance your company and your career. Join us as we accelerate AI literacy for all.
Paul Raitzer
Welcome to episode 124 of the Artificial Intelligence Show. I'm your host Paul Raitzer along with my co host Mike Kaput, who is our Chief Content Officer at Marketing Institute and co author of our book Marketing Artificial Intelligence. We've got like a lot to cover this week. It's going to be a continuation in some ways of last week's conversation about like have these scaling laws just stop working? Have we hit a wall? There was more stuff there. We've decided to do a deep dive into what is an AI agent and I'll explain why in a minute. But this is really important conversation and I hope very valuable people. And then that's unbelievable. Five hour podcast from Lex Friedman and shocking that I actually listened to the whole thing with Dario Amade and Amanda from Anthropic. Wow, what a, what a marathon that was. Okay, so we got a lot to talk about and a bunch of rapid fire items and then some stuff we had to cut at the last minute because I, we're, we're going to struggle to keep this one under an hour, 15 minutes. We're going to do our best. All right, so this week's episode is brought to us by the AI for Agency Summit. This is our virtual event that's taking place Wednesday, November 20th. So if you are listening to this the day it comes out on November 19, you still have time to get in for the live event. If you are listening to this, after November 20th you can get AI for agencies on demand. So you haven't missed out if you listen to this late. But we are recording this on November 18th. I'm in the midst of finalizing my opening keynote which we are going to talk about because it is related to the AI agent topic. But AI for Agency Summit again is coming up on Wednesday, November 20th. It is a half day virtual event from noon to 5:00pm Eastern Time. We're going to hear from, I think there's about six case studies from AI agency, from agency leaders talking about how they're using AI, how they're infusing it into their own transformation and building into their client programs. I've got an opening keynote on AI agents and the future of agencies. We've got incredible closing keynote. We've got a panel that provides brand side perspective on what's going on and how brands are thinking about working with agencies. So there's a ton of content packed into five hours of a virtual event. You can check all that out@ai4agencies.com that's aifor agencies.com you can use promo code AI forward 200. That'll get you $200 off your ticket. And again it's ai4agencies.com be sure to use that promo code. And as I mentioned there will be on demand options. So if you can't make it live, different time zone or you're busy that day, don't worry about it. You can catch up on demand. All right. And then a quick programming note. So we are not going to have an episode next week. That would be the November 26th would be the drop day. Normally Tuesday, November 26th we will not have an episode. The next weekly will be Tuesday, December 3rd. So I'm actually going to be on vacation at the end of this week and we can't record while I'm on vacation. So Mike and I are going to take a week off. Hopefully nothing too crazy happens. I, I realize now that that will be in the midst of the 2 year anniversary of chat GPT so maybe some things will be happening. But yeah, so we'll catch up with you on December 3rd and we'll, we'll remind you again at the end here. Okay. Mike, has AI training hit a wall?
Dario Amadei
That is the question of the day, of the week, maybe of the month. Because this is a topic we are kind of basically just contin a conversation about that. We started last week and we've got some more news and some more sources on this topic because more and more people in the AI community, at least some of them appear to be asking are we hitting a wall when it comes to scaling AI and improving AI models? Because right now we're getting more and more reports that major AI model companies are hitting roadblocks in their race to build their next generation of models. So according to things like recent reporting from Bloomberg and the information OpenAI, Google and anthropic are all experiencing diminishing returns in their efforts to develop more advanced AI models, despite making massive investments in computing power and data. Last week we talked a little bit about how OpenAI's latest model, which is codenamed Orion, hasn't met the company's performance expectations. It has particularly struggled with coding tasks. Similarly, we're hearing that Google's upcoming Gemini update is falling short of internal goals, and Anthropic has actually delayed the release of its anticipated Claude 3.5 opus model. Now, the root of this problem appears to be threefold according to all these sources. First, companies might be running out of high quality training data, so the Internet's freely available content, which powered the first wave of AI models may no longer be enough to create significantly smarter systems. Second, even modest improvements now require enormous computing resources, which makes it harder to justify the costs. And third, this kind of long held belief in Silicon Valley that simply scaling up models with more data and more compute would lead to better performance, which is known as scaling laws, is being challenged. So this all has kind of come together in a narrative right now where prominent AI voices are claiming we are hitting a wall when it comes to AI development. And more importantly, we're not on as fast a path to artificial general Intelligence, or AGI, as some AI leaders have previously led us believe. For instance, Margaret Mitchell, chief ethics scientist at Hugging Face, put it to Bloomberg this way, quote, the AGI bubble is bursting a little bit. So Paul, maybe start us off from the top here and walk us through, like, what's going on here. We've talked about this topic a bunch of times, not just last episode, but throughout the year. Why are these conversations about hitting a wall getting so loud and prominent right now?
Paul Raitzer
Yeah, so there's a lot to unpack here. And you know the one about Anthropic and Claude Opus, and you know why we haven't seen that one. We're actually going to talk about that as the third main topic today because that was part of Dario Amadei and Anthropic did that massive interview with Lex Friedman. So we'll get into Dario's thoughts on this. But in essence, media reports and some AI antagonists are claiming the scaling laws are slowing down or plateauing. But many voices inside the lab say there's no end in sight. So just this week we got a tweet from Sam Altman November 14, said, There is no wall. I guess last week we'll put the links to these in. You can go check them out for yourself. We had Oriole Vinales from Google DeepMind, VP of Research and Deep Learning Lead at Google DeepMind. He replied what wall? In response to a new benchmark that we'll talk about in a rapid fire item that showed Google has a forthcoming model that is now number one on the benchmark leaderboard. And then Miles Brungage Brundage, who we talked about on episode 121, who was the former senior advisor for AGI readiness at OpenAI. So someone who certainly is aware of what OpenAI is doing but also no longer has to toe the company line because he is independent now and was very vocal on his way out as we talked about in episode 121. So he doesn't really have a stake in, you know, you know, continuing to push OpenAI messaging if it's not true, he tweeted, betting against AI scaling, continuing to yield big gains is a bad idea. Would recommend that anyone staking their career, reputation, money, etc on such a bet reconsider it. So that being said, it does appear based on reports that there have been delays in some of the frontier models that we expected to see in 2024. So we could think like a Gemini 2, a Claude Opus 4, a GPT 5 or Orion, like we kind of assume we might see all Those models a llama 4. So a few thoughts here. 1. The year isn't over yet, so there's certainly still the possibility we're going to get smarter, bigger, more generally capable models. The labs don't share their model release plan, so while we may have been anticipating these models by year end, they may not have. And then the third, and maybe the most important aspect of this is these models are complex. They are, they are not traditional software where you just brute force a bunch of code and you release a model that does what you want it to do and then you fix some flaws after you release it. These things are not, they don't work like that software, they don't do what you want them to do all the time. And oftentimes it's not until you train the model that you find the flaws or deficiencies or that maybe it doesn't do what you wanted it to do as well and you have to go in to retrain it or you have to fine tune it after the fact. And so as these models get bigger, they get more complicated to train to post train to red team. Red teaming again is like the idea of testing and evaluating a model's vulnerability, vulnerabilities, limitations, risks. So maybe you train this massive thing and then you realize this is too dangerous. Like, there's too many risks associated with this thing. It has too many emergent capabilities. We can't release this thing. We got to go back and like fine tune this and do more post training to make it safe enough to put out. This is like, I think kind of what we saw with the advanced voice mode from OpenAI. You have the thing ready, you've done the training, you've done all the testing, but then you realize it's got some capabilities that we cannot release. And so we have to now make it safer. So as they get bigger, it's going to be harder to project what's going to happen. And as we'll hear from Dario, I'll kind of walk through what goes into training and preparing these models and people will realize, like, this isn't a. You run a do a training run and 30 days later you just release the thing. That is not how these work. So my current bet would be the fact that the labs will continue to push the scaling laws. They will continue to do more compute more data, likely with new approaches to maximize performance and capabilities. So the labs are going to keep buying Nvidia chips to do the training. We're going to keep hearing about massive data centers being built. We're going to continue to hear about massive investment in energy infrastructure. That's going to be a major priority of the incoming administration. It's going to be in the United States. It's going to be a major priority of people like Sam Altman to push this. The labs and the governments will spend tens of billions of dollars next year on training and building these models. Within two to three years, they will be spending hundreds of billions of dollars to build bigger, more generally capable bottles. So whether the scaling laws as we have known them remain exactly true or not, I don't think it really matters. And I don't think all these headlines about the scaling laws plateauing or different, you know, people kind of taking a victory lap. Who are the general antagonists of the AI models and the scaling laws, I think those victory laps will be seen as premature in the end. So when we talked about this last Week on episode 123, there was a few things I highlighted. So one was that a lot of these leaders of the frontier labs like Sam Altman, Demis Hassabis, they have been very public that they see there needs to be maybe two to three breakthroughs to unlock, like the true Intelligence, powerful, AI, AGI, whatever you want to call it. Sort of the. The model that takes like a massive leap forward from what we have today, like a GPT4.0. And so that has been known. Now, the things I highlighted in that episode was reasoning, you know, the O1 model from OpenAI. We're under the assumption we're going to get the full O1 model soon. Multimodal training, where they're not just trained on text, but images and video and audio. The idea that there'll be a symphony of models working together, that the large models will be kind of like a conductor working with a bunch of smaller models. The concept of self play or recursive self improvement, where the models are able to identify their own flaws and kind of fix them as they're going. And then memory. There was actually an interview. I don't think we have it on the list to talk about today, but Mustafa Solomon did an interview last week and he was talking about memory. Maybe we touched on this last week. I don't, I don't remember.
Dario Amadei
I think we did, ironically, I don't remember.
Paul Raitzer
But memory is a huge one and he thinks it'll be solved by next year. Now, the one thing we didn't talk about, Mike, is AI agents. And so that, that's going to kind of lead us in to our second main topic and to get started here. The way, like, again, we've talked a little bit how this podcast works. The planning works sometimes, but it's a very dynamic process sometimes up until literally the minute Mike and I get on to record this. And this would be an example of that. I'm preparing to do my AI Agents in the Future of Agencies keynote on Wednesday and the deck isn't done yet. And so as of like Sunday night, I was still going through all of my research around this idea of AI agents and what they are. And so we weren't sure how much of this we were going to weave into today's conversation. But as I kind of like came to some personal, like, peace of mind on the topic very late Sunday night, I decided that this probably needed to be a main topic. So the concept here is the issue here, I guess is earlier this year, a lot of the AI companies like Google and Microsoft and OpenAI and others started talking. Salesforce started talking a lot about AI agents. And it started creating a lot of confusion for me as someone who obviously follows the space very closely, because I wasn't really clear what exactly they were talking about, like what they were considering agents to be. And so Historically for me, like when I did the episode 87 AI timeline, we talked about the explosion of AI agents starting next year, right? And I had a very clear picture in my mind of what I believed AI agents to be based on what they have historically been talked about as. And so the simple concept, the simple definition I have historically used is it's a system that takes actions to achieve goals. And so in the idea of an agent LLMs like the Power ChatGPT, Claude Gemini, those AI systems answer questions and create outputs by predicting tokens or words. Like they don't take an action, they just output something to answer a question or write an email or, you know, do an article or whatever, but it's just predictions of words and tokens. So. And we have other generative AI systems that create images and videos and audio, but again, they're just outputting something. Those systems don't take actions, they don't complete a workflow, they don't go through like 10 steps to do something. So when we talk about agents in a traditional sense, the concept was you give it a goal, it plans and executes to achieve it with no human inputs or oversight. It's this idea of like autonomy. So an example here would be Google DeepMind's AlphaGo. So I know a lot of our listeners, viewers have probably watched the AlphaGo documentary. If you haven't a great, you know, while we're not here next week with you, go watch the AlphaGo documentary. It's a great example where the machine is, is provided training data to win at the game of Go. It then does all these simulations to learn how to play the game, but then it functions autonomously. It's just told basically to win the game. It does all the planning, it figures out how to do it, it analyzes its own moves, it thinks 10, 20, 100 steps ahead of what the human may do. And so that was kind of like the traditional idea of an agent. Now the confusion comes in today because a lot of leading AI companies have been talking about their AI agents as autonomous, and that is largely not the case and can be extremely misleading. And so autonomy actually becomes sort of the sticking point here. And the way I talk about this, and I mentioned this last week is like in full self driving in autonomous cars, which Tesla and Waymo and others have been pursuing for well over a decade, the idea of full autonomy is that you don't need a steering wheel or pedals in the car. The human just gets in the car and says, I would like to go to the office and the car figures out everything else. The human has no involvement in anything other than the goal setting and then the machine executes the goal. And so the example I gave when we talked about this last year earlier is like the idea of sending an email in HubSpot. Yeah, if I want to send an email in HubSpot, it's a minimum of 21 clicks as a human for me to do, for me to send an email. The idea of an AI agent that's autonomous would just be me, the human saying, hey, go send this email. Here's what I want you to do, like provide us some parameters and a goal. It then goes and does the 21 steps with no human oversight. So this is the problem is that we've been seeing brands talking about their agents as autonomous when they are not, they're not even close to autonomous. And so this is why I had created this human to machine scale years ago. It's this idea that the technology and the tasks have levels of autonomy, that there's kind of like there's zero, which is it's all us. We're telling it what to do. And then there's full autonomy at the end where the machine does everything. The human provides no real inputs or oversight. It's not dependent upon the human for anything. And so just to give you a sense, the Salesforce Agent Force page. So this is Agent Force is all the rave. That's what Salesforce is pushing everything to. They, they define Agent Force Agent as a proactive autonomous application that provides specialized, always on support to employees and customers. They're equipped with necessary business knowledge to execute tasks according to their specific role. Now they're calling it an autonomous application. And yet on that same page it says the user defines the role, connects the trusted data sources, defines the actions, sets the guardrails, determines the channels where they connect. That sounds like a lot of human involvement and oversight to me for something that's supposed to be autonomous. So you can understand where the confusion comes in. Then we go over to Microsoft. October 21st of this year, less than a month ago, the headline of their own blog post, New autonomous agents scale your team like never before. If I'm a marketer or a business person and I see that headline, I think I'm going to assume that we are at the age of autonomous agents. Right, like that. That's pretty direct. In that post it says we're announcing new agentic capabilities that will accelerate these gains and bring AI first business process to every organization. First, the ability to create autonomous agents with copilot Studio will be in public preview next month. Great. I'm the CEO of a company. Autonomous agents are here in November of 2024. Like I don't need agencies. Maybe I don't even need employees. Like Autonomy has arrived. Second, we're introducing 10 new autonomous agents in Dynamics365 to build capacity for every sales, service, finance and supply chain team. They then go on to provide some context, which is actually quite helpful if they hadn't already made all these promises in the headline. So they say copilot is your AI assistant. It works for you and Copilot Studio enables you. Now again here, remember, these are autonomous. You easily create, manage and connect agents to Copilot, think of agents as new Apps in the iPower world, every organization will have a constellation of agents. Now this is the real key that maybe should have been closer to the headline, ranging from simple prompt and response to fully autonomous. They will work on behalf of the individual team and function to execute and orchestrate business processes. Then they have another blog post same day, unlocking autonomous agent capabilities with Microsoft Copilot. And in that blog post, agents are expert systems that operate autonomously on behalf of a process or a company. They also have another one unveiling Copilot agents built with Microsoft Copilot to supercharge your business. Now in this one they talk about they come in all shapes and sizes and they like actually don't get into the autonomy thing. So that's Microsoft. They're maybe like the most guilty party here in terms of like claiming autonomy. Google has actually done a pretty good job of not claiming autonomy per se. So. Sundar Pichai May 2024 so this is right around the Google I O conference. He defines as intelligent systems that show reasoning, planning and memory. They're able to think, quote, unquote, multiple steps ahead and work across software and systems, all to get something done on your behalf and most importantly, under your supervision. They're actually like very directly. Not saying it's purely autonomous then. Thomas Kurian, the CEO of Google Cloud September 2024 for now, this is two months ago. AI agents are intelligent systems that go beyond simple chat and predictions to proactively take actions. That's not bad. Like again, Google's done a pretty nice job here of not over promising Autonomy. Nvidia in October 2024 they say in a blog post, what is agentic AI is the title AI chatbots use generative AI to provide responses based on a single interaction A person makes a query. Chatbot uses natural language processing to Reply. The next frontier of AI is agentic AI, which uses sophisticated reasoning and interactive planning, iterative planning, to autonomously solve complex multi step problems. So they're kind of alluding to Autonomy is coming then. Jensen Huang, the CEO, at a conference last week, the Nvidia AI Summit in Japan. This is what he said about AI agents. The first AI is basically a digital AI worker. These AI workers can understand, they can plan and they can take action. Sometimes the digital AI workers are being asked to execute a marketing campaign, support a customer, come up with a manufacturing supply chain plan, help write software, maybe a research assistant, a lab assistant in drug discovery industry. Maybe this agent is a tutor to the CEO. These AI, these digital AI workers, we call them AI agents, are essentially like digital employees. Now I actually really like the direction Jensen goes here and so I'm going to finish this excerpt because I think it's, it's very representative of the reality. Just like digital employees, you have to train them. You have to create data to welcome them to your company, teach them about your company. You have to train them for particular skills, depending on what function you would like them to have. You evaluate them after you're done training them to make sure that they learned what they're supposed to learn. You guardrail them to make sure they perform the job they're asked to do and not the jobs they're not asked to do. And of course you operate them, you deploy them. That does not sound like autonomy to me. That sounds very clearly like the human is essential in this process. Okay, so then they interact with other agents. They have the ability to potentially to interact with other agents, to work as a team to solve problems. Agentic AI is transforming every enterprise using sophisticated reasoning and iterative planning to solve complex multi step problems. So let me go into that. Okay, so what makes agentic AI so powerful? This again is still gents, I'm talking is its ability to turn data into knowledge and knowledge into action. A digital agent in this example can educate individuals with insights from a set of informationally dense research papers. None of these agents can do 100% of anyone's task, anybody's job. None of the agents can do 100%. However, all of the agents will be able to do 50% of your work. This is the great achievement. Instead of thinking about AI as replacing the work of 50% of people, you should think that AI will do 50% of the work for 100% of the people. By thinking that way, you realize that AI will boost your company's productivity. You know, people have asked me, is AI going to take your job? This again is Jensen still. And I always say, because it's true, and I'm the one who's gotten ridiculed for saying this, but now Jensen is saying it again. AI will not take your job. AI used by somebody else will take your job. And so be sure to activate using AI as soon as you can. So the first is digital AI agents. Then one other piece of context from Jensen. In spring of this year on following an earnings call on a CNBC interview, he said the world's enterprise software platforms represent approximately a trillion dollars. These application oriented tools oriented platforms and data oriented platforms are all going to be revolutionized by these AI agents that sit on top of it. And the way to think about it is very simple. Whereas these platforms used to be tools that experts could learn to use in the future, these tool companies will also offer AI agents that you can hire to help you use these tools to help reduce the barrier. Now, someone who may have already been working on this or listened to this quote as he was building it is Dharmesh Shah, our friend that I talked about on last week's episode. Because Dharmesh has built agent AI where literally the call to action button is higher or like add to team and agent and there's over 100 agents you can go look at. So go look at agent AI if you want to kind of understand how this is going to work in the near term. So Dharmesh in September did a future of AI agents keynoted inbound. And to Dharmesh's credit, they did a really good job here of not over promising autonomy. He described it as software that uses AI and tools to accomplish a goal requiring multiple steps. And he specifically said some agents can have the ability to run autonomously, some have executive planning capabilities, but those are niceties, not necessities to be an AI agent. So as I take a breath here, the key thing to understand about AI agents, forget all the kind of confusing, different messaging coming from these different brands. At the end of the day, an AI agent takes actions to achieve goals. Now there is a spectrum of autonomy. So it is not like there is going to be no agent in the near future that the human just gives the goal to and it just goes and does everything and that's it. The human has no involvement beyond that. No inputs, no oversight. So think of autonomy as again this spectrum, it is not binary. Something is not autonomous or not. It can have kind of a level of autonomy. That's where the human to machine scale came in was the different levels of autonomy. Because if you think about what needs to happen in an AI agent for it to work, someone has to set the goals, that is the human. Someone has to then do the planning of how this agent is going to function. Then there's the execution. The plan is in place, it knows what to do, then it executes. That's where the autonomy today lives. That's where they're, what they're calling autonomy is the execution step of the agent. Then there's the iterating or improving, like knowing it's doing something wrong and fixing it. Then there's the analyzing the performance. If you think about kind of those five steps, goals, planning, executing, improving, analyzing the autonomy that Microsoft is talking about and others is basically the executing phase. So there are, with every AI agent there are varying levels of autonomy. There's varying levels of complexity of doing simple like five step process to 200 steps with no error rate, which basically doesn't exist today. So there's these levels of complexity. There's, there's levels of its ability to understand, reason, plan and remember, like memory. They in theory can learn and adapt and improve and make decisions, but not all of them can. They can interact with tools like search, so ChatGPT can now go and use search calculators, Python code. Like the ability to interact with tools and create other content, interact with other agents. They have data sources, they have guardrails and controls. They can be multimodal or not, they can be interpretable or not, meaning I can look and see why it did what it did, the steps it took. And they can engage with humans through natural language. So every one of those characteristics I just outlined, they're not uniform across agents. Every one of them can be variable within an agent. So we're, we're basically Mike using this, this AI agent term to encompass every form of agent that can take an action. But there's like a dozen characteristics that will all vary depending on the kind of agent you're interacting with. So my main takeaway for people here is to kind of summarize this is they are nowhere near autonomous. If you hear about AI agents and you think, oh my gosh, they're taking my job next year, that is not happening. Like if, if you realize all the things that have to go into making agent work, goal setting, planning, building it, monitoring it, improving it, that is almost always the human's job right now. So I would actually be looking at this as the opposite of being threatened by them. I would look at it in my Company and say, well, I'm going to go play with agent AI today and try and figure out how to build agents once they. It's a wait list right now. But once I can build agents on agent I, I'm going to start building agents that are really valuable to people. If I have access to Copilot Studio, I can go build agents for my team to do things more efficiently. Like the ability to build these agents, which mostly won't require coding ability, is a massive superpower. So if you own an agency, if you are a brand marketer, if you are an accountant or lawyer, I don't care what you do. Think about the things that require multiple steps that are repetitive data driven processes in your business you will have agents for all of those things. It may take years, but you can be the one that figures out how to build those things. The closest parallel right now is custom GPTs.
Dario Amadei
Right?
Paul Raitzer
In essence that's what you're doing. You're kind of like building AI agents in a way to do a thing. And so if you start to imagine the value of building a bunch of custom GPTs to take all of your processes, all these like 10, 20 step processes and build something that can do those, yeah, it's going to save a ton of time. Drive efficiency, productivity, creativity. But someone's got to envision them, give them goals, plan them, build them, improve them, that is humans for the foreseeable future. So okay, I'm gonna stop. Hopefully that all makes sense because it's just I think people need to think of this as an opportunity, not a threat. Like that's kind of my main takeaway right now.
Dario Amadei
Hearing you outline that, it really does strike me with more clarity than I think I had in the past of just I've been racking my brain like what skills are going to be really valuable moving forward outside of just nebulous like get good with AI. Right. And being a manager, creator and or shepherd of AI agents immediately strikes me, like you said, as something super, super valuable and like will be an obvious skill need in the next one to two years.
Paul Raitzer
Yeah, yeah, I think so. Like I think you could start to see resumes or start to see job applications where building agents and custom GPTs is a desirable capability across any industry. Now obviously like marketing, sales, service may move faster or product development, like, things like that, they're going to be ahead of the curve looking for people with those capabilities. But if you want to like bolster your resume, don't like just take a class like the way you used to Both boost your resume or like your career opportunities was go take a class, go get a certification that still matters. Build agents, build custom GPTs in your personal life, build them to help with your own job. And then when you go into those interviews, say, yeah, actually like, I've managed to open up 50 hours per month because I built five agents that do these things I used to do and it enabled me to go do these things. And, and like, I think the people who are proactive within their own companies are going to become more valuable there. But if you're like, I need to move, I need to go get a career opportunity with a company that's more AI forward than where I'm at, go build some agents and prove your ability to bring that to another organization. Like, that's the opportunity. Or if you're an entrepreneurial mindset, think of all the agents you can build like they truly are going to be part of your team. Like that Jensen interview I would recommend or the presentation, the Japan one I would recommend people watch that. And then I would go look at agent AI and see how Dharmesh and the team are kind of positioning these things as team members. And that's it. Like you basically add agents to your team to do things and org charts are going to one to two years out. You're going to see that where there's AI agents just baked right into the org chart.
Dario Amadei
And as you're saying that, I also. And it's on top of mind because I'm doing a talk later this week at a graduate school. You know, everyone always asks, like, what's your advice for job interviews or skills or career advice in the age of AI. And what you just said with even custom GPTs and or agents is huge. But it's so easy to do that. I'd also be considering every interview I go into creating one specifically for the company I'm interviewing with. It's not hard to figure out what a company's broad marketing strategy is. Say if you're in marketing, for instance, from their website. So you could pretty easily extrapolate what are they likely to be spending a ton of time on with a little research and create something valuable to show them.
Paul Raitzer
Yeah, and I like that idea a lot. And you can even like one of the agents that Dharmesh and the team built on agent. That AI is like a go to a company profile thing and it'll go through the profile. I think there's one for earnings calls. So I don't know. I mean, I really think that if People get through the abstract nature and uncertainty of what an agent is and just think of it, it is something that can basically take actions for you across multiple steps. And you start thinking about all those repetitive data driven things you do and start thinking, maybe I can build an agent for that. And again, it may not be today that you can go do it, but it might be first quarter of next year. And so if you can be the one on your team that just starts building agents internally that other people can use again, it's going to be so valuable and so many people are going to think it's harder than it is because it's not going to require coding ability. And it's almost hard. Like, honestly, I've spent basically my last like 10 days of my life immersed in what is an AI agent. And like how, and I've been thinking about it for years, but very intensely for like the last week and a half. And I, I'm having trouble honestly wrapping my mind around how big the opportunity is to be the one that learns how to build these things, whether it's in your team or if you're an agency or if you're an independent developer. Like people are going to need help doing this. Like, this is a massive consulting opportunity. It's a huge opportunity internally to create a career path for yourself. Like it's, it's big. Like it's real big.
Dario Amadei
What's also big is our third main topic. And big, I mean literally, because Lex Friedman just dropped a an insanely long interview, five hours long, with key leaders at Anthropic, including CEO Dario Amade, Amanda Askel, who works on fine tuning and AI alignment at Anthropic, and co founder Chris Ola, who is working also on mechanistic interpretability at the company. And as you can expect, they discussed a lot of different things in the time they had. Amadei talked a lot about the scaling law limitations we just discussed. He talked about the possibility that we may run out of data or hit a ceiling in terms of how AI models can learn about the world. He talked a lot about Anthropic's responsible scaling policy, which is designed to address the risks of AI systems. Askel she talked about the importance of creating a good character and personality for Claude, their model, and how this is done through a process called character training. Ola discussed basically how the company aims to reverse engineer neural networks to figure out what's going on inside. That's that term mechanistic interpretability, what that means. And of course this is just a very Small sample of what they covered in five hours. But like we've talked about before, like, these types of interviews are really important to stay on top of for a couple reasons. So one is that the best way to understand what is shaping the future of AI is to listen to the handful of people who are actually doing it, which is actually a relatively small number. So listen to what they tell you in interviews like this. What's also really interesting is number two, these interviews are actually kind of increasingly fulfilling the role of formal company messaging. We're increasingly seeing AI founders and startup founders generally, quote, unquote, go directly to, say, popular podcasts to get their viewpoints and perspectives out there. So these interviews may actually be kind of the source of truth you get reference to on things like model release dates, product roadmaps, company viewpoints, et cetera. It's actually really funny. Instead of responding to Bloomberg's requests for interviews, in one of the stories we cited in the Are we Hitting a Wall Segment, Anthropic literally just pointed them to this podcast multiple times. That Bloomberg article that we've cited that we were talking about, it literally says Anthropic in response to our questions pointed to the five hour podcast with Lex Friedman. So, Paul, I know you found a lot to pay attention to in this interview. Could you maybe share with us some of the most important highlights?
Paul Raitzer
Yeah, luckily I had a flight to and from San Diego last week, so I had, you know, 12 hours of travel to consume this at 2x speed. So I got through the vast majority of it. I'm just going to call it a couple of things. So I referenced earlier on the scaling laws. Dario does not see it as an issue. You know, he thinks synthetic data is going to be a big thing. He thinks the reasoning path that OpenAI and others are taking is going to be a thing. He said, I think most of the frontier companies, I would guess are operating in roughly 1 billion scale, meaning a billion dollars for a training, plus or minus a factor of three. Those are the models that exist now or are being trained now. I think next year we're going to a few billion and then 2026 we may go to above 10 billion for an individual training for a single model and probably by 2027 their ambitions to build $100 billion clusters. And I think that will actually happen. So he certainly is a believer that this is going to continue. The one section I found really interesting, I'm going to read this, this excerpt because I think it's really helpful is the complexity of training these Big models that I referenced earlier. So he said, so Lex says, what is the reason for the span of time between say a Claude Opus 3 and a 3.5? What takes that time, if you can speak on that? Dario says, so there's different processes, there's pre training, which is just kind of a normal language model training, and that takes a very long time. Again, that's where you take all the content, all the text, everything, and you train these, these models on that source data that uses these days tens of thousands, sometimes many tens of thousands GPUs, Nvidia chips for training them, or we use different platforms, often training for months. So that initial training process, pre training can take months and tens of thousands of Nvidia chips. Then he says there's then a kind of post training phase where we do reinforcement learning from human feedback as well as other kinds of reinforcement learning. And again, that's humans telling the model this is a good output, that's a bad output. And they're trying to kind of tune it to, to do what they, the humans think is good, basically. And so you hire people and they literally work with these models to fine tune these outputs using reinforcement learning. He said that phase is getting larger and larger now and often that's less of an exact science. It often takes efforts to get it right. Models are then tested with some of our early partners to see how good they are. And they're then tested both internally and externally for their safety, particularly for catastrophic and autonomy risks. So we did, we do internal testing according to our responsible scaling policy. And then he says, we have an agreement with the US and the UK AI Safety Institute, as well as other third party testers in specific domains to test our models for other risks, chemical, biological, radiological and nuclear. We don't think that models pose these risks seriously yet, but every new model we can evaluate to see if we're starting to get close to some of these more dangerous capabilities. So those are the phases and then it just takes some time to get the model working in terms of inference and launching it in the API. So there's lots of steps of actually making the model work. So again, why don't we get GPT5 like on December 1st, like we thought we might well, because any one of these steps, they could have run into obstacles. Now, is it scaling laws they're running into? It may have nothing to do with the scaling laws. It may just be they're getting bigger and more complex and these different steps just take longer and they're finding more and more Kind of hiccups or weaknesses or threats or whatever it may be within the models. And they're not going to tell us that stuff. So the media is going to write whatever they write. It may have nothing to do with the reality of what's going on. And then the, the other one I'll save from Dario was he said if about AGI, which he prefers powerful AI, but whatever. If you just eyeball the rate at which these capabilities are increasing, it does make you think that we'll get there by 2026 or 2027. Again, lots of things could derail. We could run out of data. We might not be able to scale clusters as much as we want. But he seem. He doesn't really see any obstacles that aren't able to be overcome. And then I won't dive into Amanda's. I love Amanda's interview because she's the person that's basically building the character of Claude, the personality behind Claude. I would listen to that. Like even if you just want to jump ahead and listeners, it's so intriguing like how she thinks about prompting character development, the system prompt that goes into Claude that kind of guides its behavior. It's really intriguing and very non technical. It's kind of a very approachable overview. And then the, the interpretability, mechanistic interpretability is a more dense technical topic. But the reason why this matters is because we, we've said this before but if you're kind of new to these models, we don't know why they do what they do. Like if it starts misbehaving or if it has some risk that's identified or has some emergent capability that wasn't expected when it comes out of training. They can't just go look at the code and like oh, there's the line that's causing this. That is not how these things work. They function much closer to like the human brain where you just have neurons and they're firing and doing all kinds of things. So if you say to like, if you ask where are memories stored in the human brain? Or how are memories created or what are dreams or like why did you have that thought? Why did you say that word? You can't just go into the human brain and, and pick that thing out or like find the exact neuron that fired or neurons that fired together. That's how these things work. They just have all these parameters and they do all these things basically. Like the human brain has the neurons and so like the interpretability is trying to understand why they do what they do how they do what they do. And so it's a very important, like bigger picture topic that if you like the more scientific, technical side of this, that would be a great listen for you. If that's overwhelming to you, then just don't stick around for the last hour and a half.
Dario Amadei
Yeah, I think what's also notable here is it's proof positive of exactly what you were saying in the first segment, that despite everyone shouting their head off about us hitting a wall, there are many, many people, many of whom are deep within the actual AI labs, that do not appear to believe this.
Paul Raitzer
Yeah. And they. And again, if they were selling something to us that was like a Future Sci Fi 10 years out thing, you could like question their motives. We'll know in like three to six months whether they're full of crap or not. And like, if you're Sam Altman or Dario Amade and you're staking your entire reputation and career on these being right. I feel like you might hedge a little bit more if you, if we were all going to know in three months you were lying about it all or you were just being misleading, like this is near term stuff. We'll know when the next models come out, if we hit scaling laws walls or not. And they don't think we did. And so I just, I don't know. Like I said earlier, I just feel like there's probably some elements of truth to it, but I would not overreact to it. I wouldn't bet against these things continuing to get bigger and better.
Dario Amadei
All right, let's dive into this week's rapid fire topics. So first up, OpenAI is set to release an AI agent tool of their own in January. According to Bloomberg, this new tool, which is codenamed Operator, will be able to perform complex tasks on behalf of users, from things like writing code to booking, travel arrangements, all by directly controlling a computer. The tool will be released as both a research Preview and through OpenAI's developer API. In a recent AMA on Reddit, CEO Sam Altman said, quote, we will have better and better models, but I think the thing that will feel like the next giant breakthrough will be agents. To that end, Operator is apparently just one of multiple agent related research projects that OpenAI is working on, according to the sources interviewed by Bloomberg. So, Paul, we've known everyone's working on agents, we just talked a bunch about agents. But OpenAI formally getting into the game and quite soon seems like potentially a big deal.
Paul Raitzer
Yeah, and this is more. This is like computer use, like we talk about With Anthropic Claude, I think we talked last week about Google was working on something like this. This is more in the realm of what traditional AI agents were considered within the labs. Like, I give you a goal to like book my trip to Florida and you go and have the ability to use my computer and other tools and you're able to go and do the thing and I rely on you. I trust you to have my credit card, I trust you to have the login to the apps you're going to need and you just go fulfill the goal. You take actions to fulfill a goal. So again, this is kind of why the confusion exists of like, this is the traditional AI agent that's far more capable, more autonomous. What we're Instead getting our AI agents where I determine the 25 steps you're going to take, I tell you what steps to take, and you just go do the thing. It's like more automation. But yes, this, they're all working on this. We've known this since 2017 that they've been working on this. And I think next year we'll probably get a bunch of like cool demonstrations. I do not expect in your consumer life or in your business life that you're going to be using these kind of truly more autonomous AI agents that take over your screen and do things. Apple is working on this kind of stuff. So I think next year you'll start to experience it, but this will not like, be life changing for you in 2025.
Dario Amadei
All right, some other OpenAI news. Co founder Greg Brockman has returned to the company after a three month sabbatical. So in an internal memo to staff last week, Brockman announced he was officially starting work again. He also said he'd been working with Sam Altman to create a new role for him, which is focused on tackling major technical challenges. Back in August, Brockman said he was taking his first break since helping start OpenAI nine years ago. Now, Paul, like, no doubt Greg deserves a break, but there's more to the story than just that because in past episodes we had talked about some drama, some reports at OpenAI that some people kind of saw Brockman's leadership style as perhaps problematic or counterproductive. Is this all just kind of a quieter way of making sure Greg stays in the fold, unlike all these other executives, while shifting them away from managing teams? Like, what's going on here?
Paul Raitzer
I have no idea. I mean, the whole idea of like a more technical role probably implies like, hey man, like, you're not going to be the president anymore. And yeah, I don't. And maybe that's what he wants, maybe just wants to get back in. Like, you know, I think he liked being involved on the technical side and maybe with all the stuff they've got coming with their OH1 release and whatever Orion is and Sora and you got all these technical things like maybe, maybe that's where he wants to be or maybe it's just where they've decided is best. So yeah, I guess we'll just have to wait and see what, what it really ends up being involved in. But I could certainly see them taking a significant role as we move forward because they got a lot going on.
Dario Amadei
So we got some research from earlier this year on generative AIs impact on jobs and this research is kind of getting some new life and some new buzz. So the research we're talking about is from February of 2024, but it was highlighted just this week in Harvard Business Review because the research is now going to be featured in the peer reviewed journal Management Science. This research paper is called quote, who Is AI Replacing the Impact of Gen AI on Online Freelancing Platforms? And it's notable because it's a comprehensive study that analyzed over 1.3 million job postings from a major freelancing platform before and after the introduction of chat GPT. So from July 2021 to July 2023 is when they started looking at their data and they actually found that the introduction of ChatGPT led to a 21% decline in demand for certain types of freelance work compared to jobs requiring manual skills. Now, this impact was not uniform across all the categories. Writing related jobs were hit hardest. They experienced a 30% drop in demand. Software and web development saw a 21% decline. Engineering related posts dropped by about 10%. And after the release of AI image generation tools, demand for graphic design and 3D modeling work fell by approximately 17%. Now, it's not all bad news here. The study did find that the remaining job postings in AI impacted categories actually saw slight increases in budget and complex complexity, suggesting that while simple tasks might be automated, there was still demand for more sophisticated work that combines human creativity with AI tools. Now, Paul, obviously this is quite data from quite a while ago. It is a study that's kind of probably going to be talked about quite a bit more just given that it's going to appear in management science. But we have to keep in mind it was published in February 2024. However, it does seem to highlight some interesting trends, which is there was a pretty immediate and Material impact on what types of work people wanted to hire for once something like ChatGPT came out.
Paul Raitzer
Yeah, I'm always happy to see this kind of research. I don't know how meaningful it is honestly like mainly because the, the time period that they pulled the data from ends in July 2023, which is four months after GPT4 came out when there was almost no enterprise adop. So I mean if anything it might be an early sign that has gotten far worse. Like I could imagine these numbers are much, much higher for those traditional roles because honestly by summer 2023 I don't really know too many enterprises that were using it to replace those roles. The other thing I would be really fascinated to see though. So I guess what I'm saying is I would love to see some updated data through summer of 2024 if those trends continued or assumption is they would. My hypothesis would be that they grew significantly in terms of like the impact that had on those jobs and postings. But the, the other thing would be, I would be fascinated to see what are the other jobs that emerge because I would guess that there's tons of postings for like AI agent building and.
Dario Amadei
Right.
Paul Raitzer
An AI training and all these other things. And, and again like the opportunity or like other viewpoint here is if you're someone in these roles that is being impacted or may be impacted or the trends show, you should be kind of really thinking about the future study where the merging roles are. Because AI agent training, gen, you know, gen AI training, like all of those things, your skills are transferable. Like it's not the end of the world. You just gotta look where the opportunities are going to be and like move, move in that direction. I'm not saying give up on your career path and what you went to school for, but the markets are going to shift and there's going to be new jobs that emerge that people didn't go to college for. And so maybe that's, maybe that's kind of what you're going to be doing. Like again, think about that post training example from Dario. The importance of reinforcement learning from human feedback. Where does that come from? It comes from experts in their fields. They need experts in writing, in medicine and biology and math and you know, business consulting. Like they need the experts to teach these things how to do what they do and there's no end in sight for that. In fact they're going to be paying more money for those experts. So I don't know, like again this study is, it's super reliable data. It's old data, that's for sure. But it's directionally worth paying attention to. And I think, you know, maybe an impetus for people to be a little bit more proactive in figuring out where their career moves might come from next.
Dario Amadei
All right, next up, Google's most recent version of their Gemini model is now at the top of a popular AI leaderboard. So this new model is an experimental model called gemini-exp- 1114. And it now beats out every other model on the popular Chatbot arena leaderboard, which we've talked about before. It uses Elo ratings and human rankings to rank over 150 of the most popular AI models. The organization behind the leaderboard made the announcement in a post on X on November 14th. And in that post they said the new Gemini model jumped from rank number three to number one overall, which puts it ahead of everyone like GPT4,001, etc. It also made leaps in specific categories and went from number three to number one in math, number two to number one in creative writing, two to one in vision, and five to three in coding. Now you can test this new model out along with other models that aren't in commercial deployment yet. If you go to Google AI Studio, which is aistudio.google.com now, there were a couple, like, nuances I saw here, Paul, where it was like, they have something on the Chatbot arena called a style control rating. And this is basically an evaluation method they developed to do what they call quote de biasing, like the user ratings. And they do that by kind of accounting for things like style elements that might influence how you or I would rate a model's performance. They say, for instance, quote, style indeed has a strong effect on models performance in the leaderboard. This makes sense from the perspective of human preference. It's not just what you say, but how you say it. But now we have a way of separating the effect of writing style from the content so you can see both effects individually. And so like, when you look at the rating for style control, which they also provide, Gemini actually hasn't moved at all. It's still sitting at number four behind.01 GPT4O and Claude Sonnet. So, Paul, this is just one leaderboard. It's a very important one now, like, maybe take us a step back and just walk me through, like, why should we be tracking who's on top, who's not? How often is this changing? What do we have to pay attention to here?
Paul Raitzer
I'm not really sure I understand the style control thing. But, you know, whatever. I mean, I get the premise of it, but I don't think I really understand how exactly that would work. Yeah, I mean, I think it's interesting for people like us to kind of keep an eye on it. I think it's increasingly intriguing because that it's actually a pretty good indicator of when new models are about to get dropped. So because all the frontier model companies are putting their models in here under different names, in this case it's actually Gemini Experiments. You know, it's a Google model. Sometimes they don't put the name of the model in there, but when you see something jump like this, it's a really good indicator that we may be on the precipice of like a major new model coming out. So that's part of why we follow it is just it's an indicator of things are coming. And obviously when you see a leap like this, it could be an indication that maybe there's something major, maybe it's actually like a whole nother leap up, like a Gemini 2. And I'm not saying that's what this is, but when you see big jumps, you might get indications of something much bigger coming. Sundar did tweet more to come. Like he replied to Logan Kilpatrick's tweet about Gemini. This experimental model is pretty good. And again, these CEOs aren't going to boast if they don't know something's on the frontier. Like they don't want to get out there and say stuff like that. So yeah, definitely worth watching. I would think in the next couple weeks here you might see something. And then the other note is the Gemini app is now available for download on iPhone. So if you have iPhones and haven't been able to have the Gemini app, you can now go grab that. I've been playing around with the Gemini Live, which is their version of like advanced voice mode. Pretty slick. So yeah, it's an easier interface for people.
Dario Amadei
All right, next up, Microsoft Copilot is having a bit of a bad week. Business Insider just dropped an in depth investigation into how Copilot is falling pretty short of customer expectations. Business Insider says it reviewed internal emails, spoke with customers and competitors, and interviewed 15 current and former Microsoft insiders for the report we're about to talk about. They then report that many customers appear dissatisfied for what Copilot can actually do, especially when compared to what was promised by Microsoft and how much the tool costs. They cite a number of third party research reports showing that customers are struggling to see the value of the tool, including a Gartner report from October that says only four out of 123 IT leaders they surveyed believes it provides significant value to their companies. Customers also appear to be seriously concerned about Copilot security. The tool relies in parts on browsing and indexing internal company information. Many have run into issues with what Copilot can access and what that means for employees, writes Business Insider. Quote As a result, many customers have deployed Copilot only to discover it can enable employees to read an executive's inbox or access sensitive HR documents. And this is a quote from an employee. Now, when Joe Blow logs into an account and kicks off Copilot, they can see everything, said one Microsoft employee familiar with customer complaints. All of a sudden, Joe Blow can see the CEO's email. Another Microsoft employee said the tool quote works really darn well at sharing quote information that the customer doesn't want to share or didn't think it had made available to its employee, such as salary info. According to a garden that Gartner survey. Again, a full 40% of IT managers said that their company had delayed implementing the tool for at least three months due to these types of concerns. This is affecting how much value companies get out of the tool. A customer they talked to said his company had to disable the Meeting Summary tool, which he found really, really valuable because the legal team was wary of it saving transcripts. And last but not least, the worst criticism in this article from for Copilot kind of came from Microsoft itself. One longtime employee told Business Insider. Quote I really feel like I'm living in a group delusion here at Microsoft in reference to the gap between what the company was promising and what it can actually do. So, Paul, this is a rough picture of Microsoft Copilot. I mean, anecdotally, we've definitely heard rumblings from some people we talked to about gripes with Copilot, like, how bad is this?
Paul Raitzer
Damn. Really bad. Like, those are. I mean, I've been following Mark Benioff, the CEO of Salesforce, and he's, you know, living his best life, retweeting these, you know, negative things about Microsoft Copilot. He is like the chief antagonist at the moment for this stuff. Yeah. But yeah, so, like, you know, I want to try and be as objective as possible here. I have yet to meet with an enterprise that loves Copilot. Like, and Mike, you. You do the, you do these talks too. We've been in workshops. We've met with big enterprises who have Copilot. I have yet to talk to a single person that is like it's life changing. It's, you know, it's amazing. I had assumed a lot of the lack of value creation or utility was coming from a lack of education and training and change management. Like where people were being trained how to use it properly. More and more it does seem like it's just not ready for prime time and maybe they, maybe they're trying to sell like a whole massive thing when they should be focusing on like smaller use cases or features within it that are immediately valuable. Because what I'll say to people is if your company has Copilot or you're thinking about getting co pilot or you're in a situation where the company has it but it hasn't been rolled out due to different concerns. If you're a leader of a company and you're sitting around doing nothing because you're hearing Copilot doesn't work, go get ChatGPT, build some custom GPT for people that help them do their specific thing that don't need to be connected to any systems or data and get to work. Like yeah, don't let these articles make you think that generative AI in an enterprise is invaluable. That is ridiculous. Generative AI when it's not personalized to individuals for their workflows or when there isn't a plan to prioritize use cases and roll those out across teams and departments, then yes it doesn't work. But there are hundreds of use cases I promise you in every company, across every department where you can get value without all these headaches. Where it doesn't run into the issue of servicing the CEO's emails or salary information for your. You just need to think this through differently and start coming at it from a different angle. Do not wait until the middle of 2025 when your IT and legal finally allow you to roll out copilot to do something about this. You are going to fall behind. So yeah, tough look for Microsoft. Hopefully they get some stuff fixed or hopefully it's not as bad as it appears but from a user perspective don't wait around to to get this. Go go invest the money and just get some licenses to chat GPT or something. Do something. Yeah.
Dario Amadei
Well there is some good news from Microsoft. Maybe this is totally coincidental they released this given this you and I spend.
Paul Raitzer
Some time in pr. Like sometimes we got to balance the negative.
Dario Amadei
Yeah for sure. So Microsoft just dropped this awesome list of over 200 examples of real life companies using Microsoft AI to get results this includes Copilot and Azure. They mention a few companies like BlackRock purchased more than 24,000 copilot licenses to improve productivity. Finastra uses Copilot to save employees 20 to 50% of their time on content creation, personalization, etc. Honeywell employees are saving 92 minutes per week, which is 74 hours a year. Using AI from Microsoft, McKinsey created an agent to reduce lead time during onboarding by 90% and admin work by 30% and much, much, much more. Go check out the show notes. You'll see a link to the full list. They're really valuable to take a look at. Microsoft claims more than 85% of Fortune 500 companies are using its AI products. And they also mentioned an additional study they commissioned with IDC that says for every $1 that organizations invest in generative AI, they're realizing an average return of $3.70. So, Paul, obviously Microsoft's like talking their own book here, but this certainly seems to show that despite criticism of generative AI generally and Copilot, as we just saw, like, some companies are getting a ton of value out of these tools.
Paul Raitzer
Yeah. And again, like, use this as education to help inspire adoption, you know, inspire ideas of how to use it. My guess is a lot of these are custom builds from Microsoft. So a lot of times they'll sell the Copilot licenses, then they'll go in and for 3 million, build some, like, custom solution for something. And I have no doubt that you can see, like, massive value from those. But again, like, Microsoft, Copilot issues aside, these are all real things and I would, Yeah, I would go check them out. It's like, you never know when you're going to see something that aligns with your business, where it's. It's like, I hadn't thought about that. That's cool. So it's a quick read. They're like, you know, it's like a sentence or two for each one. You can scan all 205 minutes and.
Dario Amadei
They all link to, I think, further case studies. Kind of skim very quickly and then pick and choose what you want to read about. All right, next up, Elon Musk's AI company, xai, is reportedly seeking a massive new funding round of up to $6 billion at a $50 billion valuation. According to CNBC, the deal is set to actually close early next week, and the money is going to be used largely to buy thousand Nvidia chips. The company, if you recall, raised 6 billion A. 6 billion Series B in May at a $24 billion valuation. And the majority of this funding round apparently is expected to come from Middle Eastern sovereign wealth funds. So, Paul, we've talked about Elon Musk recently. He seems very well capitalized headed into 2025. Even more importantly, he seems more well connected than ever, given the recent Trump election win. Like, what is the new money, the election result? What does this all mean for Xai's?
Paul Raitzer
I don't know if he's going to take it public in 2025, but this company can raise as much money as they want. Like, given his clout within the incoming government and his influence over everything that's going to happen. Yeah, I mean, they're, I can't imagine they're not going to raise a Series C and a Series D some point next year. If not, like start looking at an ipo. Yeah, I mean this company is going to and just skyrocket and whether they, you know, start delivering right away or not, it's just their own distribution. Like they have Tesla cars, they have the AX platform, they have Neuralink, they have SpaceX, they have all Elon's companies. And this, this company is going to be like the AI platform for all of those companies. And yeah, it's just, it's going to be wild to watch this. I don't think we talked about. There was this, I don't know, it was insider information. Somebody had this story last week about how, like, I think it was OpenAI was rumored to have hired a plane to fly over the new data center that Musk built in Memphis because they were so shocked that he was able to build the thing so fast. And they were basically doing reconnaissance missions trying to figure out like, how they were doing this. That's wild. Yeah, it's going to be nuts to follow.
Dario Amadei
All right, some other fundraising news. Ryder, which is a leading generative AI startup, has just raised $200 million in a series C round that values the company at 1.9 billion rytr. We've worked with them several times and talked about them a bunch. They have typically been a generative AI platform that helps enterprise teams generate content securely at scale. However, they've expanded beyond that initial valuable use case to now offer a quote, full stack generative AI platform for enterprises. In this funding announcement, it also sounds like some further evolution could be underway because Reiter says that, quote, the new capital will help cement the company's leadership in the enterprise generative AI category and fuel writers development of enterprise grade agentic AI. There's those agents again.
Paul Raitzer
I'm going to hear it in every press release, every funding round, every earnings report. You're going to see AI agent or argentic AI in all of them.
Dario Amadei
So Paul, we've known the folks at Ryder for a long time. Like what can we learn about their trajectory overall trajectory of the startup market based on this funding?
Paul Raitzer
Yeah, good people. A huge fan. I think May Habib is awesome. Their CEO and co founder had a chance to spend time with her and get to know them. They've made a bet on big frontier models aren't necessary to create enterprise value and they've been building their own, in some cases domain specific models to start going after verticals. And it's, it's been probably viewed as counterintuitive for a couple of years as these scaling laws have kind of gone and the research has shown like the frontier models are just going to obsolete. These like smaller models that'll just be smarter than them all. So I, you know, good on them for continuing to stick to that vision and that bet. And I do think there's going to be a place in the market for these like vertical domain specific, smaller models that are very the post training like that, reinforcement learning, all that stuff is very fine tuned to specific domains. I think there's a massive market for that and so I think companies like Ryder are well positioned to take advantage of that and keep growing.
Dario Amadei
All right Paul, we've got one final topic here, then two quick announcements. I'm going to kind of roll these together as we wrap everything up here. But first up, in a new episode of the Big Technology Podcast, Spotify's chief Technology officer Gustav Soderstrom shared some really interesting ideas about how AI is reshaping the music industry. And how Spotify is reacting to all this is rather than viewing AI generated music as a threat, he says Spotify sees it as the latest evolution in music creation tools. He emphasized Spotify doesn't plan to generate music itself, but it will serve as a platform for creators who use AI tools as long as they follow the right laws and licensing. And he said on the recommendation front, Spotify is evolving from simply algorithmic suggestions to become kind of a quote ambient friend, an AI powered presence that understands context and can engage in two way conversations about music. He also said that as AI capabilities grow, the scarcity of genuine human connection might make it more valuable than ever. Will we care if our favorite new song was created by AI? Or will the human stories behind music become even more important? Paul, these are some pretty interesting points that kind of hint at a Larger tension that creative industries are trying to figure out. Like what exactly does human creativity mean when AI can generate great music or art? How much are we going to care if something was created by AI or not, as long as it resonates like that? Seems like there are some really big questions at play here.
Paul Raitzer
I would go listen to this. I think Alex the, you know, Cantrowitz does a great job. He asks very direct, challenging questions. I love listening to his podcast. It's the first time I've heard Gustav speak and I found him to be incredibly thoughtful and very balanced and both like insights, but also honest about his uncertainty about what comes next. And I thought he was being very transparent about Spotify's approach to this. And I like that to believe he's right because he talked about humans valuing human experiences and creativity even more in the age of AI content abundance and overload, which is what I've been betting everything on, that that's what I could be more intelligent, more human. Like, like, yeah, very bullish on in person events and you know, podcasts like this where perspectives and points of view are shared, not just like AI generated stuff from a PDF. So I, I don't know, I just, I thought it was a very vulnerable interview where I just felt like I really liked this guy. Like I want to hear this guy talk more because I feel like that's the kind of like deep thinker that I love to like learn from. So I would, I would just suggest going listen to. It was a pretty quick clip that I saw. It was like a 10 minute clip or something. So lots to be learned and I think big areas that we all need to be exploring more as we go forward.
Dario Amadei
All right, Paul, so at the end here, two quick announcements you've got for us on a webinar and an upcoming special podcast episode.
Paul Raitzer
Yeah, so you know, on recent episodes I've talked about this co CEO GPT that I've built for myself for internal purposes. And we've gotten lots of inquiries about this and how it works and things like that. And so what I've decided to do is we're going to host a webinar on December 17th and I'm going to actually demo what I've built. We're going to show you how to build your own and share a prompt you can use for it. So whether you are a CEO or you just want to be able to talk to the CEO and understand how they think and work and kind of approach things like a CEO would, we're going to give you the tools to do that. So stay tuned. We don't have the webinar page live yet, but you can go to www.smarterx.AI newsletter and subscribe to the newsletter. And we will alert everyone that's a subscriber as soon as the page is live that you that you can register for that. So that's the one big one. And then the second is we're going to do a special episode in December. I don't remember the date on this one. Mike, do you have the date?
Dario Amadei
I think we had said we're going to release a special episode on Thursday, December 20th. Whoops. Thursday, December 19th. So before Christmas break.
Paul Raitzer
Okay. So same week. And what we're going to do is 25 AI questions for 2025. We were looking at the data and of 10 top 10 podcast episodes we've done three of them are these Q and A episodes. We figured, all right, let's give the people what they want, I guess and focus on things to think about going into next year. But we're going to do a twist on this one and let you contribute questions. So if you go to bit ly 25 dash questions dash episode go to the show notes. It's going to be in there. It's a bitly link specific to a Google form where you're going to be able to submit questions and then Mike and I will curate those and integrate a bunch of those into that special episode. So again, coming in December, we're going to have a co CEO webinar of how to build your own co CEO and 25 AI questions for 2025. Check the show notes for both of those.
Dario Amadei
Great. Paul, as always, thanks so much for breaking everything down for us this week.
Paul Raitzer
Thanks everyone. And again, final reminder, no episode next week, November 26th. We'll be back on December 3rd. Thank you as always for listening.
Mike Kaput
Thanks for listening to the AI show. Visit marketingaiinstitute.com to continue your AI AI learning journey and join more than 60,000 professionals and business leaders who have subscribed to the weekly newsletter, downloaded the AI blueprints, attended virtual and in person events, taken our online AI courses and engaged in the Slack community. Until next time, stay curious and explore AI.
Podcast Summary: The Artificial Intelligence Show
Episode #124: Has AI Hit a Wall?, What Is An AI Agent?, Dario Amodei Interview, OpenAI’s New Agent, Greg Brockman Returns & Microsoft Copilot’s Woes
Release Date: November 19, 2024
Hosts: Paul Roetzer and Mike Kaput
In episode #124 of The Artificial Intelligence Show, hosts Paul Roetzer and Mike Kaput dive deep into several pressing topics in the AI landscape. They continue last week's discussion on whether AI scaling laws have plateaued and explore the nuanced concept of AI agents. The episode also features a comprehensive discussion on a recent five-hour interview with Dario Amodei of Anthropic, examines OpenAI’s upcoming AI agent tool, Greg Brockman's return to OpenAI, and analyzes Microsoft Copilot's recent challenges.
Timestamp: 04:40
Dario Amodei opens the conversation by addressing the prevalent debate on whether AI development has stagnated. Recent reports from Bloomberg indicate that major AI players like OpenAI, Google, and Anthropic are experiencing diminishing returns despite significant investments in computing power and data. Key issues cited include:
Notable Quote:
“There is no wall.” – Sam Altman ([0:57])
Paul Roetzer counters these claims by highlighting that internal voices within AI labs, including leaders like Oriole Vinales from Google DeepMind and Miles Brundage, former OpenAI advisor, disagree with the notion of hitting a scaling wall. He emphasizes that AI models are complex systems requiring extensive human oversight in goal setting, planning, building, and monitoring, thereby ensuring that AI development continues to progress.
Timestamp: 14:23
The discussion transitions to the concept of AI agents, a term gaining traction among leading AI companies like Google, Microsoft, and Salesforce. Paul clarifies the traditional definition of an AI agent as a system that autonomously takes actions to achieve predefined goals, differentiating it from current AI models like ChatGPT that primarily generate responses based on input.
Key Points:
Notable Quote:
“If you hear about AI agents and you think, oh my gosh, they're taking my job next year, that is not happening.” – Paul Roetzer ([33:29])
Dario Amodei underscores the necessity of human roles in managing AI agents, emphasizing that tasks like setting goals, planning, and monitoring remain human responsibilities. He advocates viewing AI agents as opportunities rather than threats, encouraging listeners to explore building and managing these agents to enhance productivity and creativity within their organizations.
Timestamp: 36:31
Paul and Mike delve into insights from a recent five-hour interview between Dario Amodei and Lex Friedman. Key takeaways include:
Notable Quote:
“Models are complex. They don't work like traditional software where you just brute force a bunch of code and release.” – Paul Roetzer ([14:23])
Amodei emphasizes that despite media sensationalism about hitting a wall, the AI field continues to progress, driven by substantial investments and continuous innovation. He predicts the scaling costs will escalate, with training clusters potentially reaching $100 billion within a few years, reinforcing the sustained commitment to advancing AI capabilities.
Timestamp: 50:31
OpenAI is set to release an AI agent tool named Operator in January, capable of performing complex tasks such as writing code and booking travel by directly controlling a user's computer. This tool will be available both as a research preview and through OpenAI's developer API.
Paul Roetzer notes that while this represents a significant advancement towards traditional AI agents, widespread consumer or business adoption may still be some time away. He anticipates seeing impressive demonstrations but remains cautious about immediate, life-changing impacts for end-users.
Timestamp: 52:59
Greg Brockman, co-founder of OpenAI, has returned to the company after a three-month sabbatical. In an internal memo, Brockman announced his new role focused on addressing major technical challenges alongside Sam Altman. While speculation remains about the reasons behind his sabbatical and the implications for OpenAI’s leadership dynamics, the hosts suggest that Brockman’s expertise will be crucial as OpenAI ventures into new technical territories like the Orion model.
Timestamp: 53:41
A study highlighted in the Harvard Business Review and soon to be featured in Management Science examines the impact of generative AI on freelance job postings. Analyzing over 1.3 million job postings from July 2021 to July 2023, the research reveals:
Notable Quote:
“AI used by somebody else will take your job.” – Jensen Huang ([34:47])
Paul reflects on the study's timeframe and anticipates even greater impacts as AI adoption accelerates. He encourages individuals to proactively adapt by embracing AI tools and developing skills in AI agent management to stay relevant in evolving job markets.
Timestamp: 58:42
Google's latest Gemini model (gemini-exp-1114) has ascended to the top of the Chatbot Arena leaderboard, surpassing models like GPT-4 and Claude. This model excels in several categories, including math (ranked #1), creative writing (#1), vision (#1), and coding (#3).
Paul Roetzer explains that such leaderboard rankings are indicators of upcoming model releases and advancements. He suggests that significant jumps in rankings may signal imminent major updates or new models from Google, urging listeners to stay tuned for further developments.
Timestamp: 62:52
Microsoft Copilot faces criticism following an in-depth investigation by Business Insider, which reveals:
Notable Quote:
“I really feel like I'm living in a group delusion here at Microsoft in reference to the gap between what the company was promising and what it can actually do.” – Microsoft Employee ([68:51])
Despite these setbacks, Microsoft counters with success stories, highlighting over 200 examples of companies benefiting from their AI tools, including Copilot. They report significant productivity gains and returns on investment, suggesting that while initial implementations face hurdles, tailored, custom solutions can deliver substantial value.
Paul Roetzer advises businesses not to be dissuaded by Copilot’s current issues. Instead, he recommends exploring alternative AI tools like ChatGPT or building custom GPTs tailored to specific workflows, emphasizing that generative AI remains a potent tool when properly implemented.
Timestamp: 70:25
Elon Musk's AI venture, xAI, is reportedly raising up to $6 billion at a $50 billion valuation, primarily from Middle Eastern sovereign wealth funds. This substantial funding round aims to acquire Nvidia chips essential for AI model training. Given Musk's influence and the strategic importance of AI, xAI is poised to become a significant player in integrating AI across Musk’s portfolio of companies, including Tesla, SpaceX, and Neuralink.
Timestamp: 72:01
Ryder, a leading generative AI startup, has secured $200 million in a Series C funding round, valuing the company at $1.9 billion. Known for its secure, scalable generative AI platforms for enterprises, Ryder is expanding into full-stack generative AI solutions, including developing enterprise-grade agentic AI. This funding underscores the growing demand for specialized AI models tailored to specific industry needs.
Paul Roetzer praises Ryder's strategy of leveraging domain-specific models over relying solely on large frontier models. He anticipates a robust market for vertical, finely-tuned AI solutions, positioning Ryder for continued growth and leadership in the enterprise AI sector.
Timestamp: 75:23
In a recent episode of Big Technology Podcast, Spotify's CTO, Gustav Soderstrom, discussed how AI is transforming the music industry. Spotify embraces AI-generated music as an evolution in creative tools rather than a threat. The platform plans to support creators using AI tools within legal and licensing frameworks while enhancing its recommendation system to become a more interactive, context-aware AI presence.
Key Insights:
Paul Roetzer reflects on the balance between AI-generated creativity and the enduring value of human-driven artistic expression, highlighting the broader implications for creative industries grappling with AI integration.
Webinar on Building Your Own CEO GPT
Timestamp: 78:15
Paul announces an upcoming webinar on December 17th, where he will demonstrate how to build a CEO GPT—a custom AI model designed to emulate CEO decision-making and thought processes. This session will provide participants with practical insights and a prompt to create their own AI assistants, beneficial for both executives and professionals looking to integrate AI into their workflows.
Special Podcast Episode: 25 AI Questions for 2025
Timestamp: 79:37
Scheduled for December 19th, the hosts will release a special episode featuring 25 AI questions looking ahead to 2025. Listeners are encouraged to submit their questions via a Bitly link provided in the show notes. This episode aims to address the most pressing AI inquiries, offering curated insights from Paul and Mike.
Episode #124 of The Artificial Intelligence Show offers a comprehensive exploration of current AI trends, challenges, and advancements. From dissecting the realities of AI scaling and agent autonomy to examining the latest developments from industry giants like OpenAI, Google, and Microsoft, Paul Roetzer and Mike Kaput provide valuable perspectives for business leaders and AI enthusiasts alike. The episode underscores the importance of human oversight in AI development, the evolving role of AI agents as tools rather than threats, and the ongoing transformation across various sectors driven by generative AI technologies.
Final Notable Quote:
“AI will not take your job. AI used by somebody else will take your job.” – Jensen Huang ([34:47])
For a deeper dive into these topics and more, listen to Episode #124.
Note: This summary excludes advertisements, introductory remarks, and concluding segments that do not pertain to the core content discussions.