
Loading summary
Alex
People who are already in the ecosystem now have a super intelligence at their beck and call. That's probably the least interesting thing.
Dave
When they're on the cusp of the singularity, they'll start soft selling it. Gemini 3.0 which has in just a day climbed all the third party AI rankings. Let's break down though, what this so.
Peter
Called Gemini leap means.
Salim
This will change the game completely for everything everywhere.
Peter
Why is this just not another, you know, little faster, little better capability?
Alex
We have a way of measuring progress in our civilization. AI is imminently, I think, well positioned now that these benchmarks are saturating to start solving the hardest problems on earth in math, science, engineering, medicine.
Dave
All of a sudden you can build software by talking to the machine. This is like a different world starting today from the day that we lived in yesterday. Now that's a moonshot, ladies and gentlemen.
Peter
You know, the hardest thing for me when I'm going over the slides is what to cut out. I mean, it's all so good, right? Every one of them could be like an entire hour conversation. The question of how we group it and how we actually make it such that it's a fun conversation is so, is so challenging. I mean, so much going on.
Salim
We also have an episode on robotics, an episode on energy, an episode on AI.
Peter
Yeah, but then if we do that, we're publishing more than once a week, which is a lot. And sometimes we do. And then if you've gone like three weeks without covering one of the fields, it's like disruptive shock therapy.
Salim
The world is over.
Peter
Well, besides.
Dave
The audience has a limited amount of time too, so we gotta try and help them as much, much as possible. Twice a week, basically. And that's all you can do.
Peter
I mean, I hope you guys have as much fun as I do on this.
Salim
Oh yes, it's awesome. Scanning all the breakthroughs and looking at how fast it's all moving. It's really.
Peter
And then trying to figure out, okay, what does this really mean? Okay, besides yet another benchmark or besides yet another. You know, this number is greater than that number. Okay, so like, what does it mean for everybody?
Dave
My, you know, I get so buried in the day to day, you know, there's just so much going on and if it weren't for the podcast pulling me out of the weeds, I would miss all kinds of things. And I tell you, I get really frustrated when people don't know what's going on and they're not reacting to it. I'm like, well, the only reason I know what's going on is because we do the podcast and that prep time for it is what pulls me up out of the. So I love this time for sure.
Peter
And it's like I told my son, hey, Gemini 3's out and it's got amazing benchmarks. And he goes, yeah, insert name of model here, insert number here. Like every week you tell me that. It's like, yeah, you're right.
Dave
Well, we're the antidote for that because as we're always saying, people get inured to things so quickly and they miss the implications. And that's true even at mit, where I've been for the last three days, but it's just not true. This is like step function, life changing stuff week by week.
Peter
I think we should just jump in because there's a lot. If you guys are ready. All right, so I'm here with DB2, AWG, Mr. Exo, call signs, and let's jump.
Salim
They're all airports.
Peter
They're all three letter airport signifiers. Okay, let's get going here. So welcome to Moonshots, everybody. This is another episode of WTF just happened in tech. The real news and for us, the only news and the implications. And what does it mean? And hopefully we can go deeper into what does it mean for you, your family, your business, your company, your country, all of those things. We're going to open up with The Hyperscalers, Google XAI, OpenAI and the TLDR for this episode is Google is winning a lot going on in Googleverse. We just saw the release of Gemini 3 yesterday, which is why we're recording today. Trying to be right here, right now. All right, let's jump in. I'm going to share a video from Josh Woodward. Josh is a friend. I had him on the abundance stage a year ago. He now heads Gemini and Google Labs, A brilliant presenter. We're going to have him on this podcast right in the new year. Excited for that. All right, let's jump in.
Dave
Hey, everyone.
Josh Woodward
My name is Josh and I lead the Gemini app, Google Labs and AI Studio. And today is the day. Gemini 3 is here and it's in the app. You can try it right now. It's our smartest model ever. We have this new feature called Agent and you can actually go in now to Gemini, describe a task and it'll get to work for you. So you can plan a trip, you can research products, all these things, acts on your behalf, takes multi step actions, tool calls all of it. The other thing I'm really excited about, we're entering into a new era where you can create UI dynamically. The model creates these generative UIs so you can go in and when you ask a question, Gemini will not just respond with a wall of text. It'll actually pull in images, different interactive widgets. Gives you a much more customized experience based on what you're looking for. All of this gives you a more helpful response. And so I hope you go out, try both of those features and more today. We look forward to your feedback.
Peter
All right, one more video here from Gemini, then we'll discuss it. This is their official introducing Gemini video. And again, congratulations to Josh for taking the lead there and crushing it. Crushing it. We'll talk about the benchmarks with of course, AWG in a little bit. But before then, Gemini 3 is the.
Josh Woodward
Strongest model in the world for multimodality and reasoning. It's our most intelligent model that helps you bring any idea to Life. Google Search Gemini 3 enables new kinds of generative user interfaces. It codes interactive simulations like this one custom built for your search in the Gemini app. You could supercharge how you learn, create, plan, take action, analyze complex videos and more. We're even introducing a new platform, Google Antigravity. It's our vision of software development at the frontier of model intelligence. It lets you use Gemini 3's agentic coding capabilities to accelerate how you build. This is just the beginning of our Gemini 3 series.
Peter
Okay, who wants to dive in first? Dave, you want to jump in? What's this mean to you? Why is this just not another little faster, little better capability?
Salim
Dave is full kid in the candy store here. This is great.
Dave
Well, I can't wait to hear Alex's take on this too. It's at 50% almost. In humanity's last exam is such a step function change in history. And I was over at MIT last night talking to a bunch of undergrads and I'm trying to tell them, like, look, you don't know this, but 40 years ago we started writing code as a species. And we started with cobol and we.
Peter
Started with ones, ones and zeros and hexadecimals where we started.
Dave
True, we started with assembly. And I swear to God, if you look at what happens today when you write code versus 40 years ago, it's identical. It's like a higher level language. Nothing's really changed. All of a sudden you can build software by talking to the machine. It is such a different world. Starting today and moving forward, I'm hoping they can then generalize and say, well, it's coding Today it's gene sequencing, tomorrow it's it's all white collar automation, the day after that, then it's all industrial design of robotics is done by voice. This is like a different world starting today from the day that we lived in yesterday. And it's really hard to get people to fully understand the implications. So it's just such a. Well, anyway, we'll get into it. I can't tell you how big this is.
Peter
Alex. Yeah. What's your takeaway buddy?
Alex
I think said in the past here? I think the singularity is probably an optical illusion. When you're in the midst of it, space time feels flat. And every time I hear the question, well, what else is new? The benchmarks are going up and to the right doesn't feel really transformative. That to me is a sign that when you're in the midst of a singularity that spacetime feels flat. And breakthroughs that are happening essentially every week or every day feel prosaic. There are so many transformative aspects of Gemini 3 just walking through those two videos, starting from maybe the least transformative aspects, the Gemini app itself, which is how many people are likely to first encounter. Gemini 3 now is integrated with all of the other Google properties. So there's been a lot of bellyaching over the past year like why can't I agentically have Gemini write my Gmail for me or have it organize my calendar for me or interact with YouTube movies that that's I've been playing with Gemini Agent, the agent Mode part of Gemini 3. And that that's seamless at this point. It's literally a single Click to get Gemini 3 order your entire Google platform based existence or Google workspace based existence. That's probably the least interesting thing.
Peter
But a powerful, a powerful driver for people to switch to Google as an all in platform. I mean that's what's really the situation that they're striving for.
Alex
Google has billions of users across all of its products already. So I'm not sure at the margin the greatest impact on humanity is getting people to switch to Google. I think it's more people who are already in the ecosystem now have a super intelligence at their beck and call. And again that's the least interesting thing. A couple of more interesting things in interacting with the model itself. And again this is not focusing yet on the benchmarks. This is just on interacting with the client. It smells people in the community refer to something sometimes as big model smell a model that has certain types of capabilities that can't be arrived at through extended reasoning or through other sort of smaller footprint attempts to extend the capabilities of a model. Gemini 3 has what I think can be fairly termed big model smell. You can ask it to do cross modal or multimodal tasks that are very challenging to do elsewhere. One of my first tasks was I fed it a photo of the MIT campus and I asked it generate a 3D voxel block world type rendering that I can interact with. And one shot, basically zero shot, it produced an interactive 3D rendering of the MIT campus. There's also, I don't want to let this point drop antigravity. The code development environment, the integrated development environment that it was focused on Gemini 3. My understanding is that the Windsurf team, we've talked about Windsurf in past Cursor competitor. Many of the core members of the team joined Google, DeepMind and Anti Gravity. As a result. I was interacting with Anti Gravity. It was a very impressive visual studio code derived experience for code development. So there are so many pieces here and that's before we get to the truly interesting stuff in my mind, which is the benchmarks.
Peter
Yeah, yeah.
Dave
You know, one of the things, one of the things we said a while ago is when they're on the cusp of the singularity, they'll start soft selling it. And you noticed, you know, Google put out all these benchmarks that are mind blowing and the only thing they put out in terms of content is that Josh Woodward clip from a second.
Peter
You know, it's contrast, contrast that to contrast that to the open, you know, the GPT5 release. Right. Which was a special hour long presentation by Sam and so forth. This was like you said, a very soft sell. One thing I found fascinating is the speed at which we're sort of up leveling the models. Right. Gemini 2 was December of last year, 11 months ago. And now we've got Gemini 3 coming out. So increasing speed at which we're deploying, we're seeing that across the board with the hyperscalers.
Alex
Maybe just to comment narrowly on that, from my perspective, Gemini 3 is the biggest model release since OpenAI's oth in April all of 7ish months ago. GPT 5 of, to the extent GPT 5 may have felt slightly underwhelming, I would argue it's because almost all of its raw capability jumps actually happened a bit before in the form of O3. And then maybe think of GPT5 as O3, which was actually O2 because O2 was trademarked, so it had to be called O3 GPT5 was actually like O 2.1. So I think we can't take credit away from OpenAI on the achievement that was 03 and then partially repackaged as GPT5.
Peter
Every week my team and I study the top 10 technology metatrends that will transform industries over the decade ahead. I cover trends ranging from humanoid robotics, AGI and quantum computing to transport, energy, longevity and more. There's no fluff, only the most important stuff that matters that impacts our lives, our companies and our careers. If you want me to share these metatrends with you, I write a newsletter twice a week, sending it out as a short 2 minute read via email. And if you want to discover the most important meta trends ten years before anyone else, this report's for you. Readers include founders and CEOs from the world's most disruptive companies and entrepreneurs building the world's most disruptive tech. It's not for you if you don't want to be informed about what's coming, why it matters, and how you can benefit from it. To subscribe for free, go to dashmandis.com metatrends to gain access to the trends 10 years before anyone else. All right, now back to this episode. This for me is seeing Google go from reactive assistant where you're asking it for something to autonomous agent and handling complex real world data. And we're going to see that in next slide and let's go there. So let's go to Gemini 3 delivers breakthrough profitability in AI run mini economy. This is the vending bench benchmark which I Love this. And Gemini 3 outperforms Grok Claude Chatgpt in long term business management tasks to explain to us what this means. The king of benchmarks. Alex, let's go to you.
Alex
I love benchmarks. I love this benchmark in particular. So this is a benchmark vending bench arena that's maintained by a company named Andon Labs. It's derivative of another benchmark that they maintain named vending bench 2. The basic premise is AI agents are given simulated $500 to start. They're put in charge of a simulated vending machine. They're given tools that they can manage so they have the ability to send and read emails, like real, full natural language emails. They're given the ability to search a simulated Internet. They have a simulated bank balance, they can send money, they can receive money, they can stock and restock the vending machine. They can set prices, check inventory, collect cash, et cetera. So this really is performing the role almost of a Middle manager in charge of a vending machine. And if the simulated agents maintaining the vending machine, if they fail to pay a $2 daily fee for 10 consecutive days, they go bankrupt. And the goal of the game is to maximize the return on investment for that initial simulated $500. And I think this is just such a lovely self contained proxy for AI agents as first class economic actors. If AIs can do a spectacular job of managing this pretty rich simulated vending machine world, then I think they're halfway to autonomously running their own real world businesses and becoming AI entrepreneurs. At which point we get zero human startups.
Peter
Wow. It's amazing, right? We talked about Gemini 3 is delivering almost 3,000% more profit than GPT5 or Claude Sonet. And you're right, we've talked about going after stablecoins and agents together, spinning up new businesses faster you can possibly. Now the one thing this doesn't do is it doesn't account for the messiness of employees and this would have to be a non human business that it's running in order for it to really maximize profitability without dealing with. Yeah, go ahead.
Alex
I would actually argue that the email functionality built into the benchmark, so when it sends and receives emails, there's a large language model counterparty at the other end writing full natural language emails. So I could imagine a generalization, maybe a future version 3 or 4 of Vending Bench that does take into account say like performance reviews and interacting with employees. All of that I think is not technically that much more difficult testing. If you can manage vendors, if you can manage vendors and suppliers, then email communication with employees is not that much harder.
Peter
Interesting.
Dave
Dave, the Internet, well the Internet advertising business is $300 billion a year, completely non human. The whole thing is autom, automated bidding, automated placement. I'd be surprised if the non human economy is anything less than a trillion dollars already. So the parts of the economy where you can just deploy this are going to grow very rapidly now, which I think. But did you notice how Alex has a lot more emotion in his voice right as the AI is getting more sophisticated. So is that improvements in the algorithm or is that just enthusiasm?
Salim
If his true identity is being revealed.
Alex
I think he's proud of when personhood is granted and I get to be a real person, real boy as it were, then I get to run my own business too. I guess.
Dave
On this topic I completely agree. Like we need many, many, many more benchmarks and the more real and practical they are, and the less technical they are, the more it Opens up people's eyes to what's possible. And I think we desperately need more benchmarks in the medical area. And Peter, you're the top guy on the planet in this, but we're getting so close, so close to being able to cure first, extend people's health span, delay cancer, delay heart disease, and then cure it. And if we do that quickly, I think we can save 30 million lives. You know, there's 10 million a year. And this is very, very important to me personally, just because some friends that I have in this situation, and I swear to God, this step function improvement today puts that right in front of us. And I think it's almost criminal for people not to remap their.
Peter
Dave, imagine this in the future. Instead of AI agents managing vending machines, you're gonna be a part of a population and the agent's gonna manage you. It's like, go outside, take a walk right now, drink another glass of water, go take these pills.
Salim
That's the promise of the Jarvis thing you keep talking about up here.
Peter
Yeah, it's coming, buddy. So, Saleem, I know you got a pesky leaf blower outside. I tell you, I keep on saying to you, please make electric leaf blowers. Just make them quieter.
Dave
Nat Friedman has a $100,000 prize for anyone who can create a silent electric leaf picking upping machine.
Peter
Oh, crazy, right? And should be. We're going to elevate it to an X prize and put $10 million behind it.
Dave
Let's do it. That's a great idea.
Salim
I've got a couple of thoughts. One is the entire stack of society can now be AI mediated. Right. Which is kind of an incredible thing to be able to say. And the second part of this is there's a really important point that Alex made, which is you can now build a company with literally zero employees. We were talking about three employees a few weeks months ago, Peter, and a year ago. Right now it's down to zero. And this is going to change the game, and absolutely will happen. As Dave says, there's already a trillion dollar or so economy out there, and this is going to get automated very quickly.
Peter
All right? So keep your eyes on this. I mean, it is. As an entrepreneur, I think about this, when can I start spinning up companies? Can I give $10,000 in stablecoins to my AI agents and say, go make me some more money? And now the question is, is that available for everybody? Can anyone and everyone spin up an agent that is going out there and generating revenue for them? Because if it isn't then we're beginning to have a widening wealth gap. All right, let's go to our next story here. And this is a story about a one shot cyberpunk first picture shooter that I think it was. You made it, Alex.
Alex
That's right. I see the comments sometimes, people. I've remarked in the past that one of my favorite evals for a fresh model is to ask it to generate a cyberpunk first person shooter. And some folks in the past have suggested as nonsense. So I thought it might be instructive, given the strength of Gemini 3, to ask it to one shot the generation of a cyberpunk first person shooter. The prompt that I gave it, the only prompt was create a visually stunning cyberpunk FPS that I can play. It should have nice music and rich visuals.
Peter
All right, let's play the video. If you're watching on YouTube, enjoy this. If not, go to YouTube. So Neon Protocol. I do like the music.
Dave
Actually, Alex. I immediately copied Alex's prompt and extended it and my music. Would you do Absolutely Nausea. I said make it even faster action and make it a deeper pumping bass. And my version was just nauseating beyond belief.
Peter
Okay. So I mean listen, I mean this is. I keep on telling my kids instead of playing video games, at least design them and build them. And so this is just making it so much easier.
Dave
And the prompt is everybody listening. You can do this. This is not like something you have to have special access. You can do exactly what Alex did in less than five minutes. So go ahead and try it and then modify it. It's super fun. Also, Google has a limited amount of computer and everybody can do this for free. But after you hammer it for a few hours, it'll throttle you. So take advantage of your first few free hours and have some serious fun and learn a lot.
Peter
I was with Jack Hickory at fii and one of the conversations I had with Jack and I respect this very much, he says, instead of waking up in the morning and consuming like just scrolling through everything, get up in the morning and create something, build something and you can go on Alex.
Alex
And to that point it's never been easier. That was probably 140 characters or fewer. If you can post on X or post a short social media message, you can create a game on demand. Which means that I think we should expect to see billions of games created in the next year because it's now so easy. It's the most competent one shotting I've ever seen.
Peter
Gaming slop.
Salim
Just to echo the conversation from last week, with 140 characters in flying cars. It'll be amazing when the inner loop gets to a point where you can just use 140 characters to say, build me a flying car.
Alex
Correct.
Salim
Yeah. It goes and does it.
Alex
You can do that right now. You can, with 140 characters, create a simulated flying car with Gemini 3.
Dave
Yeah. You know, there are 6 million people in America whose full time job is influencer and that was enabled by the camera phone. Prior to that, you needed a production crew and heavy cameras. Like you couldn't be an influencer all of a sudden because there's a 4K camera on every iPhone and there's great editing. Six million people shift to influencer as a career, this is at least as big a shift. If you say video games are generic right now, let me make something custom to my community. Custom to people. You can actually create it even if you couldn't code yesterday. Today you can create something just using your thoughts and your voice and so it opens up career opportunities.
Peter
Let's take a listen to this. This is the next article here. Is Gemini Live a more natural voice on this menu? Yes. There's a sea bass.
Salim
Yum.
Peter
I love sea bass.
Alex
Can you help me order that in Spanish?
Peter
Of course.
Alex
Try.
Peter
Me gustaria la lubina, por favor.
Salim
How's this?
Peter
Me gustaria la lubina, por favor. That sounds great. Yeah. So, you know, I think they made a nice move forward here. I used to love my GPT5 voice. I use Ember when I'm talking to it. And Gemini was felt stilted and not natural. So they really did a great job moving us forward. So super excited about that. Interesting. On the translation side, we talked in one of the previous pods about duolingo being disrupted well over the year. Now it's down almost 50% in the last year. So a lot of challenges there. They're going to have to reinvent their business model, which I'm sure they will. Dave, what are your thoughts on this?
Dave
I'd like you to remember what Peter just said for later in the pod because I had the exact same experience where the OpenAI version of the voice was much more engaging. I can talk to it while I'm driving. It's great. And then the Google version was stilted and robotic and just no fun. So now Google has leapfrogged and it's actually better. But they did it under competitive pressure from OpenAI and I think you're going to see that theme throughout everything that we see on this podcast that OpenAI hopefully will catch up and leapfrog again. But that's the only reason Google moves, is because of that pressure. Otherwise things just stall.
Peter
I mean, Dave, we had that conversation and you noted it in our chat. A lot of the AI capability, a large amount of the large language models were developed in Google. But until OpenAI released them onto the open web, Google was holding back. It was the responsible thing to do. Don't. Don't allow it to code itself, don't put it on open web. That was the basic thesis of the last decade. And when OpenAI moved, Google had no other option but to move as well.
Dave
It's just big company shit. I get it. Because I've run companies with hundreds or thousands of employees, it's hard to make your company move. But then you get competitive pressure from a little nimble company and it's much easier as a CEO to say, guys, get your asses in gear, there's a threat here. It's kind of the dynamic that makes America and the global economy move forward at all. But all this technology, like you said, Peter, was invented originally. The transformer algorithm was invented inside Google and it was just sitting there, like literally not coming out the door at all. And we could go through all the reasons, we've talked about them before, but. Sorry, Alex, you were going to say.
Alex
I would perhaps go even further and argue that many of these underlying capabilities are not just available, but they're available in the underlying data distribution that these models are being trained from. And that exposing, for example, different accents is probably more of an unhobbling, as they would say, than anything else. It's not so much the capabilities are being added as is restrictions being removed. And frontier models in particular, when we see like live audio type engagement are moving from what they've been in the recent past, which is audio to text to text to audio, just directly audio to audio, which enables much, much richer audio interactions, including accents.
Peter
Yeah, I mean, and where we're going here with the next generation of AR glasses, everyone's developing and basically plugging into your auditory and visual input. It's simultaneous translation. It is going to change how we communicate with people around the world in an extraordinary fashion. This was a fun one. Again, continuing on the Google theme, the tldr, they really have gone one hands down. I know, Dave, you and I are looking at the prediction markets that Google has literally skyrocketed to be the contender that is the winner by the end of the year. And I think they got that mantle. Google AI helps users shop, compare and call stores for the holidays. New agentic features can call your nearby stores, check stocks pricing, Gemini apps, add built in shopping tools. I mean this is like, hey, call 20 stores within 10 miles of me and find out who's got the cheapest prices and put it on hold. Or better yet, purchase it for me and have it delivered tomorrow. Holy cow, a lot to unpack there.
Alex
So I had to check on this one. Peter. It was all of seven years ago that Google launched Duplex, their AI store calling functionality at I O seven years ago, 2018, the year after. Attention is all you need. It's been seven years for this to make it into some fully realized format. But I think this is finally the beginning of AI starting to autonomously index the physical world. If you can have AI call stores autonomously, you can send AI powered robots out into the physical world to index everything that's going on as well.
Peter
I'm curious what the consumer behavior is going to be like. Is it going to be just become. Actually what I'm really interested in is what's it like on the other end when you're in the store, you're getting all of these calls inbound. And at what point is more than 50% are AI calls?
Alex
You have AI answer the AI calls obviously.
Salim
Sure.
Peter
I mean, is it going to be that you have to identify yourself as an AI? Probably.
Alex
That is what Duplex has historically done. It announces itself as an AI assistant.
Dave
Yeah, actually so far it's going to be state by state, but so far the AIs are not announcing themselves. And about half we do a lot of this inside our lab here. So about half the time people are like, am I talking to an AI? And the other half, they have no idea.
Peter
And so do you have to answer it if it asks? If you ask, you don't have to.
Dave
In most states you don't have to. But again, regulatory consideration is moving so slowly, it's just completely ambiguous. But as of right now, you don't have to. But it doesn't hurt to say, yeah, I'm an AI, or even declare it up front. It's not hurting the call performance rates at all. So you might as well just say, hey, I'm an AI, but I'm so much more helpful than the guy you were going to talk to.
Peter
My new business idea then is a little button on your phone. When an AI calls you, you flip it over to your AI. Because when I'm calling a store, I want to speak to a human, but you know, the human at the store. What do you think about that? Product, you know, you know, how good is it? Are people returning it? And that interaction is a pro. Human to human interaction. But I'm not going to have that tolerance with an AI.
Salim
Wait, I want to challenge you on that too.
Peter
All right.
Salim
If you call a store, why do you want to talk to a human? An AI is going to know way more about the inventory, the situation than you've been wanting.
Dave
Yeah, exactly right, Salim. And not just that. The AI can pull up images in real time and show you the product and spin it around and stuff. So it's not nothing like talking to a human in a store.
Peter
Okay, you win.
Dave
Far, far more engaging. I'll tell you what else. The voice run guys here in the lab are doing opentable, doing restaurant bookings and stuff. And you wouldn't believe the fraction of restaurant bookings that are non English speaking person or going the other way. If you're traveling internationally, it's a lifesaver to be able to talk in a different language and do your full booking and then the AI just translates it.
Peter
Fascinating.
Alex
I do think this is how we get to APIs for everything. There's now the need for an escape valve for surfaces for business interactions that don't support APIs. With an AI that can make voice calls and have arbitrary, unstructured interaction, we get APIs for everything we do.
Peter
Okay, one more article on the Gemini front. Gemini three benchmarks. We should probably skip this. I don't think anybody's interested in it, but okay, you can't.
Salim
You're teasing. Good teasing.
Dave
Alex has just sent a drone to your house. There watch.
Peter
Watch the roof to take me out.
Alex
I'm going to send my duplex AI to give you a phone call.
Peter
All right, Alex, clue us in here. Gemini, three benchmarks. How good are they? And at the end of the day, what do they really mean? I mean, just to represent people watching this listening and watching our Moonshots program. Okay, Alex, I hear you talking about benchmarks every time, right? We're going to talk about some more benchmarks in a little bit, but what does it really mean? What does it mean to me? So sure.
Alex
So I guess there's the headline. The numbers are going up and to the right. So who cares? Who cares is we have a way of measuring progress in our civilization. And this is a precious moment when with raw numbers day by day, at this point, we can track progress towards solving some of the hardest problems that our civilization faces. Humanity's last exam. Say what you like about it. Some like it, some like it less but it's an attempt, as are all of these benchmarks, to encapsulate in a measurable quantitative way progress by AI towards solving hard problems. In humanity's last exams case, it's an attempt to measure the ability for AI to solve problems PhD level problems. In the case of ARC AGI 2, it's an attempt to model human level ability to visually reason the so what is these benchmarks are all saturating, which means that AI at this point has the ability to perform PhD level research. When we think about the so what for the so called average person, it's going to be that AI is imminently, I think, well positioned now that these benchmarks are saturating to start solving the hardest problems on earth in math, science, engineering, medicine. That's the so what we spoke about.
Peter
That last episode, last pod with Sam Altman speaking about science breakthroughs coming on GPT6. That's his expectation here. The numbers are impressive. If we're looking at GPT 5.1, Gemini 3 is basically doubling the Arc AGI 2 benchmark. It is effectively doubling Claude 4.5 on humanity's last exam. I mean these are not incremental moves, they're significant step ups.
Alex
And critically, it's not benchmaxing that we're seeing. There are some labs that have been accused of just optimizing their AIs to do well at one or two of the benchmarks and then when you ask them something out of distribution, they fall over. That doesn't appear to be the case here. It feels like the team behind Gemini 3 really did a professional job, not over optimizing towards narrow spiky intelligence on any of these benchmarks to do well. In a press release, this feels like a well rounded generalist AI model. And given the trajectory towards saturating these benchmarks, I'd be very surprised if by the end of say next year we're not seeing hard research problems succumb to AI models like this one.
Peter
Do you remember two podcasts ago Alex, we had that paper that came out on how to measure AGI? Like defining it in terms of the I don't know if it was 10 or 12 different quadrants. I wonder how Gemini 3 does on that.
Alex
I'm sure we'll know soon enough, but I would expect it to do generically well on the spikes where models historically were doing well. As I recall, one of the spikes where one of those dimensions where models historically did poorly was on continuous learning with ultra large context off the cuff, I wouldn't expect Gemini 3 Pro to do amazingly better on ultra long context, but it does really well on retrieval scores. I don't think it's shown in this slide, but there are other needle in a haystack type benchmarks that attempt to measure how well models are able to retrieve tiny facts of information buried in their context. Window Gemini 3 does 3 Pro does amazingly well at retrieval as well. So I think almost everything is going up into the right at this point.
Salim
Yeah, there was one observation I had and I wanted to check with you guys what you think of this when you have coherence at this scale, it implies we have systems level thinking inside these models. Is that accurate?
Alex
Could you say a little bit more salim about what that means?
Salim
Well, because you've got Essentially systemic thinking is one of the holy grails of deep deep reasoning. Because you can look at the entire patterns of things shifting and it feels to me like we're at this level of AI competency. You can get to that kind of systems level thinking. That means you can do world modeling in a really powerful way using almost you shift the whole thing to symbolic reasoning almost when you can think in those concepts. So don't we get to that level very quickly now?
Alex
I have so many thoughts, but the first thought that immediately jumps out at me is of course these are world models and of course they're able to symbolically reason. They're solving math problems and they're writing source code. I would argue that in past you've seen some commentators argue that there's some sort of nebulous, neuro symbolic type advancement that's waiting to drop. I think that's utter nonsense. Of course they're able to reason symbolically. The tokens are in some discrete space. And of course there are systems level thinkers. They're able to solve PhD level problems across dozens of disciplines. That requires understanding the world as a system.
Dave
So yes, I agree with that and it turns into a philosophical debate and nothing great usually comes out of it. But I will say that, that this is a 7 trillion parameter class model. And last year all the naysayers were saying, well there's evidence that things will slow down because last year we're at a trillion parameters and they were clearly wrong when you went from 1 to 7. We know next year is at least a 10x and up to a 40x step up in raw horsepower. And the naysayers are saying, well, things are going to level off unless we crack through some other level of system two level thinking. But they're clearly not leveling off. And I would challenge the technical audience out there. Looking at these benchmarks, you're almost obligated to think about two things if you're at all inclined. One of them is where on these benchmarks does it become self improving? Read all of Ray Kurzweil and really have an opinion on that, because that's tied heavily to benchmarks 1, 4, and 6 on this slide. And to just have an opinion, I have my opinion. But have an opinion about where you need to be on 1, 4 and 6 in order for this thing to improve its own algorithm. That's a critical point. And then the other one is, where do you need to be on the benchmarks to start proposing cures to diseases and being right? And if you work in anywhere in health tech and you have no opinion on that topic, you're doing a disservice that's bordering on, in my opinion, bordering on negligent homicide. Because this can save lives if you work on it. And if you apply it to whatever you're doing in health tech and you're obligated to get your head out of the sand, look at this podcast, study the numbers and at least have an opinion. And even if that opinion is no, it's not gonna work, that's fine, I'm okay with that. But to say I don't know or I didn't listen to the pod, that is absolute negligence.
Peter
Can I ask a question to you, Dave and Alex? Yann LeCun comes out saying, we have gone down the LLM rabbit hole and that's the wrong direction. We're optimizing on that. We need to go through a different evolutionary tree to really get to AGI. What are your thoughts?
Dave
All the old people say that and all the young people don't. When that tells you something out of the gate, you're sorting yourself into an age bucket just by saying it. There's definitely a philosophical divide in there, but the question I would ask isn't, is there another innovation that we need? It's whether a human will have that innovation or this exact AI scale will have that innovation.
Peter
I would bet on the AI anytime.
Salim
Either way, we have so much to absorb just from where we are now. Forget everything else that may come along later.
Alex
I think there are also many paths to AGI, and I know and respect Jan's work, and I know he favors an approach toward AGI that's more focused on actions in an embedded space rather than in terms of autoregressive models. That may be a perfectly legitimate approach as well, but when I see the scaling laws continue to hold and capabilities continue to go up and to the right without any new paradigms, it makes me think maybe really we can just continue scaling and don't need to worry as much about yet another paradigm shift.
Peter
And let AI do that. All right, let's go on here.
Salim
Let's turn to Wait, insert my normal rant about AGI here and we can move on.
Peter
Okay, so noted and approved.
Ross Holiday Shopper (Ad Voice)
This episode is brought to you by Blitzy Autonomous Software Development with infinite Code Context Blitzy uses thousands of specialized AI agents that think for hours to understand enterprise scale code bases with millions of lines of code. Engineers start every development Sprint with the Blitzi platform, bringing in their development requirements. The Blitzi platform provides a plan, then generates and pre compiles code for each task. Blitzi delivers 80% or more of the development work autonomously while providing a guide for the final 20% of human development work required to complete the Sprint. Enterprises are achieved a 5x engineering velocity increase when incorporating Blitzi as their pre IDE development tool, pairing it with their coding copilot of choice to bring an AI native SDLC into their org. Ready to 5x your engineering velocity? Visit blitzi.com to schedule a demo and start building with Blitzi today.
Peter
All right, next story. OpenAI introduces GPT 5.1 for developers so again, this is a benchmark question. First of all, this was announced before Gemini 3 came out, so I am curious AWG, whether this is still the case and why again, why does this matter?
Alex
Yeah, I think the economics of this, the microeconomics are maybe even more interesting than the technical side. So we're starting to see, and this is somewhat visualized in the chart you're showing, the beginning of inference time compute start to conform to the economic productivity of queries. So you know how like in Google search, for example, if you search for mesothelioma litigation, you're going to see a bunch of very expensive adwords.
Peter
Yes, for sure.
Alex
It's a very economically valuable query. On the other hand, if you search.
Peter
For the lawyers for the lawyers for.
Alex
The lawyers, if you search for an arithmetic query, you'll see none or almost no ads because it's not that economically valuable. We're starting to see, I think that same dynamic emerge here where certain queries require lots of inference time compute. And so what we're seeing at the routing layer with GPT 5.1 is even more compute being allocated to queries, to prompts that really require a lot of compute. And then for the lighter, easier queries or prompts we're seeing less compute get allocated. And I think this is actually pretty profound. It's not just a matter of moving around the deck chairs in some sort of zero sum game. I think this is actually almost a premonition for what the economics of post superintelligence will look like. One of the things I think the most about is who's going to pay at the end of the day for the trillions of dollars of capex in data center build out. Who's going to pay for it? Is it going to be the consumer? Will the consumers on average be spending hundreds of dollars per month on core subscriptions for AI? Or will it be enterprises that are spending billions of dollars in some cases for enterprise level tasks? And I think what we're starting to see here is that modally, probably it's going to be the enterprises paying lots of money for the most valuable tasks. In the same way we're seeing right now in microcosm, some of these harder tasks, harder prompts get allocated, a lot more inference time compute at the expense of easier queries.
Salim
I would totally bet on that direction, just because if you're say target, you can manage merchandising and get 20% extra margin on something, then it's worth the extra compute on the back end and we'll see a lot of that.
Peter
But there are places where consumers will spend hundreds of dollars a month on their iPhone, on their plan, because it enables them in an extraordinary fashion.
Alex
But remember that the money to be made here is on the margin from persuading people to switch their behavior from what they otherwise would have done if, if they were going to spend the money anyway. That money doesn't go to the AI, it goes to the entire value chain underneath the phone manufacturer.
Dave
All right, well, I can tell you in my experience, you have to operate at the margin at the extreme end of what these are capable of. And I've tried to either save money or to get more speed by dumbing it down by a half step. And it just isn't the same. And so it just feels like everybody wants to be at the forefront. And this is the weirdest product that's ever been launched on humanity in that it's talking to you as it's selling to you. And so you start with a subscription, they give you this incredible experience, and then it tells you, well, you want more of that, you need to upgrade, but it's actually telling you it's talking to you about upgrading. No product, no cable company no iPhone has ever done that before. So it's a salesman baked into its own capabilities. It's kind of creepy, actually. It's very weird.
Peter
All right, let's stay on the OpenAI theme. And this is a fascinating story. It's an important one. OpenAI backed startup aiming to block AI enabled bioweapons. So this is a startup called Red Queen Bio and they received a $15 million investment from OpenAI, which by the way, just sounds really small compared to all the $100 billion and trillion doll being made. But Red Queen is using advanced AI plus lab testing to spot vulnerabilities in biological systems. They're basically saying, hey, we want to stop people from using these AI models to create bioweapons. Super important. Who wants to jump in first?
Alex
I'd love to speak to this one. Maybe for starting with the literary reference. So for those not tracking, Red Queen in this case is a reference to a scene in through the Looking Glass where Alice and the Queen are constantly running just to stay in the same place. So the Red Queen's race in general is used as a metaphor to cases where a lot of effort is required basically to maintain a standstill. And in this case, I think the other key concept that I think is ultimately quite profound out of what Red Queen Bio has announced and the reason why they're taking funding is we've just spent quite a bit of time talking about how as you pour more compute onto these models, the capabilities keep increasing, inevitably you have to worry about alignment and safety as well. In society, if you're growing a city and you double the population, you're going to approximately want to double the police force or the safety force. Wouldn't it be wonderful if as the capabilities of AI keep scaling, keep increasing, the safety measures, the alignment and other properties that make them safe for humanity, if those also benefit from scaling with more compute. So seeing scaling laws, Red Queen bios announced that they've uncovered scaling laws for biological safety measures. I think this is the way we achieve alignment. Just like again, the scaling law for police forces in the city. A little bit sublinear relative to population. Same idea here, but nonetheless close as capabilities increase. We want to live in a world where we achieve so called defensive co scaling where the resources and capabilities of safety measures scale close to proportionally with the resources and capabilities of the underlying models.
Peter
Yeah, let me add some data to that. So today, or at least Last year in 2024, the biosecurity biodefense market was 34 billion billion and it's expected to double by a decade from now. 2034, 2035. But here's the quote that really hits me. An extreme bio attack scenario could have a multi trillion dollar global loss. And the notion is, could you create such a bioweapon for 1,000 bucks? It's the asymmetric situation where a small amount of money using complex models could do a lot of damage. And so there's got to be this layer of defense. I mean, it's critical. When I talked to Eric Schmidt, I remember a couple years ago at fii, you know, the number one scenario that is of greatest concern are bioweapons. Something that can be. You take an existing virus, you change its viral payload, you make it much more infectious, and release it. You know, Salim, you and I have had this conversation that one of the most important things is going to be to set up these biosensing capabilities at train stations, airports, bus stations that are filtering the air and looking and doing rapid sequencing of everything they come across. You know, the majority of the bioweapons that are concerning are airborne, right? So a person coughs or sneezes and it's there. And one thing that is in our favor is that these viruses, these bioweapons, can only move at the speed of an airplane. That's the fastest it can go, right? And it travels. We saw that with the release of COVID So if you can detect it at an airport, sequence it on the spot, develop an antiviral, and then transmit that at the speed of light, not the speed of 600 nautical air miles per hour, then you have a chance of battling it. And this is what we're talking about.
Dave
This is also exactly why open source AI is dead in America. Meta decided, okay, we're not open sourcing. So now none of the US labs are open sourcing anymore. So the only open source models are coming from China. But if you're a US company, usually a terrorist in a basement in some jurisdiction somewhere in the world isn't the sharpest tool in the shed. And you're counting on them not knowing how to build the weapon. But when you give them genius level AI as a sidekick, suddenly they're empowered to build virtually anything in that basement. And that's the risk. No US company wants to be responsible for that. So they're trying to cut it off at the query level, saying as soon as you ask the AI to help you create a bioweapon, it stops. And so the open source would be a huge leak in that. So the US labs don't do the open source anymore. The Chinese still do. Alex, comment on that.
Peter
How do you deal with that? If it's a model running on my laptop and somehow it contains enough knowledge to do this and I can query my laptop, no one ever knows the query I've made. It's just resident there. How do we deal with that?
Alex
Yeah, I think ultimately it all reduces to co scaling. So if you imagine having a fully self contained facility hypothetically in your basement and the ultimate societal protection will be having lots of sensors and more importantly having lots of AI screening, super intelligent AI screening that can spot hidden agents. I have this dictum that I think is super important on so many different levels in the software engineering world. There is Linus Torvalds who, who created Linux has this so called Torvalds law that I'm going to butcher this slightly that with enough eyeballs all bugs become shallow. And I would propose sort of a generalization to that, that with enough superintelligence, all hidden agents become shallow. So what happens to the extent that we have hidden agents in their basement building superweapons, I would expect with enough superintelligence, defensively co scaled they become shallow.
Peter
So I've made the comment, if I made the comment before that privacy is an illusion and this is just gonna shatter even that illusion. Because if you want safety, you're gonna want agents listening and watching everything all the time.
Salim
Saleem this is an arms race. I think what we've seen throughout history when we tried to, we thought, oh my God, email's gonna crash because of, of all the scams. And then we thought we had phishing, we can't solve for that. And we've used AI consistently in that sense because people forget the bad actors may use AI and they will, but the good actors can also use AI and therefore you just have to be one step ahead. The question is if that gap gets too big. One of the challenges with what you were saying earlier, Peter, is you may not know what to look for in some of these. And that's the danger point.
Peter
Yeah, well you do it in catalog.
Dave
It's an interesting little case study too because you know, if you rewind the clock. Before Gmail took over, Microsoft had Outlook and Hotmail and Google launched Gmail. And the two promises were very different. Microsoft said we will never read your email. And Google said we will read every word of every email that you receive. But it's going to be read by an AI and not by a human. So we won't let the human eyes look at your email. But we're going to do all kinds of things based on the information in your email read by the AI. And people didn't care. And so everybody moved to Gmail. So you have an interesting case study in how this plays out. Just the human behavior. So here I think the equivalent is, hey, I'm talking to AI about my most personal things in the world. And Peter, I think you're right. The AI is going to listen to every single word. And if you're designing a bioterror weapon or a cyber attack, it's going to flag it and escalate it. And if you're talking about your virtual girlfriend or whatever, that's going to be fine. I'm just going to kind of hide that.
Salim
Yeah, I, I remember talking to the head of one of the major intelligence agencies and they had a very clever thing. They said, look, when there's known things that nuclear weapons or whatever, we put eyes on it, we try and watch it. When you have something like this that could be developed in secret, they've been actively open up, opening up these communities and actually funding the biohacking movements because then you can see things earlier. But this, as if you can do open source bioweapon development in a lab, in a bunker, that really causes a huge issue. And we're going to have to rethink the approach. Something to, along the lines to what Alex said.
Dave
Remember when you gave the Ayn Rand Award to Mike Saylor? I don't know if you did the keynote, Mike did this incredible speech, but those people are probably vomiting right now based on how this is evolving.
Peter
Well, listen, the bioweapon, I mean, you're not going to create a novel virus that has zero history involved. And there is, there are extensive registries of every virus that's ever been mapped. And so when at an airport, if you identify, if you sequence something and it's not on that registry, you can then look at it and LLMs or the future Biolms will be able to look at, okay, this is an infectious agent. This is something that's able to be airborne or water soluble. So when you look at the proteins, you can tell what kind of a virus or protein it's generating. So you're going to be able to learn instantly when you sequence it. And rapid sequencing is here, but we're going to need this. And I think giving up privacy to a large degree, which you've talked about Saleem. Right. When you're in an airport, you basically have given up your privacy right there.
Salim
Yeah, you know you're being surveilled and you know your rights can be taken away at any time. And the one framing of our rear is that we're living essentially in a global airport. But I think that continues to some extent. I don't see a way of coming back from that.
Peter
Well, good luck.
Salim
Brad Templeton and the EFF folks. They say there is a way of doing it. You don't have to compromise privacy for security. There's lots of mechanisms for solving this in other ways. That's their complaint that the governments kind of go after the surveillance side just because. Oh this is great. We can surveil people under the excuse of security, but many times you don't have to.
Alex
Yeah, maybe. If I might close the discussion on this. I want to make sure we don't over index on safety concerns or so called safetyism. I think these are very important concerns. But I also think that AI can be scaled to combat the concerns. Just like one might naively expect prior to the development of modern cities that crime would be overwhelming and that humanity would not be able to support itself in urban environments. Scale. Turns out that we are able to. I would also. We're not doing a book corner this episode. Encourage everyone to read Werner Vinge's Rainbow's End, which does a glorious job of depicting what the future of AI enabled biosafety looks like.
Peter
Amazing. Well, I'm, you know, the eternal optimist here and I'm absolutely clear we're going to be able to overcome this. Let's move on to one more benchmark here. This is Xai releases. Grok 4.1 ranks number one in major leaderboards for reasoning and writing. Back to our resident leaderboard expert.
Alex
My comment on this one is short. This lead in the text arena benchmark lasted approximately one week and was over. So my short comment here is the race for the frontier is so intense that even if Frontier Lab is perhaps even benchmaxing towards a single benchmark, generalist models seem to be able to push the frontier at this point on a weekly basis. I can only imagine as timelines progress what this is going to look like when these benchmarks are being toppled on a daily basis.
Peter
Well, I'm sure Grok 4.5 and 5 is around the corner. Let's move on to Cursor. So Cursor triples its valuation in just a few months from June through November, going from roughly 10 billion to roughly $30 billion in six months. Time raised 2.3 billion. There's Michael, the CEO of Cursor. Who wants to jump in here? I mean, this is a hot race between a whole slew of different coding tools out there.
Salim
It seems to be in Dave's wheelhouse.
Peter
Dave? Yeah. What do you think?
Dave
I'll tell you, I think this team is phenomenal and most of the people around here think that they'll rise to the occasion and succeed. But I also think that Anti Gravity looks exactly like Cursor. I mean, like, I actually have both open on my laptop side by side. And other than a little cosmetic here and there, you don't even know which one you're in. And so then you look under the covers and it's like, well, I can access all the models through Cursor and I can only access Gemini 3 through anti gravity. So there's a difference right there. But then the bet at Cursor is that that the Anthropic and the other models will be worth having. And Gemini 3 doesn't just run away with it anyway. So it's really. It's an interesting horse race right now, and I'm not going to make any prediction on it because you can't make a prediction on it because their core positioning is incredibly vulnerable. But the team is brilliant.
Peter
Let's back up.
Dave
And they're well capitalized.
Peter
Back up. For those who don't know what Cursor is or what it does.
Dave
Fair.
Peter
Let's do that basic 101 right now. Dave or Alex. Yeah.
Dave
So Cursor, I think everyone around here that I know uses it every day. It's the best or has been the best coding assistant that uses AI. It's fully agentic now, so you can just type in a prompt. You can talk to it now, too, and it'll just build things for you. And under the covers, though they don't own their own foundation model, it's going out to either OpenAI or Grok or it has all of them in there. Anthropic is what I usually use. Claude 4.5. And it organizes everything. It cranks out the product, it configures your laptop for you. It just makes coding trivially simple. Anyone can do it. And it's pretty universally used. It was early to market.
Peter
When I think about the value in this world, where does value aggregate? My list is, it's data scaffolding, user experience and integration and customization, and then the models themselves. So where would you put Cursor in those categories?
Dave
The scaffolding, other than the model, it's all the above, other than the models themselves, compared to replit.
Peter
We've talked about replit a bunch. And lovable, how do they compare to Cursor?
Dave
So replit and lovable are much more for your mom and pop who want to build like a video game quickly or an invite to a birthday party with moving graphics or whatever. You can build something while you're flying your plane, Peter, like you did super, super easy to onboard. Cursor is more for hardcore engineers that are moving to AI and trying to get 10x more performance out of their engineering.
Alex
I would just note for what it's worth, all of these, or almost all of these integrated development environment companies, including Cursor, are rolling out their own first party models. It's almost inevitable that they want to climb down the stack to own more of their software supply chain. And I think the success that we're seeing from Cursor, which is of course very exciting, is a reflection that software engineering is probably the first high productivity labor category that's being automated by AI. Won't be the last, but it's the first big one that we're seeing.
Peter
All right, keep your eyes open.
Salim
Surely AI driven software development is now the default, right? I mean, you couldn't do it without it now already in a few months.
Alex
To the point where, I mean, this is crazy, but I see companies that are almost treating potential software engineering hires by vintage. Did they get their degree and their experience prior to agentic code or not?
Peter
Are they spoiled? Have they been ruined?
Alex
Basically, yes. Did they get their skills? Did they learn? Do they have lots of experience prior to the atrophying that comes perhaps with agent decoding?
Peter
All right, I'm going to move us forward to another incredible article. This is a new startup funded by Jeff Bezos called Prometheus. Jeff put in $6.2 billion. And by the way, can I just call out the ability to start a company with $6 billion on your balance sheet has got to be just frightening for a number of startups. And there's got to be incredibly accelerating. We've never seen this kind of, you know, starting with billions, multiple billions of dollars on day zero. So what is Project Prometheus? It's an AI enabled engineering and manufacturing. It's basically learning real world experience so that it can manufacture efficiently and focus on physical testing and simulations. And I love this other bullet point here. Prometheus has hired nearly 100 researchers from OpenAI, Google, Meta and other labs. They're just feasting on each other. They're stealing each other's, you know, well trained.
Salim
If the Going rate is a billion dollars per researcher then this is really underfunded.
Peter
They've got six researchers on this. But I find that the two things I found fascinating off the top. We'll talk about the meat of what Project Prometheus is in a second, but is starting with that much money and that they're basically stealing from each other. Dave, what do you think?
Dave
Well, I mean, it's funny. I have probably 12 meetings with different MIT teams in the last week, 30, 40, 50 at a time. And about half of them are computer science, the other half are not. The half that are not are saying, how do I get involved? What do I do? What's my AI role? When MicroStrategy started, Mike Saylor was an aeroastro. All the rest of the guys were computer science. The company took off under Mike's leadership. It didn't matter what he studied. AI is like that. There's nothing in the computer science curriculum that teaches you much of anything anyway.
Peter
No school is learning how to learn.
Dave
Don't be intimidated. Yeah, and so what you're pointing out here on this slide, Peter, is okay, they stole another hundred people, okay? Clearly the industry wants 100,000 more people to come in. Why are you letting this guy get a billion dollar signing bonus? Why don't you get into the market, learn this stuff and be there for 100 million? I mean, just get in the game. But, but it's funny because people get intimidated away from it because they feel like it's all geniuses and I'm going to get crushed. It's just not true. Just get in the hunt, get into the game. This is the thing happening in the world now. And there's usually only one thing driving all change in the world. This is that thing. So just get into the middle of it and then Jeff will. The other thing I'll point out in this is that there's a tendency to be intimidated by Elon Musk spent six or seven billion dollars building a massive data center in record time. How am I going to compete with that? But the foundation models that will do parts creation or robotics simulation or whatever are different enough from a large language model that you can build a great foundation model company in parallel with OpenAI and Grok and Meta and Gemini. It's okay. You shouldn't be intimidated by that either. And that's, I think, what Jeff is saying here. Just one final point. Jeff bought all the robotics companies, put them into warehouses and just ran away with warehouse automation, which created a whole litany of new startups. Working for Walmart and Target and everyone else. Like Symbotic, where Daniela Rus is on the board, does the robots now for Walmart's warehouses here. Jeff is saying, okay, Amazon is big enough that I'm actually going to be able to build a multibillion dollar company within our own universe, our own channel. But that creates opportunity for somebody to be outside of the Bezos universe, doing it for everybody else. And so all of our technical design is wide open.
Peter
We're going to have Jeff Wilkie on stage at the Abundance Summit this year. Jeff was the CEO of Amazon Worldwide. There were two divisions, one was AWS and one was everything Else. And Jeff Wilkie ran everything else. And he's actually super excited about this because this is what he's doing. He's got a company called Rebuild Manufacturing which is working in this area too. So, Alex, let's get into the nitty gritty here. Prometheus is building physical AI. It's world models again, like Fei Fei Li and a little bit like Genie 3. These are world models understanding the laws of physics and chemistry and engineering. So you can actually do real optimization. What are your thoughts here?
Alex
Yeah, I think we're starting to see the pivot of the capital markets from funding superintelligence to funding that which comes after superintelligence, which is, as I've argued in past, solving math, science, engineering and medicine. And I think it's a 10x100x larger market opportunity, larger addressable market solving. Basically everything else after solving superintelligence than solving superintelligence itself. 6.2 billion is a drop in the bucket. I would expect it's going to cost many, many trillions of dollars in funding to solve all outstanding problems in math, science, engineering and medicine. There's been relatively thin reporting on what Prometheus, or what Project Prometheus is particularly focusing on. I have taken note it seems to be absorbing a lot of old biology friends of mine. So it's possible maybe it ends up focusing a little bit more on biology, a little bit less on manufacturing. But I think this is where the action is after superintelligence.
Peter
Yeah. I have three points I want to make here. One, this is kind of a shift from chatbots to industrial agents. Right. So AI for the office is what we've had. This is AI for the factory floor, where there are physical consequences, where the systems are able to operate the factories because they understand the physical constraints and situations and logistics. The second thing is, I met Jeff in college. I was the chairman of SEDS Worldwide at one point. And Jeff was the president of SEDS at Princeton University when I was in mit. And so space has always been his passion. Congrats to Blue Origin for its recent launch and landing. We talked about that last time. But this kind of a physical AI system is exactly what you need to operate heavy industry in space, to build factories in orbit, to build factories on the moon, and to have them fully autonomous and capable. And then the final thing I would say is that this is going to change. And we've seen companies like Lila and other companies out there that are going to go from invention that happened by serendipitous human creation to invention coming from a computational one. That's when it gets super interesting. And that's what you've been talking about, my friend Alex.
Alex
That's right.
Salim
What hit me with this is this felt to me like he's creating a backbone AI for everything in his world. Amazon space, logistics, et cetera. This will service all of those.
Peter
And Elon will do the same, of course.
Alex
Like electricity, it's going to run through everything.
Peter
Yeah, yeah.
Dave
Well, also, you know, the foundation model that I built early in my career was five years from the day I started writing the code until it was done. I can recreate it now in about two months, which I just did. And so if you look forward a year, that'll come down another, you know, 5, 10x. So you can use AI to build the next AI, which is essentially what I just did. The same applies in mechanical design. So if you said, wow, building an entire AI platform that designs rockets or designs robots is really hard. Well, it would have been. But now you can use the current AI to build that AI, and it cuts the time down tremendously. So if you just look forward a year to where the existing AIs will be, that time is actually not intimidating at all. And so it's a good reason to get into the game and build these parallel AIs that work on very specific problems, whether it's biotech, whether it's mechanical design, whether it's futures trading, whatever it is, build it from the old AI to the new AI.
Peter
So last time I asked our subscribers, and by the way, we're almost at 400,000 subscribers, so if you haven't subscribed yet, push us over the top, we'd appreciate it. Our march is towards a million. Not that it really matters, other than it'll make my kids really proud of me. So that's my goal. So I asked our subscribers to post questions.
Salim
You're on your way to Mr. Beast.
Peter
Yeah, well, hey, yeah. In about a thousand years, I asked our subscribers to post questions and I took all the comments, put it into ChatGPT and asked for it to summarize the most important questions. And there was a critical question that was asked and I just want to take a second and read it because I want to have an AMA about it. It said, what concrete milestones should people expect to see that prove abundance is coming? In other words, lower cost, new industries, accessible AI tools, and how do we ensure these benefits reach everyone rather than concentrating wealth among a small AI augmented elite? So I want to play a video that was posted on X today and then we're going to talk about this question. But AI and humanoid robots will actually eliminate poverty.
Dave
And Tesla won't be the only one that makes them.
Peter
I think Tesla will pioneer this, but there will be many other companies that make humanoid robots. But there is only basically one way to make everyone wealthy and that is AI and robotics. All right, so that's Elon's thesis. I posted the question here again and it's a real concern, are we going to have runaway wealth concentration? And honestly, if you want me to believe in this future of abundance you keep talking about, guys, what are the concrete milestones and how do we ensure these benefits reach everyone? How do I know it's actually coming? Yeah, let's jump into this.
Salim
Can I throw out a couple of points?
Peter
Yeah.
Salim
There's an important framing here where we. Let's not talk about the wealth gap, right? The reason is that the richest people in the world are always going to keep getting richer and the poorest people are going to have nothing. The issue is more can you lift the bottom? If you want cares, yeah, right. You make this point all the time.
Alex
All the time.
Peter
My next book, right? I mean, a thousand years ago, the king and the queen on the hilltop lived below poverty today, by the way, and there was thousands of serfs that supported them. And what we've done, they died of.
Salim
A tooth infection at age 22 or.
Peter
They were bled by leeches, you know, as the king and the queen up there. And what we've done is, yes, we're heading towards a world where there are trillionaires living on Mars, but if every man, woman and child got access to all the food, water, energy, healthcare, education they could possibly want, we've lifted the bottom of humanity to a point where mothers can believe their children have access to everything they need. That's the world I want to live in. That's the world I want to create.
Salim
So let me speak just to that for a second, please. Right. We forget because we see all this, we see people getting richer, etc. Etc. But we have to remember the unbelievable benefits accruing to every level. I'll give you a concrete example. When the tsunami hit Indonesia in 2004, all the ship to shore communications were wiped out. And so the government gave cell phones to all the fishermen, saying, hey, if you're out fishing, you see another tsunami texted in, et cetera, et cetera. And they found to their surprise that their incomes had increased by 30% over the next two months. So they looked into it and all they were doing, just texting in to see what the market price was of the fish. Should they stay fishing, should they come.
Peter
In and sell or which port they go to, who's paying more?
Salim
Yeah. So now just that little hint of what Alex would call the inner loop allows you to increase income pretty radically by having the democratized access and demonetized access to cell phones, smartphones, and now AI. And this will change the game completely for everything, everywhere. I'll touch two areas. One is education. You can now sit a child down with a smartphone and say, create a lesson plan for grade seven algebra. And they're going to learn 10 times faster than all the kids stuck in elementary schools in the west that by law have to go to these things. Right. The second is healthcare, where any single medical condition can now be diagnosed instantly. And when you get something early, the cost of treating it drops by 100x. So those two are very concrete areas where AI will make a massive difference in two areas that were traditionally inaccessible and hard to get and expensive.
Peter
Let me read the numbers here. The US average expenditures for a family, this is 2023, was $77,000. So the number one cost was housing. 33% goes to housing, 17% to transport, 13% to food, 12% to insurance and pensions, 8% to health, 5% to entertainment, and about 2.5% to education. So let's knock these down. Housing, right? So number one, now you can live outside of the city where it's cheaper and be able to telecommute in and reduce your housing cost. There is a future, it's not here yet, where we're 3D printing houses, reducing the cost. And what we saw on stage a couple years ago, if you remember Salim, was 3D printing houses being per square meter, the cheapest, but also the most beautiful and most luxurious because you could get the greatest designers to create a standardized print file for people to use transportation 17%. Well, guess what? An autonomous electric cyber cab is four to five times cheaper than owning a car. It's going to be cheaper than an UberX, cheaper than a bus. So we're going to solve that food, we've got to solve food better. We need basically vertical farms and stem cell grown meats.
Salim
Let me give you the stat on vertical farms. We've been doing horizontal farming since the beginning of time. Vertical farming is just crossing over now into economic viability. You can drift feed water to the plants. You know what nutrients the plants need because the sensors know it. You get about seven times the yield of horizontal farming by doing things vertically because you have the right frequency of light hitting it, you save 99% of fresh water. And by the way, we use 70% of our fresh water globally to agriculture. So just that the best calculation we've seen is if you took 35 skyscrapers in Manhattan, turn them into vertical farms that would feed the entire city sustainably. Nice. Just think about that from a logistics, food security, pesticides, fertilizer. There's massive changes coming down the pike and this is before we apply AI to the whole mix. So the radical changes coming are going to be so huge that the cost of everything should drop to near zero. The amount of energy you need to feed one person is the amount of sunlight hitting one square meter and that energy would feed somebody for a year. So all we have to do is get a better loop of figuring out how to convert that energy into consumable foods and we've got a long way to go.
Peter
Yeah, you know, healthcare, 8% of our cost in healthcare. You said it already. We know that an AI physician diagnostician is significantly better than any and even in the best physicians and a autonomous robot eventually will be the best surgeon and the cost of that will be capex and electricity. I mean it's hard for people to believe this stuff now because it's on the bleeding edge, literally. But we're going to get there. Entertainment, 5% of cost. Well, guess what, YouTube, what else can you want? Education. You mentioned before, AI, YouTube, all these things. So we're demonetizing and democratizing this stuff. It's just hard for people to realize it. I think the challenge is we compare ourselves to the Kardashians, right? We compare ourselves to people that we see on TV and on the Internet all the time versus comparing ourselves to what it was like for our parents or grandparents.
Dave
Yeah, I think that last point is the key one because we've had dirt cheap food for a long time. But everybody still wants a $14 Starbucks latte, which you don't need to pay for. But there it is. Why do I feel that need? The metric I'd be tracking is actually depression rates because I think AI properly deployed can hit that much more quickly than it can hit robotic automation that creates new homes for Everybody that are 10 times larger.
Salim
That's a great point.
Dave
I'd be looking at that one as an early indicator that we're on the right path. And it's not a no brainer. You got to really think it through. Because you mentioned rent is at the top. 33% of household income gets spent on housing on average. But when you look below the poverty line, I think spend on drugs, alcohol and gambling, pain relief is 3. The opioid addiction alone is a trillion dollar error, I guess, and it's about five times more collectively than rent.
Peter
Go ahead.
Dave
Well, no. So I'd be attacking that. If you want to come bottom up and say, look, we want to create universal happiness with AI, we've never had a tool that could attack it before. Right. You can attack manufacturing, automation, you can make food cheaper, you can have harvesters that mow down half the Midwest to create wheat, but all that does is create more of stuff that's already abundant. Now, AI is the trigger for a massively more thoughtful way to create universal happiness. And I would start with depression rates and work up from the bottom because you can do that very, very quickly, much more quickly than you can. We've already looked at the robotics. We know that's going to build a mansion for everybody in the world. But we're not going to have the robots for about 15 years because we have to scale them up on this exponential curve. So some people will have them next year, you'll have yours this year, but we won't have enough of them to attack the global problem for about 15 years because of just the manufacturing ramp up rate.
Peter
Alex, this is all about benchmarks. We've talked about this. You and I have been working on a paper on this subject. Can you speak to that?
Alex
I think it's so simple. I think what's upstream of all of these other milestones is the dollar cost per unit of intelligence. And as we've discussed previously, right now that's hyper deflating by something like 40x year over year. So to keep the party going and to make sure that all of these downstream considerations, cost of living, healthcare, housing, et cetera, that these all hyper deflate ultimately alongside cost of intelligence, I think it's largely A regulatory and social concern we've spoken previously about, for example, the difficulties of getting waymos in Boston. That's a regulatory consideration. The cost of intelligence needed to autonomously drive cars around, that's making excellent progress. But ultimately, in order to say, provide essentially free, autonomous on demand transit to everyone, there's a regulatory bottleneck. And in order to avoid making or in order to ensure that the benefits of intelligence too cheap to meter become evenly distributed, I think it's going to require some revision of social coherence and the social safety net to make everyone comfortable with the downstream consequences of intelligence too cheap to met, including healthcare and housing and energy and utilities too cheap to meter.
Salim
Yeah, I did a calculation. Okay, so if you wanted to have a reasonable life, you could do it for $20 a day. In Bali, housing costs about $10 a day and your meals are literally about $2 a day and then a bit of extra. Okay, so for about 20 bucks a day, you could do it if you had half an Ethereum, which is about $2,000, you can put it into defi trading pools and earn about a percent a day, which is about $20. Okay, so half an ethereum of capital allows you to live crudely, but allows you to live in a very lovely spot in the world for near very low cost. And think about just that feedback loop on that. Because as you double that, if you triple that, if you 10x and all of a sudden you get into a really great place, you can survive today on a very small.
Peter
My feet are in the sand. My feet are in the sand already. I'm ready. And of course, Salim, that Ethereum comment was not investment advice just to let everybody know. But it is interesting that Harvard is double down on Bitcoin. And now that we're in the bitcoin doldrums, it's nice to see the institutions. I mean, I remember when we went from like, you know, wacky individuals buying crypto to now institutions and financial institutions and sovereign funds and so forth, but countries also not investment advice. All right, what an amazing episode. And we've actually just gone through half of our stories, but I think to make this consumable, because the feedback we've gotten from folks is please try and keep the episodes under an hour and a half. So we're listening over. Over comments. We're trying hard, so we'll have to spin up another conversation on everything going on in data centers and energy and space and so much. I mean, it's hard during the singularity to keep up with everything Just the.
Salim
Mind blowing stuff from Gemini 3 was worth covering properly.
Dave
Just a reminder, last summer, not that long ago, Polymarket said everybody the top five had an equal shot at being the best AI model by the end of the year. Now it's 91% Google, but by next summer that's down to 60%. So it's kind of like 50, 50 that someone else will take a leapfrogging baby.
Peter
Leapfrogging.
Dave
Well, that's what we should hope for because Alex said the key point, as usual, 40x is what you should expect next year. 40x people really struggle 40x in anything. So if the cost per intelligence comes down by 40x or they're just raw intelligence goes up by 40x next year, you should expect, expect that. Very hard to visualize all that that means. So we do everything we can on the podcast to try and make that tangible for people, but really try and digest that coming out of this Gemini 3 incredible breakthrough.
Peter
Yeah, and just hats off to Josh Woodward, to Sundar Pichai, to Demis Hassabis for an Extraordinary job on Gemini 3. Just so proud of what they've been able to create. And of course a lot more, a lot more coming.
Salim
And I have one announcement.
Peter
Yeah, please.
Salim
Sometime in December we're gonna do a meaning of life session online. So I've had enough clamoring for my community and other people and Peter people that go to abundance that people want to do it. So stay tuned, we'll get more details next time around.
Peter
Well, we'll do it also at the Abundance summit on the Wednesday night. This is Saleem, like waxing poetically and philosophically for about five hours straight.
Salim
It's like a late night French salon type discussion. Alcohol or equivalent mandatory on the medicine philosophy. And what does it mean to be.
Peter
Alive in Today's starts at 10pm what time does it end? Dawn?
Salim
It depends on the audience. But the crazy, the crazy ones we've gone to till dawn.
Peter
Oh my God.
Salim
Because we never get a structured conversation on the meaning of life. We never get the. So let's talk about that conversation.
Peter
Well, we'll do it. You'll do it and I'll join you for at least until my bedtime at 9 o' clock and I'm exiting the building.
Salim
But I'm gonna do it online in.
Peter
About a month, so we'll do it earlier. And last time we talked about the potential for a moonshot gathering, we've had 500 of you email us. So if we get to 1000, if you're interested in A Moonshot gathering next fall. You can send an email to moonshotsamandis.com and let us know you're interested in having these conversations and gathering with other Moonshot listeners. And once again, we put our call out for Outro music. And this is a piece by John Novotny, and it's called Moonshot's Metal Version. But here's the key. You need to see this. This is not just music. This is a fun video. Saleem, you look so sexy. Dave, and I love your ponytail. Alex, AWG's got a ponytail in this and he's rocking it. All right, on our outro. Let's go ahead and watch and listen to this. This is heavy metal moonshot music.
Salim
Oh, my God. I haven't seen this.
Peter
Between neon veins.
Salim
Oh, my God, no.
Dave
Oh, that's a good one. Very gentle.
Salim
Oh, not to miss. Oh, my God. Frightening. This is amazing.
Dave
Cool.
Salim
Bottling the lightning.
Dave
I gotta get the ponytail on there. Peter and Salim, your look in that video was, is really good. You should just do that.
Peter
I, I, I love Celine with the, with the, with the sunglass move. Dave, you on the guitar and AWG you on the keyboards. And the ponytail was you, buddy. You gotta, gotta grow that ponytail, apparently.
Salim
So.
Dave
We got some good listeners.
Salim
Thank you, John, for that. That was amazing.
Peter
Yes, DB2, AWG, and Mr. Exo have a fantastic week. I love, I love doing this, and thank you to all of our listeners.
Salim
Great episode. All right, take care, Peter.
Peter
Take care, guys. Every week, my team and I study the top 10 technology metatrends that will transform industries over the decade ahead. I cover trends ranging from humanoid robotics, AGI, and quantum computing to transport energy, longevity, and more. There's no fluff, only the most important stuff that matters that impacts our lives, our companies, and our careers. If you want me to share these metatrends with you, I write a newsletter twice a week. Sending it out is a short 2 minute read via email. And if you want to discover the most important meta trends ten years before anyone else, this report's for you. Readers include founders and CEOs from the world's most disruptive companies and entrepreneurs building the world's most disruptive tech. It's not for you. If you don't want to be informed about what's coming, why it matters, and how you can benefit from it. To subscribe for free, go to dashmandis.com metatrends to gain access to the trends 10 years before anyone else. All right, now back to this episode.
Dave
Every holiday shopper's got a list.
Ross Holiday Shopper (Ad Voice)
But, Ross shoppers, you've got a mission. Like a gift run that turns into a disco snow globe, throw pillows and.
Dave
PJs for the whole family.
Ross Holiday Shopper (Ad Voice)
Dog included. At Ross holiday magic isn't about spending more. It's about giving more for less. Ross, work your magic.
Date: November 20, 2025
Host: Peter H. Diamandis
Panelists: Salim Ismail, Dave Blundin, Alexander Wissner-Gross (AWG)
This episode dives deep into the release of Google's Gemini 3, exploring what sets it apart beyond improved benchmarks, and what broader impacts it signals about the trajectory of AI and technology at large. Peter Diamandis leads a roundtable with noted technologists and entrepreneurs to unpack the practical, economic, and societal implications of this latest leap. They also explore benchmarks, democratizing AI benefits, emergent risks, the coming revolution in automation, and industry shifts.
| Timestamp | Segment | Speakers | |:-----------:|--------------------------------------------------------|--------------------------------| | 00:00–02:15 | Gemini 3 initial impact and the meaning of AI progress | Alex, Dave, Peter, Salim | | 04:30–06:32 | Josh Woodward (Google) demo: Gemini 3 features | Josh Woodward (Google) | | 09:23–11:22 | Google's integration power and "big model smell" | Alex, Peter | | 14:32–16:39 | VendingBench: AIs as economic actors, benchmarks | Alex, Peter, Dave | | 19:33–20:05 | "Zero employee companies" and impact on society | Salim, Peter | | 24:12–25:51 | Gemini Live: natural voice and voice AI competition | Peter, Salim, Dave | | 29:24–33:09 | Gemini 3 for “shopping agents” and world indexing | Alex, Dave, Salim, Peter | | 32:51–36:18 | "Up and to the right" — what benchmarks mean | Alex, Peter, Salim, Dave | | 46:14–56:50 | Biosecurity, AI, open source risks, and surveillance | Peter, Alex, Dave, Salim | | 73:33–79:58 | Concrete milestones toward "abundance for all" | Salim, Peter, Alex, Dave | | 83:21–84:14 | Deflationary cost of intelligence & social implications | Alex, Salim | | 85:15–86:10 | Google as current AI leader: future leapfrogging | Salim, Dave, Peter |
For listeners:
This episode offers both high-level perspectives and tangible examples for founders, technologists, and anyone interested in how AI’s “moonshots” are swiftly becoming today's reality. The conversation is at turns technical, philosophical, and practical, with concrete predictions and an eye toward both the astonishing potential and the perils ahead.
Summary by [Your AI Summarizer]. For detailed references or segment navigation, check the timestamps above.