Loading summary
Jason Calacanis
I heard Gemini 3 is out. I heard Grok 4.1 is out. Everything is happening so quickly in our industry.
Alex Wilhelm
Look at what Xai did. Grok 4.1 is really good across a number of metrics compared to Grok 4. Fast. It's dramatic improvement. When 4.1 was released onto the charts, it went straight to the top. And I think it just shows that there's several different AI labs here in the US that are able to successively create state of the art models. And I think it also pushes back, Jason, against some of the doomerism that's been kind of building in the last three months.
Jason Calacanis
One of the great things is I guess that now that we have these leaderboards, it's motivated all these companies and gotten these teams super excited about claiming the top spot. That's where I'm just getting a little concerned of, you know, like, are we actually making progress at solving the real world problems people have, or are we getting good at acing the SATs? And that is a question for me.
Alex Wilhelm
This Week in Startups is brought to you by Zeit. Zeit is the fastest way to build business software with AI. Build apps, forms, websites and portals that connect to the tools you already use. Go to zeit.com twist and get 50% off your first project. Every running a startup is hard enough. Every takes care of incorporation, banking, payroll, benefits, accounting, taxes, and more so that you focus on building, not back office admin. Visit every IO and Gold Belly. Gold Belly ships America's most delicious iconic foods nationwide. Get 20% off your first order by using the promo code Twist at checkout.
Jason Calacanis
All right, everybody, welcome back to this Week in Startups. There's tons going on in the news. I heard Gemini 3 is out. I heard Grok 4.1 is out. Everything is happening so quickly in our industry. I'm having a hard time keeping up.
Alex Wilhelm
No, it's been really, really busy. I want to start, though, Jason, with the Cloudflare outage. I don't know if you were awake or asleep. I know you've been going through some jet lag. The Internet broke this morning. It actually made getting the docket ready kind of hard. It's funny when a news story really impacts us because services like ChatGPT and X& other kind of major planks, you might say, of the podcast economy broke and for several hours Cloudflare had these really embarrassing downtime outages. And the good news is that now that we're here, it has been resolved. They've come out and apologized for it. If you're curious, everyone, what drove this? They said that quote, a latent bug in a service underpinning our bot mitigation capability started to crash after a routine configuration change we made. And Jason, you'll know this. That cascaded into a broad degradation of our network and other services. Sure, whatever that means. It was bad though.
Jason Calacanis
Yeah. Cloudflare is this great cdn, I guess, content delivery network. And if you have a website and you get these DOS attacks, these denial of service DDoS where people would send tons of pings to your server, try to slow things down and cause a disruption in service, they figured out ways to, when all those attacks come in, to just drop them. And because it's a network, you know, if some hacking group has, you know, 10,000 servers or maybe even individual machines on the Internet that they've hijacked to send in these denial of service attacks, just to think about it, Simply, they'll hijack 10,000 Windows machines, send a bunch of attacks to a website, and then all of a sudden you can't read the Drudge Report or New York Times or whoever their target is. They know those IP addresses, they know what it feels like when those kind of attacks happen. They know the signature and they just drop those requests. It's pretty sophisticated, it's pretty basic. But Cloudflare has become kind of a standard for this cdn. I remember when we were running Weblogs Inc. Back in the day, this 20 year old knowledge. But we started using these different CDNs, different caching servers. We put up our own caching servers. We were handling these issues ourselves. Now everybody abstracts it to providers. What's the value of being able to abstract it to a provider like this is. Well, you can focus on other things in your business. What's the downside if everybody trusts. Everybody's got their eggs in one basket, whether it's Amazon Web Services or Google or Azure or in Cloudflare, everybody gets affected at the same time.
Alex Wilhelm
That's kind of what I want to talk about because normally this is a little bit outside of our wheelhouse, Jason, but there was a Cloudflare outage also in June. There was an AWS outage in October, there was a Google Cloud outage in June, and there was an Azure outage in October. So you're a startup founder and you are running a software business, as most of them are. You are using cloud services, as most of them are communication strategies for downtime mitigation. Just I'm curious what your advice is to founders as they go through this. Apparently Recurring issue.
Jason Calacanis
It's a recurring and non consequential issue at this point. Like it, the downtime happens for an hour or two and frankly we're all permanently online. So it's actually good. Everybody gets a little break. I would be concerned if your service is down for more than a couple of hours. If it gets into like, okay, it was down this morning, but it's still down this afternoon. It's still down this afternoon. That's when people start to go, wait a second, maybe I should look at another option. And you just don't want to get into the okay, it's time for me to look for another option. If you're a startup founder and your service is down in the morning for an hour, no big deal. If it goes into the afternoon now you got a big deal. People start looking for other solutions.
Alex Wilhelm
When aws and their U.S. east one region went down, everyone kind of lost service. If you lose service with everyone else, Jason, is the penalty just not that high for your individual company? Because that's good news and better than I expected.
Jason Calacanis
If it was aws, everybody gives you a mulligan. No big deal, no harm, no foul. Now, all right, you might if you're a financial services company, you know, if you're a Robinhood or you're stripe and you go down and it costs people money, that's a whole different ball of wax. You might need to give a credit, you know, and things can get, yeah, a little bit ugly if people lose money. If you are selling pay per view and it's Netflix and it's that Boxing night, yeah, that is a big problem. So I think it, you know, it's service dependent. The financial companies have a different order of duty. But you know, if you look at the video game industry, they'll be like, hey, we're having downtime Saturday afternoon. You can't play your game for three hours. Expect downtime. And they communicate it to you ahead of time.
Alex Wilhelm
I don't mind that. But like you know, Tuesday morning, I mean, I think Cloudflare likes to say they, they power or protect about 20% of the websites online. That's, that's a pretty big chunk of the Internet to bring down yourself. Going back though, to what you said about their footprint and how they help protect websites from data DoS attacks and they do content delivery around the world. Cloudflare just bought a company called Replicate Jason. And this is a startup that I hadn't spent enough time looking at. Raised a couple of rounds, including a series A from Andreessen And a series B from Andreessen. And what's interesting is they're kind of offering serverless AI access. So if you want to just basically not deal with an individual provider of AI services, you could go to Replicate and just kind of via one API hook access a bunch of different AI models. And they handled all the technical backend. And so it kind of makes sense to plug that into cloudflare's global infra footprint because then you can kind of have AI at the place of your consumers with limited lag and so forth. But like, is it kind of crazy that I didn't really know anything about Replicate before this? And it raised several rounds. How many companies are there doing this sort of thing?
Jason Calacanis
This is a company I'm not aware of either. But yeah, being able to use an API to just send something to a model and get a result back is kind of how people are running these systems now. Instead of you can run your, your whole application on AWS or Azure or Google Cloud, Oracle Cloud, and then just be calling APIs from Grok, Claude, you know, Anthropic or chatgpt. That does make it a lot easier, doesn't it? You don't have to like stand up a bunch of hardware, you don't have to stand up like a bunch of instances. So it has become quite modular. I do think at a certain point in time people are going to want to stand up their own language models with their own data to just not educate other LLMs and to keep their data private. So I think the future is going to be, you know, a lot of people are running their own deep SEQ instance, you know, over at a Google Cloud or Azure, et cetera. Yeah, I do think I'm just seeing it with startups. Let's say you have a startup that does some obscure thing like, I don't know, building codes, right? And you're going to fine tune this model to do building codes. And it's going to be for architects and builders, developers to understand all these different codes on a global basis. You know your Starbucks, you're opening stores everywhere, you need access to all these codes and you're taking all the time to go collect that data and to normalize it and train the model. Do you want to train ChatGPT's model or Anthropic's model, have them get the benefit of it, or do you want to just create an island, put all your data into it and not have them see the prompts, not have them learn from your prompts, not have them Learn from what are the queries you're sending from your customers to them. I think people are going to become very, very cautious about the large language models becoming competitors. And I said it before on the shows, everybody listening to me knows it's going to be a big end of the year. November and December, you're trying to get a lot of things done at work, but Thanksgiving is coming, Christmas is coming, and you're going to need some help gifting and hey, getting ready for your Thanksgiving table. And we've got the perfect partner. My favorite gold belly. Yum, yum. Get yourself some amazing gifts for your partners. Get some amazing food, your table, like Martha Stewart's salted caramel chocolate cake. How about this? John's of Bleecker Street New York style pizza. Amazing. Lombardi's pizza. These are some of the best pizza makers in the world. It's great way to reward your team members. I do it, I order great stuff from around the country. Just surprised my team with it when we have team meetings. So if you're looking to reward your team for doing a great job in 2025, maybe you want to surprise your loved ones with a special holiday treat. Here's how you do it. You go to goldbelly.com, use the promo code TWIST and you get percent off your first order. That's goldbelly.com and use the offer code twist. Some of them are positioning themselves as like neutral third parties. They're not going to be in your business. Other ones, you know, that have to spend a trillion dollars. Quite possible. They need to make $2 trillion. How are they going to make that money? I think they're going to compete with you. So for this startup that was working hypothetically in building codes and they took the time to go find all these building codes, let's say they hired a hundred researchers to, you know, full time for a year and they spent $10 million on these hundred researchers to go collect all this information, normalize it, whatever, and then they feed it into somebody else's language model and now they learn this information. Not good.
Alex Wilhelm
So in that case, Jason, the corollary or the follow up question is just are the startups that you're seeing that are kind of rolling their own models, are they using, I presume, all models from China, then the open source stuff from your moonshots, your deep seats, etc.
Jason Calacanis
I think this is the year this is happening. So tbd, well, you know, let me ask a couple of them if they're willing to talk. So producer Marcus like maybe we should see if there's somebody in our portfolio who's doing this. We can just ask in founding university, etc. Anybody working on open source AI models and training themselves and running themselves and maybe they can come on the show and talk a little bit about it. But it is definitely a trend that's going on. And I saw Andreessen Horowitz was publicly saying, hey, a lot of our startups using deep sea.
Alex Wilhelm
So that's what I was going to say. Martin Casado, I think said that it's like 80% of startups they see are using open source models from China.
Jason Calacanis
They're probably using other models as well, by the way.
Alex Wilhelm
It's just not exclusively, but I mean like, you know, I think that's how.
Jason Calacanis
People took that quote though. People took it as like, oh, they're picking the, the open source Chinese model over OpenAI's model. I think it's in addition, or maybe they're experimenting with them. So in other words, they're aware of them, they've got an instance running and they're sending some jobs to it. And why wouldn't you? I mean, I think ultimately what's going to happen is we're going to have a large language model on our Mac mini M17. If it's an M, I think I have an M4 right now and I guess someday I'll have an M5, you know, somewhere around M10, you know, I'm going to have four M10s or an M10 will have four microprocessors in it and some amount of bandwidth and RAM that will give it the ability to run my language model locally. Then when I do that search and I say, hey, show me all the photos of us skiing and you know, a bulldog in the background because I'm looking for those pictures of the bulldogs playing on the mountains while we're skiing. It's just going to do that locally, not in the cloud. It's going to index my photos locally. It's going to have all my Gmail local. It's going to, you know, so do.
Alex Wilhelm
I then have like a server in my basement, Jason, that has like my family collection of GPUs or am I just running a single Mac mini pro 2015 edition or 2030 edition, whatever.
Jason Calacanis
I think that the ultimate manifestation of all this is going to be Apple will have local language models and they'll be pitching us on. They're encrypted, they're private, your data's safe, just like they do for icloud. We can't get your icloud stuff. If the government comes and asks us to crack it open, we can't get there. Just like, you know, I guess Sam Altman gave a warning, like, hey, if somebody sends us a letter, we're sending them everything you've talked about. And when this kid tragically committed suicide recently, they had all the logs, right?
Alex Wilhelm
Like, yeah, they're going to dump the log.
Jason Calacanis
So if you're in a relationship with ChatGPT, you know, they have all that. Whatever you said to your ChatGPT, they've got it stored.
Alex Wilhelm
But the Apple point's great though, Jason, because if you wanted to have that model succeed, you would need to have a smartphone, a desktop computing experience, a slate computing experience, a headset computing experience, a future glasses computing experience. And you need to be in cars, perhaps do something like CarPlay. They literally have all the surfaces already that they would need.
Jason Calacanis
Whoever has the operating system, I think is going to have a real operating system. Plus, privacy is going to be a really interesting combination. I've talked ad nauseam about my love of the Comet browser and watching my browser, you know, go to United Airlines and find a flight for me and put it into, you know, the shopping cart was like, yeah, that's interesting. Or I used it the other day. I said, go to Blue sky, because I don't have. I'm not following it in Blue Sky. And I just said, I think I said, follow anybody on the Explore page who's like, featured. Just follow like 200 people. And it went. And it followed like 20 people. It's like, 20 is the limit. And I was like, why is 20 the limit? Okay, do it five times. When did it five times. The idea that Apple is watching what you do on your phone and there's this. Give Apple intelligence. I don't know if you've seen this setting. You can pull it up.
Alex Wilhelm
We've talked about this.
Jason Calacanis
Yeah, Give Apple Intelligence access to what you're doing on this app. So if you go to the settings for any app on your phone, there's a thing like give Apple Intelligence access to it. What does that mean? What that means is Apple Intelligence is watching how you use your phone and they're watching how you use, I don't know, the Contacts app, your phone app, your Superhuman app. And then over time you're going to be able to say, go do this in the StubHub app. Go find Nick's tickets. But it will have recorded and studied it over and over again.
Alex Wilhelm
People asked online quite a lot about how to turn this off. But to me, if you're going to use any sort of intelligence on your iOS device, you're going to want it to connect into other applications, Jason.
Jason Calacanis
So just so Apple intelligence. Siri. Yeah. Learn from this app.
Alex Wilhelm
Next on the docket we have Grok 4.1. We'll talk about Gemini 3 in a second, but I think it's worth stopping and saying look at what XAI did, because this is a company, Jason, that started later than OpenAI and has really done, I'm just going to say impressive things. I know a lot of my friend group aren't the biggest Elon fans, but I do think it's important to give credit where it's due. And Grok 4.1 is really good across a number of metrics. One, they did a lot of work on reducing hallucinations, which I think is a very important step compared to Grok 4. Fast, it's dramatic improvement and I think most importantly based on the usual kind of panoply of things that Ella Marina tracks in its battleground for different AI models. When 4.1 was released onto the charts, observe, it went straight to the top. Grok 4.1 and Grok 4.1 thinking took the absolute pole positions. But this is a real accomplishment and I think it just shows that there's several different AI labs here in the US that are able to successively create state of the art models. And I think that's very encouraging for startups, I think it's encouraging for the AI economy writ large. And I think it also pushes back, Jason, against some of the doomerism that's been kind of building in the last three months. It feels like AI needs a win and I think this is an example of such a win.
Jason Calacanis
Less hallucinations, obviously better. And one of the great things is I guess that now that we have these leaderboards, it's motivated all these companies and gotten these teams super excited about claiming the top spot. So yeah, kind of cool that these things, you know, somebody came up with this, this concept of the arena and all these different tests. I do wonder if building for the tests at a certain point is kind of like kids mastering the SATs and like at a certain point maybe that's not what's important. Maybe like there are other things. So I wonder if these tests are changing that I don't have enough information on that of like how are these tests changing? I know early on some of these, one of the tricks that was occurring is when you give people an incentive like a leaderboard like this, people then try to game the leaderboard. What's a way to game the leaderboard doing unnatural acts? Like if you know it's going to ask certain types of questions, then studying for those types of questions or feeding those exact questions or those, you know, multiple versions of those questions into your training set. In other words, kind of gaming the system.
Alex Wilhelm
Precisely.
Jason Calacanis
And I think that that's where I'm just getting a little concerned of, you know, like are we actually making progress at solving the real world problems people have or are we getting good at acing the SATs? And that is a question for me.
Alex Wilhelm
There's people are worried about. I'm trying to remember the exact phrase of this, but it's like diffusion or when the model gets like so suffused with people know how it asks questions that it kind of loses meaning because you can tune and if you think back to Meta's launch of llama 4, Jason, if you recall, they actually made a bunch of different versions of llama 4 and then ran them through the same test to get the best results. And they got mocked for this pretty mercilessly because it was clear they were trying to cook the numbers to appear impressive. The reason why I shared LM arena information versus just kind of a raw rundown of individual benchmarks is that this is their users in a head to head battle, choosing which result is better with the model names masked. And so it shows what people that are actually using these models think. And I'm hoping that provides a better perspective on actual progress here.
Jason Calacanis
I'm a big enough deal now that I can afford to hire my own admin team. Look at me. They handle all the details of running my company. But if you're a startup, you need to spend your time obsessing about your product, not filling out paperwork and doing all these books. That's why I want to tell you about Every. They've worked with over 1000 startups, from first time founders to VC funded teams and more. And they've got the experience to help your company navigate navigate whatever's coming around the bend. Maybe it's time to incorporate your startup so investors take you seriously. Every is going to take care of those filings for free without any unnecessary delays or legal fees. Hey, maybe you just got some new funding and it's time for you to scale up with every. You'll get 3% cash back on every dollar you spend on your company's card. Hey, maybe it's time to say goodbye to a member of your team. It Happens. Well they've got an employee offboarding checklist and you know I love a checklist. It's going to help you ensure a smooth transition while protecting your business both legally and financially. Plus you'll get a thousand dollar bonus when you move 250k or more into your every account. So for your incorporation, banking, payroll benefits, accounting, taxes and other back office administrative needs, visit every IO that's E v E R y dot I.
Alex Wilhelm
But anyways, let's move on to this morning's big news. Similar idea, new AI model this time from Google. Gemini 3 Pro is out. Limited early access. It's rolling out as we speak. Earlier Jason, you mentioned how the benchmarks are a little bit dodgy. So I don't want to over index on this particular data set but I think it is worth just sharing with people what the numbers show. So here's the Gemini 3 Pro benchmark results. I just wanted to talk about the difference in how things have changed. If you look at their humanities last exam results, Claude's on at 4.5 gets about 14% right versus about 38% for Gemini 3 Pro. Big improvements in math and screen understanding and these numbers really are materially better than other models that are kind of currently state of the art. And I just really think this pushes back against the idea that AI improvement has slowed dramatically. So I viewed this all as very, very bullish. And there's a little bit of a teaser though. One last thing here. If you take a look at this chart right here, there's an upcoming variation of Gemini 3 Pro called Gemini 3 Deepthink does even better and is crushing the Arc AGI benchmark which I think is very important. So I believe the kids say that Google cooked here and that a lot of folks are really going to dive into this pretty much right away because.
Jason Calacanis
This is going to be looks incrementally better. You know to be honest I we need to have an expert on here to tell us if it's actually better or not or like the actual extent because this history's last exam I do know having talked to some folks like they are some pretty hard questions I would like to like have an expert on here. Who created that. I wonder who created histories. Somebody created this. Somebody creates the new questions on it. I want to know who that person.
Alex Wilhelm
Is in a more real world application than Jason to put it into more kind of functional context. Browser use HMS 500 company gave it very strong marks. Instead it was much better than previous models from the company Box's Aaron Levy compared it to Gemini 2.5 Pro and found that in the box context, it was much better across a number of metrics and stripes. Patrick Collison gave it pretty strong review, you might say. He gave it a pretty hard task and was very impressed with what it put out when he put together a compendium of recent research into genetics. So.
Jason Calacanis
So some people in the field are actually reporting. That's good to know. Got rave reviews from some leaders who actually use it.
Alex Wilhelm
Yeah.
Jason Calacanis
At their major company. So Aaron Levy likes it, Stripe boys like it. That's pretty good endorsement that they're seeing a difference.
Alex Wilhelm
It's at least as good as people hoped, unlike GPT5, which was not as good as people hoped. So at a minimum the vibes are persisting. So that's pretty good. And then just for fun, Jason, I thought we'd talk about the poly market perspective here. There's a funny little wiggle here in this poly market. If you're on the audio version, we're looking at which company has the best AI model by the end of 2025. The end date of course is December 31st. And there was this really interesting wobble. Jason, we're zoomed into a one day context here. But look, Grok 4.1 came out, people got really excited about it. Maybe XAI is going to win. And then Gemini 3 came out and got crushed it again. So the market is once again betting that Google is going to run the AI game through the end of 2025.
Jason Calacanis
How does this one resolve? I always bring the same thing up, which is we need to understand how the bet resolves. So how do they define which one has the best AI model?
Alex Wilhelm
Again, the market will resolve to. Yes, if any model owned by Google has the highest arena score based off the Chatbot Arena LLM leaderboard. So it's the chatbot subset of the LM arena leaderboard that we discussed earlier. Again, head to head, no models listed. People just pick which one they think is best. So it's kind of a blind taste test and Google has to have the highest score in that area. And it didn't when Grok 4.1 came out. And now it does again. Now that Gemini 3 has come out.
Jason Calacanis
And surpassed it, that means Google is going to have the best model, period. Full stop. Like if the Sharps are saying it, there's somebody on the inside who's making this bet. I bet, I bet there's people inside the. There's some group of developers. This is my hypotheses. There's Some group of developers who run LM arena and there's people inside of Google who are betting on themselves like a basketball. It'd be like Michael Jordan betting on himself to win a game or cover the spread. It's like it's in his control. So when I see like every, that sort of level of consensus, I kind of think it's the developers themselves working on the language model or the person and, or the people who actually run the leaderboard who are watching the leaderboard activity and they have some inside information. This is why these, you know, Poly Market is like really changing the world because there's an opportunity to make money from your knowledge. People have some proprietary knowledge, some inside line or they're in control of it. But this is what makes these things so interesting.
Alex Wilhelm
Do you want to see a funny little addition then, Jason, while we're talking about this? Sure. If you pull this, if you look at this, this is the same chart but zoomed out to the entire year for this kind of end of year bet. And you can quite literally see people's enthusiasm for GPT5 and then it came out and just collapsed its percent chance of being correct. So not everyone trading on here as inside information because some people are still reacting to news as it comes out to the public. But you can make a lot of money if you know what's going to happen before everyone else. You can see right there.
Jason Calacanis
I think I'm with them. It does seem like Google is going to run away with it for this year and they also have this massive advantage with their profit machine and the number of searches growing. They can build out their capex. Yep. Off of profits and so that also gives them a massive advantage in all of this. And it makes sense. They bought DeepMind back in the day and they can just keep using all the data they have from browsers, Gmail, YouTube. I mean think about their data advantage. Gosh, what is. And have they come out with an agentic Chrome browser yet? Has there been any rumor about them releasing an agentic Chrome browser?
Alex Wilhelm
I think they're taking the same approach to adding AI to Chrome as Microsoft is with Windows, which is trying to slap it on the top so they don't disrupt what currently works. And I think that's why they're ripe for disruption. I know there's plugins and extensions. I don't think there's a hardcore agentic Chrome version, Jason. Or at least I. No, I don't think there is.
Jason Calacanis
Gemini is in Chrome like in the top. There is like a Gemini Button. Yeah, that will like, I don't know, summarize the page you're on and do basic stuff like that.
Alex Wilhelm
I'm not playing with this. Okay, this is better. I never actually clicked that button until now, so sorry if I'm behind but like, I don't think I ask. Okay, you see you can chat to Gemini inside of Chrome about your current page and you can change the tab, but this seems kind of bolted on to me and not as deep of an integration as we've seen with Comet, Atlas and other browsers.
Jason Calacanis
Yeah, I think they'll do like some basic things like create a calendar event for you, but I don't think you can have it. Go do a task for you like Comet does where you can say like, hey, go find me 10 camping. Put 10 camping supplies I should have for my camping trip this weekend into my, you know, shopping cart and give me two versions of each flashlight, two versions of each sleeping bag and I'll delete the one I don't want. Give me the highest rated and average price and I think those are the kind of features people are going to be looking for. But integrating with like, you know, summarizing your YouTube video or making a calendar event, I guess those are nice. But when they, if, if they are studying everything you do inside your browser, we're talking about Apple intelligence earlier, studying how you use an app. Now think about that. With your Chrome browser plus your Gmail, they're going to know everything you shop on Amazon. So then they could every time you're. This would be super aggressive and probably trigger some antitrust stuff. But imagine if they were studying your purchasing and then just taking that into account with an AI agent for like, hey, I can build your shopping cart for you for your groceries next week based on, you know, what I've seen you do in the past, right? I, I think you're going to be out of milk. I think you're probably needing more eggs. It'll be very interesting. Your startup needs custom software, but building your own secure production ready apps is hard. Right? And not every founder has the expertise or time to get started. Well now you don't have to just use Zeit Z I T E. Make whatever you need from idea to a working app in just minutes, easily all on your own. Plus you can fill it in with forms, apps, databases and other automations that you need as part of your product. So many of the no code and vibe coding tools are just flashy demos or they work on legacy systems that take so much time and effort to learn, let alone master. But Zeit is both easy to use and powerful enough to build out the app you want. And it quickly connects all the other tools you're already using. For example, earlier this month they added the ability to bulk create and update records. So it's easier and faster than ever to design data heavy apps that are constantly syncing between systems. That's the kind of attention to detail that sets Zeit apart. Start using the number one AI powered business software generator on the market. I'm going to give you 50% off your first project. Go to site.com twist to get started. That's zeit.com twist for 50% off your first project.
Alex Wilhelm
We just did confirm, Jason, that the current instantiation of Gemini in Chrome is non agentic. I bet you if you count to like 35 they're going to update that because even Edge, Microsoft's current browser, has the same kind of right rail agentic interface that's very similar to again, Atlas and Comet from Perplexity. Yeah, Speaking about data and which companies have it, Microsoft has a lot of data because they run of course the Office suite and have for a thousand years and they are going to invest in Anthropic. This is the other enormous AI news story from today. Both Nvidia and Microsoft are going to Invest into anthropic. Up to 10 billion from Nvidia, up to 5 billion from Microsoft.
Jason Calacanis
Wow.
Alex Wilhelm
Also, yeah, also Microsoft is going to become a compute provider for Anthropic. So you're going to be able to use Anthropic's Claude models inside of the Azure ecosystem. And they're going to buy Compute from Azure as well. So it's kind of one of these classic AI multi part deals in which Microsoft's going to work with Anthropic and invest in it. Anthropic is going to work with Nvidia to ensure that its models can run efficiently on Nvidia GPUs. And everyone's very excited about this. This is going to be fun. Now, no matter what cloud you're on, Google Cloud, Azure or aws, you can get access to cloud models. I think it's the first AI company to be available on all three major clouds, which I think speaks to their business focus.
Jason Calacanis
They're very much don't want to compete on an application level with their customers. They want to be a neutral third party. Although they do have a coding product. I think they're not going to do what OpenAI is going to do, which is try to, you know, invest in every single space. But Microsoft investing 5 billion after just settling their OpenAI deal. I think that's probably why this is happening now.
Alex Wilhelm
Yes.
Jason Calacanis
Is the OpenAI deal closed and they seem to have worked that out that they're going to go public and they have certain ownership and now this drops. Microsoft investing 5 billion. Anthropic will use 30 billion worth of Azure compute potentially. So that's pretty amazing. And you know, these contracts are now starting to say we'll invest up to. So I think people want to understand the contours of these deals a little bit more. Wall street that is. And so now you're starting to see that those qualifiers come in up to 10 billion, up to 5 billion. Although they here in the notes it says they're going to purchase 30 billion worth of Azure. I wonder if that is actually locked in or that's based upon, you know, certain targets being hit. But that's kind of big news.
Alex Wilhelm
Anthropic has committed to purchase 30 billion of Azure compute capacity and to contract additional compute capacity up to 1 gigawatt. That's pretty sturdy language to my point.
Jason Calacanis
Like I think you're starting to see more precise language in these deals because people are wondering like is this going to actually show up or not? So when they put up to and they put committed like there might be, there could still be an out in that by the way, they committed to this amount and there could be caveats. If they can cancel this, they could push out this spend. So it could occur over 10 years, it could occur over 20 years. There's all kinds of contours and qualifiers being put on these things.
Alex Wilhelm
These companies want each other to succeed. So I don't think they're going to be so hard nosed about individual contractual things that they're going to cause a ruckus. Microsoft doesn't want Anthropic to look weak, nor does Nvidia want Microsoft to look like it made a bad deal. They want everyone to look brilliant and profitable and then there's just revenues raining from the sky. So I kind of presume there's some goodwill built into these. Jason, maybe I'm being naive, but you.
Jason Calacanis
Know, if they are making commitments to buy a certain number of Nvidia's chips, well, Nvidia needs to actually make those. So there could be outs. They're time based obviously. So the nature of these is rising tide lifts all boats and we need more power, more GPUs. But Dario was also on 60 Minutes this weekend and there was an interesting clip of him talking about white collar jobs and it was exactly in line with some of the reporting I've had. He basically is saying the. For the early stage and I'll summarize it, but he's concerned that rote work in the next couple of years, in the short term it's going to be really cataclysmic and he just kind of comes out and says it, which is refreshing. Let's hear the clip.
Dario Amodei
You've said AI could wipe out half.
Jason Calacanis
Of all entry level white collar jobs and spike unemployment to 10 to 20% in the next one to five years. Yes, that is, that's shocking.
Dario Amodei
That is the future we could see if we don't become aware of, of this problem now.
Jason Calacanis
Half of all entry level white collar jobs.
Dario Amodei
Well, if we look at entry level consultants, lawyers, financial professionals, you know, many of kind of the white collar service industries, a lot of what they do, you know, AI models are already quite good at and without intervention it's hard to imagine that there won't be some significant job impact there. And my worry is that it'll be broad and it'll be faster than what we've seen with previous technology.
Jason Calacanis
This is interesting because he's pretty concise there. Again, back to contours qualifiers. He thinks this could be 10% unemployment. Now we've had 10% unemployment quite recently amongst young people. It's 8% right now coming out of college. So saying 10 to 20% is basically not that 4% number. You'll hear me quote all the time. That's like the sort of national. But amongst young people unemployment, it's like 8, 9% already. So going to 10 to 20% is not that big of a jump. This is the unemployment rate among 16 to 24 year olds and you can see what's that climbed from a bottom.
Alex Wilhelm
Of 6.6 in April of 2023 to a high of 10.5% as of August. Recall government shutdown, data delays. That's the most recent data point.
Jason Calacanis
Yeah. So it's already at 10% amongst this 16 to 24 group, the higher group, which is like I think 20 to 27, I think it's 8% or so. That group is an important one to look at because that includes like the college grads who've been in the market for a couple years. The point is what he's saying is not far fetched and it's not doomerism. It's actually really well thought out and you will hear people dismiss it as doomerism and the people you Hear maybe dismiss it as doomerism. Maybe they have horses in the race, maybe they have political agendas, et cetera and they don't want people to be as matter of fact about this. But he is exactly right. If you were to look at an entry level PR person or an HR person, or an entry level researcher, entry level apprenticeship level accounting person or legal person, a lot of what they do is a drag on the senior people. And the senior people are now able to do that young person's grunt work with AI. So then you have to ask yourself, if you're a senior level person, do you want to mentor, you know, these two or three annoying kids who are just learning how to like show up at work on, they can't even get in on time, or they're goofy and they fool around or they do some stupid stuff at work. Like everybody's had this experience. Like mentoring people is hard. It takes time out of your day. Some people find it rewarding, most people find it annoying. Sorry, it's just the reality of it. And then, you know, you could just have run a query with your LLM or use your co pilot from a legal provider, a tax provider, a coding provider, or you know, if it's an HR person and they're like, I'm going to put people on recruiting for these new positions and writing the job descriptions, then I've got to go edit the job script. It's just easier for me to do it myself. You take a senior person like yourself, Alex, or somebody like Lon, you know, you got to prepare the docket. You have a young person prepare the docket, you edit them. It's like I could have just done this myself with a large language model and it would have been done already.
Alex Wilhelm
I think we're going to have to make a civilizational decision here because everything you're saying, Jason, is right. Often mentoring people slows you down. It burns time and it makes the senior person who's more expensive, less productive net because they have to go back and do a lot of busy work. But if we want people to have kids, they're going to need employment and stable employment is better.
Jason Calacanis
And yeah, so you're thinking on the societal basis, but that's not how people work. That's not how business leaders work day to day.
Alex Wilhelm
But maybe that's what I'm saying. We need to change our mindset because if we do automate these jobs away and we take a lot of careers, that should start right after college and begin to accrete value and income and therefore wealth and therefore the ability to buy a house and have children. If we short circuit that process at the beginning, we're going to have a lot of people that are just 45 with two roommates and a shared dog.
Jason Calacanis
Yeah, it's already happening. Yeah.
Alex Wilhelm
So, I mean, these conversations that people don't want us to have because they don't want us to, you know, tell people the truth about what's coming. I mean, who's offering. Is anyone even trying to find a solution to this other than just throwing their hands up? Because I agree with you and it scares me.
Jason Calacanis
It doesn't need to be scary. It should be concerning that smart people building in this space. Dario Elon. Yeah, myself, not that I'm on their levels, but. And Bernie Sanders and other folks are saying, hey, this is a little bit concerning and we should be thinking about it. Individual businesses are not going to think collectively. The government is like, they're, they're never going to be able to like, get in here and solve these problems. So what I'll say is, you're on your own, young people. The message to young people, you're on your own. Nobody's coming to help you, and you got to figure it out for yourself. That's my best advice. Because businesses are just going to do what businesses do, which is lower costs and be more efficient. They're going to adopt the technology. They are adopting the technology.
Alex Wilhelm
Oh, quickly.
Jason Calacanis
Yeah. And so they are not going to think about this overall society issues. Our government has never been able to manage any of this stuff. They never will. They're not going to do it. There'll be like 10 years in the review mirror by the time they get involved. I mean, they're just thinking about breaking up Google search monopoly in year 30. So they're out to lunch.
Alex Wilhelm
Government moves slow. But what about making conscious decisions like putting together a consortium of leaders of businesses and say, hey, we are not going to stop hiring young people. We are going to keep investing in the future generations because we would like to not only have a society in 50 years, we'd like to have senior people in 50 years like this. This is a thing that businesses could get together and choose to do if they wanted to.
Jason Calacanis
I'm trying to figure out if that there's any precedent for what you're saying. It's idealistic. But these are competitive businesses. So it would be like during the, I don't know, the banking era, saying, like, oh, we're going to put in, you know, teller machines. Let's all the Banks get together and we'll coordinate, you know, having more tellers to do more things, which eventually exactly happened. You know, you go inside the building and they had more services they offered other than just counting you money and giving you a deposit. They worked on other. They move those people to work on other offerings. I think in some cases they often.
Alex Wilhelm
Have, but it's different this time. The thing is, I don't think this is just another introduction of a specific technology that impacts one slice of one part of the, of the labor force. Like, like ATMs are automated teller machines. They automated tellers in a machine. Tellers were not 10 to 20% of the workforce. Right. So when, when Dario says this and you say he's right and he's being upfront, we are implying that we are going into a Great depression level of unemployment because we're going to automate away tens of millions of jobs.
Jason Calacanis
I think it'll be low millions of jobs. I don't think it'll be tens of millions, but I think it will be low millions per year will be automated away. So then the question is like, do we catch up with new work, just start a company, go to founder university? I think nobody's solving these problems. I think it's going to be tragedy of the commons kind of situation. Nobody's, you know, going to be looking out for the interns and the apprentices. They're just going to adopt the technology, go faster, lower their costs. Amazon's just going to be a machine. They're not going to like, think collectively about anything other than faster deliveries at a cheaper price.
Alex Wilhelm
And you better get accustomed to Mandami winning in New York City, because what you're saying here is that there's no way for businesses to look out for their own interests. And that's surprising to me because you're saying essentially it has to be such short term thinking, they can't think long term. And that's an indictment of the capital formation process.
Jason Calacanis
They can't think collectively, they can think midterm long term. But it's not like Amazon's going to go, you know what, we need to keep adding staff to do this instead of robots so that our staff can buy stuff on Amazon or can go to Starbucks on their way to work. They just, that's just not how businesses think. They're just going to be radically pursuing efficiency now. The question is, do people want to create more companies and products and services using these new tools? If you know these tools, you're going to be infinitely employable. And that's the best advice I can give any young person who is going to be faced with, you know, 15% unemployment amongst their peer group. I think it'll be 15%, which I think when I graduated college in 93, I think I remember 14 or 15% unemployment amongst that 20 to 27 year old. If you look at that.
Alex Wilhelm
Okay, here's the 20 to 24 year old unemployment rate historically.
Jason Calacanis
Okay, that's interesting. That's like really graduates. If you go to the 93, it was like 12%.
Alex Wilhelm
It was just starting to decline from about 10 and then it fell consistently down to 6.
Jason Calacanis
Right when I was graduating it was over 10%. Now it's at under 10%. It could be faster, as he's saying, than people anticipate.
Alex Wilhelm
If it's not faster than he's anticipating, then we're spending too much on AI. And if it's going to go as fast as he says it is, we're going to end up with socialists running every single major city in this country. So something for business people to think about as they approach productivity.
Jason Calacanis
So we're going to host our dim sum demo day in San Francisco. We'll have our latest accelerator class. Maybe I'll do a fireside chat with one of my friends or besties and we'll eat some dim sum. And it's just going to be great networking for investors. I'm going to invite 150 investors, everybody from my high net worth LP base, the syndicate members who like to write 10 to 50k checks into startups, as well as all my seed fund and venture friends in Silicon Valley. You can apply to come if you're an active investor. You have to be an active investor. This is only going to be like 150 seats, I think, or even 100. There's no room for, you know, friends to just hang out or other founders to hang out. You have to be an investor. You have to be able to prove your investor. You can apply Launch co Dim Sum D I M S u M Launch CO D I M S U M Dim sum so jealous.
Alex Wilhelm
That's gonna be so much fun. I love dim sum. I love dim sum in San Francisco and I love me some nerds. So that's like my three favorite things.
Jason Calacanis
It's gonna be in San Francisco area. I like to make it like, you know, these demo days. I think 70% of the value is to get to catch up with other investors, candidly. And then 30% is to get to see the companies and get to meet them. So there's the reason to come. If you want to fly in to San Francisco, you got a reason to do it December 5th. And I think December 6th is the on and holiday party. If you happen to be coming in for that. Awesome. And that'll do it for us today.
Host: Jason Calacanis
Guest: Alex Wilhelm
Date: November 19, 2025
In this episode, Jason Calacanis and Alex Wilhelm break down the latest developments in the AI model space, focusing on the recent releases of Grok 4.1 from XAI and Gemini 3 from Google. They cover model performance, real-world implications, industry trends, the reliability of cloud infrastructure, recent major M&A moves in the AI sector, and the looming societal impacts of rapid AI adoption. Their conversation combines technical analysis with candid debate about long-term consequences for startups and entry-level workers.
Grok 4.1 and Gemini 3 Pro:
Both models were just released, and each leapfrogged state-of-the-art benchmarks immediately.
Leaderboards & Real-World Impact:
Having public leaderboards (like Chatbot Arena LLM) has incentivized labs to optimize their models, but also raises questions about "teaching to the test".
Cloudflare Acquires Replicate:
Replicate offers serverless AI access — startups can access multiple models via a single API, rather than standing up their own infrastructure.
Future of Data Privacy & Model Competition:
Jason predicts more startups will train proprietary models to protect domain-specific data and competitive advantage.
Open Source Model Adoption:
Many startups are experimenting with open source models, including those from China, for niche/in-house applications.
Prediction: Localized, Private LLMs:
Jason imagines a near future with high-power Macs running local, private LLMs, ensuring privacy and offline capabilities.
Anthropic’s Massive New Investment:
Microsoft and Nvidia are investing billions in Anthropic, a top AI model developer.
Deal Structure Evolution:
New large deals feature language like “up to $X billion,” reflecting uncertainty and future milestones.
Dario Amodei’s 60 Minutes Warning:
Dario Amodei warns about rapid white-collar job loss due to AI, predicting possible youth unemployment rates of 10–20% within five years.
Mentoring, Automation, and Societal Consequences:
The hosts debate the likelihood that businesses will act collectively to mitigate societal harm. Jason is skeptical about coordinated, altruistic solutions.
Historical Context & Future Outlook:
While the impact of AI is likened to past automation waves (like ATMs or robotics), the scale and scope of possible job loss in entry-level, white-collar roles is unprecedented.
Founders Should Prepare for AI Integration:
Encouragement to Adapt:
The episode ends with announcements and encouragement to attend upcoming networking opportunities for investors and founders, reinforcing the need to stay plugged in and adaptable.
On rapid AI evolution:
"Grok 4.1 is really good across a number of metrics compared to Grok 4. Fast. It's dramatic improvement... When 4.1 was released onto the charts, it went straight to the top." — Alex Wilhelm ([00:07])
On the risk of leaderboard optimization:
"Are we actually making progress at solving the real world problems people have or are we getting good at acing the SATs? And that is a question for me." — Jason Calacanis ([00:32], [19:01])
On PolyMarket & insider knowledge:
"It's like Michael Jordan betting on himself to win a game or cover the spread. It's in his control." — Jason Calacanis ([25:03])
On structural unemployment:
"AI could wipe out half of all entry level white collar jobs and spike unemployment to 10 to 20% in the next one to five years." — Dario Amodei ([35:38])
On the need for self-sufficiency:
"You're on your own, young people. Nobody's coming to help you, and you got to figure it out for yourself. That's my best advice." — Jason Calacanis ([40:18])
AI Model Race and Benchmarks:
Cloud Infrastructure Discussion:
Serverless AI & Proprietary Data:
Agentic Browsers, App Ecosystems:
PolyMarket & The AI "Horse Race":
Major Investments in AI Companies:
AI, Automation, and Youth Employment Crisis:
Societal Adaptation & Founder Advice:
This episode provides a comprehensive, sometimes sobering overview of the fast-moving AI landscape, from cutting-edge model releases and their real-world efficacy to the looming challenges for startups and young professionals. While showcasing new heights of technological advancement, Jason and Alex press listeners, especially founders, to balance optimism about AI’s potential with practical awareness of its societal impacts.
As always, the message for startups: stay agile, keep learning, and be prepared to build with — and around — rapidly changing AI tools.