
Loading summary
A
All right everybody, welcome to twist. It's April 8, 2026. My co host Alex is with me. We've got a bunch of guests today and we've got a major breaking news story which is that Anthropic released a promo video and a thread that their new model Alex is so powerful that they cannot release it. We knew this day would come. The day is here. Why can't they release it? They believe when they tested it, or they found out when they tested it that it would try to escape. That was one issue. But a bigger issue was it could find exploits in 10, 20, 30 year old software projects and it could thread together multiple security vulnerabilities. Catch everybody up in the audience on this incredibly important story.
B
This week in startups is brought to you by LinkedIn. Jobs hire right the first time. Post your job and get $100 off towards your job post@LinkedIn.com twist grasshopper bank. Time is money. Don't waste either. Go to grasshopper.bank twist and get an exclusive $500 cash bonus just for opening an account and Render. Find out why 5 million developers are already using the all in one cloud platform render. Go to render.com twist and apply for the render program to get 500 to $100,000 in free credits depending on your stage and backers. Big model race going on between the major AI labs. Anthropic is now very far out in front with its new model called Mythos. It is a general purpose LLM, so it's not tuned for one specific task. It is currently in preview. You cannot use it. A consortium of companies, Jason, are working with Anthropic to basically use it in a defensive capacity because as you said, Mythos is incredible at both finding, exploiting and patching security vulnerabilities in software that humans have often missed. This goes back decades, as you said, to things like OpenBSD, a famously secure piece of software. It found something there. It found something in FFmpeg, which is an important part of the open source infrastructure of online video, for example. Basically the gist is with this model anyone can go to any piece of software and and find zero day exploits quickly and then basically go to war with them. So Anthropic cannot let this out of the bag because if they did, then North Korea and China and everyone else could use it to essentially break the modern digital infrastructure that we depend on. So today, Jason, Project Glasswing is the goal. A bunch of companies, Your Nvidias, your AWS's, your azures are all going to Work with Anthropic to basically take the model mythos preview and harden everything. Anthropic has also put together a $100 million credit fund, essentially saying, here, use the model up to $100 million of compute to essentially harden these systems. So I think that anthropic is doing the right thing here by saying, hey, we're not going to release something this dangerous, you know, from just off the cuff. But it does create a situation in which we now have a very much two tier economy. There are the companies that are sufficiently important that Anthropic is letting them have access to Mythos, and that means they can be ahead on both defense and offense. And also we're now seeing a world in which smaller companies are just stuck outside the glass looking in. And I think it's a little bit of a change because previously every AI lab was so focused on having their newest and best model out in the world that it was very democratic.
A
I'm going to play the video that they shot and we can get into with our guest today whether this is cynical and this is showmanship. And they want everybody to understand, hey, it's this week in Anthropic. Basically here we're going to rename the show because they're dropping so much incredible content. But before we do, I have to take a moment to applaud plod. You see my plaud pen? Here they are, our sponsor of the show today. You have the Plaude wristband. I hold and press. I get a nice little haptic boom, the red light goes on. And now when I put it into my charging tray, here's my little charging tray. And here's my backup plod. You should buy two. That's my best advice. Have two. One in your bag and then one at your desk. Any meeting you do, you're just one click away from having the note taker on. And this note taker is independent of all other platforms. It does beautiful summaries. I am addicted to this. You know the big test for me, Alex, it's like this is called the key and the wallet test.
B
The key. Oh, I know where you're going with this.
A
If I leave my house without my keys or my wallet now in today's day, it's your smartphone and your pouches, your nicotine pouches. Those are the two things I gotta turn. No, I don't turn back for nicotine pouches, but I do turn back for my plod. I literally put the car in park, walked back to the house, put the car in Reverse went back into the house and I got my Plaud pin.
B
I think it's fantastic. If you want to have a good recording of all your conversations and automatically taking notes, then you should go to Plod AI TV Twist. P L A U D A I Twist. Use the code TWISTJASON. Save 10% on your purchase. We're big, flawed fans. Shout out to them.
A
Let's get started with the show. I want to go deeper on this and you know our pillars here on the show. Show don't tab. We like to have great demos and we like experts. Experts only, no offense to our journalist friends. The top 20% do a great job. The other 80%, we all know, you know, maybe they're trying their best, but we like to have experts on. We do experts directly from the horse's mouth. Here we go.
B
We got Rob May from Neurometric. Neurometric builds small language models. In contrast, Jason to large language models or LLMs. Things we talk about so very much. We're going to talk about SLMs in a little bit of time. But Rob, I presume you've had a chance to read through the anthropic Mythos card, read the red team report and chew on it. So what do you think?
A
Rob has opinions, so people know. Rob was one of my first angel investments, one of my first five for a company called backupify. This was a genius idea he had. I cold emailed him. Is that correct, Rob? I cold emailed you and said, hey, I love your product.
C
Yeah, you did. This was back before you were famous, Jason. So you had to explain to me who you were.
A
Yeah, I was like, hey, my name is Jason Calacanis. I use backupify. And this was the best idea ever. If you lost your Gmail account, Alex, in the days before this is before Google Suite, I think think existed or was just coming out. If you lost your Gmail account, Google would say, okay, you got hacked, and you'd lose your entire archive. Rob figured out a way to use the API to back up your Gmail box and your G drive and, and. And every other service you can think of. It was an amazing vision and it was a great company and we had a nice little exit. I backed every company Rob's done and he's got a new company. So I'm excited to hear all about it, catch up with Rob. Rob also ran the Open angel forum for me and Boston. Yeah. For a little.
C
Yeah. And actually, I think I was one of the first people to talk to you about AI, Jason, back like 10 years ago when I was. I was talking to the incubator and telling the story, and you and I had dinner with one of your friends, Jeff, with the G. And we stayed up late talking about A.I.
A
yeah. Good memories. And, yeah, this is. I'm going to be in a old age home, Alex, and I'm going to be like. Welcome to episode20122 of this Week in Startups. We're here on Twist, and here's Rob May. He's going to be like, in a wheelchair, like Professor X, but it's going to be a levitating, floating one. And it'd be my brain in a vat of back to tank, and I'm just going to be 120 years old still doing this and loving it. Let's get into it. Mythos is the preview of the new anthropic model. Here is Dario and this video. I don't know if you see this video yet.
C
I don't think so, no.
A
Okay, we'll get you to react to it for the first time here. Here is the anthropic team talking about why they're withholding the model.
D
There's a kind of accelerating exponential, but along that exponential, there are points of significance. Claude, Mythos preview is a particularly big
A
jump along that point.
D
We haven't trained it specifically to be good at cyber.
A
We trained it to be good at code. But as a side effect of being good at code, it's also good at cyber.
B
The model that we're experimenting with is, by and large as good as a professional human identifying bugs.
C
It's good for us because we can
B
find more vulnerabilities sooner and we can fix them.
C
It has the ability to chain together vulnerabilities. So what this means is you find two vulnerabilities, either of which doesn't really get you very much independently. But this model is able to create exploits out of three, four, sometimes five vulnerabilities that in sequence, give you some kind of very sophisticated end outcome.
D
And we think that this model can do this really well because we notice that this model is very autonomous. It's just generally better at pursuing really long range tasks that are kind of like the tasks that a human security researcher would do throughout the course of an entire day. Obviously, capabilities in a model like this could do harm if in the wrong hands. And so we won't be releasing this model widely.
B
More powerful models are gonna come from us and from others.
D
And so we do need a plan
A
to respond to this.
D
That's why we're launching what we're calling Project glasswing, where we partner with a number of the organizations that power some of the world's most critical code to put the model into their hands, to allow them to look at how they can use models like this to bring down risk and protect everyone.
B
And by giving these software developers advanced tools before anyone else, it gives all
A
of us a collective head start. It allows us to find things that we couldn't find before, and it helps us fix these things much more quickly.
D
Working with our partners, we've been finding vulnerabilities across essentially every major platform.
C
I found more bugs in the last couple of weeks than I found in the rest of my life combined.
A
Dario said. He's got very big concerns, and the reason he left OpenAI was he felt Sam Waltman was not trustworthy. I'm not piling on Sam here. There was a big New Yorker story, but the truth is Sam drove a lot of people out of OpenAI and there were trust issues, according to those people. Again, I'm not editorializing here. Everybody kind of thinks I'm Team Elon, which is fair enough, but I don't have investments in any of these companies, so I'm not talking my book or anything. But the truth is he drove Dario out. Now Dario is passing OpenAI in models, profile, PR influence, and dare I say, the revenue here. So your take Rob. If you've got an engineering team at your company, I'm betting there's a solid chance they're spending far too much time on infrastructure. You need your team building your your product to delight your customers, not configuring your virtual network. Render is the all in one cloud platform for developers that allows you to deploy, scale and secure your apps and agents with zero ops. Most cloud platforms ask you to split your focus between product and infrastructure, or they force you into platform constraints that you know you'll outgrow in six months. But just connect your GitHub repo to render and you are live L I V E Web services, Cron Jobs, manage, postgres, the whole stack in one platform. It's time to find out why 5 million developers are already using Render. Go to render.com twist and apply for the Render startup program. You'll get anywhere from 500 to $100,000 in free credits depending on your stage and who your backers are. That's render.com twist.
C
There's a lot of stuff in there.
A
Take it wherever you want to go.
C
Yeah, I think Anthropic has significantly passed OpenAI on a lot of things. And I think the reason is that they've been more focused. Like what's happened to OpenAI to focus on them for a second is that they were. Because they were first and they've had to raise so much money to build these models Sam had to go out and sell that they were going to enter. They needed a $10 trillion TAM, which means you got to be in every market. Right. And so I think that's their problem. They've been trying to do a little bit too much with respect to this model. I do think it's interesting and I like their approach. I think we were talking a little bit about at the beginning of the show about they're going to IPO this year. So this was a very well put together video that I'm sure they had the IPO in mind as they started to do this. But that said, I actually, I like the approach. And what I like about it is I think we've been a little bit. The Silicon Valley vibe on AI for the last couple of years has been like, whoever hits AGI first wins. Right? Because that machine extrapolates and gets better. And I think what we've seen over the last 18 months that surprised everybody is that the parity amongst the top labs and including Google and everybody is really. What it means is when we hit AGI, we're all going to have access to superintelligence for free and open source three to five months later. And so I like what they're doing because they're preparing us for that day. And I think that's super important.
B
Rob, do you think that the open source.
A
No, no. Alex, I want your opinion. Actually, having watched this and handicapping it, you've been deep in this space and cautious optimism is a place to go. Look at that cautiousoptimism news.
B
Yes, sir.
A
Yeah. Okay. So I read the newsletter. Everybody should subscribe. You've been obsessing over this. Your take on Dario being right. Dario being driven out of OpenAI. Dario focusing on code first as opposed to sideQuest and consumer and them surging ahead of OpenAI.
B
I'm surprised at the speed we went from anthropic is catching OpenAI to they seem to be tied to. Anthropic is winning October, January and now April. It's pretty crazy how fast things have changed. Anthropic has gone from probably around 10 billion ARR in last October to like 30 now, which is just incredible.
A
Unprecedented.
B
Unprecedented to the point at which I don't even really understand the numbers. What I will say, though, about their decisions in product terms is that I think product market fit is the ultimate arbiter of entrepreneur's success. And clearly no one has more PMF than Anthropic today regarding the future. And Rob mentioned that the video is aimed kind of at investors in the IPO talking about future capabilities, future capacities. There was an interesting interview between Greg Brockman and I think it was Alex Kanterwitz over at Big Technology talking about models. This is right before Mythos came out and they were discussing takeoff and how these models are now getting a little bit better at self improvement, working on themselves, writing their own code. And so I think we're seeing the tailwinds of that though. The thing that I'm unsure of, back to Rob's point is how quickly the open source world will catch up to Mythos because today Meta dropped their latest model in their new family. Shout out to them for pulling that off and it looks pretty good compared to everything that came before Mythos, but it's not now state of the art. So if the proprietary labs are here and then the second tier players are here and open source is here, I'm curious if it's more than three to five months until they catch up and if so, we have more time to fix the world, Jason, to find all these vulnerabilities and patch them because to me this is at once the best tool in the world for cyber defense. Find the bugs, patch them and also the best tool for cyber offense. Find the find the flaws and then abuse them. And so we're going to be in an arms race I think until we're all dead.
A
Let's take a look at the anthropic polymarkets. This is a way for us to really understand how polymarkets actually doing. There's so many to choose from. Here's the URL of all anthropic polymarkets. So let's pull this up. The first polymarket that comes up under the anthropic tag over at Polymarket and you can basically find a keyword there and it's polymarket.com prediction anthropic. Right. The first one. Anthropic $500 billion valuation in 2026. 95% chance everybody realizes that's going to happen. Anthropic Claude score on Frontier Math benchmark by June 30th 72%. And then there's the anthropic IPO closing market. Just tons to go here. I think the one that we want to know is when are they going to release this model. And this model is called Mythos. And so walk us through this one here, because I don't see a chart on this one.
B
Yeah, it's not charted. Some of them that have fewer kind of like, endpoints aren't charted per se. But in this case, we can see, Jason, that there was people betting that it might come out before the end of March that has now, of course, lost because there was reporting about this model of. Via Fortune and a leaked blog post a couple of weeks ago, if people remember now, April 30, I would say 0% chance. Poly market sharps say 7%, Jason. But the thing that really caught my attention is this June 30, they're only handicapping a 28% chance, which is. Which means three out of four times we won't have Mythos out in the market by the start of July. That's a couple months from now.
A
Or two thirds. Yeah, two thirds, because you got to put the first two numbers together, I think.
C
Oh, right, of course.
A
Two thirds chance. You know, it's not out until. Yeah. Sometime in the summer or the fall,
B
which in AI terms is. Is years. That is. That is so much time in this present moment. And the thing that I'm not sure about, and I'm really curious to know, is all these companies that have Mythos now and can play with it and can use it, can put it to work hardening their software, how much progress can we make in that interval? And will it be half the work we need to get done? Will it be all of it? I don't know. Rob, do you have an idea of how fast we could put a model like this to work in fixing all the code? We depend on day to day.
C
The problem is we're generating code faster now. Right. So with all the Vibe coding, I think. I think I read that more than 25% of GitHub commits now are or Vibe coded. And so that's going to go up. It's like you're. This is all going to boil down to. So it's. It's. I know you love poker, Jason, and so, like, the world is going to boil down to poker. Running a business in this AI future is going to be about estimating probabilities of things happening, knowing the cost to go run those probabilities to ground and figure out what they really are. And, like, so. So this is going to boil down to using your compute to figure out, like, you know, code generation versus code checking. I mean, Anthropic has the main code generation platform Sort of the number one right now. So maybe they combine these together in ways that write better code. But, you know, if you're talking about it not coming out till June 30, I would not be surprised to see an open source model like Quinn or Kimmy come out with something that's similar glm, maybe like before that date from somebody who doesn't care as much about.
A
Okay, so this is a key point that you're making, Rob. What if a bad actor has already achieved this? Like, China may have already accomplished this, and they have no incentive since every company in China is owned by the ccp, which would be the equivalent of like the CIA having a board seat in every single company. And they were. There's like four CIA people and FBI agents and Department of Justice, whatever, inside of anthropic and OpenAI saying, okay, not only are you not making that promotional video, we're taking this and we're going to hack North Korea, Iran, whatever bad actor we want. China might already have this. Everything could be completely compromised at this point. And I think that's when Americans who were debating David Sacks and the administration and are we hand wringing too much here? That America has to win the AI race, folks. It is existential who wins this. It's 100% existential. If these tools give you the ability to hack everything. Here's a hot take. If your bank moves slower than a startup, that's a problem. I see this process behind the scenes every day, and I know that bad banking can kill your company's momentum. That's why I'm so glad to introduce our newest partner, Grasshopper. Grasshopper is a real federally chartered digital bank, not some fintac rapper sitting atop some mystery institution. Nope. It was built just for founders like you. You want fast? You want easy? Open an account in just minutes and start earning yields that can top 5%. Wow, that's a big number. Plus, you'll get unlimited 1% cash back on purchases, free ACH. Free domestic wires, and no monthly fees. Plus, if you're sitting on some real Runway, Grasshopper's treasury product hits 5% plus with same day liquidity. As a twist listener, Grasshopper wants to give you a $500 cash bonus just for opening account. And you can open an account really quick. So go right now to Grasshopper Bank Twist and use the promo code twist to get started.
B
I think we should consider this Mythos model, this Mythos preview, to be essentially a cyber weapon and perhaps a cyber weapon of mass destruction. I mean, maybe we need A new term for this, because I can't recall, apart from certain hacks in history, of a point in which people were worried about all of software having potential vulnerabilities. And I think, Jason, it's a little bit weird that we're talking about a three to five month gap here of time that we might have ahead of China. But I would so much rather have us have that time to work to get things secure than to be catching up for three to five months until our companies were of a sufficient quality to match them.
A
So I thought this sheds a new light, Alex, on the Emile Michaels appearance on all in three weeks ago, maybe, where he was talking about the anthropic conflict and are they going to ban it or whatever. These guys must have made peace right now. And I wonder if Dario and Emil, when they broke bread and were trying to work this out, I wonder if Dario said, by the way, our new model would give the CIA the ability to hack North Korea, to hack China. And we are patriots at anthropic. And instead of giving the tool to a bunch of security researchers, we're going to give it to the US Government. Because if you made this the equivalent, Alex, of the race for the atomic bomb, sure, no American citizen would be like, you know what we need to do? We need to keep the atomic bomb to ourselves, and we're just gonna delay making it. Oppenheimer would say we need to have this for ourselves before the Nazis get it, period. Stop. I don't think I'm out of line here to say that this is becoming the equivalent. This is becoming the equivalent. This might not seem as much because a nuclear bomb can cause such a mass destruction of life, but this could cause a massive financial devastation across the economy.
B
We talk about the GPT3 moment a lot. I think that's like nuclear fission in your analogy. And then Mythos would be the introduction of the hydrogen bomb. But, Rob, I think I may have cut you off there.
C
Yeah. Like. Like, can this thing figure out how to hack bank software, right? Like swift and international transfers. Like, who knows, Bitcoin? Like, you know, it's going to be crazy to see how, because we do know, like, if this capability exists. I think you're right, Jason. I think probably some other places have it. Like, my guess is Google has something like this that they haven't announced or released. Like, I still think they're in front of Anthropic from what I can tell the people I know there and the. The tools and technologies that they have. And we also know that last year, I think, for the first time, China passed the US in research papers on AI accepted into top tier conferences and journals. And so there's a really, really good chance, Jason, that I think your point of view might. This might be going on behind the scenes. We don't even know. And it's interesting to think about the run in that they had with the DoD a couple of weeks ago. Five, six weeks ago, whatever with Anthropic was sort of like, why don't they just use GPT5 or whatever? Like, you don't need Claude's model. But it might have had to do with this, and they knew it was coming. And we're discussing it.
A
And this was one month ago. So now we start doing game theory. I think the CIA and the government are inside of Gemini Anthropic, OpenAI Xai, talking to each of these model folks. I think they've probably been in there for a year or two, and they are saying, hey, are you a patriot or not? Are these tools, are you gonna hold them close to the vest, or are you gonna tell us exactly what's going down here and then you turn over another card? Alex, if this is in fact true, what they're saying, and if they present it as such, if Dario is presenting this as it's cataclysmic, the entire economy could go down. And we believe him, we take him at his word. There's an argument you have to nationalize this technology. There's an argument it's too powerful for a private company to own this. It would be like if a private company were to stumble on a bioweapon of such. Or this weapon that we were using, the disorientating weapon. If a private company gets to that level, hey, we've got a weapon that makes you come in like a superhero. And you could just Professor X, the entire other army and just make them all grab their ears and blood starts coming out of their nose. I hate to be graphic. You have an obligation to go to the president and say, Mr. President, this private company has this. He's. Okay, great. Let's go to Venezuela. Let's handle that problem. There's some equivalent here. I don't mean to be hyperbolic. I'm just telling you game theory. I don't think Dario's lying. I think Dario is being sincere. Now, I know everybody hates Dario. The right hates Dario. Dario wouldn't bend the knee to Trump. Dario did not donate to Trump, like, $25 million that one of the OpenAI CEOs or co founders did. He's not loved by this administration, but they love his tech. And then is he a patriot? Is he like Alex Karp and Palantir, or is he a hippie dippy who says, I don't want my tool to be used for this? This sheds it in a totally different light. And there's some conversation going on here that we're not privy to between the President of the United States, Emil Dario, the CIA, the Department of War. This is cataclysmic in its severity.
C
Yes.
A
And this is a super weapon.
B
I want to point out just one thing to back up what you're saying, Jason. Thomas Friedman, not someone we usually quote here on the show, but he wrote a post for the Times and he says this is not a publicity stunt. Referring to Anthropic's position here, quote, in the run up to this announcement, reps of leading tech companies have been in private conversation with, with the Trump administration about the implications for the security of the US and other countries. So not only is Anthropic saying this, people they've shared the model with are going to the White House and saying, holy crap, ring the alarm. Because suddenly everything is potentially insecure.
A
Now, Thomas Friedman then is confirming, and I didn't see that story. So great pull. He's confirming what I believe to be occurring right now. And by the way, this is not. I'm not in the Trump administration. I probably could have been. I could have been on one of these projects. I didn't. I mirrored. Thank you. So, yeah, well, it's just a choice. I'm an independent. Well, how do you think this and how do you.
C
But how do you think about the idea that the government needs to take this over at a time when the public's trust in government is probably the lowest it's ever been on both sides of the aisle, Right?
A
Yeah.
C
Republican and Democrat. I mean, doesn't that present a really interesting sort of wrench in the game theory, how you think about it?
A
Nobody trusts anybody because we're in literally. What was the show with David Duchovny? The X Files. Trust no one. We've caught up to the X Files. We've caught up to the X Files. Those people believe that the truth is out there and you should trust no one. Those were like the two signature concepts of that. That show was 30 years ago. You know what? Here we are, folks. This could be the thing that galvanizes three different groups of people who have the least trust in the world. Journalists, AI and the government. These three groups are not trusted anymore. And for good reason. You can debate it, but the X Files in 1993, they nailed this. The truth is out there. And the truth is this maybe could galvanize these groups of people. Thomas Friedman, the New York Times should be saying, hey, if we have any information about this, do we put this out as a public story or do we go to the White House, do we go to President Trump and Emile Michael and say, hey, by the way, we think this is real. Dario Elon, Sam Altman, Sergey Brin, they all must be having some sort of zoom or private conference here to unpack this. And how could this be used to stop North Korea and their inter ballistic missiles, which they have, and their nukes, which they already have, as opposed to Iran, which has none of that they're attempting. But North Korea is far ahead. There is a nuclear threat out there from a bad actor who's insane. Kim Jong Un, he's insane and he has a ballistic missile that could reach California and he has 30 or 40 nuclear warheads, they estimate.
B
I mean, yeah, but what's more effective now? The threat of nuclear deterrence or the ability to take a model and break everyone's entire nation? So like, to me, like, it's funny that I think I'm actually more worried about the second category than the first. Cause the first seems to be more about saber rattling and keeping your country from being attacked. But this gives them more useful offensive capacity. Jason.
A
That's the way we need to take 20, 30% of the cycles of these companies. The government needs to say, hey, we'll pay you for your time in not releasing these models. And we're going to create an Oppenheimer Manhattan Project. We're going to create a Manhattan Project to sturdy the infrastructure of the United States.
B
Okay, now let's take the other side of this coin. We're all kind of in broad agreement here. But here's the downside to all this caution and I'm going to pull it up right now.
A
Hiring can be its own full time job. And hey, guess what? I already have a full time job. I make podcasts and I invest. But when you're running a small company, we both know every hire matters. You don't want to waste any of the seats you have at your company. And the best partner you can have is LinkedIn Hiring Pro. Why, there's a billion people using LinkedIn. All the great talent are there. If you're proud of your work, you build a LinkedIn page and you update it. LinkedIn Hiring Pro is gonna streamline and simplify the entire process for you. Nearly 60% of companies using LinkedIn Hiring Pro. You're gonna get an incredible candidate to interview in the first week. And, you know, we're looking for a new producer for the pod. We did shout outs here on the show. We posted it on my social media. We asked friends, you know, where we found our Next Great hire. LinkedIn. And it was competitive. We had like three or four really good choices. So hire right the first time. Post your first job and get a hundred dollars off towards your post@LinkedIn.com hiringpro offer. That's LinkedIn.com hiring pro offer. Terms and conditions apply.
B
This is an excerpt from the Mythos previews systems card, I think. And essentially what it shows if you're on the audio version is a dramatic increase in score for Mythos across a number of very important benchmarks. These in particular, Jason, are software coding AI benchmarks. And as you can tell, you know, on swe bench multimodal, we went from 27% from Cloudopus 4.6 to 59% with mythos preview. So what we're doing is we're also saying we're going to slow the pace at which AI models that we use day to day for day to day tasks down. We're going to retard that function dramatically. And I don't think anyone else in the world is going to do that.
A
Well, I don't know that you have to slow down the model creation to just make a SWAT team from each company and say to each company, we want your five best brains on cybersecurity available. We're taking five from each company. There's going to be 25 of these. Maybe the smaller models can contribute one or two people, and they're going to be the brain trust that then builds a system. This is my proposal. If this is all true, which I'm 98% sure that Dario is not being hyperbolic, to raise the stock price, I'm going to take him on his word. If it's true, we could create a piece of software. We could create a new unit of government that then goes and says we're going to do our own red teams to try to turn off or overpower this nuclear power plant and have it melt down. And we're going to use these tools to see if it's possible. And then we're going to fix it. And we're not going to talk about it. We're not going to talk about it. You know that this group exists.
C
Yeah.
A
I mean, Just a group of people who goes and says, what are the biggest vulnerabilities? And this is where immigration of the most talented hackers in the world matters. We should also be trying to get every single AI researcher in China to defect, and we should pay them a million dollars to defect with their families. We should figure out when they're on vacation in Singapore or if they can get out of the country to Australia to go on a holiday. We should figure out how to pick them up and let them defect, which we did during the Cold War with Russia. We need to get them to defect and bring them to our team and then put them in a air gapped kind of space so we know they're not double agents, whatever, monitor them like crazy. We should be in a talent war to try to recruit these people out of these countries and then put them on our cybersecurity team. I bet you this is happening covertly. I bet you this is happening right now.
B
If it's only a million dollars a pop, that's the cheapest thing we can ever spend as a nation. But, you know, look, I don't like to talk about it much, but, I mean, Jason, there is a lot of rich people in tech, right? So why doesn't one of them just say, the first hundred million is on me?
A
Well, sure. I mean, this could come any number of ways. If the government's adding 1.5 trillion in spending for the military, you know, you got to think, like, we could put 100 billion from that spend into this very delicate area. It might mean that we need to have the government building this tool. We need to have a fork of it given to. And this is why that discussion where Emil said, anything legal we should be able to do with the tool, we'll buy the tool from you. And kind of Alex Karp's position is, hey, we make a tool. You decide how to do it. I think Anduril has the same position. Hey, we make these tools. It's up to the government to deploy them. It's not for us to tell them they can or cannot use it. And, you know, then we have to trust the government to not use this in some kind of crazy, abusive way. This is gonna be the story of the next year. I'm gonna predict it right now, and it's gonna get political, but this should be a galvanizing moment for America.
B
I'm getting a call, though. We're getting a call to the show from a dear friend of ours. I can pull. Ah, there he is.
A
Nick. Nick, are you okay. Nick, are you okay? It's my guy. Our correspondent Nick. In. He's in the Brickell. I think he's in South Beach.
E
Edgewater.
A
What is going on? I see you have something on your forehead here. What's going on?
E
Yesterday, I heard a mention that I was sponsored by somebody and that I wasn't disclosing it. And I just wanted to say. I don't know what you're talking about. It was brought up about Higsfield paying for these things. I generally don't know what that is. I am in Edgewater, though, by the way, not Brickell.
A
Okay, so you're in Edgewater. And you can confirm for us that as much as Higgsfeld would love to partner with an influence of your international fame, dare I say, crypto circles, technology circles, finance, I mean, even in the artistic community, with Your Adjacency to NFTs in the Miami art scene, you are confirming. Higsfield is not sponsoring you. Nick, you can confirm right now for us.
E
Definitely not sponsoring me. Despite the release of their new model, which integrates with Seed Dance 2.0. They are definitely not partnering with me, by the way. That doesn't work in the US but they have not partnered with me, not paid me anything, nor has, actually. And I appreciate you asking me about what neighborhood I was in, because that brings up to me the LinkedIn jobs, which has nothing to do with me. And they're@LinkedIn.com twist, for example. I've never worked with them, and I've never been paid for any of these things.
A
Okay, great. And just to be clear, the promo code. Nicko getting 25% off. Not a true story.
E
Not true. Never worked with them. Ever. Never worked with them. Never worked with them. Yeah, I just happen to be a big fan of these sort of products and services, but I just happen to be independently wealthy and don't need any cash flow from any of these brands.
B
Right, right, right. Nick, I gotta ask, though. I'm thinking about getting a new tattoo. I only have one, and I see you have a tattoo on your forehead that looks quite sharp. Where'd you get it done?
A
I think it's a little dirt. You might have a little smudge over here.
E
Oh, I didn't even notice that. That's crazy. I had no idea.
A
What?
E
Oh, I had no idea.
A
Nick, we appreciate you and thank you for clearing this up. Everybody wanted to know. Everybody use the promo code, Nicko, for 25% off your Roman Sparks, if you need Roman Sparks. Nick doesn't need it. He's all man. You need Roman Sparks fully functioning. Fully functioning. You can actually get that Roman Sparks 25 free Roman Sparks if you use the Niko code. Well done, Nick. Appreciate you. There he is, our roving correspondent.
B
Yeah, coming in hot from Miami. Also, I just googled Roman Sparks on my work computer. Am I going to get fired? I didn't know what that was.
A
No, it's totally fine. Roman Sparks is a delightful little lozenge that. Rob will fill us in. Rob, you're. You're a man of a certain age.
C
I don't know what that is either, but I'm making some guesses.
B
It turns out it's a. It's a hype. Yeah, I googled it. Anyways, I'm going to grab this by the tail and drag it back. On topic here. Rob, very glad to have you here. Neurometric is the company and you made me go out and learn stuff. I'd heard of SLM's small language models, but I wasn't sure what the parameter cap was, what they're good for. So first of all, what is the parameter cap? The differentiation point between a small language model and a large language model. And to put it politely, why do we care?
C
Yeah, so the cap keeps sliding, right? Just like the LLM cap keeps sliding. So Mythos, which we were just talking about, is a 10 trillion parameter model. So when that goes up, what's small in comparison is, you know, changes. I would say in general, people think about small language models as something you could probably run on a high end laptop. And that's sort of a rough.
B
Is that 10 billion parameters? Is that 20 billion parameters?
C
20 billion is normally sort of the cutoff these days. My guess is pretty soon people say anything under 100 billion parameters is small. But here's why you should care. Because just as the intelligence density, so the intelligence density is the amount of things the model can know for the parameter size, it is, right? That keeps going up. For LLMs, part of the way that these models learn more is they develop synthetic data training techniques and reinforcement learning techniques and new architecture tweaks, those filter down to the small models. So like an 8 billion parameter model can do more this year than it could last year. And so if you're not trying to hack the world's entire code base, right, if you're trying to build a model that reconciles a bank statement with your QuickBooks or, you know, predicts customer churn in your customer success funnel or things like that, these small models, as they climb up in their capabilities, we Predict that by 2030, 90% of common work tasks will be able to be done by like a 10 billion parameter model or smaller.
A
So what impact does everybody having the equivalent of today's Mac Studio or I just upgraded to a Mac Pro 14 inch with 448 gigs of RAM, I could clearly run an SLM on a Mac Studio. When you get to 256 gig, 512 gig, you can start running Kimi and other things, I guess OpenSeek, those OpenSeeks and the Kimis, those are not SLMs. Those are straight up open source LLMs. They require server level memory, server level CPUs. But people are starting to do SLMs on their latest iPhone, they're doing it on their latest laptop. So this will be embedded into every device eventually?
C
Yeah, yeah, yeah, it definitely will. And what's interesting about them is so if you look at what's happening with large enterprises that are AI forward, which is not very many people right now. But if you look at companies that have deployed AI for a couple of years now, what they typically do is they start with a Frontier model, right? Anthropic or OpenAI or maybe one of the open source ones, and they run all their tasks through it and as their inference charges climb up, they start to go, huh. Well, what of these tasks can we put to smaller, cheaper models? Because the bigger models are more expensive to run. Because when you run a layer of a neural network, you have to shuffle it in from memory into the compute, calculate it, write it back out to memory and save it. So bigger models take more memory, they're longer to run and everything else. These smaller models you can run on lower hardware. You know, hardware that's six, seven, eight years old, that's refurbished. I mean we even joked at Neurometric about like, can we just buy a bunch of phones and like line them up, connect them to the Internet and run small models on them and serve those for people. Right? But, but these small models, there are things you can do to tweak them per task. So if you need to do a finance task or a sales task or whatever, you can, you can make them
A
really, really, how do you tweak it? So if I wanted to say, take my photo archive and have all of that managed locally. I didn't trust Apple, Google, Photos, whatever, but I just wanted all these photos on my desktop, on my laptop, and I wanted to tag them all and I wanted to say, hey, this is this person's face without having that go up to a Frontier model. So now they know, oh, that's lon in your photos. That seems to me to be like a great way to tweak this. Or if you just wanted it for writing or you were an Excel jockey, you would want to just have one tweaked for Excel. Are those models already pre tweaked? Can you get a flat flavor of that model on GitHub or on hugging Teams?
C
Yeah, for some of them you can get a flavor of it, but the flavors tend not to be work task related. They tend to be things like Quinn has a series of SLMs that are called Instruct. And so they do instruction related following tasks better than say generative writing tasks or something like that. But there's two main ways to sort of make these specialized. One is to augment your prompts with a whole bunch of stuff that can get hard with the context window limits and everything else. But you add this stuff like imagine you're a CPA and you have all this knowledge and you can throw the whole GAAP accounting standards in there into your prompt or whatever and pass it to the model. The other way is to fine tune it. The best way to do this is what they call distillation. And distillation is when a bigger model teaches a smaller model something. So you can think of us as setting up a system where let's say you have a task that you're having, you know, anthropic or OpenAI do. And maybe that task is like, hey, I have a thousand pages here from industry reports on the energy industry and I'm an investor. So GPT5 take this report and pull out all the stock symbols in here because I want to go research, right, and see what they should do. So if I, if you see that information go in with the prompt and you see GPT5 response, you get that back enough times, right? That prompt and response is a data set you can use to distill a small language model just to do that given task. Now depending on how much you do, if you only do that task once in a while, it doesn't make any sense. But if you're like we do this every day, a thousand times a day for all these pages of reports, you could probably save 90% by doing that. And in fact there's a story in VentureBeat in February about AT&T got to the point where they were spending 8 billion tokens a day on their AI infrastructure. So probably a couple hundred thousand dollars a day. They re architected everything using frontier models for 10% of the tasks. SLMs for 90% of the tasks. And like dramatic, dramatic improvement in both speed and cost.
F
Yes.
B
The line from the Adventure beat story is 8 billion tokens a day forced AT&T to rethink AI orchestration, cut costs by 90%. Rob, can I just narrow down and focus on one of your SLMs really quick? Because I think it's a good example. So here is one you guys built called Deal Sieve. And it's a task specific model that compares target company profiles against a firm's investment thesis. Pretty pertinent to what Jason does here at launch. And it's based off of Quinn 3 to 4 billion instructor 2507. Okay, so is this one that you made via distillation? How did this one in particular come together?
C
So it's a good question. We have, we have an automated process behind the scenes that does a lot of this. So I don't know on this specific one. But look, our, our bigger goal as a company is just make intelligence free, right? Why would we want to do that? Because it's, it's the Jevons paradox thing, right? The more the cost comes down, the more, the more people are going to use it for more tasks. I think the better the world will be if, you know, good people control AI. And then, and so people are like, well then what do people pay for? Well, you're going to pay for an sla. You're going to pay for testing, you're going to pay for analytics, you're going to, like, you're going to need stuff to manage all this intelligence. So yeah, so like you can go and you can download these models. You can, you know, we give them to you for free if you, if you want.
A
What's the website again? So people who are listening can go there now and try it.
C
It's just marketplace neurometric AI. And then today. Do you want me to talk, Alex, about the claw pack?
B
We're going to talk about the claw pack. One second before we get to that, Jason, because you said something very important, which is you want to make intelligence free. And what's incredible, Jason, about that comment from Rob is that he's not that far off from it already. They offer 100 million free tokens a month if you want to have neurometric handle your inference. And it's two bucks a month per model. Per month? Per month. No token limits. So, Ron, how the hell can you afford to do that? Are these so cheap that effectively they're already free?
C
Well, you can run, you can run these on Old hardware. And then the second thing is, you know, we're a seed stage company, so we're learning the real usage distribution. But, you know, at the beginning of the show, Jason talked about Backupify. Backupify offered unlimited storage for, you know, Google Apps, Salesforce, Office 365. And people would be like, how can you offer unlimited storage for $3 a month? Because we knew the distribution of what people actually used. And I think what companies want here, and particularly prosumers and open cloud users and cloud code users, but, but also Enterprises, is like, they want to not have to think about all these innovations that keep coming to drive down the cost and how they apply them and how they roll them out. So I think if we can manage all that and we can bear the token risk, I think it's a great business model.
A
But you're not doing the hosting or
C
you are doing the hosting. We will do the hosting. We don't have to. You can download it, we can deploy it wherever you want, manage it for you, but if you want us to host it, we'll do it.
A
I think this is going to become the key. Anybody who goes deep down the Open Claw or even Perplexity Computer, which is awesome, or Claude cowork, at some point you're like, am I getting enough value from spending $1,000 a day, $365,000 a year. Now, if your business is printing money and you don't have to hire the fourth developer, you know, okay, fine, but in other circumstances, you're going to be like, you know what, this task, I have tasks I want to do, which I'll talk to you offline, Rob, that are not necessary for my business. But we have to judge the startups that are coming in. But I would love to be running back testing on every startup I've ever met with in my life. The founders, where they wound up and be saying, okay, let's examine every 2009 startup, every 2010 startup, every 2011 startup, and tell me, look for some patterns, look for the talent there. Which talent created diasporas of Googlers or people who worked at Uber who went on to do other companies. There's all kinds of intelligence that I could see myself using, but it's not a priority. And I would look at the $100,000 in token costs and say, not worth it. But at $1,000 or $10,000, yeah, it might be worth it. I think a lot of people who are the tip of the spears here, playing with this technology, they're starting to come to that realization there's things I want to do that don't make economic sense today. But if, if the tokens were cheaper, I would do it. Why not?
B
You know, Rob, if only there was a brand new product out there called Claw Pack that for only $8 a month, got you unloaded. Inference. You want to tell us about it?
C
So as soon as we put this marketplace out, one of the things people started using it for was obviously open claw, because that is, man, you go on Reddit and people are complaining like crazy about their openclaw prices. So we said, okay, let's do some research on what are the top OpenClaw use cases. And let's take 39 small language models that we package together under one API. So we host 39 models for you. They do common tasks like social media posting. You don't need Claude Mythos to write an email headline, right subject line. So it does all this stuff. You get 100 million tokens for free on it, and then after that you pay $8 a month unlimited tokens. We manage the token risk and there's a lot of ways to do this. The other thing, Alex, that we haven't talked about and you didn't bring up, but there's this new emerging concept in tech called harness engineering. And harness engineering is the thing you wrap around the model to make it do things. And one of the things we noticed in our research on small language models is one of the reasons you can't use them for complex tasks is they kind of tend to go off and lose track of what they're doing where the big models don't. But if you put a nice harness around it that helps it stay on, that says stuff like, hey, every time you do a step of a task, check back in with me and make sure you're on the right task. Like, those kind of harnesses are easy to write. Claude code can write you one for a specific task and you plug in a small model and run it. And that harness keeps the model on task. So we have a bunch of innovations like that that enable us to. And we'll just keep driving down the cost. And other people are doing stuff to drive down the cost.
B
So, Rob, on the Harness point, regarding SLMs, is this like a dog sled? Do I have like one chain of harness that has multiple dogs, aka SLMs that it can work with or is it like one harness per SLM?
C
It's both. Today it's mostly one harness per SLM because they're task specific. But you can already see with the claw Pack, right?
B
Yeah.
C
They're starting to work together and what you're going to start to see is swarms. I think of SLMs that'll do common work tasks and you'll always need 20% of your workloads to fall back over to the frontier models because they're one offs or you get lost, you don't know what it's doing. So you're always going to need a combo ensemble system. But I think we can take people's core workloads and drive that cost way, way down.
B
Yeah. Jason, the power of branding. Remember that moment in time in which we all said, oh, if you're an AI rapper, you're doomed. But now if you're an AI harness, you're the tip of the spear.
A
Well, there's two interesting observations here for startups. One is every Reddit bitch thread where people are bitching about something they hate is a potential startup. This is like a super important thing. Somebody needs to create for me a skill for my open claw to just look for these startup opportunities from people saying, this product sucks and I would like, you know, why can't they solve this very simple product problem? I would pay a lot of money for a tool that did just that. Because it would be like the request for startups that we do YC does other people do. It'd be like, these aren't my requests for a startup. And my intuition as a VC 55 year old guy in Austin, Texas, what I want is irrelevant compared to what the world wants, as demonstrated in a subreddit about music where somebody wants a specific tool to do a specific task. And 1,000 people participated in that thread. That's really an interesting, and it's an interesting go to market movement because you can go to that thread and say, hey, I built something. Would you test it for me? Would you try it? And then people are like, wait, you made me a custom piece of software? Okay, great. And then startups now are not about who can build the product, it's about who doesn't stop building the product. Startups aren't about who has the resources to build the product. It's about who will not stop building that product and refining it. In other words, it's now a test of your resiliency, your passion for the vertical. Will you just keep working? Because anybody can build anything then, well, who's going to build the next three or four features of this meditation app or this fitness app, whatever. It happens to be this enterprise piece of software. It's really just who's willing to go on that product March for 10 years and not give up.
B
I think Aaron Levy's the, er, example of that from the SAS era, but I think we're going to need to find out who that is for the AI era. We're talking a lot, though, Jason, kind of around this Marc Andreessen tweet. He agrees you can't spend $1,000 a day on OpenClaw, and he says it's actually heading to $10,000 a day if you really want to have a magical experience. And then he says the future shape of the entire tech industry will be how to drive that to 20 bucks a month. So, Rob.
C
Yeah?
B
How far can SLMs get us? Can they reduce our Open claw spend by 80% in time, 90% in time. And how quickly can you cut my bill down? Because my wife's not happy?
C
Depends on what you use it for, obviously. Right. Because there's some tasks SLMs can do and there's some that they can't. But the important thing is people are going to use more tokens every year and SLM's capabilities are going to increase and there's a bunch of costs that are also making it more efficient to run those every year. So I would say today, for an average Open claw user, probably 70% reduction, you'll still have to use Claude for some things or OpenAI or whatever, but you're going to, you're going to have this weird paradox, right, which is you're going to do so much more with AI that like, your per unit costs are going to come down, but you're going to spend more because you're going to turn over more parts of your life to it and you're going to do more things.
A
And, you know, I have a prediction.
C
Let's hear it.
A
I have a prediction there and I'll unpack it. There is a possibility that if we are reaching AGI right now, which I've said, hey, we've reached AGI, it's just not distributed yet. It's just not implemented yet there. And you believe in superintelligence and recursive learning, which everybody here believes, then there is no doubt in my mind that LLMs will get so small and there'll be so many verticalized ones, because building a legal accounting design, topography, I mean, I don't know how many steps down you can double click. There is a possibility that these SLMs being fractured, being essentially like skills, instead of building a skill in openclaw, there's an SLM that does that skill. It's trained specifically and somebody makes it better and makes it so good that this could collapse the value of the frontier models. Because how does a frontier model sell into a law firm or an accounting firm? Like hey, we've got this new amazing thing to help you solve these legal or accounting or design issues. When there's an SLM that the person goes good enough, that's a good enough logo, that's a good enough font, that's a good enough non disclosure agreement, good enough happens and then the ability to sell a note taking app, which we saw there were dozens of note taking apps. This was like a great business to
C
be in dozens of Now Evernote was a unicorn.
A
Evernote was a unicorn. It's a perfect example. And now it's like, well, notes. I can make one notion. It's just these things eventually become deflationary. This could be the deflationary moment for the frontier models. They may not realize it, but they might have created their own demise. Yeah, they may have just created their own demise.
C
Dario's going to have to buy neurometric. I mean, I don't know.
A
But who's going to pay for these things? If an SLM exists that does it, or a TAO subnet solves that problem for you in a distributed computing way. This is so deflationary that we need a new word for deflationary. There's deflationary. But what is hyper deflation like? It's not just deflation where it's like it's going to get 10% cheaper year. What if it's, it's going to get 90% cheaper every month? Like then the compounding deflation, the hyper deflation could be so acute that just things like you're saying get to free Rob and that's just kind of a mind blowing exercise to do in your mind.
C
Well, you're going to, I mean the price of compute is collapsing pretty fast as well as people on a per petaflop basis or however you want to measure it. But I will say you will always come up at least against the cost of buying the. I wouldn't even say GPU because there's other chips coming, but that plus the energy, so. So there will be some floor to a unit of intelligence that'll start to advance, decline slower because of the laws of physics.
B
Right? Yes. But I think as the floor rises in the quality of open source models and cheap SLMs for specific tasks, as Jason points out, yes, there less room for frontier models to charge. But I think also that the companies that are Desperate to have an edge, Will, because everyone else is going to be defaulting to the cheap stuff or cheaper stuff. And so I think there's still going to be some market there, Jason. I just don't think that everyone's going to be paying Opus 4.6 level pricing for tokens in five years. But the question that just becomes, will Javelin's Paradox drive token usage up enough in that same time period to have these still be growth businesses? But I'm still pretty bullish on improving the frontier of intelligence because as Mythos proves, there's still so much left to come. I don't want to start thinking about ending that run, Jason. I want to go up the damn mountain to the top.
A
Compute performance per dollar has improved roughly 40% per year across 20/AI accelerators released between 2012 and 2025. The GB300 costs nearly 9x the P100's release price, but delivers 24 times the performance per dollar. So that's all that matters is performance per dollar. 24 times. That goes to this hyper deflation concept.
C
If you think about it as an investor, Jason, it's going to change the way that you think about defensibility. And I think one of the possible outcomes of this is it starts to enable a type of business that I don't know if it'll give venture funded, but it enables a type of business run by 1, 2, 3 people. Sure can do. They can, they can work in small TAMs, make $30 million a year, drop 9 million to the bottom line and.
A
Or 29 million to the bottom line.
C
Yeah, exactly. It's going to be super, super interesting to see like how you adapt your angel into.
A
I mean, the way I've been through this before, when storage became free or was trending towards free YouTube and Netflix became viable, Mark Cuban was like, netflix will never work. The infrastructure's not there. If everybody had Netflix and was streaming hd, it just wouldn't work today. And he was right. But the compounding effects of deflationary technology, in that case it was the rollout of fiber. In the case of hard drives, it was what those disks could store and at what price. And how it would scale in hardware scales differently than software. But you have two compounding effects here. The AI is so good that it's making the models more dense. To your point earlier, the density of the model and the specificity of the model that could have more dramatic effect than the hardware curve. But then you add the hardware curve to it. This is where I think this 40% cheaper or tokens are 90% cheaper. We might be greatly underestimating what superintelligence does. It might be that superintelligence just rams this down 99% a year maybe.
B
But with the humans are doing quite well as well. So Meta, their new model they dropped today, Muse Spark, in their post talking about this Jason, they talked about how they're getting more efficient with their compute. They said they rebuilt their pre training stack and they said these advancements increase the capability we can extract from every unit of compute. So it feels like every possible vector.
A
But why does meta still suck at doing anything with AI? Like their AI search sucks on like Instagram. Nothing works. It's a disaster.
B
Their new model is not hot, hot garbage. I have the, I have the chart here. This came out right before the show, so no one's seen it. Really. Artificial intelligence. Sorry. Artificial analysis on their intelligence index points out that the last meta model we got, which was Llama4Maverick, which is now complete garbage compared to the state of the market, has now been replaced by Muse Spark, which is now the fourth best model out there. Now we don't.
A
According to who. What is that?
B
ArtificialAnalysis AI. They run a series of very good benchmarks and this is kind of a meta benchmark that I pay a lot of attention to. Now we don't know how it's going to perform in the market. I'm not trying to say this is the best model since last bread, but I'm saying that it's a really pleasant surprise.
A
But what is their strategy like? Okay, they leapfrog their last model, they're trying to catch up. But what is the business case here? Like, what is the user case here? I don't see a user case for what they're doing right now. Like what is it? Actually, I understand you could serve better ads, but. Go ahead, Rob. I think it's Zuckerberg.
C
Yeah, I think for them, I think initially Zuckerberg thought he would hurt his competitors. He would hurt, you know, OpenAI, he would hurt Google by open sourcing stuff. Now I think they're looking at cogs improvement because I also know they're building a special chip. Right. They announced this in a conference late last year that. So it's just an AI recommender chip probably Meta, Netflix and Amazon might be the only three companies in the world that could spend the money to do a recommendation specific chip and have it make economic sense. So maybe that's how they're thinking about it.
A
Just make people that much more addicted to a product that's already so addictive that it's being regulated and banned for Turkey. He's under 16. Okay, great. What a stupid idea. Like, the worst possible thing they could do is use this technology to make an already smoking level addiction, a heroin level addiction, worse for children and adults. That's what I'm saying is where's the vision here? What is he trying to accomplish? I haven't heard from Zuck.
C
There's no change the world thing that I see.
A
There's nothing. And you know what? This proves my point about him. He's great at copying other people's ideas. Snapchat, LinkedIn, whatever, Facebook Marketplace, ebay, and Craigslist. But where is the original idea from that corporation of how to deploy something unique in the world? It's pathetic on that vector.
B
Nothing to add, Jason, from my opinion, but I asked Alexander Wang, formula, scale AI at the Meta Superintelligence Lab, is it gonna come out that I can use it in my openclaw setup? And he told me that they will release the API soon and it will power some claws. So at a minimum, we can put it to test soon enough and see if it's any good. See if it's.
A
You know, if Zuckerberg were to put his entire energy on copying openclaw, I would be very nervous because he's so good at stealing other people's ideas and doing them better. Yeah, like, he would create open claw times 10. So, yeah.
B
Oh, that's a challenge.
A
Let. No, don't give him any ideas.
B
All right, Mark, you heard it here first. We need. We need Meta Claw. All right, let's bring up Gyani. Gyani is the man behind probably my favorite and least favorite tool online today, Jason. It's called Death by Claude, and, well, you know what? I'll let the man himself explain it. Gyani, what is Death by Claude and why? Why did you build a tool that insults me?
F
Oh, hey, thank you for having me on the podcast. So what is it? Backlord? You put the URL in there. It can be a person, it can be a company, and it'll critique whether it's an AI rap or something that can be replaced. Why did I build it? I'm running a company myself, and we have been mocked by Anthropic a few times. We were building something like Cowork, and then co came out, and their margins are bad. If you try selling Cowork at anthropic price, you cannot do it. So we've been Killed by Claude a few times. We did The Citrini piece that was making the rounds. I did the tweet by Ryan Peterson where he said, okay, show us this.
A
Show it to us. Here you are going to savage somebody personally, like a roast comic, or you're going to savage a startup or a business idea. What's the best example of this at work?
B
Gianni, should I bring up the. The one I did about my own newsletter?
A
Yes.
B
Okay. I roasted myself for everyone's enjoyment. So here is the. Here's what. Death by Claude kicked out for cautious optimism. I have an 89 of 100 already dead score. Gyani, how do you calculate the scores that you're giving to each company or product?
F
So if you were like a hardware company or a science company, you are basically a model. And if you're a blog, then you're probably dead. So if you can get replaced, then you score high, and if you cannot, then you score low.
B
Yeah. So I'm pretty much doomed. And Jason, what this service does is not only does it find different ways to tell you that you're replaceable, it creates an entire skill MD file to replace the thing that you're doing. And then also it gives you a death certificate. And it says that. My newsletter was cautiously optimistic about its own survival, but it taught us that the real disruption was the substack fees we paid along the way before it died.
A
Brutal.
B
But the question is really, like, how much of the stock market do you think Yanni is really at risk from being replaced? Because you get.
A
Yeah, do Uber. Do Uber Door. What would be one that would be.
B
Hit me.
A
No, let's do. I'm trying to think of. Well, I gotta be careful here because I don't want to sling mud at anybody, but.
B
Oh, okay, let's do an end roll.
A
No, no, no, no. Don't start that up.
F
I think those sound like safe companies to me.
B
Yeah. Because hardware. Right. I mean, basically companies that rank well have a low score.
A
Yeah, but let's not do Andro. Let's do. Let's do. Do Slack. Slack.
C
Okay.
A
Or Chegg is a good one. The textbook company.
B
Well, Chegg's already dead, but okay, we'll do Chegg.
A
Well, Peloton or Chegg, like, these are things that. There's a group of people who are trying to pump Peloton right now. It's an Interesting one. Chat 92 are already dead.
B
Yes. Let's see what it has to say. It's just crud. It's an AI wrapper. It has no moat depth. It's marked down replaceable and it's expensive. And the skill file to replace it. 31 lines and yeah, brutal. Just absolutely mogged. I feel like every founder should be trying this tool out.
A
Do up. Do Peloton. Peloton one.
B
I'm on it.
A
What else you got, Rob? Who else do we want to figure out how dead they are? Because Peloton to me is not dead. They. They basically had all the suckage taken out. And Peloton is actually defensible now because people love their brand. They love the hardware and they love the people who are the teachers. They're addicted to those teachers and that brand and those teachers is defensible.
C
Yes.
B
And in this case, 32 out of 100. Much better than my blog, much better than other services that we looked at. It's not just crud. It's got reasonable moat depth. Yes, it is expensive, but you can't replace a bicycle with code. I suppose. Let's do one more for fun.
A
Lovable.
B
Lovable.
A
Everybody's saying that vibe coding. Yeah, people say vibe coding is dead. And like, it's. But Lovable keeps having revenue grow. But then there was some folks saying maybe revenue was stalled or a lot of people canceling their subs.
B
All right, 78 out of 100 score on death by Claude for lovable.
A
Why? Why? Why? I thought it would be 60. Okay, go ahead.
B
Lovable built an AI that writes code for you, which is adorable because Claude already writes code for you without the $20 per month middleman tax. Brutal. And then a 31 line prompt for a full stack app generator. Yeah, brutal. Cause of death, terminal AI rapper syndrome. The underlying model got too good and ate the wrapper alive.
A
All right, G man, G Man. Why did you build this? Who are you? Where are you located right now? And why did you build this?
F
I live in London. I am building an AI startup myself. I was building something like Cloud, Cowork, and then Anthropic released before us. It's very expensive to serve inference. So Rob is a friend, I guess, in the future. So I was reading the Citrini piece. I read this piece by Ryan Peterson where he said, Harvey is replaceable by cloud for legal. And that's like a real feeling. Like we have been walked by Anthropic a few times. So we decided to run our startup, NYC's entire portfolio, and the results are very funny. So put it on the Internet.
A
This is becoming a reoccurring thing. Rob is using AI to give a defensibility ranking. That's basically what you're giving and I think it's genius. A defensibility ranking is a great idea. Idea. We do this with founders organically. Rob in the founder University, you know, or the launch accelerator. And we tell them, hey, here are your competitors. How are you different than these competitors? Or hey, if Claude releases this, what would you do? G Man, if I may call you G Man. G Man, you got your ass kicked. And you said, I want to create a tool that prevents me from getting my ass kicked again. What are the two or three things that prevent an ass kicking of a startup? Based on what you've learned, a few
F
things come to mind. So if you're doing hardware, we don't have physical models yet. So when Travis succeeds with atoms, I think that's a space that goes away.
A
But hardware is number one. Got it. And that, by the way, is becoming. Now we went from hardware is hard to hardware is a great way to be defensible. And hard means defensible. Now engage. It means moat. Hard means moat today. Whereas hardware meant death and not fundable, it now means moat highly fundable. Very interesting.
F
What's the number two network effects are like very, very helpful still. Like, if you're WhatsApp, you will not be replaced by Claude. I don't think any AI company at the moment has network effect, so I would be on the look for it.
A
So network effect, number two, network effect being, hey, how many people are participating in this? Like a marketplace, like Uber? Hey, they've got 20 different AV companies putting cars into the system. They're in, you know, 10,000 cities, they've got 7 million drivers, they've got, you know, 10 million restaurants. Okay. That makes it more defensible. Network effect, great. What's number three?
F
If you're doing something deeply scientific or in a regulated industry which requires talking to people, I think you're pretty safe as well. Yeah, that's the thing.
B
G, You're. You work.
A
G Man, please. G Man.
B
Sorry. G Man.
A
He's the G man.
B
You're behind tabtab. Tab AI, right? That's your main company.
F
That's the thing that we were working before we started pivoting because Our score was 89 on the hitcheck website.
B
Oh, no, I was. It's actually 92 now. Your score has gone up. I just thought it was very funny that your own product told you that you need to pivot and off you go. So there you have it.
A
Rob, you got a question for the gman or do you have an observation here about this as it relates to being A serial entrepreneur yourself.
C
Well, defensibility has always been hard, and I do think it's particularly hard now. But, I mean, I think he's on the right path. Right. I think this is making everybody a little more like a CEO, because as a CEO, you realize, like, you have to review stuff, you have to coordinate stuff and there's some limitations to what you can do. Like at some point you have to decide, like, what are you doing? Right. And these tools help you do stuff. But yeah, I think it's. I don't know if we talked about this, Jason. I stopped. You were one of my mentors for angel investing 11 years ago when I sold my first company. I've quit because of this. I think it's too hard now.
A
There is something to it. It is very hard to be an early stage investor. Even if you look at people doing angel investing and getting into a big round, you got to get in at a higher valuation. So even the idea of hitting 6000 or 7000x, which I think is where Uber might have peaked for me, or hitting a 500x, which is, I think what Robinhood did, 100x, whatever it is, comm 100x, it's going to get harder and harder to do that because the entry level valuation is a magnitude higher. Dilution might be higher. It's just hard. It's never been easy, I'll say that.
C
Yeah, yeah.
B
Tricky.
A
Oh, no, you're not doing. You. Oh, I don't know. Should we do neurometric? That's do it.
B
Oh, you wanna, you can do it live.
C
I don't, I don't care.
B
All right.
A
Maybe there's something you learn here.
C
We'll figure out if we have to, we'll figure out if we have to pivot.
B
All right. This is the meanest thing we've done to a guest on the show in at least a week. Here we go.
A
I, I, I know the score already and I know why it's giving that score. But I can, I can explain why this score will get better over time.
B
Still calculating the exact level of doom.
A
Here it goes. Let's forecast that. Alex, what is it saying?
B
It's saying, counting the lies of markdown needed to replace cross referencing with the obituary database, Gianni built one of the funniest tools I've ever seen. It's just consistently hilarious.
C
Not a great score, but not terrible.
B
100.
A
I think you can get that number down because a network effect would be an interesting one if you could make your platform rob into a place where anybody can Put up. This is my idea. If you became a marketplace of SLMs where anybody could post an SLM and put a price on it and you would be the reseller hoster of it, then you become more like ebay or a marketplace of these ideas. Like hugging faces. Right. So if you were the hugging face of SLMs, like put hugging face in there. I wonder how it thinks about hugging. Hugging face. It might also think that hugging face is not replicable.
C
Yeah. The thing we've been working on. So there's an open source project called harbor. And harbor is a collection of thousands of tasks with their evals. And currently people in the world think about benchmarks, not tasks. Right. But a benchmark is a series of tasks. And what we've shown, and a lot of the thesis behind Neurometric is that you want to choose the right model for the task. And even amongst frontier models, they don't all perform the same on a given task. So even the model that wins the benchmark doesn't win all the tasks within the benchmark. So ensembles of models will 100% of the time beat single models like you just. You can't make one model the best at every single thing. And so to your point, we're kind of going in that direction that you're talking about, Jason. But it hasn't all been released yet. But it's a really great insight is why you're such a good early stage investor, man.
A
Yeah, I provide a little value on the margins. All right. This has been another amazing episode of Twist G, man. Thanks for coming on the show and sharing this with us. Thank you, Alex. Great job. Lon Harris, they say he's 72%. He's a 72 score. Iron replaceable. I disagree. I disagree. Personality counts for a lot.
B
Yeah, exactly. That's why. That's why he's a 69.
A
I put him at 69.4. 420. Do me, do me, do my LinkedIn.
B
If my. If my newsletter doesn't make enough money by the end of the year, I'll go work for a vc. I'll make you a deal.
A
I mean, it's hard. Oh, boy, this is going to be scary. I think it would probably say venture if. If it says venture capital and podcaster, man, what am I? Maybe I'm like a 65. 64.
B
Okay, so I have. I have the image here from long.
A
Okay, here we go. Replaced by Claude.
B
Oh, actually, no, I have the full URL.
A
All right, 58.
B
Yeah, you are a 58. And we can cut this.
A
No, leave it in. I'm a 58. I like it. Timmy does. Yeah. Look at this. Brooklyn startup rooster in a Fordham blazer, pacing the cap table at 6am and yelling ship it into three microphones while stuffing another safe into his jacket. I mean, it's got a sense of humor.
B
That's the thing. These are hilarious.
A
I mean, I like it. I like it. Do. Okay. I could do this all day.
B
That's fun.
A
Yeah, we'll see you all on Friday. Bye. Bye.
Host: Jason Calacanis
Date: April 9, 2026
Main Theme:
The hosts and guests react to Anthropic’s bombshell announcement: their new AI model, Mythos, is so powerful—and so adept at finding and exploiting software vulnerabilities—that they consider it a potential "cyber-weapon." Anthropic is not publicly releasing it, and is partnering only with the largest and most critical infrastructure players to use Mythos defensively and harden digital infrastructure. This kicks off a wide ranging and at times urgent conversation about AI security, AGI trajectory, open source vs proprietary models, deflationary tech, startup opportunities, and the existential stakes of the global AI arms race.
[00:00–08:53]
“Basically the gist is with this model anyone can go to any piece of software and find zero day exploits quickly and then basically go to war with them.” —Alex, [01:33]
“Obviously, capabilities in a model like this could do harm if in the wrong hands. And so we won't be releasing this model widely.” —Anthropic team in video, [08:13]
[09:59–14:22]
“I think Anthropic has significantly passed OpenAI on a lot of things. The reason is they’ve been more focused.” —Rob May, [12:00]
“Product market fit is the ultimate arbiter... Clearly no one has more PMF than Anthropic today.” —Alex, [14:22]
[17:35–29:59]
“This is a super weapon.” —Jason, [26:43]
“I think we should consider this Mythos model to be essentially a cyber weapon and perhaps a cyber weapon of mass destruction.” —Alex, [21:11]
“There is an argument you have to nationalize this technology. There’s an argument it’s too powerful for a private company to own.” —Jason, [24:21]
[27:37–29:40]
“Nobody trusts anybody because we’re in literally ‘The X Files’... The truth is out there.” —Jason, [27:52]
[38:28–58:30]
“Building a business in this AI future is going to be about estimating probabilities…using your compute to figure out code generation versus code checking.” —Rob, [18:03]
“I think this could collapse the value of frontier models…they may not realize it, but they might have just created their own demise.” —Jason, [56:22]
[64:15–71:13]
“If you’re doing hardware, we don't have physical models yet. Now, hard means moat, highly fundable. Very interesting.” —Jason, [70:03]
“If you can get replaced, then you score high; if you cannot, then you score low.” —Gyani, [65:19]
[71:46–76:44]
On the existential race for AI security:
“This is becoming the equivalent [of the atomic bomb]. It might not seem as much because a nuclear bomb can cause such a mass destruction of life, but this could cause a massive financial devastation across the economy.” —Jason, [22:04]
On AI model release strategy:
“More powerful models are gonna come from us and from others. And so we do need a plan to respond to this.” —Anthropic team in video, [09:21]
On the impact of SLMs & startup costs:
“If only there was a product called Claw Pack that for $8 a month, got you unlimited inference…” —Alex, [48:50]
On software and company defensibility:
“This is making everybody a little more like a CEO, because as a CEO, you realize you have to review stuff, coordinate stuff...these tools help you do stuff.” —Rob, [71:46]
On the deflationary future:
“This is so deflationary that we need a new word for deflationary. There’s deflationary…what is hyper deflation?” —Jason, [56:39]
The episode is fast-paced, irreverent, and veers between urgent concern (about the real security threat posed by Mythos and the global AI arms race) and optimistic fascination with the business and technical innovation happening at the frontier. The hosts and guests balance serious analysis with typical “TWIST” banter, dense with memorable, sometimes biting quips and a running thread of startup/VC humor.
In this landmark TWIST episode, the discussion zeroes in on AI capabilities outpacing our security infrastructure, the arrival of “cyber-weapons” through LLMs, how C-suites, governments, and startups may react, and the impending deflationary shockwave poised to hit the tech sector as SLMs and open source models threaten to commoditize AI everywhere. For founders, operators, and investors, it’s a call to rethink everything—especially what it means for software to be defensible in a world that’s on the verge of abundant, cheap, and sometimes dangerously powerful intelligence.