Loading summary
Jason
SpaceX formally announced plans to buy Cursor for 60 billion in stock. Cursor once accounted for 40 to 50% of Anthropic's total revenue.
Jason Calacanis
It is Game of Thrones in terms of talent, in terms of territory, in terms of weapons.
Ali Ansari
In AI, the model is no longer the product. The agent evaluations that sort of come on top, the harness, the user interface and so forth.
Jason Calacanis
You don't want to give your knowledge to Claude. You don't want to give it to OpenAI or Gemini. You want that for yourself.
Ali Ansari
The majority of the data spend, in fact, probably close to 100% of it will be on the agent and the application layer versus the model companies. There's just going to be orders of magnitude more agents built than obviously models.
Jason Calacanis
Cursor will have the number one model at this time next year, thanks to our friends at PayPal, the exclusive sponsor for this Week in AI. Try the payment and growth platform that's trusted by millions of customers worldwide. PayPal Open start growing today@paypalopen.com. all right, everybody, welcome back to the greatest AI show in the universe. Yes, I've launched another roundtable show. This one is called this week in AI. I'm running it just like I do my venture roundtable on this Week in startups, or how we do all in, which is to say we have a group of experts and we talk about the news of the week and we'll even tip you off into where the future is going to be. So if you listen to this Week in AI, going to be six months ahead of everybody else. Why? Because we pick people for the show who are in the arena who have deep expertise. But in this four chart, four quadrant chart, they also have to be willing to share it candidly. So in the top right hand corner, expertise building the future. But they have to be willing to talk about it if they're experts and they're quiet, nope, can't come on the show. So that's what we're optimizing for. This is episode 18. You can get all the information you need about this podcast@thisweekinai.AI. i'm gonna have our editorial director here at this weekend, Lon Harris, moderate this week because I wanna shoot the ball along with the number one, the number one CEO just in terms of performance in our fourth fund, Ali Ansari. He is my favorite, the golden child, or as I call him internally, Travis 2.0. You know, Travis Kalanick lawn, of course, you know how he plows through brick walls. He's super.
Jason
Will he's super pumped, that guy.
Jason Calacanis
He is super pumped. And you know what he did? A lot of bricks went flying. And you know what my job was? Had to catch some of the bricks sometimes or explain what we were doing at Uber. Smooth it out a little bit.
Jason
Yeah, we're watching the troll from the Battle of Pelennor Fields doing the same.
Jason Calacanis
Okay, great, great. Nerd reference or juggernaut. So I asked Ali. One of the things I like to do, this is a little producing trick. I like to ask the great guests on the program, hey, who else is really crushing it? Ali came back with an incredible guest and now we have another person crushing it in AI. I think the app layer in AI is going to be a huge win. And Ollie's in partnership with us. So, Lon, you have the reins. I am trusting you with the com. What is it on the Star Trek?
Jason
The comms. You have the commercial, you have the comm.
Jason Calacanis
You're in charge. Get started.
Jason
All right. Well, yeah. So our second guest, Ryan Daniels of Crosby Legal, he's the founder, CEO. They are Jason, an AI first law firm combining AI systems with human expertise to provide a scalable legal service. So rather than selling, you know, like an AI model that's good at legal questions to law firms, Crosby built their own in house team, and then these lawyers developed and used the AI product in house. So you're sort of, you're contracting an AI first law firm when you go to Crosby, which is, it's an interesting.
Jason Calacanis
So, Ryan, if I, if I'm doing my Series A, you're saying instead of using traditional Silicon Valley law firm, I could use yours.
Ryan Daniels
We'll do it faster, we'll be more reliable. We'll do all your sponsorship agreements.
Jason Calacanis
Well, those are easy sponsorship agreements. We've got that already dialed in with AI. But, but in all seriousness, closing a Series A is, I think $50,000 in Silicon Valley now. Closing a seed round, maybe 10k if you're using notes. Is that about right in terms of what the cost structure is?
Ryan Daniels
Yeah, that's probably right.
Jason Calacanis
And what would you provide those similar services for? I'm curious.
Ali Ansari
So we're not.
Ryan Daniels
So we're getting to financings. We started mostly with commercial agreements. The most recurring thing that our clients do. And we can build the best data models around. Our main thing is pricing by the deal, not by the hour. I'm flat rate. Flat rate, yeah. And so our incentives are really aligned.
Ali Ansari
Yeah.
Jason Calacanis
That is ending your billable hour.
Ryan Daniels
Exactly.
Jason Calacanis
What did you say, Ali?
Ali Ansari
They're ending the billable Hour forever.
Jason Calacanis
Well, Ali and I know about the shocking legal bills that we get in sometimes and oh my Lord, the sticker shock sometimes. And if you ask a lawyer, hey, can I get a flat rate? 9 times out of 10 they say we're not the right law firm for you. So Ryan, there is a clear lane for you here if you can solve this specific problem. Problem. Hey, Lon, give a proper introduction to Ali. I gave my, you know, seed investor. This guy's my guy. You know I'm Professor X in this model, right. I run the school for gifted mutants.
Jason
Right.
Jason Calacanis
He's like my Wolverine. So. But explain what he does. Sure.
Jason
He's the founder and CEO of Micro One. They recruit and manage human experts who help train AI. So think of it as sort of an expertise marketplace. AI labs that have models that they need to fine tune in sort of high level, you know, tasks, the things that you still need a human in the loop to help them figure out. And they team them with pre vetted domain experts, like a full service operator handling the modeling, compliance, payments, basically all the HR stuff you don't want to worry about. And then you have senior engineers, PhDs and others providing this critical bottleneck for improving frontier models. Getting some human expertise in there. Most recent numbers, we have about 300 million in ARR as of.
Jason Calacanis
Whoa, are we allowed to say that last 2026. Are we allowed to say that?
Ali Ansari
That's a bit of an old number.
Jason Calacanis
So there you go, we'll leave it at what it is. I'll tell you, it's the fastest revenue ramp in like two years that I've seen since I guess I wonder if it's faster than Uber's first two years. It might be, in fact, might be. So incredible revenue ramp. And Ali, that was not the business I invested in when we met. The first business, if I remember correctly, three years ago was you had built an incredible AI to sort through developers and give them tests. Yeah. Am I correct?
Ali Ansari
Yeah, exactly. The V1 of micro one was we built an AI screener and this was actually for. It was an internal tool that I built for one of my previous companies. And then I decided, okay, this is probably something I could spin out and make a product on its own. So the first version was an AI recruiter which companies use to essentially source talent. And then also we ended up making this kind of like next generation marketplace of pre vetted engineers, specifically which startups would hire from. So that was V1. But then we realized very quickly that there's this data space and we sort of pivoted entirely into it.
Jason Calacanis
Which, lon, is what the great entrepreneurs do. They always keep that periphery vision open. So if you ever hear my talk about peripheral vision, when you're a great entrepreneur, you start with a vision, you start building, and then sometimes out of the corner, your eye, you see a second. Is that a diamond mine? Do I see something glistening? Is that a perfect wave to surf? And I'll give you a great example. If you look at Uber once again, or Airbnb, which we weren't investors in. Airbnb was renting a room in somebody's apartment and then they saw, hey, wait a second. What if somebody just rented the whole damn apartment? Incredible. Micro pivot changed the whole nature of the company. Expanded the TAM by 100x. In Uber, it was linking town cars. It was your own private chauffeur. It was for the elites, Uber black. But in the corner of their eye, they saw a now defunct company, Sidecar, allowing you to take rides, ride sharing, which Lyft then copied this ride sharing thing, hey, use your personal car and give a personal ride. And they went after that. Obviously that became uberX X Lyft and Sidecar, the originator of the idea, failed to capture that opportunity. Let's get started with the docket.
Jason
Sure. So, our first topic. Huge news this morning, although sort of the culmination of a story we've been following for a while. On Tuesday, SpaceX formally announced plans to buy Cursor for 60 billion in stock. The announcement, of course, came just days after SpaceX's debut on the NASDAQ in the largest IPO ever. Cursor now will become a wholly owned subsidiary of of SpaceX. Of course, everybody knows about Cursor's popular AI coding tool, which launched in 2022. They've apparently crossed 1 billion in annualized revenue as of November of last year. Those are the most recent numbers we have. SpaceX shot up 60% on Tuesday. They now have a market cap of 2.88 trillion. Jason, that makes it the fourth most valuable company in the U.S. surpassing Amazon and Microsoft. Of course, this is all subject to regulatory approval. I do have one quote here from Lecher Cap CIO Quinn Thompson that I thought was good on X. This is brilliant corporate finance. Use your newly printed low float retail inflated currency to acquire real businesses ahead of the lockup, expiring probably the most creative, accretive way to sell as much equity as possible into an IPO pump. I wonder what acquisition is next? So that, that brings me to my sort of dual question, what would you acquire next if you were SpaceX in this wonderful vaunted position they have? And do you think this was a done deal all along when we first announced to like, maybe we're going to buy Cursor, maybe we're just going to work on them and they were just waiting to make the announcement post ipo. Ali, what do you think about has this been a done deal all along and they were just waiting until the post IPO moment to make the big announcement? And if you were running Space X, what are some other companies you might be looking to scoop up?
Ali Ansari
Yeah, so I think it was probably a done deal, I would say. I'm sure there are some sort of regulatory concerns that resulted in them announcing it later. I think this is an incredible deal for both sides. And, you know, for Cursor, obviously there is. They have immense amount of potential to build their own model for, mainly from the product usage that they have from their customers, actually using Cursor, which is one of the most important kind of data flows that you need to actually build a model. But they sounded like they had some compute constraints, which obviously SpaceX had a ton build. And so I think that sort of puzzle matched really nicely. I would say what SpaceX has built in terms of this sort of three core business lines is quite incredible to see. And they've done it in very quickly, it seems like in the last six months or so. Obviously the Starlink had been there for a while, but the compute business, which I heard somewhere has more gross profits than anthropic. That might be true, maybe not, but that's an incredible statistic. And of course now they have the opportunity to not only serve these frontier model companies, but also to serve their own frontier model, which I think buying Cursor is the best way to potentially do so. Because. Because once you build a model that can navigate the computer in the best way possible, which is the coding capabilities, then you can have a lot of immersion capabilities that come from that which is in any other domains. So I'm not sure if I would buy any other companies. I think that these three business lines are really lucrative and perhaps SpaceX should focus on them.
Jason
Sure, I guess, Ryan, to sort of follow up, I mean, why do you think Xai and SpaceX needed cursor with so much compute, with so much capex at their fingertips, is it that hard to train your own model? And like, what. What did Cursor, what can Cursor's team do that the SpaceX AI team cannot?
Ryan Daniels
I'll start by saying Crisor is a very big client of ours, so everything I'm saying has no relation to that.
Jason
Fair enough.
Ryan Daniels
This is all speculative.
Jason Calacanis
Good disclaimer, good disclaimer.
Jason
Good to know.
Ryan Daniels
I, I am a lawyer after all. My look, I think like long, short is you really need two things to build a model and one is, you know, a ton of compute and so there's a lot of money. And the other is it's actually extraordinarily hard as a research problem. And my sense is that I think for physical world problems, Elon is just, bar none, the best to solve. And even the speed with which he set up Colossus was astonishing. But for the really hard computer science problems, they've really struggled and they've gone through several waves of researchers at X and still haven't been able to come up with a base model as good as the two labs. My sense just in kind of the, you know, the folks I know at Cursor, they probably are one of the best research teams and so, you know, if you couple that, it starts to make some sense that they could build, you know, a premier frontier model. I guess the only other one that's relevant, you know, like, I guess so. You know, between Google and Meta, they still haven't, you know, I guess Google has, but Meta still struggled. So like clearly it's very hard even with great researchers. And I think there's a miraculous story about Cursor as a few year old startup that's been able to build that kind of research capability.
Jason
Jason, I want to ask you about sort of the timing of all this and that quote that I read. But also building on what I just said, there was an interesting article in Business Insider this week. There was supposedly like a profile of Cursor CEO and co founder Michael Truewell, but it sort of delves into their history with Anthropic and it sort of, it almost seems like it's Anthropic's fault that Composer exists. Cursor once accounted for 40 to 50% of anthropic's total revenue. And then prior to launching Claude Code, Anthropic executives apparently specifically told Cursor that Claude Code was always going to be more of a research effort than a major commercial push. Like we're not going to be a rival to you. And of course that didn't necessarily hold. So in January 2026, Truo called a red alert and said, okay, we need to design our own AI model. We can't be totally dependent on Anthropic anymore, do you think? How much of that tension, like how much of this is Anthropic faulted? Anthropic push Cursor to make a model. And now they have this massive competitor in SpaceX plus cursor.
Jason Calacanis
Yeah, let me give you three really important observations here for entrepreneurs. And this is the most dynamic space I've ever seen in 30 years in the technology business. It is Game of Thrones in terms of talent, in terms of territory, in terms of weapons in AI. And if you look at Cursor, Cursor had the lead and their partner was in fact Claude. But all these frontier models eventually are going to need to put up revenue numbers to backstop their trillion dollar valuations, right? OpenAI and Anthropic are right around that 800 billion, maybe trading in the secondary market to a trillion dollar when they go public. Market caps. When you look at that, that means Sam Altman and Dario have to examine what their platforms are being used for and then compete with their customers. And I've made this point three times, four times in my career. I made this point when people went to Facebook and started working with them. I made this point when Apple started releasing apps that competed with the best apps in the App Store. And of course, Microsoft is famous for doing this. Microsoft originally came out with Windows. They studied Lotus 1, 2, 3, eventually launched Excel. They bought PowerPoint. So when you build an application level on another person's platform, they have perfect insight into how you're using their product and they study you. So Windows knew how many people were using Excel. They understood that. And they understood if they bundled it with the operating system, they would kill Lotus 1, 2, 3, which is what they did. Facebook did the same thing. They started coming out with games. They looked at their social graph. Yeah, and they killed Zynga. Right. And Mark, my friend Mark Pincus thought, oh, my relationship with Zuckerberg is so deep and so strong, he would never shiv me. He got shivved in the middle of the night, literally woke up in a pool of blood. Same thing happened to Lotus 1, 2, 3. Now we look at this. Same thing's happening now. Cursor did the most important, this is point two pivot in their career. They said, we have to make the language model. That's a lot of effort. They were stuck. And this is point two. Elon stood up Colossus faster than anybody. That's their data centers. They bought all these GPUs and Elon had a very famous tweet, buy GPUs and stand them up. Step three, monetize it. Step two, question mark, what happens in between? He didn't know, but he knew that having all that compute footprint would work. So when he met with Cursor, as this story goes that you're reading about in the public, he gave them a massive chunk of compute, which then let them catch up and then even perhaps beat Codex and Claude code. So you have this three or four horse race going on for coding, co pilots, agents, et cetera, and peanut butter chocolate. All of a sudden he makes this option. Now in point three, Elon is exceptional at buying companies. People don't know this because they haven't been paying attention. But if you look at all the public data, obviously SpaceX, and when we look at the valuation, $3 billion in revenue approximately per year, $60 billion valuation, right? 20 times top line revenue. SpaceX today, in total, including Cursor's revenue, probably is around a $30 billion run rate, but trading at close to 3 trillion. So they have a hundred x 80, 90, 100 x multiple, which means when you make an acquisition like that, the seller, Cursor, gets to be in a bigger, more stable, more diversified equity pool, the SpaceX equity pool. And then SpaceX gets to use that incredible valuation to make intelligent purchases. And if you look at Elon's career of buying these, I just did a quick research project before we got on air. SpaceX has bought a half dozen companies, the most important of which is probably Swarm Technologies that some of my friends were investors in. They were the IoT small satellite startup. They were doing incredibly important research and now they are the Starlink direct to sell. And they had a lot of great innovations there and they bought them for a half billion dollars, 524 million. Now you look at how savvy that was. He also bought Twitter, he created XAI, put it into SpaceX, he bought Twitter and cursor put that into SpaceX. And now you have two very important key assets, X, which has all that real time data from Twitter. And then you have that one. And then you look at Tesla. Tesla's made, I think, close to 10 really important acquisitions, most of them in batteries and battery technology and wireless charging. But of course, SolarCity was in there and SolarCity became Tesla Energy. I think that was a $2.6 billion acquisition. That acquisition is the entire, or you know, a large portion, let's say, of the energy business, of space, of Tesla, which is extraordinary. Eventually these two companies are going to come together, as everybody has been speculating and you'll buy ticker symbol Elon.
Jason
Yeah, yeah.
Jason Calacanis
Ticker symbol Elon.
Jason
The true Elon Industries, like Stark Industries, we'll finally get here.
Jason Calacanis
Yeah, exactly. So this is just brilliant across the boards. And I think Cursor will have the number one model at this time next year in terms of revenue, in terms of market share. And they can play the long game because they have so much compute. Everybody else has to build all this compute and if Compute moves to space, another advantage for the Cursor team and the Cursor team is exceptional. This is an exceptional team.
Jason
Yeah. One thing that was interesting, I think Cursor built off of Moonshot's AI. They're admitting that that composer was built initially distilled from Moonshot, Kimmy. And then they sort of built it, but now they say it's 85% their own model. So I feel like there's a lot of people don't really understand how Distillation works. We think of it as like ripping off someone's model, but it's really like almost entirely their own product at this point.
Jason Calacanis
Ali, explain Distillation for folks. And you have a very unique position in the market because you help Frontier models and all the application layers fill in their data. And data on the open web is questionable in terms of quality and it's a commodity because everybody's stolen every little nook and cranny. And I know these Frontier models. They need, you know, your resource to get that data in there. So maybe explain that a bit. Yeah.
Ali Ansari
So distillation has a bit of a negative connotation because the word is when you distill a closed model where you essentially prompt it at a large scale and you try to create a massive sft, what's called an SFT dataset, which you can think of as prompt response pairs that you get from Frontier closed models to then use as pre training data or some sort of post training data. That obviously is not a good thing to do. And we've heard Chinese companies do it to close the Frontier models in the us But I think what I would call what Cursor did is I actually wouldn't call that distilling a model. It's just you're building on top of a baseline reasoning model that is open source and that is the perfect thing to do, that is the right thing to do, which is if you already have a really good baseline reasoning model that has been pre trained and it's already a massive model that has pretty good capabilities and it's literally open source Then you shouldn't reinvent the wheel, at least for your first model. And you should just post train it in ways that improves the model for coding, which is what Cursor did. And the post training efforts, they're so large that they sort of resemble a pre training effort. So that's why it requires a lot of compute as well. And if you do a really massive post training effort, you're changing the weights in so many ways where the model is just yours. And that's why I haven't heard the 80% number, but that sounds right, which is like you're essentially changing 80% of the weights. And so it is your model. I mean, you fundamentally changed the model, but you've kept some of this sort of baseline knowledge and baseline reasoning that came from the Internet scale pre training.
Ryan Daniels
I mean, one thing I'd add is I think a researcher recently outlined his vision of the future where there are just two types of companies that matter. One type of company makes models and the other type of company does not. And there will be maybe five or six companies that can make their models. And that's like kind of a hard cap. And it's basically we talked about before. It's a, it's, it's a function of compute into capital. And you know, each model is now, you know, pre training starts at about a billion dollars, but probably looking more like four or five to make a frontier model, maybe even more. And, and then the research capabilities, which keep getting more and more difficult. And so if you accept that view of the world, acquisition makes a ton of sense, right? Like Space X needs to be one. It is what it, you know, has some of the most capital availability, especially now with its share price. And they, and Elon, you know, it's like inevitable for Elon to have his own model. It's just been taking longer than I think I would have anticipated. Probably could have been OpenAI. And so when you sort of understand his ambitions there, it makes sense that Elon will be one of the companies that have crossed that, you know, kind of frontier cross that plane that's binary to be one of the companies that is able to make models. And if we believe that models can build the next models, which is already happening, then it's just like a recursive improvement for three or four companies that become bigger and bigger and bigger. So that argument's compelling and I think it kind of frames where this goes and why Cursor thought to sell, given that they have the capabilities, but they need the compute and they need the capital infrastructure.
Jason Calacanis
Yeah, it's undisclosed how much revenue OpenAI's Codex revenue that's not disclosed publicly but Claude code there was a rumor of 2.5 billion in revenue or run rate in February 26th and cursor in June reported I think $4 billion run rate. And so this is a big space. Just Code in and of itself is going to be $100 billion in revenue in the next year or two just from those top three players. And you're correct, Ryan Xai had a bunch of turnover with the founders or the founding team. We all saw that in the press and Elon rebooted everything and I do think he will catch up. My question Ali, and I'm sure you have something to add here, is our open source models. What percentage of a frontier model are open source models today? Ali, in terms of capabilities etc. And when do you think if ever it will flip where the open source models will exceed frontier models or be within plus or minus 10, 15%. In other words, it's not discernible to the majority of users that there's a difference between an open source model and a frontier model.
Ali Ansari
I want to sort of define frontier model for a second and I totally agree with Ryan's take of like there's probably, we're probably going to converge to like 4 to 5 really massive model companies, which I would sort of define them as baseline reasoning model companies versus frontier models. And the reason for it is because I would actually define frontier models as sort of what's built actually in the product layer. And this relates to Satya's post a bit, which I'm sure we're going to get into. But essentially when you take baseline reasoning model, whether it's closed source or open source, and you build a product on top of it, the best way for you to make sure that the product is actually reliable is to in a way build your own model. And of course you're not doing like full on pre training and doing these sort of post training runs that require a lot of compute, at least for most companies. But when you build a probabilistic software with evaluations as the sort of core of the full product lifecycle and the full product build out, you're very much doing what model building looks like. And this goes back to this notion
Ryan Daniels
of
Ali Ansari
And Greg from OpenAI says this as well, he publicly tweeted it, which is the model is no longer the product. What comes on top of the model, the agent evaluations that sort of come on top, the harness, the User interface and so forth, that is the product. And I think those are areas where frontier models are actually built. And I don't think there's sort of an infinite set of domains that you can take this final mile in. And I don't think the baseline reasoning model companies, which I think will be still the biggest companies in AI, will take the last mile, which we sort of call this like infinite last mile in every single one of those domains. So there's going to be countless companies that are really like today we'd call them AI application companies that end up building their own models in some ways. And of course we're seeing that with cursor and actually I think those are where the frontier models are going to be, not the open source versus closed source that are these massive models. And I actually don't think the percentage of open source versus closed really matters too much. I think in a lot of domains where the last mile is taken, an open source model will be totally okay to sort of fine tune and evaluate your way into, into what I would call frontier in that product. And a lot of cases the closed source models is, is where you'll sort of build on. But, but I think the, the real final sort of value will come from all the, the evaluations that, that results in these product companies owning their own intelligence layer, which doesn't mean they're not building on top of these closed source models. But, but you could still own your own intelligence layer while building on top of the, the closed source model. So that, that's what I would define as frontier.
Jason Calacanis
Ryan, in terms of Crosby, you probably looking at the cursor example are thinking, hey, are my model, frontier model folks going to partner with me or eventually try and take our business? Now you have a services component, obviously, but how do you think as an entrepreneur about being either headless or not dependent on any one model? And to Ali's point and Greg Brockman's, the model itself isn't the product. The harness, the instructions, the skills, memory. There's a lot of other components to this.
Ryan Daniels
Yeah, I mean, I think there's so many ways to go. I think this is the fundamental question that anybody doing what I guess would call application companies previously and now it's becoming, I think even more opaque kind of where we differentiate in terms of, you know, our own type of intelligence and one specific domain. We're all like, you know, there's like this, I think we can, we can oftentimes fall into nihilism or like nothing matters. Don't build anything the labs will do it. You have this very friendly collaboration with the labs where we do a lot of applied evals with both of the open anthropic and yet we never know what are they going to. And I think it actually. So a couple of things I think one is it makes companies like Ollie's like Micro one much more interesting because for companies like us we have this vertically integrated solution. We have tons of unique data that is very, very hard to create and we built it ourselves with our own law firm and that is a huge edge. But if you don't know how to build your right evals and kind of tune models to work well for that, like it's not super useful. I think it's one thing, I think the other is it's very domain specific. I think the non self verifiable domain. So like things that are really taste based, obviously law is one of them. It's super subjective. You know, as senior lawyers get more senior, they disagree on what quality looks like. You know, code is very self verifiable. Right. And so you can see why that's something easy for the lab to get really good at. But the more, the more taste based you get, the more I think the value can accrue more easily to these application layer companies that are just closer to the work product. And that's our bet. And I think it might just mean that we get more specialized over time but, but you know, that's kind of what we just need to focus on. Whereas I think the labs will, they already have both open and anthropic, have legal point solutions that are I think very base level and, and you know, not super specialized.
Jason Calacanis
Yeah.
Ali Ansari
Also just one, one thing to add here in terms of like the product or the model model being the product is, you know, like OpenAI and Anthropica are the best example of this where they're coding product not modeled, but the product is what is generating a very large portion of their revenue. And I think cloud code obviously is the perfect example where they had the best coding model. They obviously still do. But when they came out with the desktop application which was an actual product, that's when the revenue skyrocketed.
Jason
Satya Nadella, the Microsoft CEO, wrote a widely shared X article over the weekend titled A Frontier without an Ecosystem. He argues that the future is a quote, cognitive loop connecting people and systems. So this means enterprises will be driven by humans building learning loops on top of AI models and turning workflows into more like agentic systems that improve themselves through Reinforcement learning. And then those loops are are the company's IP where all of the value rests. I pulled a key quote. This means the real opportunity is not in picking the best model, but instead building a learning loop on top of models where human capital and token capital compound. You can offload a task or even a job, but you could never offload your learning. The future of the firm is the ability to compound that learning across people and AI. The post, by the way, staggeringly popular. 63 million views, 12,000 RTs. A genuine banger, even from the CEO statement genre. So I guess my first question would be to Ryan. This reminds me a lot of what Crosby has sort of designed its own kind of in house proprietary feedback loop, much like the post describes. So can you walk us through like how that works and how the combination of human and token capital is really producing these like higher level results.
Ryan Daniels
This post was so validating in so many levels. You know, we've hired now we've got 50 attorneys that we've hired and we have a captive law firm that we've built. And you know, it's a, it's not easy when you're, you know, raising venture to explain the value of a law firm that looks very much superficial, like a services business. It's not easy to explain to lawyers that are really anxious about models getting better. Like why are we just trying to have them train themselves out of a job? And our fundamental belief, and I used to be an attorney, but our fundamental belief is that the lawyers get more valuable as the models get better because they can do the things that only humans do. And the cost per dollar of that time actually go up over time. And all the stuff that you're paying somebody $1,000 an hour for that you can definitely hand off to a lawyer. You no longer have to pay for it. And we kind of know that. I think people get frustrated their lawyers because we all kind of know that. And I think the hard thing is that feedback loop, the reinforcement learning loop, which is the key again going back to what I mentioned about Ali. You need to be able to eval an output to say if it's good or bad. If you can't do that, the models can't reinforcement learn on it. And for the subjective domains like poetry, like forms of accounting, like literature, obviously like legal, where there's a variety of opinions for what's a good output, this is where this becomes a very proprietary thing to work on. But I think Josh Kushner wrote a post last week that I felt like I was like, why don't we think of that? Where he says we're long humans here at Thrive Holdings. Like, this is just like we've just been so banging this drum of we're long humans. We don't want to build an application that can replace lawyers for this domain. That's very subjective and where the quality of judgment matters so much. The more we can kind of create these loops of learning on top of the models in a vertically integrated system is so powerful. So like, and I think we're probably a few months ahead of most other companies, but I think every law firm will have to figure this out or they just, they won't exist in four or five years.
Jason Calacanis
Yeah, this is, I think, going to be the trend of 2027. Lon. You're going to see open source on prem servers. Building your own language model or the harness around it becomes the trend. And when you read Satya's post, you immediately think about a company like Ryan's because this concept of there's a software layer, there's a services layer, there's an intelligence layer. All of that is going to be abstracted into what I call the problem being solved layer. The problem for your customer being solved. Ryan's customers do not care how the problem is solved. They care. The problem is solved faster, better, cheaper, some combination of those things. So if you Fast forward to 2027, you don't want to give your knowledge in terms of Ryan's company or many other companies, accounting companies, we have tax GPT in our portfolio. You don't want to give that to Claude. You don't want to give it to OpenAI or Gemini. You want that for yourself. And so we as venture capitalists investing in these companies, we never wanted to do services. We always wanted to do SaaS. Higher margin, please. If you have any services revenue, hide it from the venture and investment community. None of that matters. The whole paradigm has been abstracted into customer solution. Money changes hands. That's it. And so if you take this to its end state, I think savvy founders like Ryan are just going to say, you know what, I don't care what's underneath the hood here. I care about my customers and I don't care about VCs and their lack of understanding or being able to put me in a box. And that's what Kushner is, I think, also saying in his missive. And everybody's kind of coming to the same conclusion. Just don't worry about what box the startup fits in. Are they delighting Customers with a better, cheaper, faster solution, period, full stop. And Ollie's company has the same micro one, has the same confusion that VCs have had. Right, Ali, you had this like three months ago, people saying, like, these companies that are doing reinforcement, learning their services, they're not software. It doesn't matter what a VC thinks. VC's opinions do not matter. The customer's opinion is what matters. So we just have to get this VC paradigm biases off the table and just look at what customers, Ali's customers like Ryan, love what he does for them. Ryan's customers love what they're doing. So how does the frontier model now the challenge is on the frontier model, how do they get to play with the people who are touching the customers? And I think that's what could be radically changed. Don't be surprised if everybody working at Ryan's company has a workstation running a local model, studying their behavior. And at some point, I think, Ryan, you could just have everybody on a Mac studio or with, you know, a terabyte of RAM running local models that you own and you never touch a frontier model. Is that a possibility? Ryan?
Ryan Daniels
There is always a gap between what the frontier models can do and like, great work output. And then the question is, like, what domains do you really just pay the extra premium for that last 5% to get to the good work output? And what things are you kind of okay with cutting corners or doing yourself? And for domains where there's sensitivity, especially like legal medicine's another one, you, the value, all the crews, the power law, the premium all goes to like getting that last bit done. Not the first draft, but the final ready to file, you know, ready to sign, signature kind of contract. And so for us, we've been in the last year in particular dogmatic about the codifying the judgment of our lawyers, who are all fairly skilled. Five plus years, they're expensive, but this is where the value started to accrue. And so, yeah, we're becoming much more thoughtful about what are we doing with that data. Should we be post training our models with it instead of sharing anything with frontier models? And I think, Jason, the point you made is spot on. I think this handwriting about it is it services, it's software. It's just like when human labor can be done by a model for the first time ever and every six months, even more and more labor. It's just such an anachronism to ask the question, it doesn't matter anymore. And we're always surprised by how many things where lawyers will Always have to do that and then one day they no longer have to and we just watch it with our evaluation.
Jason Calacanis
Examples of in the lawyer stack, things that have already become automated versus things that still require human in the loop and then things that you think are going to be the last castles to fall.
Ryan Daniels
I mean we'll get to this. When we talk about the benchmark I think like I'll tell you one thing that took us a long time was summarizing a review for a client. Sounds really trivial, right? We make a bunch of changes, we send them a cover note saying here's what we did. There's an enormous amount of judgment that goes into that. Right. If you're talking to a lawyer, you give a very in depth summary. You really want them to get in the weeds. And the way we evaluate ourselves is we don't want our clients to ever have to open a document that we did because that means that we haven't described it well and they don't trust it. So the real time saving with AI, you have to read the document every time, look at the output and verify it with us. You should just trust it. That's the key for a non lawyer. They want a very cursory summary and they don't want to know the 10 shades you made. They want to know the one thing that they really might want to know about and getting models to be thought and we could just create all these hard coded rules. Here are the five. We're summarizing these instances and it was too brittle. And then we just put in a bunch of really good instructions on like hey, given, these are the parameters of the deal and it's a really high velocity deal and it's not a big client and it's a salesperson, not a lawyer. Give a cursory summary and here's what kind of cursory looks like. And I think with 4.6 it started to kind of come together where it was like making like you would read it like yeah, that's a pretty reasonable note. But for the longest time we were sending these like huge cover notes. Our clients were like what am I supposed to do with this? Like this is crazy.
Jason Calacanis
And yeah, this is where Ali, the person who is closest to the customer gets to win.
Ali Ansari
Yeah, yeah. I think this is a, it's actually a really important point which is there is one of the main. Of course as we've said, one of the main ingredients to train models is obviously data, but there's two sort of subcategories of that which I Almost want to call like the two kind of new scaling laws that are specifically in this data category. The first is the horizon of tasks that you get access to and you're able to kind of keep increasing the horizon of. And the second is the real worldness of the tasks and of the data points. And if you think about what a services company does is they, they're, you know, they're delivering real world economic value to their customers, which means it is as real world as it gets. Like it is literally the real world. There's not a proxy to the real world. And it's also as you kind of keep improving the capabilities of the sort of, in this case the AI native law firm, you are able to deliver longer and longer horizon tasks to your customers and eventually deliver as long as possible. And those two dimensions can be very much scaled up to then improve the intelligence layer that you have at the application company or the services company, whatever you want to call it. And this is something that labs are very much looking for. A lot of the work that we do with the Frontier labs, they're always trying to look for ways to make the data longer horizon, but also make it more realistic. And so one of the things that we've actually done recently is we've started to partner with companies that are using some of these applications that are built by these AI labs in their real workflows versus these kind of fake sandboxes to then send back trajectories of the real tasks that they do with those tasks evaluated, which is structured and evaluated, which means how well did these applications actually do on these tasks for these model companies to then, to then train on? Now if you think about again, a services company, there's an abundance of those tasks because it is what you're doing and what you're serving to your customers. So this goes back to the point of, I think the services companies that have a very niche focus and sort of one area that they want to kind of use intelligence plus humans on is really those are the companies that are going to build the frontiers of these domains. And also we are. I love that everyone's sort of going this route of like long humans and humans first. We've been saying this for a long time, which is human brilliance is needed more than ever when you start to improve model capabilities in a certain area. The expertise from humans, the judgment to continue training in that area plus the judgment required to actually deliver value in those areas increases and the demand for it increases. And coding is actually a perfect example where a lot of folks think that as coding capabilities get really good, where the model can basically write almost 100% of the code, there's this notion that humans are actually not needed, but it's not true. There are more engineers than ever. There's also more engineers training the models than ever. The pipelines that we kick off right now at MicroOne for training model capabilities, a very large portion of them are coding and the demand is the highest it's ever been. And of course the capabilities is also the highest ever been. So we see that we will likely see this pattern follow along with every other domain as well.
Ryan Daniels
Ali, do you think that like your next client base in 2027, 28 will be like accounting firms, consulting firms, trying to have these learning loops done locally?
Ali Ansari
100%. In fact, it is, it is a current client base as well. We are starting to see enterprises that are, let's call it less AI native than others that are coming to us that are saying we have this pilot of an agent that we've built, but every time we test it, there's just random people within the company that go and do these anecdotal QA on this probabilistic software. So we don't know if it's actually going to work or not in real life setting or like in production. And that's exactly where the evals come in. And that's where like them owning their own intelligence comes in. So we actually believe that in the long run the, the majority of the data spend, in fact probably close to 100% of it will be on the agent and the application layer versus the model companies. And this doesn't mean the model companies like reduce their spend. I think they're going to continue increasing their spend in major ways, but there's just going to be orders of magnitude more agents built than obviously models. And it's not like 100x or 1000x, maybe like a millionx more agents built than models. And it's sort of a similar argument to where compute is used, which is obviously compute was mainly used for training a while ago. Now it's largely inference and it's probably at some point going to be essentially almost 100% inference. I think that's sort of like the equivalent trajectory that we're going to see on data, which is the majority of data needs will be on the application layer from all types of companies, even the ones that are like mom and pop shops that, that, that are still owning their, their intelligence layer.
Jason
I thought this was interesting, especially talk about the nuances of the law that sometimes hard to bake into these models. There was an interesting piece of the Wall Street Journal over the weekend about how court reporting seems to be a prime target for AI takeover. There's a shortage of trained humans to do it, and you're just listening in. It seems like a speech to text sort of thing. Of course, a court reporter, they're the ones typing the real time transcripts of everything that's spoken aloud during a legal proceeding, like trials, depositions, hearings, or whatever. But the thing about what they're typing is that transcript becomes its own legal document. So say if there's an appeal, future lawyers and judges are relying on that transcript as an accurate verbatim copy of what was said. So you would think speech to text has come such a long way. But the real work of court reporting includes a lot of stuff AI isn't great at yet. Capturing nonverbal cues, filtering out ambient noise, essentially making it certified that we can rely on it, you know, as a verbatim transcript of what happened. The National Court Reporters association, maybe they're a little biased, but they argue AI transcription remains too error prone for use in courtrooms. Some states are already trying it, though. North Dakota has eliminated stenographers and already switched over its courts to electronic recordings. And of course, this is a difficult thing for humans to do because you have to type. Texas, for its court stenographers requires 95% accuracy at 225 words per minute for five minutes straight. I could type pretty fast, but I'm not. I can't do that much. That's like some people compare it to learning a foreign language or a musical instrument or something like that. So my question here is, for Ali, is this a matter of time?
Ryan Daniels
No.
Jason Calacanis
Maybe for Ryan, since he's in the legal space, sure.
Jason
For Ryan, is this a matter of time before you think AI could do this as well as any human? Or is this always going to be something we need a human in the loop for? And, you know, for Ali too, Is this something we could ever train or fine tune with the use of experts to make AI better at it?
Ryan Daniels
I think the obvious answer, I actually think AI can do this already, honestly, Like, I think like some of the voice models like Whisper, I use, like, it's like, it's kind of astonishing.
Jason Calacanis
Whisper's crazy.
Ryan Daniels
And so I just, like, I think it's solved. I don't think that's the question, like, the business that I would have started if not. This one is an AI private dispute resolution company, and there's like aaa, the American Arbitration association and jams. These are two huge arbitration societies and businesses, typically internationally for JAMS and in the US for aaa, agree when they sign a contract ex ante, we're gonna like surrender our jurisdiction to these, to these arbitrators because they're smarter than judges. They're faster, they're smarter for our need for business. But here's the key. The outputs of those tribunals are enforceable by a court. So it's not just like whatever they say we might go with, we consent to a court binding. A sort of decision of this tribunal is. But the key is that like we have such a need for dispute resolution as humans. That's just in the U.S. that's how we solve our problems. And the U.S. court system, right, like you need, especially the federal system, like you need an enormous amount of political capital to like appoint any new judge. It's pretty deadlocked. And so any change is really, really hard to make. I think that AI lawyers will overwhelm courts far faster than courts will use AI to keep up. Right. Like there will just be way more lawsuits filed. It's already happening. And I think one of the key things to get past is there's a collective myth that we surrender to as Americans that the courts are fair and we just know they're not. We just know that juries are not fair. They have a lot of bias. And so judges are much more likely to grant bail after lunch than before lunch. There's a great famous study on that because.
Jason Calacanis
So if they're hangry, they.
Ryan Daniels
Yeah, super famous, super famous study.
Jason Calacanis
Same hangry index for hell.
Ali Ansari
Wow.
Ryan Daniels
Yeah, big time super famous study from like seven, eight years ago by these Israeli researchers. So like we just know, however, we just believe we surrender this collective belief in the fairness because it's done by humans. And so I think courts like it is inevitable that access to justice increases. And about 70% of Americans don't have access to a court or lawyer when they need it. And it will because of AI. And I just don't know if the public judicial system will be able to keep up with the reality technology. And this is just another example of that. But they're like endless examples.
Jason Calacanis
When will a person be able to be their own counsel for. Let's call it not small claims court, but whatever. The next level up is like a, let's call it $100,000 dispute. I think small claims court is under
Ryan Daniels
2510, depends on the state.
Jason Calacanis
Yeah. So let's say $100,000 dispute about, I don't know, I was building my dream home and I'm fighting with my architect and the construction company. I don't want to hire a lawyer. It's not worth it for this 100k in damages. But if I can use AI to represent myself, maybe I'll take a swing at it. When will that moment happen? Do you think you have to be
Ryan Daniels
in a position to cut someone open because it's dangerous if you're not licensed? And so, but you know, like, let's be real. I think some things in court require that level of training. For sure. You really need to know how to argue the right motion. And your client will be at a significant disadvantage if they represent themselves. I think by late 27 models will be on evals, which are really hard to verify. As good as many lawyers. Right. We have like almost 3 million Bard attorneys in the US about, about 800,000 practicing. So. So let's look, let's index close to a million. And there's an astonishing variety in the quality of the lawyer you're getting. Right. Like, I mean, there's just like some pretty crummy lawyers and. And yet people don't have access. So I think. And then I think the hard question for course of capital is when models are verifiably as good as the mean of a lawyer, how can you ethically not give people access to those who represent themselves? And that's coming.
Jason Calacanis
Yeah, well, we, we saw a study, I think it was out of Harvard, that consumers were preferring the bedside manner and the fidelity of answers from a general practitioner. And that's already happened in healthcare. Now, people might not know that or might not be doing it, but the number of health searches you're seeing in these large language models and people going to them first is obvious. I apparently have a mosquito bite allergy. It's a silly, simple thing, but I went to an emergency care because these three mosquito bites were just, you know, getting inflamed. And I just said, hey, you know, can I record this emergency person at the urgent care? I was like, sure, no problem. I took it, I dump it into the transcript and the audio file into one of the large language models, and it gave me an absolutely spectacular answer in real time. That I then had a Socratic dialogue with the person. And she's like, yeah, that's correct, that's correct, that's correct. Literally everything she told me was then verified and then some. And then it gave me other opportunities. And then I shared it with my wife and said, hey, by the way, I think one of our daughters also has this mosquito allergy thing. And here's the protocol. We're done. Like amazing, right? I didn't have to go to the urgent care. If they allowed me to get this certain steroid to put the inflammation down, I would have been on my own. I could have solved the problem without going to urgent care. And that to me was like, hmm, okay, we're already here in my mind.
Ryan Daniels
I mean, Ali, like, do you think if, if you were to create evals somehow that could show some of these really subjective domains were as good as a licensed professional, do you think there would be, would there be even need for them? Like, or does the licensing regime just make it uninteresting?
Ali Ansari
Yeah, so I think part of it is the licensing regime, but I think another part of it is, you know, this is not a favorable statement towards like what we do at Micro One. But evals, no matter how representative you try to make them and how much you deem that this is verifiably better than humans, you can't actually get to this sort of like 100% representative state where you determine that this legal benchmark is now better than a general practitioner or this medical benchmark very much proves that models are better than all urgent care folks that practice. I think in most cases there's certainly going to be some tasks that are fully automated away, but I think in most cases, even in the long run, the value of AI will be ultimately delivered by humans in some way. And I'll use, I mean coding is the obvious example where again there's a, there's sort of a renaissance of software happening, but it's the software engineers and the many new folks that are now calling themselves software engineers that are actually delivering that value. But that's sort of the obvious example. I'll use another example that's very niche and it's actually like a simpler problem which you can like determine solved, which is we've built a proctoring model at micro1. It's like, you know, part of, part of what we do is obviously the AI recruiter that we built and there's a bunch of like different models that we have that kind of make this work to source and vet experts. One model that we built is specifically the proctoring model which takes in, it's a very simple model. It's not LLM, it's like a pre trained model from scratch which takes in video embedding of someone doing an interview and then it sort of spits out a probability that They've cheated. And we've trained this on lots and lots of data and it's sort of like a fairly simple model. You can kind of determine this solved. However, in this case, humans are still delivering the value of this proctoring model in two ways. One is the recruiters that are checking the proctoring scores are in almost every case, in pretty much every case doing a very quick sanity check if this proctoring score hallucinated or not before they determine whether or not this person should be marked as cheating or should be not marked as cheating. And that can't really change no matter how well we say the model does. That's sort of the first way. The second way is this model. We've trained it with a pretty good sort of architecture and there's a nice data pipeline that has a lot of self improvement capabilities built in. But there's a massive amount of drift that happens. Even though there's no law changing here, there's no societal implications that result in this drift. It's a very simple like user interface changes on the proctoring side of things make the model drift in its capabilities. So that simple example on its own results in a bunch of human experts that we have, which I think now is like it used to be about 10 people, now we actually increase it. It's like 20 or 30 people that are constantly working on labeling which videos they think is cheating or not to, to not allow for this model drift to happen. So this very simple example of something that is very much solved is still delivering value through humans both on the side that actually makes the ultimate decision, but also on the side that continues to train the model because of the drift. And I certainly don't see this changing in areas that are way more complex than this proximity model, such as law, such as medicine.
Jason Calacanis
I think you should ali get an urgent care facility and just buy it. You should buy one of these Texas based urgent care facilities. You know, they're like franchises kind of or they're like local businesses, like local restaurants.
Jason
Some are definitely franchises for sure.
Jason Calacanis
Yeah, they feel like it. I don't know exactly how it works, but if you bought one of those and you just used it for data collection and you just said every, there's going to be a microphone in every room, we're going to anonymize the patient data, hipaa, whatever, and you just recorded every session and then every time people came in with cedar fever, you know, a dislocated pinky, whatever, it happens to be mosquito bite allergies, and you just had that data set, I think you could run the business at a break even and then have the best data set in the world. I wonder if somebody's done that yet.
Ali Ansari
Somebody's smiling not to. I don't want to make this promotional, but this is actually like literally my number one focus. Not on urgent cares. There's like a lot of sensitivities on medical data and so forth. But what you just described, Jason, which is we, the real operational workflows of companies is like the most valuable thing for model training. And of course it's not us training models, it's for our customers. But we're going out right now and I think you guys will find this interesting because the magnitudes of sort of money paid in this space is quite high, which is we are paying millions of dollars to companies that are oftentimes not even that large of companies, like 30 employees, 40 employees, to get access to anonymized PII sort of transformed versions of their entire corpus of company data to seed for training environments. So it's very similar to the clinician example.
Jason Calacanis
Yeah.
Jason
Do you remember, Jason, there was that story a few months ago about companies like startups that have gone belly up and they're selling their Slack histories for AI data. No, I never saw that Y, like failed so well. But that, that was a lot of the, like snark on social media. Was. But then aren't you training your new data to be a failed startup? Like, why would you trade it on companies that went belly up?
Jason Calacanis
So do the opposite of what you read in this Slack.
Jason
Right. Well, but I mean, that's my question for ALI now is like, would that data still be valuable even though the company didn't work out? Yeah, there you go. Even though that company didn't work out, is there still value to be gained from just the workflows, what people were saying, the conversations that were happening behind the scenes?
Ali Ansari
Yeah, there is. So we're not going necessarily too much after failed companies because like, obviously you can argue that there's. There's some bad data as well, but. But even for those, it's still valuable because if you think about what an RL environment is trying to accomplish, it, it essentially gives an agent, which is some frontier model, a task. And the task could be, you know, go read this document. And based on the scope of work that you see in this document and the constraints that you see in these Slack conversations that may change the scope of work, build me this front end of this web application. Just as a random example, that task has Sort of a few components, right? One is the actual prompt, like what is it that you're trying to do? The other is the verifiers, which we'll touch on what that is when we talk about the Crosby thing. And then the third component which is really important is what is the seeded data that it has, which is like what documents are you telling it to actually review to then create this web app? Or what is the Slack conversations you're telling it to review to look at the constraints of this web app. And if you create synthetic data for that seeded environment, you get really not the best results because it's too simple. There's no noise in the synthetic data. The Slack conversation is very sort of structured and the scope of work document is probably not the most, you know, complex, that it's like, probably like too well defined. So even if the company has failed, you can still take all of these different documents as seeded data for these RL environments to refer to as you build these tasks. That's kind of one of the main use cases for these company purchases.
Jason Calacanis
The ultimate dark data pool is Slack. They are not allowed to take that pool of data and train AI on it. That cannot be possible in the terms of service. However, if there was a way for Slack to make it free, but we get to train our LLM on what you're saying, that would be the ultimate product in the world because all human knowledge work winds up in some way inside of Slack eventually.
Jason
And Salesforce ascendant.
Jason Calacanis
I mean, but they're not allowed to. So now you start thinking about that as a possibility. What if somebody took an open source version of Slack and then said, hey, you can use this for free forever, but we just want to be able to train on what's in there. That would be the ultimate crazy bargain. Don't take that bargain. But what I want to do, and this is where Slack is so frustrating for me, is getting access to your entire corpus DMs, et cetera, requires you to keep paying up to 30 or $40 a month for a person if you want that version that has compliance. As a fintech company, let's say where you're monitoring everything. And I think we pay for that. So if somebody on our team sends a DM to another person with like an insider trading, we're all aware that
Jason
you're reading our Slack DMs.
Jason Calacanis
The word is actually reading it. No, I told everybody up front, like we're recording all this. Just so you know, never say anything derogatory about a Founder never do anything stupid, you know, in the corporate Slack. But what I really want to do is take the entire Slack, export it every night or in real time and train my own internal venture model. That's my ultimate goal. And I'm kind of doing that with Notion now. Notions AI is doing a version of that. And we've authenticated Notion to look at our Slack. So now when I ask Notion AI a question, Notion is just such a brilliant product. We have to have them on this week in AI, they pull Slack in and they do a better job. Sorry, Marc Benioff. They do a better job examining Slack with AI than Slack does with their tools. And that to me is like just an incredible future if we could get all that data. And now when I ask, hey, give me the history of our investment in Micro one, if I do it in Slack, result is like a 2 out of 10. When I do it in notion, it's like a 7 out of 10. Yeah. If I had my own Hermes open Claw version of this or maybe even using Cloth code eventually to build something, I think we could get to like a 9 or 10 out of 10. Where now I'm asking questions, Ali, like, you know, when did we. How did we meet? Ali, how did we. What's our investment history? It's starting to make me a dossier of the entire relationship, which is really amazing. And what companies did we miss? You know, we have their interviews and their notes. To me as a venture capitalist, this takes it from hey, I'm the world's greatest angel investor picker to hey, tell me where I effed up the most. It's the difference between the NBA players not wearing a whoop, not studying their data with Orica or whatever it's called that we're investors in. I've got the name of it. There's a company that studies not only the game data, but the health data they have. And then they put them together. So here's Jalen Brunson's sleep patterns plus his performance the next day. Here's Steph Curry's three point percentage versus his nutrition. Like all kinds of weird stuff are coming out of the NBA now where they can make these players really moneyball them with AI, it's just a really bright, interesting future. But the dark data pool of Slack is the ultimate. Yeah, data pool for me. What do you think, Ali?
Ali Ansari
By the way, are we sure Slack is not training on the data? I mean, I just asked ChatGPT. I mean, of course we're not sure if this is true. But it says it is explicitly prohibited to train foundational models on the data. But it is not explicitly prohibited using interaction data, feedback results and outputs for product evaluations and improvements of Slack AI itself. And again, going back to the point we made earlier, which is the intelligence layer, you don't need to have this distinction of like training models. You can just eval your way to quote unquote training models. And if they're doing this, they're improving their agents and they're effectively training a model on customer data. By the way, I think Slack is. I'm not sure if this is true. Of course this is just based on strategy thing that just came out. But if it is, then I think that does mean they're effectively training on customer data. And I also think Slack, by the way, needs some competition. We use Slack, but they are very aggressive. They have a very aggressive sales team. And I think competition for Slack will be a very good thing for the world.
Jason Calacanis
I would give $1 million seed funding to a team to make an AI first Slack that charged a flat rate for up to 500 users. That was like, wasn't that Glue?
Jason
Wasn't that Sachs?
Jason Calacanis
I mean Sachs had an AI version of Glue. Yeah, I think there needs to be to be another one that's not priced off of this. And I don't know what happened with Glue. Like I don't hear much about it.
Jason
I was asking. I don't hear about it anymore.
Jason Calacanis
I think this might be an example of being too early. There's some. What would you, Ryan, if you were going to make a disruptive version of Slack? Let's gameplay it here. And then I want to end with your partnership with Ali, but just let's imagine what that would look like. So AI First Slack, disruptive. How do you disrupt Slack?
Ryan Daniels
So there's this company called Ando. It's Sarah Du's company, still in stealth. This is and. And she's great.
Jason Calacanis
And this is exactly a NDO Ando.
Ryan Daniels
Yeah. And it is an agent first Slack and I'm dying to get our company access to it. And it's. I think our team's too big and it resembles Slack in many ways from a product perspective, but it's built so that agents run natively. Right now I still struggle to use agents in Slack and it doesn't make any sen is where we do all of our work. But the idea that like I could have all of my agents that work on my behalf and run through Slack and talk to my teammates, agents and actually get things done and then give me the summaries. We will obviously be there in a year. The fact that we're not there now when the models can probably do it is just a product limitation. So like it's like it's coming. So if Slack can't figure it out, I'm dying to switch this to Endo.
Jason
Yeah. Wow. It says the product thesis is that current platforms like Slack Teams and Discord were built for human to human collaboration, not for humans collaborating with AI agents and agents collaborating with one another. So that, that does sound interesting.
Jason Calacanis
Yeah, I just submitted myself for. And yeah, A N D O. So we'll put it in the show notes and we book that founder here on the show. All right. I wanted to end with you guys talking about your partnership, Ali, because this is a really unique one. How do, how do the two of you work together and collaborate?
Ali Ansari
Yeah, I'll give the brief. Ryan, feel free to add on to this. But essentially we are building a benchmark and I think by the time this episode comes out, it will be live. We are building a benchmark that is around multi turn redlining for SAS contracts and other types of contracts. And we are having about 20 to 30 top lawyers that are building these, you know, verifiers based on contracts that are real negotiations that happens, that happen between the lawyers where a lawyer submits one form of redlining based on some real documents that are for, you know, in a lot of cases, public documents. And then there's another lawyer that, you know, redlines back and then there's the sort of four or five turns that that happen and there's about four to five categories that each turn sort of cares about in terms of legal accuracy, the verboseness of the red lines, negotiation leverage and a few other things. And our goal is to simulate real world redlining scenarios with this benchmark where it's first of its kind. And also our goal is to completely open source the data set that is created here. Not just the full report and sort of a sample subset of this, but the entire data set for folks to analyze on their own and sort of check the accuracy of. But yeah, Ryan, if you want to add on to that, I think the
Ryan Daniels
key here is we came to Ali, we've known each other for a while now and I said I have this. We have this really hard problem that I would have thought would be quite simple, which is these routine commercial contracts, like any tech company, anytime they want to close, let's say anything over 30k, 50k has to close a contract and it takes sometimes months, usually at least weeks. And that, to me, was crazy. Like, I was a GC of a startup. It was crazy. That was the bane of my existence. And, like, I just thought this would be solved by AI by now. And it's not, and it's far from it. And so we came to Alan, we said, for some reason, the models are just like, they can't seem to figure out the judgment calls that lawyers are making. So we got these lawyers to negotiate against each other, and each had the same instructions, but we'd have five lawyers doing the exact same task to see where they agreed and didn't. And the results were kind of amazing. Basically, the lawyers had a lot of consensus on the first review. They would all do the same thing. Models were all over the place. And then for subsequent reviews, there's a really easy way to close a deal, which is just to say yes. Right. And that's not legally protected. That's being a bad lawyer. That's what a salesperson wants to do, but not what a lawyer would do. And we found is that models are really likely to just say yes to everything and not be legally protective. And so we found all these interesting nuances. It's the first benchmark that shows an entire negotiation from start to multiple terms. Most of these benchmarks just show one task. And the goal here is that agents should just negotiate against each other and close a deal. And lawyers can weigh in just at the end. That's what we're working towards. And so this is the first step in publicly showing what would it take for agents to just run the deal? They know what I care about, they know what Ali cares about, and they can just simulate us all the way to the end.
Ali Ansari
Yeah. I think one other nuance here is that, of course, legal is a very subjective field in a lot of ways. And one of the things that you sort of need to do is you need to simulate a really good debate, essentially, that happens before you get to this final structured judgment, whether it's red lines or any sort of outcome of a case. And so in this case, the way we've designed it is instead of taking the score of one lawyer that determines the verifiers for one set of redlinings. We actually take an average of, I believe, four to five lawyers that do the same redlining, which, in other words, they create the rubrics that define what a good redlining would be, which give a score to the model once the model actually attempts that same redlining, and then the sort of average is what ends up being the model score. So we sort of remove the noise due to subjectivity and we try to converge a little bit more towards the truth here.
Jason
Based on the current benchmarks and the results that you're seeing, how good are the top models doing Fable 5 or whatever? Like how close to a human lawyer's job are they able to do right now?
Ryan Daniels
So Fable, we're still, just given everything that happened over the weekend, we're having to play with some of those numbers because we lost access before we were done.
Jason
Fair enough. Not your fault.
Ryan Daniels
State of the art is like, you know, between 10 and 20% of the way a lawyer would get there. It's like there's still so much that goes into it on so many dimensions.
Jason
Wow.
Ryan Daniels
We're talking about like it's really a chess game between two lawyers. Like how does each person negotiate? Are they really forceful? Are they really gentle? What's the right tact here given who your counterparty is? And the, you know, the board is constantly evolving with each step. And I am positive models will be able to do this as we staked our business on it. But, you know, we're focusing a lot of our efforts now, more than we would have anticipated, on research to be able to try to figure this out first. Because every, you know, every transaction really runs on some sort of contract.
Jason
Yeah. Is there a follow up question here? Just because I'm curious, is there, is there a future where there may be like multiple legal models available and you're picking like, I want the shark model versus I. Like I, you know, we're, we're having a friendly divorce. I don't need the shark model. I'll take the, the more mild mannered, reasonable one.
Ryan Daniels
So there's a paper that came out of Stanford about a month ago where they just had a very simple negotiation over just like some fixed sum of money and had models negotiating with each other. And 4.6 was just by far the most aggressive and refused to say no. So like I could find a favorite. And so, and so like there may be a case where you're like, end up paying up more in high stakes negotiations, but again, what's the right result is just what your counterparty accepts. And so there's just so much game theory about when to use models, when to bring a lawyer in. And this is the bane of our existence. And this is what you pay a good lawyer for. Right. They get it done and they get it done for you better than you could have done yourself.
Jason Calacanis
Do you guys have strong Opinions on the Mythos getting pulled, not strong.
Ali Ansari
I think the Mythos thing, obviously without commenting too much on the specifics, is I think there actually is like a very simple way for the government here to do a good job regulating models, and that is very specifically create a data set that redlines model capabilities before they come out and make sure that this data set has a very wide range of coverage, whether it's cybersecurity, nuclear weapons, chemistry, physicists, and just have a very large set of expertise that sort of create these tasks that may be harmful. And run this data set each time, which takes. It can be very quick each time a frontier model is before release and dedicate sort of a recurring budget to updating this data set because at some point the models will, you know, saturate this benchmark essentially just like they do every other benchmark. And if you just continue to update this and this becomes a source of truth, kind of best redlining for final safety issues, this can become a really good way to allow frontier models to just not be delayed in releases. Because it could just be a very quick inference.
Jason Calacanis
And that's all.
Ali Ansari
I think we sort of keep it very simple.
Jason Calacanis
Just this dataset, this is the great idea. Now if we're going to regulate these models, I think self regulation is the exact way to do it. This list of tests should be done by Aisafety.org, a consortium of the top 20 companies who each put a little bit of money in to fund this organization. This organization is responsible with AI safety folks to create this third party certification. Bioweapons, cyber hacking, you know, harmful, you know, what is the term for kids and porn and all this other stuff?
Ryan Daniels
Csam.
Jason Calacanis
Csam, CSAM test. Just all these different tests, revenge porn tests, whatever. And then when you have yours, it does it and it is responsible for doing it quickly and it gives you the certification, hey, our 4.1 has been done, okay? We now have 4.7. We need everybody's model to go through it. Boom. Everybody goes through the same safety testing. Everybody gets the same certification. And the government has access to that. It can talk to AI safety.org just like the MPAA exists in movies to rate them. That is not a government agency. The government agencies will eff it up. No, they'll family show it up.
Jason
It's run by the industry. The MPAA is the industry. Self policing. That's exactly.
Jason Calacanis
And if, and if they screw up, Ryan, then it's on them. And then you could say, hey, The MPAA gave PG13 to this movie. But you know, kids went to it and you know, parents agree. Common Sense Media is another organization that kind of does this. So between mpaa, the independent Common sense media company, you can kind of get to some ground truth. And there could be multiple organizations that do this and you could subscribe to two or three different ones. Yeah, Ryan.
Ryan Daniels
Yeah. I think given just how fast things are evolving, you need experts to regulate something like this. And they will never be at a government agency in the same way, just given the speed of change. There's this very shaky prospect of if a model gets big enough and a company gets big enough, is it going to be nationalized? And I think just given where things are geopolitically, the US government never wants to get anywhere close to looking like it's taking control or overseeing the way a model functions, even as these intelligence capabilities get greater. And I think even something like this starts toeing the line too closely. So self regulating. And you're totally right, if it trips the line, then the government get involved. But I think there's a great incentive for them not to triple line.
Ali Ansari
Yeah, I think just one last point on this. I think it's a great idea, Jason, in terms of an organization that actually involves a lot of these different frontier model companies. Because the argument that you can make against this is each company is going to have obviously a sort of bias against their competitors and a bias towards their models actually doing well on these data sets. And I think in this case it's actually okay to have that because of course they can say that they don't. But I think there's just that innate bias that is naturally going to exist. But that's okay in this design because if there are enough companies that are organized and they are sort of adversarial to their competitors, that actually is a good design because you're trying to create literally what's called adversarial tasks to get the models to fail and be jailbroken. And so if the competitors are by design being adversarial towards each other, that's actually the point of the organization. I think that allows for this organization to exist in kind of a good state. And maybe there's a government overseeing it in some way, but I think if there's enough companies that dedicate equivalent budgets in some way towards this, it could be a very nice self regulating entity.
Jason
Awesome. Well, thank you so much to both of our guests. An amazing show. Ali Ansari. It's micro1ai is the website and micro AI careers, if you want to go work for microai and micro1. Is there any jobs you're particularly looking for at the moment? Ali? We should. We should search.
Ali Ansari
Yeah. We are hiring researchers in all three labs that we have, Realm Robotics and Cortex. So if you're a frontier researcher wanting to focus on the data stack and only data stack, which we believe is the most important ingredients, then come join us.
Jason
And also I can tell you, getting
Jason Calacanis
some equity as an equity shareholder, having equity in Ali, that's a pretty good bet.
Jason
He's going to run through the wheel.
Jason Calacanis
I would buy stock in Ali with his juggernaut. Alia.
Jason
Yeah. Also Ryan Daniels, thank you so much for being here. Crosby Legal is the company, Crosby AI is the website. And Crosby AI Careers, if you'd like to go work for Crosby. What positions are you looking for right now?
Ryan Daniels
Great lawyers. All big law firms is our bias. And ML engineers who are interested in this problem of automating a highly subjective I impact career profession.
Jason
Another.
Jason Calacanis
But no plug for me. No plug for me, Lon. I get no plugs on my own show. Here's the plug for me.
Jason
Jason, what's your website? Who are you hiring for right now? Aside from a new co host? Aside from a new AI co host?
Ali Ansari
Yes.
Jason Calacanis
No, you're a great broadcaster, Lon. I just told you in the group chat. Great broadcasting.
Jason
I appreciate that.
Jason Calacanis
We have an Associates in training program we run every summer ait. We don't have a landing page for it. We do it every June at graduation. We select them in April and May. And if you wanted to join the launch team, we do have some open positions there on the website. What is the website?
Ali Ansari
Careers Launch.
Jason
Co. Careers Launch.
Jason Calacanis
Thank you, Ali, for plugging me. You can see them there. But we're going to make an ait, so ask them to create Launch Co AIT to make an Associate in Training. And you know, then people could sign up at any time for that. So hopefully we'll have that landing page open before this episode comes out, I'm sure. Great job, everybody. We'll see you next time. Bye bye bye, everybody.
Podcast: This Week in AI
Episode: 18
Date: June 18, 2026
Host: Jason Calacanis
Guests: Ali Ansari (CEO, Micro One), Ryan Daniels (CEO, Crosby Legal)
This episode dives into the shifting paradigm of artificial intelligence development and deployment, focusing on the rapid consolidation at the AI model layer and the explosive opportunity in the application, agent, and evaluation layers above it. The panel analyzes the massive SpaceX-Cursor acquisition, explores the dying relevance of “model as product”, and details how human expertise, learning loops, and domain-specific agents are now where value is being created. They also discuss the changing cost structure for legal services and the benchmarks for AI’s practical performance in traditionally human-dominated jobs.
Micro One (Ali Ansari):
Crosby Legal (Ryan Daniels):
Legal Negotiation Benchmark:
Limits of Automation:
On platform dependency risk:
"When you build an application level on another person's platform, they have perfect insight into how you're using their product and they study you. And then...they would kill [you]." — Jason Calacanis (14:34)
On the next wave in AI:
"The problem for your customer being solved. Ryan's customers do not care how the problem is solved. They care the problem is solved faster, better, cheaper, some combination of those things." — Jason Calacanis (34:29)
On services vs. SaaS in the AI era:
"It doesn't matter what a VC thinks...The customer's opinion is what matters. So we just have to get this VC paradigm biases off the table and just look at what customers...love." — Jason Calacanis (34:29)
On legal automation’s future:
"By late 2027 models will be on evals, which are really hard to verify, as good as many lawyers...There's an astonishing variety in the quality of the lawyer you're getting." — Ryan Daniels (50:33)
On open source models catching up:
"In a lot of domains...an open source model will be totally okay to sort of fine tune and evaluate your way into what I would call frontier in that product." — Ali Ansari (26:50)
Key Takeaways:
For more podcasts, visit thisweekinai.ai. Guests’ companies: micro1ai.com, crosby.ai.