
Loading summary
A
Today, my partner Jordan and I have a special episode of Unsupervised Learning, a crossover with one of our favorite AI podcasts, Latent Space. If you're not already a listener, Latent Space is a technical newsletter and podcast by and for AI engineers. It hit over 2 million downloads in 2024 and it's become a go to resource for anyone who wants to understand the cutting edge of AI infrastructure, tooling and product. If you like this show, it's definitely worth checking out. Given we've all spent a lot of time talking to some of the sharpest minds in AI, we thought it'd be fun to interview each other. In this episode, we dig into the questions we're constantly thinking about. What surprised us most last year, what we're paying most attention to right now, how we think about defensibility at the app layer and which public companies were long or short on. It's a different kind of episode and I think you'll really enjoy it. Now here's my conversation with Swix and Alessio from Latent Space. Well, thanks so much for doing this guys. I feel like we've, we've been excited to do a collab for a while.
B
I love crossovers. Yeah, this, this is great. Like the ultimate meta about just podcasters talking to other podcasters.
A
Yeah, it's still a lot of podcasts all the way up. I figured we'd have a pretty free ranging conversation today, but brought a few conversation starters to kick us off and so I figured one interesting place to start is, you know, obviously it feels that this world is changing like every few months. Wondering as you guys reflect pat on the past year, like what surprised you the most?
C
I think definitely reasoning models. We kind of on the right here, we're like, oh, that. Well, I think there's two. There's like the. What surprised us in a good way and maybe in a bad way. I would say in a good way. Reasoning models. And I think the release of them right after the new reps scaling is dead talk by Ilya. I think there was maybe like a little it's so over and then we're so back in like such a short period.
A
It was really fortuitous timing that like right as pre training died. I mean obviously I'm sure within the labs they knew pre training was dying and had to find something. But you know, from the outside it felt like one right into the other.
C
Yeah, yeah, exactly. So that was a good surprise, I would say.
B
If you want to make that comment about timing, I think it's suspiciously neat that we know that Strawberry was being worked on for two years. Ish. And we know exactly when Gnome joined OpenAI and that was obviously a big strategic bet by OpenAI. So for it to transition so nicely when pre training is kind of tapped out into like, oh, now inference time is. The new scaling law is very convenient. If there were an Illuminati, this would be what they planned or a simulation or something.
A
Yeah. Then you said open source as well.
C
Yeah, well, no, I think like open source. Yeah, we're discussing this on the negative, I would say the relevance of open source, specifically open models. Yeah, I was surprised by the lack of, the lack of adoption and I mean people use it obviously, but I would say nobody's really like a huge fanboy. You know, I think the local llama community and some of the more obvious use cases really like it. But when we talk to like enterprise folks it's like, it's cool. And I think people love to argue about licenses and all of that, but the reality is that this didn't really change the adoption path of AI.
B
Yeah, the specific stat that I got from Ankur from Braintrust in one of the episodes that we did was I think he estimated that open source model usage in work in enterprises is at like 5% and going down it feels.
A
Like you're basically all these enterprises are in use case discovery mode where it's like, let's just take what we think is the most powerful model and figure out if we can find anything that works. And so much of it feels discovery of that. And then right as you've discovered something, a new generation of models are out and so you have to go do discovery with those. I think obviously we're probably optimistic that the open source models increase in uptake. It's funny, I was going to say my biggest surprise in the last year was open source related, but it was just how fast open source caught up on the recent models. It was kind of unclear to me over time whether there would be compounding advantage for some of the closed source models where in the early days of scaling there was a tight time loop. But over time, you know, would, would the gap increase and if anything it feels like a trunk, you know. And I think deepsea specifically was just really surprising in how, you know, in many ways if the value of these model companies is like you have a model for a period of time and you're the only one that can build products on top of that model while you have it, like, God, that time period is much Shorter than I thought it was going to be a year ago.
B
Yeah. I mean, again, I don't like this label of how fast open source caught up because it's really how fast deepsea caught up.
C
Right.
B
And now we have like, I think some evidence that deepseak is basically going to stop open sourcing models. So, like, there's no team open source, there's just different companies and they choose to open source or not. And we got lucky with deepseek releasing something and then everyone else is basically distilling from Deepseek and those are distillations. Catching up is such an easier lower bar than like actually catching up, which is like you, like from scratch. You're training something that like is competitive on that front. I don't know if that's happening. Like, basically the only player right now is we're waiting for Llama before.
D
I mean, it's always an order of magnitude cheaper to replicate what's already been done than to create something fundamentally new. And so that's why I think Deep Seek overall was overhyped. Right. I mean, obviously it's a good open source new entrant, but at the same time there's nothing new fundamentally there other than sort of doing it, executing what's already been done really well.
A
Right.
C
Well. But I think the traces is like maybe the biggest thing. I think most previous open models is like the same model, just a little worse and cheaper. Like R1 is like the first model that had the full traces. So I think that's like a net unique thing in open source. But yeah, I think like we talked about Deep Seq in our end of year 2023 recap. And we're mostly focused on cheaper inference. Like, we didn't really have deepseek.
B
Deepseek V3 was out then and we were like, that was already like talking about fine grain mixture of experts and all that.
A
Like, that's a great receipt tab to be like, yeah, end of year 23. Yeah, that's an impressive one.
B
You follow the Right Whale believers on Twitter? It's like pretty ob. I actually had like. So, you know, I used to be in finance and a lot of my hedge fund and PE friends called me up. They were like, why didn't you tip us off on Deep Seek? And I'm like, well, I mean, it's been there. It's actually like kind of surprising that like Nvidia like fell like what, 15% in one day because of deepseek. And I think it's just like whatever the market public Market narrative decides is a story becomes the story. But really the technical movements are usually one or two years in the making.
A
Before that, basically, these people were telling on themselves and they didn't listen to your podcast on the end of year 2020?
B
No, no, no. Like, yeah, we weren't. We weren't like banging the drum. So like, it's also on us to be like, no, like this. This is an actual tipping point. And I think I like as people who are like. Our function as podcasters and industry analysts is to raise the bar of focus attention on things that you think matter. And sometimes we're too passive about it. And I think I was too passive there. I'd be happy to own up on that.
A
No, I feel like over time you guys have moved into this more interesting role of like taking ST or aren't important. And we feel like you've done that with MCP of late and a bunch of things.
B
Yeah. So the general push is AI engineering. It's got to wrap the shirt. And MCP is part of that. But the general movement is what can engineers do above the model layer to augment model capabilities? And it turns out it's a lot. And turns out we went from making fun of GPT rappers to now I think the overwhelming consensus GPT wrappers is the only thing that's interesting.
A
I remember Arvin from Perplexity came on our podcast and he was like, I'm proudly a rapper. It's like anyone that's like, like talking about, like, you know, differentiation, like pre product market fit is like a ridiculous thing to say. Like, build something people want and then over time you can kind of worry about that.
B
Yeah. I interviewed him in 2023 and I think he may have been the first person on our podcast to like, probably be a GPT rapper.
A
Yeah.
B
And yeah, obviously he's built a huge.
A
Business on that now. We all can't get enough of it.
B
I have another one for. So that was Alessio's one and we prepped individual answers just to be interesting.
A
In the same Uber on the way up. Yeah, you're just like in different.
C
Oh, I was driving too. Oh, you were actually.
B
I mean, it was a time mostly Joe My was actually. It's interesting that low code builders did not capture the AI builder market. Right. AI builders being bolt and lovable. Low code builders being zapier, airtable, retool notion. Any of those. Like when you're not technical, you can build software.
A
Yeah.
B
Somehow not all of them missed it. Why? Yeah, it's bizarre. Like, they should have the DNA. I don't know. They already have the reach, they already have the distribution. Like, why? I have no idea.
A
The ability to fast follow too. Like, I'm surprised.
D
Yeah.
B
There's just nothing.
A
Yeah. What do you make of that?
B
It seems, and not to come back to the AI engineering pitch, it takes a certain kind of founder mindset or AI engineer mindset to be like, we will build this from whole cloth and not be tied to existing paradigms. I think, because if I'm to, you know, you know, Wade or. Who's the zapier person that you know? Mike. Mike who has left Zapier. Yeah, like, you know, Zapier. When they decided to do Zapier AI, they were like, oh, you can use natural language to make zapier actions.
D
Right.
B
When Notion decided to do notion AI, they were like, oh, you can like, you know, write documents or, you know, fill in tables with AI. Like, they didn't do the next step because they already had their base. And they were like, let's improve our baseline. And the other people who actually tried to create a cloth were like, we got no prior preconceptions. Let's see what kind of software people can build from scratch, basically. I don't know. That's my explanation. I don't know if you guys have any retros on the AI builders.
A
Yeah. Or did they kind of get lucky starting that product journey. Right. As the models were reaching the inflection point.
B
There's the timing issue. Yeah, yeah, yeah, yeah, yeah. I don't know. Like, to some extent, I think the only reason you and I are talking about it is that they, both of them have reported like ridiculous numbers, like 0 to 20 million in three months, basically, both of them.
A
Jordan, did you have a big surprise?
D
Yeah. I mean, some of what's already been discussed. I guess the only other thing would be on the Apple side in particular, I think, I think, you know, for.
A
The last text message summary is like.
B
But they're funny.
A
They're funny. And how often they're viral.
C
Yeah.
D
I mean, so like for the last couple years, we've seen so many companies that are trying to do personal assistance, like all these various consumer things. And one of the things we've always asked is, well, Apple is in prime position to do all this. And then with Apple intelligence, they just totally messed up in so many different ways. And then the whole BBC thing saying that the guy shot himself when he didn't. And just like there's just so many things at this point that I would have thought that they would have ironed up their AI products better, but just didn't really catch on.
A
Second on this list of generally overly broad opening questions would be anything that you guys think is kind of like overhyped or underhyped in the AI world right now.
C
Overhyped agents framework.
B
Not naming any particular ones.
C
I'm sorry. Yeah, exactly. I would say they're just overall a chase to try and be the framework when the workloads are like in such flux that I just think it's like so hard to reconcile the two. I think what Harrison and LangChain has done so amazingly, it's like product velocity. Like, you know, the initial abstractions were maybe not the ending abstraction, but like they were just releasing stuff every day, trying to be on top of it. But I think now we're like past that. Like, what people are looking for now is like something that they can actually build on and stay on for the next couple of years. And we talked about this with Brett Taylor on our episode and it feels like it's like the jQuery era of agents and LLMs. It's like it's kind of like, you know, single file big frameworks, kind of like a lot of helpers. But maybe we need react. And I think people are just trying to build still jQuery. Like, I don't really see a lot of people doing react.
B
Like, yeah, maybe the only modification I made about that is maybe it's too early even for frameworks at all.
A
Do you think there's enough stability in the underlying model layer and patterns to have this?
B
The thing is the protocol, not the framework, because frameworks inherently embed protocols, but if you just focus on the protocol, maybe that works. And obviously MCP is the current leading area and I think the comparison there would be Instead of just jQuery, it is XML HTTP requests, which is the thing that enabled Ajax. And that was the inciting incident for JavaScript being popular as a language.
D
I would largely agree with that. I mean, I think on the REACT side of things, I think we're starting to see more frameworks sort of go after more of that. I guess like Master is sort of like on the typescript side and more of like a sort of Master. Yeah, yeah. The traction is really impressive there and so I think we're starting to see more surface there, but I think there's still a big opportunity.
A
What do you have for. For an overar. Underhyped.
D
On the underhyped side. You know, actually, I know I mentioned Apple already, but I think the private cloud compute side with pcc, I actually think that could be really big. It's under the radar right now but in terms of basically bringing the on device sort of security to the cloud, they've done a lot of architecturally interesting things there.
B
Who's they?
D
Apple. Oh, on the PCC side and so I actually think that see a negative.
B
On Apple intelligence but. But on the Apple cloud, on the.
D
More of the local device sort of. I think there will be a lot of workloads still on device but when you need to speak to the cloud for larger LLMs, I think that Apple has done really interesting thing on the privacy side.
C
Yeah, we did the seat of a company that does that.
D
Yeah. Especially as things become more purpose.
A
So that felt like a perfect.
C
I was like let's go Jordan.
A
Before this episode.
D
Tell me about that company after.
C
But, but yes, I think that's like the unique. The thing about LLM workflows is like you just cannot have everything be single tenant because you just cannot get enough GPUs. Like even like large enterprises are used to having VPCs and like everything runs privately but now you just cannot get enough GPUs to run in a VPC. So I think you're gonna need to be in a multi tenant architecture and you need like you said like single tenant guarantees in multi tenant environments. So yeah, it's an interesting space.
A
Yeah.
B
What about you Swiss under hyped I want to say memory just like stateful AI as part of my keynote. Every conference I do, I do a keynote and I tried to do the task of define an agent obviously evergreen.
A
Content for a keynote.
B
But I did it in a way that it was I think what a researcher would do. You survey what people say and then you sort of categorize and go like okay, this is what everyone calls agents and here are the groups of different definitions. Pick and choose. And then it was very interesting that the week after that OpenAI launched their agents SDK and kind of formalized what they think agents are. Cloudflare also did the same with us and none of them had marry. Very strange. Pretty much the only big lab. Obviously there's conversation memory, but there's not memory memory in let's store a knowledge graph of fact about CU and exceed the context length if you look closely enough. There's a really good implementation of memory inside of mcp. When they launched with the initial set of servers, they had a memory server in there which I would recommend as that's where you start with memory. But I think if there was a better memory abstraction, then a lot of our agents would be smarter and could learn on the job, which is something that we all want. And for some reason we've all just ignored that because it's just convenient to.
A
But do you feel like it's being ignored or it's just a really hard problem and I feel like lots of people are working on it just feels like it's proven more challenging.
B
Yeah. So Harrison has Langmen, which I think now he's relaunched again. And then we had Letta come speak at our conference. I don't know, Zep, I think there's a bunch of other memory guys, but something like this I think should be normal in the stack. And basically I think anything stateful should be interesting to VCs because it's databases and we know how those things make money.
A
I think on the overhyped side, the only thing I'd add is I'm still surprised how many net new companies there are training models. I thought we were past that.
B
I would say they died end of last year. Now they've resurfaced. I mean, that's one of the questions that you had down there of like, is there an opportunity for net new model players? I wouldn't say no. I don't know what you guys think.
C
I don't have a reason to say no. But I also don't have a reason to say this is what is missing and you should have a new model company do it.
B
But again, I'm all these guys want to pursue AGI. They all want to be like, oh, we'll hit soda and all the benchmarks and they can't all do it.
A
Yeah. I mean, look, I don't know if Ilya has the secret approach up his sleeve of something beyond test time compute, but it was funny. We had Noam Shazir on the podcast last week and I was asking him, is there some sort of other algorithmic breakthrough? What do you make of Ilya? And he's like, look, I think what he implicitly said was test time compute will get us to the point where these models are doing AI engineering for us. And so at that point, they'll figure out the next algorithmic breakthrough. Which I thought was pretty interesting.
D
I agree with you folks. I think that we're most interested, at least from our side, in foundation models for specific use cases and more specialized use cases. I guess the broader point is if there is something like that that these companies can latch onto and being there sort of known for being the Best at maybe there's a case for that. Largely though I do agree with you that I don't think there should be at this point more model companies.
A
I think that's like these unique data sets.
D
Right.
A
I mean obviously robotics has been an area we've been really interested in. This is an entirely different set of data that's required on top of a good BLM and then you know, biology, material science more the specific use cases. Yeah. But also specific like specific markets. A lot of these models are super generalizable but like you know, finding opportunities to you know where you know, for a lot of these bio companies they have wet labs like they're running a ton of experiments or you know, same on the material sciences side. And so I still feel like there's some opportunities there but the core kind of like LLM agent space is tough to compete with the big ones.
B
Yeah, agree.
C
Yeah. But they're moving more into product. So I think that's the question is like if they could do better vertical models, why not do that instead of trying to do deep research and operator and these different things. I think that's what I'm in my.
B
Mind'S coming out too.
C
Well. Yeah, in my mind it's like financial pressure, like they need to monetize in a much shorter time frame because the costs are so high. But maybe it's like it's not that.
A
Easy to do you think they would be that it would be a better business model to like do it?
C
It's more like why wouldn't they? You know like you make less enemies if you're like a model builder. Right?
A
Yeah.
C
Like, like now with deep research and like search now perplexity is like an enemy and like you know Gemini deep research is like more of an enemy versus if they were doing a finance model, you know, or whatever like they would just enable so many more companies and they always have like they had ABIA as one of the customer case studies for GPT search but they're not building a finance based model for them. So is it because it's super hard and somebody should do it or is it because the new models are going to be so much better that the vertical models are useless anyway? It's like this is better lesson. Exactly.
A
It still seems to be a somewhat outstanding question. I'd say all the signs of the last few years seem to be a general purpose model is the way to go. And training a hyper specific model in a domain is maybe it's cheaper and faster but it's not going to be higher quality. But also I think it's still. I mean we're talking to Noam and Jack Ray from Google last week and they were like, yeah, this is still an outstanding. We check this every time we have a new model, like whether that still seems to be holding. I remember a few years ago it felt like all the rage was like there was like the Bloomberg GPT model came out and everyone was like, oh, you got to like, I had CPF.
B
AI Bloomberg present on that.
A
That must be a really interesting episode to go back on because I feel like very shortly thereafter the next OpenAI model came out and just beat it on all sorts of.
B
No, it was a talk, we haven't released it yet. But yeah, basically they concluded that the closed models were better so they just stopped.
A
Interesting. I feel like that's been the.
B
But he's very insistent that the work that he did, the team he assembled, the data that he collected is actually useful for more than just the model. So basically everything but the model survived.
A
What are the other things?
B
The data pipeline, the team that they assembled for fine tuning and implementing whatever models they ended up picking up. Yeah, it seems like they are happy with that and they're running with that. He runs like 12, 13 teams at Bloomberg just working on gen across the company.
A
I mean, I guess we've all kind of been alluding to it right now, but I guess because it's a natural transition. The other broad opening I have is just what we're paying most attention to right now. And I think back on this, like, you know, the model companies coming into the product area. I mean, I think that's going to be like, I'm fascinated to see how that plays out over the next year and kind of these like frenemy dynamics and it feels like it's going to first boil up on cursor Anthropic and the way that plays out over the next six months I think will be what is cursor?
B
Anthropic. You mean cursor versus anthropic?
A
Yeah, I assume over time Anthropic wants to get more into the application side of coding and I assume over time Cursor will want to diversify off of just using the anthropic model.
B
It's interesting that now cursor is now worth like 10 billion. 910 billion.
A
Yeah.
B
And they've made themselves hard to acquire. I would have said you should just get yourself to 5,6 billion and join OpenAI and all the training data goes to OpenAI and that's how they train their coding model. Now it's complicated now they need to be an independent company.
A
Increasingly it seems that model companies want to get into the product layer. And so seeing over the next 612 months, does having the best model let you kind of start from a cold start on the product side and get something in market, or are the companies with the best products, even if they eventually have to switch to a somewhat worse, tiny bit worse model, does it not, you know, where do the developers ultimately choose to go? I think that'll be super interesting.
C
Yeah, but don't you think that Devin is more in trouble than Cursor? I feel like Anthropic, if anything, wants to move more towards. I don't think they want to build the ide. Like if I think about coding, it's like kind of like, you know, you look at it like a cube. It's like the IDE is like one way to get the code and then the agent is like the other side. Yeah, I feel like Anthropic wants more be on the agent side and then hand you off to cursor when you want to go in depth versus like trying to build the Claude ide. I think that's not, I would say, I don't know how you think about it.
B
The existence of Claude code doesn't show, doesn't support what you say. Like maybe they would, but I assume.
A
All of that yet, like, I assume both just converge eventually where you want to have, where you'll be able to do both.
B
So in order to be. So we're talking about coding agents, whether it's sort of, what is it? Inner loop versus outer loop.
C
Right.
B
Like inner loop is inside cursor, inside your idea of a git commit and outer loop is between git commits on the cloud. And I think like to be an outer loop coding agent, you have to be more of a. Like, we will integrate with your code base, we'll sign your whatever security thing that you need to sign that kind of schlep. I don't think the model ads want to do that schlep. They just want to provide models. So that would be my argument against why cognition should still have some moat against anthropic. Just simply because cognition would do to schlep and the bizdev and the infra that Anthropic doesn't really care about.
A
I don't know. The schlep is pretty sticky though. Once you do it.
B
It's very sticky. Yeah, it's interesting. I think the natural winner of that should be Sourcegraph but there's another unprompted portfolio. I mean they're big support. I'm very friendly with both Quinn and Biang and they've done a lot of work with Coti but not much work on the outer loop stuff yet. But any company where they have already had. We've been around for 10 years. We have all the enterprise contracts. You already trust us with your code base. Why would you go trust Factory or Cognition as two year old startups who just came out of mit?
A
I don't know, I guess switching gears to the application side, I'm curious for both of you, like how do you kind of characterize what has genuine product market fit in AI today? And I guess Alessio more on your sort of the investing side, like more interesting to invest in that category of the stuff that works today or kind of where the capabilities are going long term.
C
That's hard.
A
Ask me to do my job.
C
You were like man, that's easy. That's a layout.
B
That's all your investment pieces.
C
I would say we are. Well, we only really do mostly seed investing so it's hard to invest in things that already work because it means they're already late. So we try to but. But we try to be at the cusp of like you know, usually the investments we like to make there's like really not that much market risk. It's like if this works obviously people are going to use it but like it's unclear whether or not it's going to work. So that's kind of more what we skew towards. We try not to chase as many trends and I don't know, I was a founder myself and sometimes I feel like it's easy to just jump in and do the thing that is hot. But like becoming a founder to do something that's like underappreciated, like doesn't yet work shows some level of like drit and self. Like you actually really believe in the thing. So that alone for me is like kind of makes me skew more towards that. And you do a lot of angel investing too. So I'm curious how.
B
Yeah, but I don't regard, I don't have, I don't use put that in my mental framework of things. Like I come at this much more as a content creator or market analyst of like yeah, it really does matter to me what has product market fit because people I have to answer the question of what is working now when people ask me do you feel like.
A
Relative to the obviously the hype and discourse out there. Like, you know, do you feel like there's a lot of things that have product market fit or like a few things? Like a few things, yeah.
B
So I have a list of like 2 years ago I wrote the Anatomy of Autonomy posts where it was like the first, like what's going on in agents and what is actually making money. Because I think there's a lot of Genai skeptics out there that are like, these things are toys, they're unreliable. Why are you dedicating your life to these things? And I think for me, the product market Fitbar at the time was $100 million. What use cases can reasonably fit $100 million? And at the time it was copilot, it was Jasper no longer. But in that category of help you write, which I think was helpful and the cursor I think was on there as a coding agent, I think that list will just grow over time of the form factors that we know to work and then we can just adapt the form factors to a bunch of other things. So the one that's the most recently added to this is deep research where anything that looks like a deep research, whether it's a Grok version, Gemini version, Perplexity version, whatever, he has an investment that, that he likes called brightwave that is basically deep research for finance. And anything where like, all right, it's like long term agentic reporting is starting to take more and more of the job away from you and just give you a much more reason to report. I think it's going to work. And that has some pmf, I think obviously has pmf. I would say I went through this exercise of trying to handicap how much money OpenAI made from launching OpenAI Deep Research. I think it's billions. Like the sheer upgrade from like $20 to 200. It has to be billions in ARR. Maybe not all of them will stick around, but that is some amount of PMF that is.
A
Didn't they have to immediately drop it down to the $20 tier?
B
They expanded access. I wouldn't say which I thought was.
A
Really telling of the market. Right. It's like where you have a. I think it's going to be so interesting to see what they're actually able to get in that $200 or $2,000 tier, which we all think has a ton of potential. But I thought it was fascinating. I don't know whether it was just to get more people exposure to it or the fact that Google had a similar product obviously and other folks did too, but it was really interesting how quickly they dropped it down.
B
I think that's just a more general policy of no matter what they have at the top tier, they always want to have smaller versions of that in the lower tiers.
A
Yeah, just get people exposure to it.
B
Just get exposure. The brand of being first to market and like the default choice is paramount to OpenAI though.
A
I thought that whole thing was fascinating because Google had the first product, right?
B
Yeah.
A
And no, like, you know, we interviewed them.
B
I straight up to their faces. I was like, OpenAI mogged you. And they were like, yeah, well actually I'm curious.
A
This is totally off topic, but whatever. Like, what is it going to take for. Google just released some great models like a few weeks ago. Like, I feel like it's happening. The stuff they're shipping is really cool. Cool, it's happening. But I feel like at least in the broader discourse, it's still a drop in the bucket relative to.
B
Yeah, I can riff on this, but I think it's happening. I think it takes some time. But my Gemini usage is up. I use it a lot more for anything from summarizing YouTube videos to the native image generation that they just launched to Flash linking.
A
Multiple stuff's great. Yeah.
B
And I run daily sort of news recap called AI news that is 99% generated by models and I do a bake off between all the frontier models every day and it's every day does it switch? I manually. Yes, it does switch and I manually do it and Flash wins most days. So I think it's happening. I was thinking about tracking myself number of opens of ChatGPT versus Gemini and at some point it will cross. I think that Gemini will be my main and that will slowly happen for a bunch of people and then that'll shift. I think that's really interesting for developers. That's a different question. It's Google getting over itself of having Google Cloud versus Vertex versus AI Studio, all these five different brands, slowly consolidating it. It'll happen. Just slowly, I guess.
C
Yeah, yeah. I mean another good example is like you cannot use the thinking models in cursor and I know Logan Kilpatrick said they're working on it, but I think there's all these small things where like if I cannot easily use it, I'm really not going to go out of my way to do it. But I do agree that when you do use them, their models are great. So yeah, they just need better bridges.
B
You had one of the questions in the prep, what public company are you long and Short. And mine is Google versus versus Apple. That was a long, short Apple.
A
I feel like. Yeah, I mean it does feel like Google's really cooking right now. Yeah.
B
So, okay, coming back to what is.
A
Product market fit now that we come back to my complete total sidetrack, there's also customer support.
B
We were talking on the car about Deku Gan and Sierra. Obviously Brett Taylor is founder of Sierra and yeah, it seems like there's just these layers of agents that'll like, I think you just look at like the income statement or like the org chart of any large scaled company and you start picking them off one by one. What like is interesting knowledge work and they would just kind of eat things slowly from the outside in. That makes sense.
C
I mean the episode with Brett, he's so passionate about developer tools and yeah, he did not do a developer tools company. We spent like two hours talking about developer tools and like all of that stuff and it's like I did a customer support company. I'm like, man, that says something. You know what I mean? It's like when you have somebody like him who can like raise any amount of money from anybody to do anything.
A
Yeah.
C
To pick customer support as the market to go after while also being the chairman of OpenAI, like that shows you that like these things have moats and have long standing. Like they're going to stick around, you know, otherwise he's smarter than that. So yeah, that's a, that's a space where maybe initially, you know, I would have said I don't know if it's like the most exciting thing to jump into. But then if you really look at the shape of how the workforce are structured and how the cost centers of the business really end up, especially for more consumer facing businesses, a lot of it goes into customer support. All the AI story of the last two years has been cost cutting. I think now we're going to switch more towards growth revenue. Like you've seen, Jensen last year at GTC was saying the more you buy, the more you save. This year is that the more you buy, the more you make. So we're hot off the press. We were there.
A
I do think that's one of the most interesting things about this first wave of apps where it's like almost the easiest thing that you could get real traction with was stuff that for lack of a better way to frame it, stuff that people had already been comfortable outsourcing to BPOs or something and kind of implicitly said, hey, this is a cost center. We are willing to take some Performance cut for cost in the past. The irony of that, or what I'm really curious to see how it plays out is you could imagine that is the area where price competition is going to be most fierce because it's already stuff that people have said, hey, we don't need the 100% best version of that. I wonder. This next wave of apps may prove actually even more defensible as you get these capabilities that actually are increased top line or whatnot. Where you're like, take AI go to market, for example, you pay twice as much for something that brought. Because there's just a very clean ROI story to it. And so I wonder ultimately whether this next set of apps actually ends up being more interesting than the first wave.
D
Yeah, I think a lot of the voice AI ones are interesting too, because you don't need 100% precision recall to actually have a great product. And so, for example, we looked into a bunch of scheduling intake companies, for example, like home services for electricians and stuff like that. Today they miss 50% of their calls. So even if the AI is only effective, say 75% of the time. Yeah, it's crazy, right? So if it's effective 75% of the time, that's totally fine because that's still a ton of increased revenue for the customer. Right. And so you don't need that 100% accuracy. And so as the models and the reliability of these agents are getting better, it's totally fine because you're still getting a ton of value in the meantime.
B
Yeah. I don't know how related this is, but one of my favorite meetings at it is related. One of my favorite meetings at AI Engineer Summit because I do these. This is our first one in New York and I just like met the different crew then you meet here. Like everyone here loves developer tools, loves infra. Over there, they're actually more interested in applications. It's kind of cool. I met this like bootstrap team that they're only doing appointment scheduling for vets and they're like, this is an anomaly. We don't usually come to engineering summits because we usually go to vet summits and talk to the. They're like, you know, they're literally.
D
I'm sure it's a massive pain point. They're willing to pay a lot of money.
C
But this is like my point about saving versus making more. It's like if an electrician takes 2x more calls, do they have the bandwidth to actually do 2x more in house? Well, yeah, exactly. That's the Thing is like, I don't think today most businesses are like structured to just like overnight 2, 3x demand, you know, I think that's like a startup thing, like most businesses.
B
Do you make an electrician agent?
C
Well, no, totally. So how do you do. How do you do a recruiting agent for electrician. Like electrician training. How do you do? Lambda School for electrician.
A
I don't know. It's like whack. A mole for the bottlenecks in these businesses.
C
Yeah, exactly.
A
Now we have a ton of demand. Like, cool. Where do we go?
C
Yeah.
B
So just to round out this PMF thing, I think this is relevant in a sense of like, it's pretty obvious that the killer agents are coding agents, support agents, deep research.
C
Right.
B
Roughly. Right. We've covered all those three already. Then you have to sort of turn to offense and go like, okay, what's next? And like, what about.
A
I mean, also just like summarization of voice and conversation.
B
Yeah, we actually had that on there. I didn't put it as agent because seems less agentic, you know. But yes.
A
So still a good AI use case.
B
That one I've seen. I would mention granola. And what's the other one?
A
Monterey, I think a bridge was going to say.
D
Abridged.
C
Yeah, bridge. Okay.
B
So I'll just. I'll call out what I had on my slides for the agent engineering thing. So it was screen sharing, which I think is actually kind of underrated. Like people like an AI watching you as you do your work and just like offering assistance. Outbound sales. So instead of support just being more outbound hiring.
A
You say outbound sales has product market fit?
B
No, it. It will.
A
It's coming along the company. I totally agree with that.
B
Yeah. Hiring like the recruiting side education, like the. The sort of like personalized teaching.
A
I think I'm kind of shocked we haven't seen more there.
B
Yeah. I don't know if that's like duolingo is the thing. Kamigo.
A
Yeah. I mean, speak in some of these practice.
B
Yeah, interesting. And then finance, there's a ton of finance cases that we can talk about that. And then personal AI, which we also had a little bit of that. But I think personal AI is harder to monetize. But I think those would be what I would say is up and coming in terms of. That's what I'm currently focusing on.
A
I feel like this question's been asked a few different ways, but I'm curious what you guys think. It's like, is it if we just froze model capabilities today, is There trillions of dollars of application value to be unlocked AI education. If we just stopped today all model development with this current generation of models we could probably build some pretty amazing education apps or how much of all this is contingent upon just okay, people have had two years with GPT4 and I don't know, six months with the resuming models. How much is contingent upon it just being more time with these things versus the models actually have to get better? I don't know, it's a hard question so I'm going to just throw it to you.
C
Yeah, well I think the societal thing is maybe harder, especially in education. You know like can you basically like doge the education system? Probably you should but like can you. I think it's more of a human.
A
But people pay for all sorts of like get ahead things outside of class and you know, certainly in other countries there's a ton of consumer spend and it feels like the market opportunity is there.
B
Yeah. And in private education I think, yeah, public, public is very different. One of my most interesting quests from last year was kind of reforming Singapore's education system to be more sort of AI native.
A
Just what you were doing on the side while you're.
B
Yes.
A
That'S a great side quest.
B
My stated goal is for Singapore to be the first country that has Python as a first language, as a national language anyway, so. But the defense, the pushback I got from the Ministry of Education was that the teachers would be unprepared to do it. So it's like it was like the, like the, it was really interesting like immediate pushback was the de facto teachers union being like resistant to change and like okay, that's par for the course anyway so not to dwell too much on that but like yeah, I mean like I think like education is one of these things that everyone like has strong opinions on because they all have kids all been through the education system. But I think it's going to be the domain specific speak. Such an amazing example of top down. We will go through the idea maze and we'll go to Korea and teach them English. It's like what the hell? And I would love to see more examples of that. Just really focus. Don't try to solve everything, just do your thing really, really well.
A
On this trend of difficult questions that come up, I'm going to just ask you the one that my partners like to ask me every single Monday which is how do you think about defensibility at the app layer?
B
Oh yeah, that's great.
A
Just give me an answer. I can Copy paste and just like, you know, network effects, auto response.
B
Honestly, like network effects, I think people don't prioritize those enough because they're trying to make the single player experience good, but then they neglect the multiplayer experience. I think one of the I always think about like load bearing episodes like you know, as pox that you do one a week and like, you know, some of those you don't really talk about ever again and others you keep mentioning every single podcast and this is.
A
Obviously going to be the last one.
B
I think the recap episodes for us are pretty low bearing. We refer to them every three months or so and one of them I think for us is Chai. For me is Chai Research, even though that wasn't a super popular one among the broader community outside of the Chai community. For those who don't know, Chai Research is basically a character AI competitor. They were bootstraps, they were founded at the same time and they have outlasted character de facto.
C
Right.
B
It's funny, I would love to ask Noel Shazir a bit more about the.
A
Whole character thing, but good luck getting past the Google cups.
B
But. So he doesn't have his own models, basically he has his own network of people submitting models to be run. And I think that is short term going to be hurting him because he doesn't have proprietary ip, but long term he has the network effect to make him robust to any changes in the future. And I think I want to see more of that where he's basically looking at himself as kind of a marketplace and he's identified the choke point which is build the app or the sort of protocol layer that interfaces between the users and the model providers and then make sure that the money kind of flows through and that works. I wish that more AI builders or AI founders emphasized network effects because that's the only thing that you're going to have at the end of the day. And like brand leads into network effects.
A
Yeah, I guess harder in the enterprise context. Right. But I mean, I feel it's funny we do this exercise and I feel like we talk a lot about like, you know, obviously there's you know, kind of the velocity and the breadth you're able to kind of build of product service area. There's just like the ability to become a brand in a space. Like I'm shocked even in like six, nine months how an individual company can become synonymous with like an entire category. And like then they're in every room for customers and like all the other startups are like clawing their way to try and get in like 1 20th of those rooms.
D
There's a bunch of categories where we talk about an IC and it's like, oh, pricing compression is going to happen, not as defensible. And so ACVs are going to go down over time. In actuality, some of these, the ACVs have doubled. We've seen. And the reason for that is just, you know, people go to them and pay for that premium of being that brand.
A
Yeah. I mean, what I'm struck by is there's been. There was such a head fake in the early days of, of AI apps where people were like, we want this amazing defensibility story. And then what's the easiest defensibility story? It's like, oh, like totally unique data set or like train your own model or something. And I feel like that was just like a total head fake where I don't think that's actually useful at all. It's the much less. You sound much less articulate when you're like, well, the defensibility here is like the thousand small things that this company does to make like the user experience, design everything just like delightful and just like the speed at which they move to kind of both create a really broad product, but then also every 3, 6 months when a new model comes out. It's kind of an existential event for any company because if you're not the first to figure out how to use it, someone else will. And so velocity really matters there. And it's funny, in kind of our internal discussions we've been like, man, that sounds pretty similar to how we thought about application SaaS companies. That there isn't some revolutionary reason. You don't sound like a genius when you're like, here's applications. Why application SaaS Company A is so much better than B, but it's like a lot of little things that compound over time. What about the infrastructure space? Guys, I'm curious. How do you guys think about where the interesting categories are here today? And where do you want to see more startups? Or where do you think there are too many?
C
Yeah, we call it kind of the LLMOs, but I would say not we.
B
I mean, Andre calls it llmos.
C
Well, but yeah, well, we have Andre.
A
The three of you call it the llmos.
C
Well, we have this like four wars of AI framework that we use and LLMOS is one of them. But yeah, I mean, code execution is one. We've been banging the drum. Everybody now knows we're investors in E2B memory, you know, is One that we kind of touched on before. Super interesting search we talked about. I think those are more not traditional infra, not like the bare metal infra. It's more like the infra around the model, you know, which I think is where a lot of the value is going to be the app security ones. Yeah, yeah. And cybersecurity. I mean there's so much to be done there. And it's more like basically any area where AI is being used by the offense AI needs to be applied on the defense side like email security, identity, like all these different things. So we've been doing a lot there as well as how do you rethink things that used to be costly like red teaming and maybe used to be a checkbox in the past. Today they can be actually helpful to make you secure your app. And there's this whole idea of semantics that not the models can be good at. In the past everything is about syntax. It's kind of like very basic constraint rules. I think now you can start to infer semantics from things that are beyond just simple recognition to understanding why certain things are happening a certain way. So in the security space we're seeing that with binary inspection, for example, like there's kind of like the syntax, but then there are like semantics of like understanding what is this code overall really trying to do. Even though this individual syntax is like saying something specific not to get too technical. But yeah, I think infra overall is like a super interesting place if you're making use of the model. If you're just. I'm less bullish. Not that it's not a great business, but I think it's a very capital intensive business which is like serving the models. I think that infra is like great, people will make money. But yeah, I don't think there's as much of a interest from us.
D
How do you guys think about what OpenAI and the big Research Labs will encompass as part of the developer and infra category?
C
Yeah, that's why I would say search is the first example of one of the things we used to mention on. We had X on the podcast and Perplexity obviously see as an API, the.
B
Basic idea is if you go into the ChatGPT custom GPT builder, what are the checkboxes? Each of them is a startup.
C
Yeah. And now they're also APIs. So now search is also an API. We'll see what the adoption is. There's the, you know, in traditional inference like everybody wants to be multi cloud, so maybe we'll see the same where ChatGPT Search or OpenAI Search API is like great with the OpenAI models because you get it all bundled in but their price is very high. If you compare it to like you know, exa, I think it's like five times the price for the same amount of research. Which makes sense if you have a big OpenAI contract. But maybe if you're just like picking best in breed you want to compare different ones. They don't have a code execution one. I'm sure they'll release one soon so they want to own that too. But yeah, same question we were talking about before, right? Did they want to be an API company or a product company? Do you make more money building ChatGPT search or selling search API?
B
Yeah. The broader lesson, instead of like going we did applications just now and then, what do you think is interesting infrastructure? Like it's not 50 50, it's not like equal weighted like it's just very clearly the application layer has been way more interesting. Yes, there's interesting infrastructure plays and I even want to push back on the whole GPU serving thing because together AI is doing well. Fireworks.
A
It's like data centers and inference providers.
C
I think it's all like the capital iz again capital efficiency, much larger funds. So I'm sure you have GPU clouds.
B
Yeah. So that is one thing I have been learning in that I think I historically had devtools and infra bias and so has he and we've had to learn that applications actually are very interesting and also maybe kind of the killer application of models in the sense that you can charge for utility and not for cost. Right. Which most infrastructure reduces to cost plus and that's not where you want to be for AI. So that's interesting for me. I thought it would be interesting for me to be the only non VC in a room to be saying what is not investable because then I won't be canceled for saying your whole categories.
A
This thing's not investable. And then like three months later we're desperately chasing.
B
Exactly. So you don't want to be on.
A
The record changes so fast. Every opinion you hold, you have to hold it quite loosely.
B
I'm happy to be wrong in public. I think that's how you learn the most.
A
Right.
B
So like fine tuning companies is something I struggled with and still I don't see how this becomes a big thing. You kind of have to wrap it up in a broader enterprise AI company like services company like a writer AI where like they will find you and it's part of the overall offering. But like that's not where you spike. Yeah, it's kind of interesting. And then I'll just kind of AI DevOps and like there's a lot of AI SRE out there. Seems like there's a lot of data out there that should be able to be plugged into your code base or your app to self heal or whatever. It's just I don't know if that's like been a thing yet. And you guys can correct me if I'm wrong. And then the last thing I'll mention is voice Real time infrastructure. Again, very interesting, very hot. But again, how big is it? Those are the main three that I'm thinking about for things I'm struggling with.
D
Yeah, I guess a couple comments on the AISRE side. I actually disagree with that one. I think that the reason they haven't sort of taken off yet is because the tech is just not there quite yet. And so it goes back to the earlier question. Do we think about investing towards where the companies will be when the models improve versus now? I think that's going to be in short term. We'll get there, but it's just not there just yet. But I think it's an interesting opportunity overall.
B
Yeah, my pushback to you is. Well, it's monitoring a lot of logs. Right. And it's basically anomaly detection rather than like there's a whole bunch of like stuff that can happen after you detect the anomaly, but it's really just anomaly detection and we've always had that. You know, like it's. This is like not a Transformers LLM use case. This is just regular anomaly detection.
D
It's more in terms of like it's not going to be an autonomous SRE for a while. And so the question is how, how much can the latest sort of AI advancements increase the efficacy of going bringing your MTTR down.
B
Yeah.
D
And Even if it's 10% improvement on beforehand, that's still potentially a lot of revenue.
B
Okay.
D
That's the way at least I think I would think about it now and then, you know, a few years from now. If it's actually an autonomous sre, just replacing altogether, then that's a totally different, different thing.
B
Cool, I'll go for it.
A
Yeah. I guess switching back to overly broad questions, what do you feel like is the biggest unanswered question in AI today that has large implications for the ecosystem?
C
Yeah, I've been banging the drum on RL and I think it's clear that you can do RL Successfully on verifiable domains. I would say whether or not we can figure out how to do that in non verifiable ones. So law is a great example. Like can you do RL on contracts and documents? Marketing, sales, going back to outbound sales. Like can you do RL to simulate what an outbound and kind of like the conversation leads to? Yeah, it's unclear. If not then I think we'll be stuck with like you're going to have agents in the more verifiable domains and then you'll just kind of have copilots and the nonverifiable ones because you'll still need a person to be the tastemaker.
A
I have the exact same thing and I feel like it's like the it just. I'm trying to think of the implications where if it doesn't work like the world could be weird where like you have like fully autonomous AI coders and like you know, no one does any software or math or even like you know, some areas of science. But then like to write the most basic sales email is still like, like just it's always so hard to predict out the world. Like that is such a weird. Of all the sci fi that was written, you know, 50 years ago, I don't think anybody for soft that future. That is a really weird future. Did either of you have a different one for that?
B
We'll go back and forth.
D
Biggest unanswered question I guess. I don't know if this is a good answer, but Bob McGrew we had on the podcast and he was talking about the rule of nines they have at OpenAI where to go from 90% reliability to 99, it's an order of magnitude increase in compute and then 99 to 99.9 order of magnitude increase. And that happens every two to three years. And so I think how are we going to scale sort of accordingly this sort of next part? I think there's a lot of unanswered questions just like from a hardware perspective and then I think as part of that from the availability perspective, like is Nvidia just going to continue to be dominant? Like obviously AWS is going hard into what's their Trainium chips. I'm blank on that, thank you. And so I think there's a big ecosystem around Cuda that's obviously allowed Nvidia to remain dominant. But just what's going to happen and is there anyone that's going to come sort of combat that to increase the availability of GPUs or are we Just going to be constrained going forward when we actually need way more compute going forward.
B
Yeah, my quick thoughts. I'm the only individual named as an investor in Medex, which is kind of really funny because everyone else was funds and then there's just me. And there's an interest. There are all these Nvidia startups, sorry, dedicated silicon startups that are coming up and trying to challenge that. And the simple answer is these GPUs are the most general things possible by design. That's why they do gaming and crypto and AI. And I think as long as the architecture seems stable, it seems like there's a case to be made for, for that. The only question is who will win that? And obviously there's a whole bunch of competitors, including I think AMD is trying to make a play for it. But so will AWS and so will every other. Like Microsoft has a chip, Facebook has a chip. So who knows who will win that. It's very interesting that this seems to be such a valuable prize. Like it's freaking Nvidia that you're competing with and no one has really made a real dent there yet. I kind of agree with you, but I think that basically it's all about stability of workload and as long as it's a bet on the depth of Transformers, basically, and if you're fine with that, and I think even the State Space model people would agree that it wouldn't really change that much. And probably I think the overall consensus is that you, you don't even use State Space models individually. You would use them in a mixture with Transformers anyway. So then, yeah, just go bet on Transformers, bake it into the chip and you'll have much way more basically ASICS for Transformers and that's fine. So prima facie, there should be a company that wins that. I don't know who will win that.
D
I wish we knew.
B
I think anyone. You have to start basically after 2019 or 2020, because anyone started before that will still be too general because you, Transformers hadn't won yet at the time. I have one more. I think that the most emergent one that came out of the New York conference that I did was agent authentication. I think literally the information just published this is something that they're worried about, which is when operator or whoever accesses your website on behalf of of you, how does it indicate that it's not you, but it's an agent of you? And I think my general philosophy on agent experience or any of the sort of reinvention of every Part of the stack for agents. Is that all not necessary? Except for this agent auth thing. We really need to be able to new SSO effectively for agents.
A
Is it going to be crypto? But crypto people are really amped about the.
B
It's really frustrating when Sam Altman is right, but, like, maybe you have to scan your eyeballs. Like, maybe you just have to. Maybe he saw this like five years ago and he was like, you gotta scan your eyeballs. And the rest of us are just behind him as usual.
A
I love it. Well, okay, now I'm moved to the quickfire round where we'll go around the horn and get quick takes on things. So the first is gonna be Dream podcast guest John Carmack.
B
Yeah, John is like six steps away from solving AGI, apparently. So we just ask him how long he is. For us, it's Andre. For me, it's Andre. He's a listener and supporter of the pod. And basically when I launched the whole AI engineer push that we have, he was basically the first one to legitimize it. He was like, there will be more AI engineers than ML engineers. And I think that made everyone else pay attention. So, like, Lay in space only exists because, you know, he. He helps. He and other people help to promote it.
A
Yeah.
D
I also had Andreas, though. I guess we were thinking the same thing there.
A
I basically. Mine's a little bit cheating, but I think at some point there will clearly. Like, they're writing a book about opening AI now, and at some point, like, somebody probably acquired will get to do the acquired OpenAI episode. But if unsupervised learning could. Like, there's like, so many amazing stories of, like the last five, six years.
B
So. Do you know about Doomers? It's a play. I'm actually going to it this Saturday. Yeah, someone made a play about the board drama of last year.
A
Really?
B
Yeah.
A
Wow, that's cool. Let's know how it is.
D
Yeah, let us know.
B
There will be.
D
There will be.
B
No, I. I think it's a lot of fan fiction, basically. But, like, someone will write the accounts and. And it will be interesting and fascinating and a lot of. A lot of it will be fake because it's a complex beast.
A
Right.
B
You're just getting an oral history of what happened.
A
Yeah.
D
Yeah.
A
All right. For the next one, I figured you could shout out either, like a new source you used to stay up to date or a startup that's. That you're not invested in, that you're excited about, or you can do it.
C
My new source is Sean.
D
That's what I was going to say.
A
I literally wrote Swix's Twitter in our discord.
C
We have a latent space. We have a latent space Discord. Any link that ever matters on the Internet, Sean is gonna post it in the discord. So all I do, I open Discord and we have like, you know, 40, 50 different channels by topic. I open Discord and I'm like, okay, AI. Then I go developer tools. Then I go creator economy. Then I go stock and macro. Then I go. And they're all there. So thank you.
B
We actually met because of the discord. It was like a Covid thing because everyone's at home and. And just started a discord. And. Yeah, that was the origin of loading space. Just chatting on the discord.
C
It used to be called Dev Invest. Yeah. So it was all about developer tools, investing. And then we were OpenAI in October of 2022. We're like, maybe we should do a podcast. And then OpenAI was the first.
B
Yeah. I was not prepared about the news sources thing. I think maybe it's hard. It's really shitty to say, but, like, just in person conversations.
A
Yeah.
B
And I think the reason I have to be here in SF is because I make friends with people who know things and are smarter than me, and we go for chats and they're nice enough to share some stuff. And so sometimes I wish. I worry that I am being used in order to put things out there that are maybe not. Not true, but, you know, so I have to exercise my own judgment as to what I think.
A
One of the cool things about the podcast in general is just, like, the opportunity to take these conversations that happen in, like, closed rooms and try and bring them on to the airwaves. I'm curious, like, how much of what you. How much do you feel like the private discourse is similar to the public.
B
Discourse in many ways? It is surprisingly similar. As in people at OpenAI learn about things about OpenAI from us, which is interesting. And then there are some ways which is drastically dissimilar. And those are the things I just cannot repeat until it's public.
A
This has been super fun. I feel like I lived up to it. We were looking forward to this for a while. We want to make sure everyone around the horn gets an opportunity to plug whatever they want to plug. So we'll leave the last word to all of us, I guess. Where can folks go to learn more about latent space and all the exciting things you do want to make sure our listeners have a good sense of everything.
C
Yes. So we have a substack latent space is the website. And then Please subscribe on YouTube. We're doing a lot on YouTube. We're trying to do better video and all that stuff.
B
He said our OKRs. It's basically all YouTube.
C
Come watch us on YouTube. It's very important for me personally, even if you don't care.
A
Just okrs.
B
Well, we have to increase our product production value. Look at this.
C
I know. We only have three cameras. Yeah. And then Sean does a lot of the writing outside of the podcast on the newsletter. So.
B
Yeah, so it's like trying to be newsletter and community and podcast and whatever else that we do. Yeah. So I guess for me, I guess there's the in space, but then there's also the other big piece which is the. The conference that I run. And the idea is that I think sometimes you just get the good stuff from people if you just put them in front of a lot of people. And that's really like I'm mining people for content and sometimes you put a mic in front of them and they yap for an hour. Other. Other times you have to put them in front of like a prestigious conference and then they drop some alpha. And so the next one for us is going to be June. It's the AI junior World's Fair and it should be the largest technical conference for AI.
A
And ours is simple, just. We just run a humble podcast. So subscribe to unsupervised learning on YouTube. Thanks so much. This was awesome.
B
Thanks for having us.
D
Good to see you guys. Thanks for coming on.
A
Hey guys, this is Jacob. Just one more thing before you take off. If you enjoyed that conversation, please consider leaving a five star rating on the show. Doing so helps the podcast reach more listeners and helps us bring on the best guests. This has been an episode of Unsupervised Learning, an AI podcast by Redpoint Ventures, where we probe the sharpest minds in AI about what's real today, what's going to be real in the future, and what it means for businesses in the world. With the fast moving pace of AI, we aim to help you deconstruct and understand the most important breakthroughs and see a clearer picture of reality. Thank you for listening and see you next episode.
Date: March 29, 2025
Podcast: Latent Space: The AI Engineer Podcast
Guests: Swix and Alessio (Latent Space), Jordan, Jacob & others from Unsupervised Learning
This special crossover brings together hosts and minds behind two of the most influential AI podcasts: Latent Space and Unsupervised Learning. The panel embarks on a candid, free-ranging discussion about the turbulent, ever-evolving world of AI engineering, focusing on major surprises of the past year, what’s currently under- or overhyped, the shifting product and infrastructure landscape, defensibility at the app layer, and the big outstanding questions for the industry. Both shows’ hosts reflect on the nuances of open source, model company strategies, emerging agents, infra challenges, and more—offering listeners a frank look at where software 3.0 is heading.
Reasoning Models Surpass Expectations
“It’s so over and then we’re so back, in such a short period.” — Swix (01:36)
Open Source Progress and Stagnation
Panelists are divided: Open source models like DeepSeek advanced rapidly (sometimes to the surprise of the market), yet enterprise uptake lags, with adoption under 5% and declining.
“Open source model usage in enterprises is at like 5% and going down.” — Swix (02:59)
Speed of open-source catching up isn’t systemic, rather “how fast DeepSeek caught up.” Other teams mostly distill from leaders, and DeepSeek’s future open-sourcing is uncertain.
“There’s no team open source, there’s just different companies and they choose to open source or not.” — Swix (04:16)
Market & Product Shocks
Low-Code Builders Miss the AI Boat
“Not all of them missed it. Why?” — Swix (08:05)
Apple’s Disappointing AI Play
Overhyped: Agent Frameworks
“It feels like the jQuery era of agents and LLMs... They’re just building single-file big frameworks.” — Alessio (11:31)
Underhyped:
Apple's Private Cloud Compute: Apple’s privacy-forward cloud compute is seen as a sleeper foundational tech.
Memory & Stateful AI: The lack of robust, persistent memory for agents is a bottleneck—true innovation awaits better abstractions for knowledge retention and learning.
“If there was a better memory abstraction, then a lot of our agents would be smarter and could learn on the job…” — Swix (15:36)
Inference and Verticalization: Net-new “model builders” flood the scene, most lacking clear product differentiation and struggling to compete with generalist incumbents.
Model Companies Are Encroaching on Product Territory
Dynamics are changing as foundation model builders (Anthropic, OpenAI) move up the stack, sometimes alienating their biggest model clients (e.g., Cursor in coding tooling).
“Model companies want to get into the product layer... that’s going to boil up on Cursor vs. Anthropic.” — Jacob (21:34)
Agent builders that handle the “schlep” (deep integration, enterprise sales, support) may retain their edge even if foundation models commoditize.
Current & Emerging AI PMF (Product Market Fit)
Clear winners so far: Code copilots, customer support agents, deep research (long-form question answering/reporting).
Copilot-style tools unlock $100M+ markets, and OpenAI’s Deep Research product launch reportedly generated “billions in ARR”—a sign of immediate, tangible PMF.
“For me, the Product Market Fit bar at the time was $100M. What use cases can reasonably fit $100 million?” — Swix (25:45)
Voice AI & Summarization are on deck as the next big things, with applications in scheduling, intake, and appointment handling where partial automation yields outsize value.
Second Wave Opportunities: Tools that drive revenue (vs. just cost savings) may prove more defensible longer-term, supplanting first-wave BPO-style automation.
Defensibility: It's All About Network Effects, Brand, Velocity
Recognized that classic defensibility tropes (unique data, proprietary models) have largely been red herrings.
The real moats: Being the default, staying top-of-mind with buyers, moving fast, and compounding small product advantages.
“It’s the thousand small things that make the user experience delightful and the speed at which they move...” — Jacob (41:48)
Example: Chai Research builds defensibility by being a marketplace with network effects, not by model IP alone.
Infra: What’s Hot and What’s Not
Biggest Unanswered Questions:
Reinforcement Learning Beyond Code/Math? Can RL be extended from verifiable domains (like software) into fuzzier, judgment-heavy areas (law, marketing, sales)?
“If not, we’ll be stuck with agents in verifiable domains and copilots everywhere else.” — Alessio (50:55)
Ratcheting Up Reliability: OpenAI’s “rule of nines”—each order of magnitude increase in reliability is exponentially more expensive. Will infra (hardware, e.g., Nvidia’s dominance) keep pace?
“How are we going to scale this next part? Is Nvidia just going to continue to be dominant?” — Jordan (51:27)
Agent Authentication: How will products securely authenticate agents acting on users’ behalf? Will solutions involve crypto or something like biometric auth (Worldcoin, etc.)?
Quickfire Round Highlights:
The episode is lively, highly technical but unvarnished—with the speakers regularly poking fun at themselves, each other, and the AI echo chamber. There's a spirit both of mutual respect and conversational informality (“Schlep is sticky,” “maybe we need React,” “I didn’t think we’d still have new model companies!”), making complex topics feel accessible and engaging for both technical insiders and curious onlookers.
This special serves both as a deep pulse-check on AI’s current state and a forward-looking guidepost for anyone involved in building, investing, or simply understanding the next era of software powered by generative AI and LLMs.