
Loading summary
Harry Stebbings
Welcome to 20 Product with me, Harry Stebbings. Now, 20 Product is the monthly show where we sit down with the best product leaders to reveal their tips, tactics and strategies to scaling the best products and product teams. Now the real question is who's going to win? Is it Codex, Is it Claude Code?
Interviewer (likely Harry Stebbings or a co-host)
Or is it Cursor?
Harry Stebbings
Well today joining us in the hot seat we have Alexander and Berrikos, product lead for Codex at OpenAI. This is an incredible discussion to Time
Interviewer (likely Harry Stebbings or a co-host)
to get the notebook out.
Harry Stebbings
I want your feedback. Let me know what you think harry0vc.com but before we dive into the show today, the early story of Atlassian is probably very similar to your own. Atlassian knows first hand the challenges that startups face every day and that the right tools are essential to go from MVP to ipo. That's why Atlassian for startups gives eligible Companies up to 50 seats free on the premium edition for products like Jira, Confluence, Loom, Jira, Product, Discovery, Compass and BitBucket. So your team can use the best in class tools to plan, track and collaborate on work, whatever that work may be. Many of today's most successful startups like Cloudflare, Canva and Rivian relied on Atlassian for their growth trajectory and Atlassian wants to give that same opportunity to the next generation of builders and investors. We know how important it is to focus on building the right things early. Whether you're in the sticky note stage or well on your journey, teams at any stage can work smarter together. It's never too early. To start with Atlassian, head on over to atlassian.com startups Harry for more details and eligibility. After Atlassian helps your team build and ship great products, Intercom helps you support the customers using them. If you're looking for a way to transform your customer service, let me introduce you to Fin, baby. Fin is the number one AI agent for customer service resolving up to 93% of customer queries automatically. There is no other agent that can do that. Not 93% of customer queries. Okay? No other agent can do that. So why choose? Fin is the best performing AI agent for cs. Fin doesn't just answer questions, it takes actions. It automates the most complex customer queries like refunds, transaction disputes, technical troubleshooting with speed and reliability. I wish my team was speedy and reliable. Beats every competitor in every head to head bake off, completely configurable and code optional setup.
Interviewer (likely Harry Stebbings or a co-host)
My word.
Harry Stebbings
The benefits just go on and on. It's easy and efficient implementation. It works on any help desk with no tedious migration needs. It's trusted by over 6,000 customer service leaders, including top AI companies like Anthropic, Lovable, Synthesia, Clay Vanta. So if you're ready to transform your customer service team, scale your support and give team members time to focus on the really high level strategic work. Learn more about FIN at FIN AI20VC while FIN scales your support without losing speed, Reforge shows you how to translate that scale into durable product led growth. Everyone's shipping faster than ever Cursor Claw Code Codex AI is making code and writing code faster than ever. But here's the problem. Speed means nothing if nobody uses what you ship. That's where Reforge comes in. Reforge is building the product discovery engine that sits upstream of your coding agents. Not another prototyping tool, research repo or AI interviewer, but a product that ingest your customer data, generate variations of product solutions, validate the solutions before code is written, and hand off winning directions to your team. Reforge kills product debt before it starts because every unused feature you ship isn't just wasted engineering time. It's a maintenance burden, complexity, tax and surface area that you cannot shrink. Used by product teams at companies like Toast, Vimeo, Klaviyo and many more, Reforge helps teams ship more features that actually get used. Try reforge@reforge.com build and use the code 20VC. That's 20 VC for one month free of Pro.
Interviewer (likely Harry Stebbings or a co-host)
You have now arrived at your destination. Alex, I'm so excited for this. Dude, I told you. I've been at a PE conference and all I could think was thank God I've got Alex next because this is going to be a great one. So thank you so much for joining me man.
Alexander Berrikos
So excited to be here. Thank you.
Harry Stebbings
Now this is a weird first start,
Interviewer (likely Harry Stebbings or a co-host)
but roll with it. You'll understand my British intricacies. I'm fascinated by people's motivations.
Harry Stebbings
Are you motivated more by the fear
Interviewer (likely Harry Stebbings or a co-host)
of losing or like the thrill and excitement of winning?
Alexander Berrikos
I'm a maximalist. I'm definitely much more motivated by the idea of winning than the fear of losing. But I'll admit to you something. When I was running a startup before joining OpenAI and one of my darkest moments, and there were many dark moments while I was running the startup was recognizing that I had spent the past few months trying to avoid losing. All of a sudden I was like, oh my God, that is why I'm so unhappy. And that's probably why the startup isn't going well. Basically Every now and then I have to catch myself and flip back into this idea of winning. But really what motivates me even more than that is I think I just love building things and building things for people. And man, I am so excited for this year because many amazing things that don't exist yet are going to be built and given to a lot of people.
Interviewer (likely Harry Stebbings or a co-host)
I'm diving right in. Elon said that coding is one of the first professions to be largely automated. Do you agree, given your position and what you see day to day, for
Alexander Berrikos
sure, I would agree that coding is one of the first domains where LLMs are really good. But you know, what does it mean for coding to be automated? It's kind of a heavy statement. Right. For example, now that we no longer write assembly, like when that change happened and we moved to higher level languages, did we say coding is automated? Not really. Right. We were just able to write much more code. And then as a result, actually there was much more demand for code and there were many more software engineers required. But yeah, part of what they used to do is automated in the same way that. Do you know the origin of the word computer?
Interviewer (likely Harry Stebbings or a co-host)
No.
Alexander Berrikos
I might pronounce the location wrong, but I think it was at Bletchley Park. There were all these machines for decoding German Enigma and there were humans who would punch out punch cards and put them into the machine and do a bunch of tabulated math. I'm probably butchering this, but basically there was an intensely manual part of work. And even the first spreadsheet software was loosely based off this idea that you would have an office full of desks arranged in a grid and people doing tabulations and then passing their sheets to the next person. And so all these things, those specific tasks have become automated. But every time that's happened, there's been an explosion in demand for the output. And so you need many more people actually to do that kind of work, even if the specific task has changed.
Interviewer (likely Harry Stebbings or a co-host)
So you think we will have more engineers in five years, not less?
Alexander Berrikos
Yeah, and sometimes we change what terms mean. Right. Like the term computer now refers to something else, but now we have the term software engineer. And so I definitely think we'll have many more builders. And you know, something interesting that I'm observing now is like there's this compression of the talent stack. You know, you still need software engineers today, you still need designers. I'm a PM. Do you need PMs? You know, you can have a fun, fun. Some fun jokes about that. I don't think you need them. But maybe when you say engineer, you might be thinking of someone who's much more full stack than has been true before. Like, even if you go back a few years, you had many more places where there was like the backend engineer and the front end engineer. Whereas like now, at least if I think about the Codex team, like, that's much less the case and things are much more full stack. Right. And this talent stack will compress, but we'll still have people building.
Interviewer (likely Harry Stebbings or a co-host)
Why do you think we don't need PMs in this world? You dangled the carrot.
Alexander Berrikos
Yeah, it's my fun joke, I think. Well, first of all, I think it's incredibly hard to define what a PM is, what a product manager is. I kind of think of the role as like actually explicitly undefined and your goal is just to adapt to whatever the team or business needs. Often if you have a bunch of people like trying to build as quickly as possible, then what. What a product manager can do is spend time, like taking a few steps back and trying to look around corners and figure out what to do, you know, collaborate with the folks and go to market and maybe be the team's greatest cheerleader and quality raiser. But all of those things I just described, which are maybe my current role could be done by a really strong ENG lead or a designer who thinks a lot about product. And so I think it's often useful to have product managers, but you probably don't want many of them until the team is really large.
Interviewer (likely Harry Stebbings or a co-host)
I was stalking the shit out of you for the last few days, which was a very fun expedition into your writing, into your tweets, into your prior interviews.
Harry Stebbings
And you said that human typing speed
Interviewer (likely Harry Stebbings or a co-host)
and validation work is the key bottleneck to AGI, not model compute architecture.
Harry Stebbings
And it kind of left there and I was like, help me understand why
Interviewer (likely Harry Stebbings or a co-host)
human typing speed and validation work is the key bottleneck. And what you really meant by that, for sure.
Alexander Berrikos
Okay, that's a fun one. I think there are multiple bottlenecks, but that's maybe the most sort of clickbaity one. So if you don't mind, we'll do this slightly socratically. How many times would you say you use AI today?
Interviewer (likely Harry Stebbings or a co-host)
30 plus times a day.
Alexander Berrikos
Okay, cool. How many times do you think? If assuming it was like zero energy expenditure from you, how many times do you think AI could help you per day?
Interviewer (likely Harry Stebbings or a co-host)
I mean, in everything. I think we'll have inference running 24 hours a day across every single thing.
Alexander Berrikos
Exactly. And I hear things now from engineers at OpenAI and also outside, who are telling me I constantly have codecs running. I never close my laptop, and if it's not running while I'm in a meeting, I'm wasting my time. I need to make sure Codex always has work for me that it's doing. And that's like super cool and super exciting. But that's a lot of work, right? To like, manage these agents and make sure they're always working and going back to the 30 times per day thing. Yeah. Like, when we look at how often Codex users are using codecs, it's like kind of this, like tens of times kind of range. And I think AI should be helping us tens of thousands of times per day, you know, compute budget permitting will, and we'll get there over time. But the problem is, like, at least if I think of myself, like I work on this stuff, I know I should be using AI for everything, but I'm too lazy to like, type out that many prompts and I am too uncreative to figure out all the ways that AI can help me. And so I end up kind of at a similar number as you. You know, I still am at the point where when I use AI to do something cool, like prep for this conversation with you, I'm like, kind of proud of myself. I'm like, oh, cool, I managed to use AI in this new way. That's fine for people like you and me who are like, really interested in this topic. Right. But I don't think most people we should expect in order to benefit from AGI should need to like, put so much effort into how to use this tool. It should just be effortless for them. I think the world we want to get to is one where to use AI. You don't really need to figure out the right way to prompt. It's just super easy for you. And you don't even need to recognize that AI could help you. It's just like, knows you connected to your context and chimes in helpfully.
Interviewer (likely Harry Stebbings or a co-host)
That's where I think CLAUDE has done well in terms of the packaging they've done, like CLAUDE for legal, CLAUDE for Excel, where you can implement it and have a DCF model. I'm not into models, but like, better than one could do before. Do you think it is your job then to productize the prompts and the human actions to remove that bottleneck?
Alexander Berrikos
Yeah, totally. So I think that it is our job to make sure that we have the models with amazing capabilities and then eventually to get to a world where this is highly productized and so you just have this magic text box or audio input or whatever, or you can just add AI to your group chat and it just starts to help. But I think there's quite an interesting in between stage and I think that that is actually where the most value lies right now. So here's what I mean. You could try to productize like a specific feature of AI for a specific market. And you know, the many companies are doing this, but I think it's a little bit hard to know what exactly will work, what is the right form factor. And someone was on your podcast earlier and they said something that I thought was quite interesting about how you cannot adopt AI at Enterprise without FTEs.
Interviewer (likely Harry Stebbings or a co-host)
Yeah, it was Matt Fitzpatrick from Invisible AI.
Alexander Berrikos
Yeah. So even though I am literally hiring FDs and if you're an FD, please apply for a job with me, I actually disagree with that entirely. So what I think we need to do is build tools for people. Like you can use FDEs, as Patrick said on the podcast, to automate workflows. Right. But then you're limited by what you from your top down perspective can do and what you from your FDE staffing can, can staff to be built. Right. But for me the most exciting future with AI is one where everyone just feels like a superhuman or a God, just like empowered by AI. And for that we need tools that are for people, for individual users and that everyone feels fluent with. I think the phase that's most interesting that we're at now is building for the kind of people who are interested in figuring out how to use AI, so what we need to ship. And I think this was like the genius of like when Claude Code first shipped, what they really got right was they had this tool that was super easy to use in whatever context you want just in your terminal and people started experimenting with where to use it. And so I think as we think about AI being used outside of coding work, one of the most important things we can do is not overly build it, like okay, this is AI capabilities, but only specifically for finance, only for, specifically for this workflow, but actually build a much more open ended tool that someone can just use for any given task creatively.
Interviewer (likely Harry Stebbings or a co-host)
But does that not put the onus or the effort back on the user? Back to the point of your bottleneck of human action and lack of activ activity on them? If you don't define the task, you put the responsibility on them for defining the task, which humans lack the ability or inclination to do.
Alexander Berrikos
Yeah. So that's why I think it's the bottleneck. So basically here are the three phases in my mind. First, let's have agents work really well for software engineering and coding because LLMs happen to be good at that. Next, let's realize that for an agent to be useful more generally using a computer is super valuable. And also will realize that all agents are actually coding agents because coding is just the best way for an agent to use a computer. So let's take that same super flexible idea, but make it available to anyone who's excited to explore and tinker. And we're already seeing people start to do this with like the Codex app. Like Codex app is built for builders, but we're seeing builders use it for all sorts of non coding tasks. Then finally, once we see what's working, let's build that productization that you were talking about where you have highly specific features that just work immediately out of the box for people. And I think we're going to speedrun this entire 1, 2, 3 journey in the next months.
Harry Stebbings
My challenge with what you said about
Interviewer (likely Harry Stebbings or a co-host)
FTEs and implementation within enterprise is data security sensitivity. Permissioning access provisions is really fricking hard. And people are much less intelligent and confident than we give them credit for, I think, especially in large enterprise. Sorry. And I think you actually need an FD to go in and custom fit a lot of the different horizontal solutions to make it work. Am I wrong?
Alexander Berrikos
I think you're right. If you're trying to go all the way from 0 to 1 and you have this like, and I say, I don't mean grand negatively here, but if you have like a grand vision for some like ultimate workflow automation system, then yeah, you're going to have to clear through all of these security hurdles, all these like compliance hurdles that are really real. Right. Build connections to all these data systems and like systems of record in action. Yeah. So you're going to need an FT to do that. What I've seen is that when we do these things top down, we end up like massively underleveraging the potential of AI in like helping that company. Whereas you can maybe do that in parallel. Right. But if you can just give AI to the people like actually doing the work, they can start to get a mental model for how AI can help and then they can start pulling AI into their workflows at the same time. Here's just like an analogy or something. Here is like imagine if you work in a customer support role and AI is being brought into your role and starting to automate meaningful chunks of your work, but you've never heard of ChatGPT, nor are you allowed to use it. So in that scenario you have like no intuition for what this thing is. Whereas in a world where actually you've been using ChatGPT for work at the same time as like parts of your work are getting automated by an LLM, you have much more intuition for how this works. And, you know, I would argue you feel much more empowered about this idea that it's being accelerated and you have some degree of control to steer, like where these automations are built as opposed to like it's like this complete like ex machina kind of thing that is quite disempowering. So bringing this back, I think there is a way to do this because the data control issues you mentioned are real. But at the end of the day, every tool, every feature, every workflow is for a human who is somewhere an employee somewhere. And that employee is accessing that tooling via their browser or via their file system, like at the end of the day. And so at the end of the day, everything comes to an interface that an agent running locally on your computer can work with. And I think it's quite unusual. Like in OpenAI, we're building a browser atlas and you might wonder why. And there are many reasons why, but I think one of the key reasons is that by building a browser and by controlling it like tightly end to end, we can build like safe agentic browsing for enterprise. That is a way to access things agentically that are otherwise not yet built out by fds.
Interviewer (likely Harry Stebbings or a co-host)
There are so many questions that I have to ask you. I want to go back before I lose thread. You mentioned about engineers not closing their laptops because they don't actually want to lose productivity and time with building, building with Codex, you partner with Cerebras, and Cerebras is the fastest provider obviously of inference out there. Amazing win, I think, for both. Bluntly, how important is speed for developers when using Codex? And in the future of AI code?
Alexander Berrikos
I mean, these simple answers, it's super important.
Harry Stebbings
And so is it like an inference
Interviewer (likely Harry Stebbings or a co-host)
monopoly, like you have it now and competitors don't.
Alexander Berrikos
This is just my opinion, but I don't think we're going to end up in this kind of monopolistic world. I think there is so much competitive pressure that there'll be multiple answers to this. But I will say that we have news coming out about that partnership soon and I'm very excited for these kinds of things to ship. It's going to be awesome. But even so with GPT 5.3 Codex, that model is significantly more efficient than prior models. And so in the feedback we've heard is that people actually feel like now this is a very competitively fast model than before. So there's a lot of things you can do just in terms of the model. There are also things you can do like improving how you do inference. So we recently rolled out a change where in the API those models are served 40% faster and in codecs they're served 25% faster. So I think speed matters a lot and we're kind of approaching it from all angles, like both the hardware, how you do inference and the model level.
Interviewer (likely Harry Stebbings or a co-host)
You mentioned earlier about putting it in the hands of users and we talked about inference there. One of my dear friends is Jason Lemkin from Saasta, and he says that actually inferences, the new sales and marketing, instead of sales and marketing teams, you're paying for inference so users can onboard quickly, easily see value, and you will actually see the removal of sales and marketing teams. It's kind of like next gen of plg.
Alexander Berrikos
I don't know. I think I struggle with that. I think, you know, fundamentally, in this new world where anyone can build and it is increasingly easy to build things, what is hard, right? I think having a good relationship with the customer, knowing what they need is as hard as ever, maybe even harder as it's just like there's just more stuff in the market to choose from. You know, the other things that are hard are like building the right thing, having a really high quality thing, but going back to the sales and marketing thing, like, I don't think that goes away because I think that's as like I said, I think that's just gotten harder as the markets, any given market gets more competitive with more software out there.
Interviewer (likely Harry Stebbings or a co-host)
How much of internal code for you today is produced by Codex? I remember like Claude for what Boris said was like 100% or nearly 100%. How much is internal codecs used?
Alexander Berrikos
So I'll speak for myself and then for the team, I would say, like most people that I know are basically not opening editors anymore. And this was a step function change that happened in. It's been happening gradually, but I'd say the key external market touchpoint for this was like GPT 5.2 codecs where all of a sudden the model was way better at running for longer, handling tasks, end to end, managing its context and following instructions. And so we kind of saw this inflection point. And that's actually part of why we built the app. So I think before GPT 5.2 codecs, the kinds of AI features we were using to write code were like tab completion or maybe you were pair programming with the model. And in my mind, you still needed to be at your laptop with your hands on the keyboard. Ish. And it might go off and do a little bit of work, but you kind of still need to be there and drive. It's just handling these small things for you. And then at the time of GPT 5.2 Codex in December, we kind of switched to like, actually I'm just going to fully delegate this task. It's like I'm going to do a plan with it, make sure we like the spec that it's going to do, and then I'm just going to go let it cook. And this is quite a different way of working. So it's changing literally as we speak. And so part of why we built this codecs app that we released last week is because we wanted to build a form factor or user experience where it felt very ergonomic to be delegating instead of pairing with an agent. And so you're delegating to multiple agents at once. And so even at OpenAI, this is changing massively. I don't have a percentage stat for you, but I would say the vast majority of code is written by AI. And I would say that now probably most people are not even opening IDEs. Maybe if they are opening IDs to. Maybe you want to own the interface, so you'll help flesh out the interface between two modules and then AI fills it out. Or maybe you want to collaborate on a plan but then have AI fill it out. The code itself is not being written by humans anymore.
Interviewer (likely Harry Stebbings or a co-host)
Will we have IDEs as a part of the stack in 24 months time?
Alexander Berrikos
Depends how you define IDE. So the formal definition, right, Integrated Development Environment. I mean, that phrase is so squishy that like literally anything could be an ide. Right? So I don't think that's very useful. If that's the answer, then yes, you could even argue the codecs app is an ide. I don't think it is. Like, for me, I think of an IDE as like a really powerful editor. And we explicitly didn't build editing into the Codex app because we wanted it to be really clear how you're meant to use it. So, you know, it has a lot of affordances for managing multiple agents, for delegating, for reviewing changes. It has really prominent skills, which are an open standard that are really useful for doing non coding work. Stuff like, you know, triaging tasks or monitoring deploys or something. But it doesn't have text editing if
Interviewer (likely Harry Stebbings or a co-host)
we assume a large percentage is done by codecs in terms of the code produced. How do you do coding reviews and is AI responsible for internal coding reviews?
Alexander Berrikos
There are a few things here. First off, the spec for what you want to do or the plan becomes more important than ever. Think architecturally, how should this code work? We recently shipped a very prominent plan mode that works a little differently than others where you have the agent go off and propose how it's going to do something. It's quite a long plan and then it asks you questions about if you agree on how it wants to do it or if you want to have input. And this is very similar to if you had a new hire who was new to your code base. They had to present a request for comments to the rest of the team before they started doing the work. So even though that's not formally code review, I would say review of the plan is actually something that's becoming more important because we're entering more of this like delegation phase of working with agents. So that's an underrated thing then. Okay, there's actual code review. I think a problem that I hear a lot of people talking about, especially in the open source world, is like a lot of AI slop. Like people will just be submitting PRs to these open source repos and they're trash. And like maybe the user hasn't even the person submitting the PR hasn't even tested them or definitely hasn't reviewed the code. I think this is a problem. And so a common practice with Codex is to have codecs review its own PR or its own change. And Codex is actually incredibly good at this. We've explicitly trained the model to be good at code review and that included things like making sure it's really good at creating high signal feedback. So it'll basically have few false positives of criticism, which means you can really trust when it has feedback. And so not only do we encourage people on the team and elsewhere to just ask codecs to review, you can then also set it up to just automatically review. So like nearly all code at OpenAI is reviewed by Codex automatically whenever you push it to a git repo. Actually like one fun thing for people who haven't tried Codex yet or didn't try it recently, sometimes the way that people see how good our models are is by asking Codex to review a different model's code and basically they're like, oh shoot, I should probably just be using Codex to write my code.
Interviewer (likely Harry Stebbings or a co-host)
In general, you said something really interesting that you said. For those that maybe haven't tried it yet or are coming back to it, how do you think about retention with this category? I remember Tom Blomfield, who's a YC partner, tweeted months and months ago, but it stuck with me. Weird brain about the ease of transition between different providers. Whether it was cursed or core code or Codax. I can't remember which one it was, to be honest. But how sticky are users and how do you think about retention?
Alexander Berrikos
We've taken this kind of counterintuitive approach with Codex to just build it super openly. So the Codex core harness is open source and we're always trying to make it easier for people to switch. So for instance, when we first launched Codex last year, we created as even a heavy word, we just established a convention which is called Agents md. This is basically a file that you can put instructions for the agent in. And instead we didn't call it Codex md. We just wanted it to be something that all agents can use. And pretty much every agent except Claude uses Agents md, which is awesome. And then just last week actually we helped push for putting skills which are our standard for giving the agent instructions and scripts. We push for those to be sorted in sort of a neutral named folder called Agents instead of in like codecs or something. And again, everyone has jumped on it except the usual suspect. I think it's really great for the developers to have a lot of choice and we're trying to make it even easier for people to try different things. Now that said, these coding tasks, right, where you're asking an agent to write some code, they're quite hermetic. And what I mean by this is maybe an analogy in TV would be like episodic, right? Like you can come in and you've got this like open ended like agents file that any agent can read from. You've got these skills that any agent can use and you can ask the agent to write some code and it produces a patch and that patch goes into git. So kind of like both ends of this are pretty neutral, vendor neutral, so very easy to move between. For now, as agents start to do work that is not writing code, but more general work again for software engineers or beyond, for any builder, they're going to need to start interfacing with other systems. So as they start, maybe your agent is talking to Sentry, right? Or it's talking to your Google Docs or something, then I think these agents become much stickier because actually deciding to connect an agent to that system is a sticky decision. And if you're an enterprise really trusting that the agent is going to have access to these tools, but there are really good secure guardrails and sandbox and like controls over how the agent works with these systems, I think is critically important. And that's not something that you're going to want to do multiple times. And so we've been kind of building Codex, knowing that this is coming. And so we have the most conservative sandboxing approach. Sandboxing is kind of like a set of controls, OS level controls over what the agent can do.
Interviewer (likely Harry Stebbings or a co-host)
But I'm a fan of Seven Powers, this brilliant book, which talks about kind of seven ways that businesses accrue value and sustainability. And your stickiness or your retention is one. If we're on the same team with Codex, how do we create retentive patterns, behaviors, programs to ensure that people stay with codecs and they don't flip to cursor when there's a better model, or claw code when there's a better model?
Alexander Berrikos
Yeah, I mean, it's interesting because I think on the one hand, we think about this. Obviously we're running a business, but our mission here is to ensure that we safely deliver the benefits of AGI to all humanity. And so something that's unintuitive to people about the Codex team, Alice, you actually.
Harry Stebbings
But your job is the success of Codex. I get.
Alexander Berrikos
Actually, our job is the distribution of intelligence. And so we're obviously building out codecs. And this is really unintuitive to a lot of listeners, but like, we put all this effort into training these models and then we serve these models to our competitors. And from our perspective, this is so
Interviewer (likely Harry Stebbings or a co-host)
difficult for me as a venture capitalist to understand. You are aware of this?
Alexander Berrikos
Yeah, I'm totally aware of it. Like, OpenAI is like a really interesting and unusual place to work. But basically because we're playing such a long game for us, if the competition gets better, we learn. It's actually helpful for us. And so we're pushing really hard at growing Codex.
Interviewer (likely Harry Stebbings or a co-host)
Do you learn? Because if they're closed and they improve, you don't learn.
Alexander Berrikos
I don't think so. For example, there are a bunch of recent launches, like even today. I literally just quote, tweeted a thing this morning about a launch from Warp. No particular affiliation. Right. And there are a bunch of cool ideas in there about how they framed up the way that their agent can work in the cloud at the same time as working locally. And for me, that's inspiring. And I think I see all these things from various companies and one of the coolest things about the space is we're all kind of inevitably reaching the same conclusions together and then building things out and so on the Codex team, I think we have some massive advantages. We have the massive distribution advantage with ChatGPT, we have the massive capability advantage of training our own models to be good in our harness and building our harness to be good at the new models and no one else has early access to those. And so I think we're playing to win and we have a really big advantage or a number of advantages, but we're also playing this long game where again, we serve our models to everyone, where we push for open standards so that everyone can use all the things that we're pushing for as well.
Interviewer (likely Harry Stebbings or a co-host)
Can I ask you what will be the defining factor of winning? And I know I'm using venture language and you're brilliant and kind of much more free and open, but what was the defining factor of winning? If I push you, is it like gtm, which is the biggest enterprise in the world, do want to work with OpenAI? I have many friends and your sales team, the inbound that you get from the largest brands is incredible. So gtm, because of the incredible brand product execution and just codecs being a freaking awesome product or compute inference speed, actual compute advantage, which one is the defining winner?
Alexander Berrikos
Okay, so I think if we're going to talk about it more from an OpenAI perspective, obviously this is way above my pay grade, but I would say it's compute advantage and having the best models. And in order to achieve that, we then need to build businesses that generate revenue. And also that something that's really interesting we noticed with having the Codex team, which is a sort of combined team of research and product is also by building these successful products, we create a lot of pressure to improve the model in sort of a faster way. That's maybe the company perspective, right? If we come to the product perspective, I think the single most important thing we can do is build a really good product that people want to use. And like I was saying earlier, I think we really want to build products for individuals and then allow like people to become fluent in those products and then like pull in automation. And I think that may be counterintuitive but will result in way more impact than anyone purely approaching it from like the enterprise workflow perspective, you know, I think that's mostly a Question of product execution. And then that works for say, like prosumer. When it comes to enterprise, the go to market side is really important. Something that I've learned the hard way is if we go to an enterprise and we're just like, hey, we're here, like, feel free to use the stuff that doesn't work, there's actually quite a lot of education that needs to be done and there's a lot of like configuration that we need to support and sort of like education of the broader team. So like that motion looks much more like coming in, pitching, meeting the head of developer experience or whatever, understanding how they want their team to operate, and then giving them tools to like propagate that mechanism of operating to the rest of the team.
Interviewer (likely Harry Stebbings or a co-host)
You said the word revenue there, which is one metric to measure a business against. When you think about your metric of success, which you sit down with Sam or Brad or whoever it is and say, hey, this is what we're optimizing for. What is the metric that you use as the defining North Star for your progression?
Alexander Berrikos
It's actually not revenue as the primary. The primary is active users.
Interviewer (likely Harry Stebbings or a co-host)
How do you measure active users? Like daily active users?
Alexander Berrikos
Yeah, so we measure weekly active users and it's, you know, did this person like actually do a turn in our product? You know, did they send a prompt?
Interviewer (likely Harry Stebbings or a co-host)
Is weekly active a frequent enough metric, do you think? Sounds nice, but if this is actually replacing the ide, is daily active not better?
Alexander Berrikos
I think daily active will be better soon. We just happen to use weekly active. It's like a standard here. And I think as we were getting started it made sense. But I actually agree with the criticism there. It's like, we should probably just be a daily. Like, I think we, we need to be getting to a world where for any given task that you have, your first instinct is to ask an agent to help. Right. It's kind of like, you know how like with Google Search, it's just like, okay, anything I need to do, I just like go into this text box and I can get navigated to the right location. Then you had ChatGPT. It's like for any information I need, I can go into this text box, type it out, and get information that helps me. And I think the next phase that we'll see this year is for any task I need to do, as opposed to just get information, I go to this text box or this input and something happens that helps me. Even if it's not the full task, even if it's only a small part of it.
Interviewer (likely Harry Stebbings or a co-host)
You said about kind of chat that again, I jump around, sorry, my brain. My mother has to walk with me around London and she deals with this manic, episodic brain. But you said about chat and the interface there. I'm really fascinated by this because it is a seemingly incredibly efficient input function for, for busy humans. But I spoke to Anish Akaya who's a GP at Andreessen and it came out the other day and he's like, no, no, no. This was created by Sam and Elon and it works for very efficient people. But most of the planet want browser based discovery interactions UIs. Do you think that chat will be the enduring UI in the next wave of AI interaction with humanity?
Alexander Berrikos
The simple answer is yes. But actually I think there's two components here. If we just imagine the future, just like, let's think of some sci fi movie, right? Like, what does AI look like? I believe that sci fi is a really good predictor of what the future should look like. And usually it's pretty simple because it's a story and I think simple is usually right. It's going to be some just like entity that I can talk to however I want about whatever I want, right? I shouldn't have to navigate to a place where I work with my coding AI and then I have this different place for my sales AI and I have to be like, hey, I'm now talking to salesthing and do that. It's just like I'm just going to talk to a thing and it's just going to help. So I think what we're going to have is that we'll have chat or voice basically. Conversational interface will be sort of the pillar of everything that you can talk to about anything and that you can add into any group chat or whatever so it can discover how to help you. But then if you're like a power user and you're very good at a specific thing, you probably don't want to be disintermediated by having to talk to another person. It'd be like if you had an executive assistant, but you can only work by talking to them. That's super annoying, right? So at some point you want to get to the show notes and like look at them yourself and like edit them yourself, right? You want to edit the thing yourself. So I think we'll pair chat with like functional, like graphical interfaces that are bespoke to like what someone needs. So like, in my case I will probably chat to like do my, you know, podcast prep. But when it Comes to like actually looking at product and code. I probably want like the Codex app that I can go into and get deep in. Right? Whereas maybe if we're talking to a marketer, maybe that marketer will like chat to ask questions about the product. They're not going to download the Codex app just to ask questions about the product, but maybe they'll have like a super custom GUI for like ad analytics or something that they go into.
Interviewer (likely Harry Stebbings or a co-host)
Totally get that. And it kind of wrongly assumes, on my behalf, a consumer interaction at some point in that journey. And I want to ask you, how do you think about like agent to agent experiences and designing experiences for agents, like we spoke about, for example, going to large enterprises and how you can be helpful. I'm just using the most boring thing ever, expense approval. You could have agent submission of expenses on my behalf for my trip to San Francisco and then the agent on the flip side doing approvals for that from OpenAI's compliance department. How do you think about that and that paradigm shift?
Alexander Berrikos
That's interesting. To be honest, I'm not sure what that's going to look like. My quickest answer to this is that we've noticed as we build codecs that the best interfaces for codecs to do work also tend to be the best interfaces for humans. So when people ask, oh, how can I make my code base more efficient for the agent to work with? The answer is often, well, have you looked at it yourself? And is it easy for a human to work with? So a very specific example would be running tests in the code base. Naively, if you just set up most test runners, they just emit all the outputs of all the tests. And so as a human, it's really annoying because you have to go in and find the one that failed. And it's like you've got to read hundreds of thousands of lines. Turns out that's terrible for AI as well. But if you filter it down to just only emit the failed test, better for humans, also better for agents. So probably the agent to agent interaction points will be very similar to if there was a human in the loop. And that's nice because it means you can kind of atomically replace individual systems.
Interviewer (likely Harry Stebbings or a co-host)
I mentioned our show on LinkedIn and a wonderful investor from a different company. It's like Harry Potter, Voldemort, and it's like he who shall not be Named. I don't want Sam to kill me, but from another company. Ask him, how do you think about a coding data moat? And does Anthropic have all the data
Alexander Berrikos
now, I definitely don't think they have a significant advantage in terms of data on coding. I think that from what we've seen, and I would defer to my research team on this, but we feel like we have plenty enough data to build really good coding models. I actually think the the place that's more interesting for getting data now is as we get into knowledge work tasks, that's kind of data that's not really available most places on the Internet. And so you start to have really interesting brainstorms for how to help a model be good at it. Maybe you have to pay people to simulate doing tasks so that you can learn these trajectories for the model. Maybe you should acquire startups that are no longer in business but have a lot of data, say their slack or something. Yeah, I think that kind of knowledge work task distribution is much harder than coding.
Interviewer (likely Harry Stebbings or a co-host)
That's so interesting. You said there about kind of the data that doesn't exist, so to speak. How do you think about your interactions with the data providers, your macaws, your Turings, your invisibles, your of the world? Will you spend 10x there or will you go, we are spending too much on data, we should do it ourselves and do data acquisition?
Alexander Berrikos
Yeah, I mean I think the way that we think about these things is just like how do we move as quickly as possible? And so becoming able to set these things up in house is very expensive in time and we're a small team. So what I have observed so far is that if we need to run a data campaign at scale, we're usually going to enlist help from one of these companies.
Interviewer (likely Harry Stebbings or a co-host)
On the consumer side for Codex, we've spoken about enterprises and going into them, how to engage in terms of developer experience, developer relations. Do you compete with a lovable and a ratlid on a low end consumer basis in a year or two's time? Is that a business where you're like, you know what? Codex is not for every person to create an about me or a small business to create their own site. How do you think about consumer in that way?
Alexander Berrikos
Yeah, I would say that right now it doesn't feel like we're competing super directly, but I don't know if you saw our super bowl ad, the tagline of which is just you can just build things with the app. We noticed that many people who are less technical are starting to build things and so the kinds of things they're building are much more hello worldy. And so I think that we will see some overlap in use cases where you have people just pulling up codecs because they have it as part of their chatgpt. Actually, like a big announcement last week was that we're now offering some codecs to people, Even on free ChatGPT plans or on the go ChatGPT plan. So this is massive just in terms of bringing availability to everyone. And so I think we're definitely going to see people with like a free ChatGPT plan coming in and just like, building simple things where they otherwise might have gone to a specialized tool.
Interviewer (likely Harry Stebbings or a co-host)
What would you most like to do differently? But for whatever reason, you can't.
Alexander Berrikos
I feel like it's been a very good few weeks for us, so we're very. I'm pretty jazzed about everything that's happening. Feeling that I have the most.
Interviewer (likely Harry Stebbings or a co-host)
Yeah, that's really interesting.
Harry Stebbings
You said it's been a very good
Interviewer (likely Harry Stebbings or a co-host)
few weeks for us and I feel that. Does the team feel the changing winds of momentum, both in positive and negative cycles?
Alexander Berrikos
Absolutely. We are very attuned to it. Right. Like, if you look at the history of Codex, the first thing we launched last year was like this amazing idea that people were super excited about. It's like, hey, we're going to give the agent its own computer in the cloud. You can have as many of them as you want, work for you in parallel on tasks. Super great idea. To be honest, it didn't work as well as what we shipped later. It was not the best. And then Since August, with GPT5, we started pushing really hard on interactive coding, which is where most of the competition in the market is. You know, we went on an absolute tear. I feel like the public metric we had was like, since August, we grew by like, 20x and then like, even, like late in the year, we, like, doubled from December to now. I forget the exact number there, but, like, that was competing neck and neck. But the shift that we feel last week is, you know, we felt like we had the most intelligent model that was cemented with 5.3 codecs. We had feedback around our model being slower and, like, maybe less fun to work with and, like, being less good at communicating with you while it was working. We addressed that feedback. And that's true even compared to the other competitor model that launched like 20 minutes before us and was like, maybe this is spicy. It was like soda for 20 minutes. So, I mean, state of the art. And then we'd always been getting a lot of feedback on the quality of the user experience in Codex. Our most popular Surface was the IDE extension and our cli which is a command line interface, was less polished. But with the app, the feedback has been like resounding from the market that this is like a really high quality experience that's like simple, like unintuitively simple and people are just loving using. Even our biggest credits are converted. So yeah, and then we, and then we had the super bowl ad and then we went to free. And so going back to your question of like, what do I most want to do differently, I have two things for you. The first is I actually want to get back to cloud. When we pivoted our strategy from like focusing on the cloud agent last year to working interactively, the thinking was very simple. It was just. And it's kind of like what I was telling you about FDEs, actually. If you go too far ahead to workflow automation before your end user is fluent with the tooling and can get it to work simply, then there's this disconnect and you just have this pipe dream idea that's not effective except for the most power users. But once you have this base where people are using your tool every day and they're configuring it and every time they use it it gets better, then the step up to letting it run independently in the cloud is a much smaller step up. So I think it's time for us to get back to building out the cloud product and making it super tightly integrated with the local product. It already is somewhat integrated. And the other thing I want to do differently is start thinking more about the bottlenecks, like CodeGen. Writing code has become basically trivial now. But the hard part is what you were talking about with code review. How do we know the code quality is good? How do we know we're doing the right things? And those bottlenecks I think are underappreciated still and under invested in. So I think we want to get to a world where you can have an agent that is unbottlenecked, that you trust to own an entire microsystem or internal tool or whatever, and can do the full iterative loop, including feedback from users, without having to go through human review. And that is a really hard problem to solve, both from an intelligence perspective, but also from a safety perspective and a controls perspective.
Interviewer (likely Harry Stebbings or a co-host)
How much weight should we place on benchmarks and evals?
Alexander Berrikos
Probably this is an annoying answer for you. It's like some they do tell you, in my mind, they give you a good measure of intelligence and so you can put weight on those for intelligence. And especially before evals are saturated, I Think when you see meaningful progress in those benchmarks, it's like very, very helpful. And then I think you have to pair that, though, with what it feels like to use the model. And that's a Vibes thing. Whenever I talk to any, even internally or even talking to customers of our models, I'm always surprised by how Vibes based. The evaluation of how it feels to
Interviewer (likely Harry Stebbings or a co-host)
work with the model is how Vibes based life is. People want to work with people they like is the lesson that I give to kids.
Alexander Berrikos
People want to work with models they like. Yeah.
Interviewer (likely Harry Stebbings or a co-host)
Relationships matter. Can I ask you. I think that Cursor will lose half of their revenue this year. I think we'll go from a billion to 500 million. As a bold statement. Agree or disagree?
Alexander Berrikos
Can I just like, no comment.
Interviewer (likely Harry Stebbings or a co-host)
Yeah, you totally can.
Alexander Berrikos
I don't know, I think it's really hard to say. Like, like, more serious answer here is just like, I think they've built a really successful business. We see them a lot when we're in enterprise.
Harry Stebbings
Do you?
Alexander Berrikos
Yeah.
Interviewer (likely Harry Stebbings or a co-host)
Or is it just crawl code? Because I don't know anyone that has no.
Alexander Berrikos
I see Cursor a lot more than cloud code and it makes sense to me. My sort of narrative for this is that you have to meet people where they're at. For most people, like, they're used to using an ide. They've been used to using Tap Completion even before there was AI, Right. Like Tap Completion existed pre AI and then AI just made it better. And so I think what's like, coolest about Cursor from my perspective is that it meets developers exactly where they are. And it's a sort of a switch. It's like you used to be using VS code or something. Switch to Cursor. Almost nothing is broken about your workflow. Everything works. Just certain aspects got better. And obviously VS code. I still use VS code. There's reasons you might like it more and they're improving rapidly as well. But I think that pitch from Cursor lands well with a lot of people. And so the bet on Cursor, I think is that they can continue meeting people where they are and then ladder into these more advanced agentic features. You know, that that relationship with the customer is valuable and it's hard to. I don't think that goes, you know, goes away.
Interviewer (likely Harry Stebbings or a co-host)
Do you think it was the right strategic decision to start building their own models?
Alexander Berrikos
It's hard to say, but I feel like there is a bit of a gap in the market right now for that kind of model. Again, if we think about what is the thesis, at least my thesis, you know, I'm not like super close to like working with Cursor or anything, but like, my thesis for like, how they win is that they meet everyone where they are and they, like, make it really easy to, like, step up into using more advanced agentic workflows. Maybe they noticed that the models that, for example, we were putting out or some of the competition we're putting out were kind of slow relative to what their customers wanted. My first magic moment in Cursor was when I hit Command K the first time. That's a feature that lets you select some code and just edit it in line. And I was like, this is incredible. And so if they noticed a lot of their customers want to be able to kind of pair with the AI, and then maybe after pairing with the AI for a while, then they start doing more delegation and then they move it to the cloud, then there is a gap for that fast model that they trained. So I think that makes sense in that context.
Interviewer (likely Harry Stebbings or a co-host)
In terms of market composition, as an investor, I have to think through how do I think about the eventual state of this given market. Kind of a terminal state. How do you think about that? Is it like Uber and Lyft and the majority of the market will be on Codex or claw code? Or is it like AWS, Azure, Google Cloud and A33? 33. 33.
Alexander Berrikos
Okay. So I think this might end up with fewer providers that are capturing a lot of value in the long run, and here's why. And maybe this is a bit spicy, but I think that we are kind of in this temporary phase where we have agents that are really good at coding. And if you look back last year, maybe more people thought we would have agents that are good at other domains too, but that didn't happen last year. So we only have PMF for coding agents in the industry overall, I would say. And then there's some very narrow other use cases like US, customer support, et cetera. But I think that's probably temporary. And then over time we're going to end up with agents that kind of can do anything for you. This is kind of what I was saying earlier. Like, there's just like a super assistant. You talk it to it about anything and then there is like specific UI that you can go look at if you happen to be deep in a specific function. So in that world, I don't think you want like 12 agents at the company and you have to like go. Your employees have to go figure out the right one to talk to because then they won't achieve fluency. And if they don't want achieve fluency, then they will also want like pull automation into their roles. But if you have this one thing that you can talk to about anything, right? So your onboarding is just like go talk to this thing about anything you need, then people will develop muscle memory to go to it. It'll become the center of gravity of work and people will pull in automation. So I think that that future makes much more sense. And I think like as the people building ChatGPT, we're like really well set up to deliver that. This, this is kind of a stretch, but an analogy here is I used to work at Dropbox and for a while, this is before Slack was big. And for a while we thought, we wondered if people should like go comment on documents in Dropbox or if they should go talk about the documents in Slack. And it was obvious that it was more optimal for people to put comments on the right timestamp in the video in Dropbox or comment on the document in Dropbox. So it was more optimal. However, what we saw is that Slack is just such a center of gravity, of people just talking to each other. Nobody wants to comment on the document. I just want to Slack you. And so we saw that there was this really big pull towards things happening in Slack, even if it was less efficient. And I think we're going to see something similar at work where if there is a single agent you can use for nearly anything, there will just be this giant pull and everyone will talk about how they use that one agent for things. Teams will share best practices with each other. There'll be hackathons around how to use that best thing. Yeah, and you'll end up with just a handful of these.
Interviewer (likely Harry Stebbings or a co-host)
You said about kind of agents not really proliferating in terms of usage other than coding. And actually maybe this being the time and customer support is one of the examples. My question to you is, I'm an investor today. I'm looking for companies which will accrue value over time and provide incredible products to customers. There is a belief that the durability of revenue of large SaaS companies today is zero and that SaaS is dead because the model providers you anthropic others are going to come for our lunch, so to speak. What would you advise me?
Alexander Berrikos
Things are built for humans, otherwise what's the point? Even SaaS tools are built for humans. So for me, I think my question is, does this SaaS company own a relationship with A human on the other end of things. And if it does, then I suspect it's not going away. Or does the SaaS company own some really important system of record? It's probably not going away. Maybe both of those two things, the interaction with the human and the system of record, are more important than ever actually. On the other hand, is the SaaS company like a kind of a glue layer, but it doesn't own either of those two things. I'm not the expert here, but I'm more nervous about that kind of company
Interviewer (likely Harry Stebbings or a co-host)
if we take that stance. Salesforce and ServiceNow, they're down 20, 30, 40%. They shouldn't be.
Alexander Berrikos
I don't think they should be. I don't know. What do you think? I would love to hear your take on this.
Interviewer (likely Harry Stebbings or a co-host)
I think it's massively exaggerated. I think there are some companies that legitimately should be. Respectfully, I think Dropbox is in a very difficult position and I think, I think your Monday.coms of the world though for the majority of SMBs and consumers who use it, which is the large majority of their market actually. Could they vibe code a to do list? Yes. Would it be cost efficient to do so? Not really. Actually by the time you customize it and perfect it and to be honest,
Harry Stebbings
this to do list is generally pretty
Interviewer (likely Harry Stebbings or a co-host)
bland in terms of what you need to do. Add task, complete task, show historical tasks, assign to new members.
Harry Stebbings
It's not very difficult.
Interviewer (likely Harry Stebbings or a co-host)
And so actually I think you just keep it. And so I think it's massively overblown and I think that's the classic knee j reaction from markets.
Alexander Berrikos
I completely agree. I mean if anything like now that it's so much easier to build.
Interviewer (likely Harry Stebbings or a co-host)
But I do think, Sorry, I do think, I think you're going to come for customer support and I wouldn't want to be in that category.
Alexander Berrikos
I think this maybe changes what kind of founder you invest in. Right? Like I think there was this maybe temporary phase where that I liked personally as a product builder. There was this phase where you would invest in the person who can just build good product and you could kind of ignore if they had a good thesis around a customer or go to market or distribution or anything like that because it was so hard to build good product and I think that was an anomaly. If we look at where we are now, maybe that kind of founder is not the founder you should invest in because it's kind of relatively easier to build good product and you need to go back to investing in the founder who's thought through distribution who has a good, good domain expertise of what to build for a specific customer, et cetera.
Interviewer (likely Harry Stebbings or a co-host)
So again, if you were on my team as an investor, how would you think about interesting areas for us to invest in in companies that will accrue value and not be threatened by model providers? Because again, you're going into health, you go into code, obviously, Codex is very clear. You go into customer support. Where are you not going and why is Claude code not going?
Alexander Berrikos
I'm tempted to just say, I don't know, I think it's a hard time to be an investor. The market is so dynamic, it's hard to say.
Interviewer (likely Harry Stebbings or a co-host)
It's a really tough time to be investing today. My answer is kind of twofold, actually, which is like, number one, I look for things with physical infrastructure. I don't think you're going into energy supply. And then two is like fintech and banking integrations, gnarly financial products. I don't think OpenAI is going to go into building 500 relationships with banks in Southeast Asia.
Alexander Berrikos
I tend to agree. Again, comes back to, are you going into a gnarly, complicated market where customer relationships and knowledge of the market are everything? That still seems great.
Interviewer (likely Harry Stebbings or a co-host)
How bad is the war for talent from the uk? We look at SF and I say to companies it's better to build in Europe because it's impossible to acquire talent and it's impossible to retain it. Am I wrong?
Alexander Berrikos
I think that the war for talent is incredibly fierce right now. Obviously, at OpenAI, we have an incredibly strong brand and so we're able to attract a lot of talent. But even so, we put a ton of effort into closing candidates that we're really excited about, even we feel it. It's not like you don't just get whoever you want for free.
Interviewer (likely Harry Stebbings or a co-host)
Can I ask, at the entry price that you get stock at, is it still attractive for the best talent?
Alexander Berrikos
I haven't had anyone tell me anything to the contrary.
Interviewer (likely Harry Stebbings or a co-host)
To what extent do you think about finding the perfect fit versus finding someone who's good enough?
Alexander Berrikos
Earlier, I made my joke about PMs being optional. Yeah, I think that's not actually true. You still need product people. But I do think that they have to be the perfect fit. And if you have someone who's not the perfect fit, they might just do more harm than good. It kind of means that we're way more selective than I might have been in other roles.
Interviewer (likely Harry Stebbings or a co-host)
I'm a CS student. Okay. I'm at Stanford, I'm at Imperial, I'm at Cambridge, I'm wherever Eth great institution. What would you advise me knowing all that you know now that would help me navigate the next five years of my career? I want to be valuable to the AI ecosystem, environment as an engineer entering the workforce in the next year.
Alexander Berrikos
Basically there's actually never been a better time to be an engineer because you have incredible tooling available to you to get an incredible amount done. And your ability to ramp into a complex code base that you might be hired into has never been faster because you can go ask AI like a ton of questions about the code base and you can ask it to plan out changes that would otherwise take you like days to research maybe. I think first off, I would say you should be very optimistic, but then of course about your abilities. Once you're at the job, then now the question is how do you get the job? Because it's never been easier to build things. The thing that becomes scarcer is agency, taste and quality. I would urge you to just build things and demonstrate your agency and your taste around what you build and build things that are of high quality and then share those things. We get a lot of inbound from folks both applying for jobs through the careers page or also on social. And this is just me, but when someone writes to me with some interesting thoughts and a link to an interesting project, that gets my attention much more than a normal resume does.
Interviewer (likely Harry Stebbings or a co-host)
Final question. So we do a quick fire. What has Claude code done well that you sit back and you learn from number of things.
Alexander Berrikos
Like I was saying, I think way back last year they made something that was really easy to use and just worked with all your tools with zero setup by running it locally in your terminal. When we started investing much more in the codec CLI and shipped great models for it like GPT5, our growth exploded. And so I think that idea of just meet people where they're at, give them something easy to use, let them ramp from there and figure out how to use it, has been awesome. So that's probably the biggest learning we've had from them.
Interviewer (likely Harry Stebbings or a co-host)
What mistake do you think they made that you've also learned from having had the benefit of seeing them make it?
Alexander Berrikos
They over indexed on their initial success with their command line interface tool. I think at the end of the day it's not the friendliest UI and it makes it hard to extend beyond pure builders and it makes it difficult to truly delegate to agents because effectively to delegate through that kind of interface, you have to be a power user of like your terminal or TMUX or something. And so that's why we built the app and I think the market reception around the app, to me it was kind of a risk when we started, but it makes me really feel good about that decision because the Codex app is a much more intuitive, simple interface to get started with. It's less scary, but then it naturally leads you to this idea of I'm going to take my hands off the keyboard and delegate to the agent.
Interviewer (likely Harry Stebbings or a co-host)
You mentioned Dropbox earlier. The alumni from Dropbox is incredible. I mean, really amazing to see the talent that's come out of Dropbox. What's your single biggest lesson from Dropbox that has shaped some of your thinking now with OpenAI?
Alexander Berrikos
Oh, I don't need to think about that one. That's kind of the thing I was telling you about earlier, right? When you're building tooling for people, for end users, you have to think about that tooling as a system of engagement. If people don't want to use your tool, if it doesn't naturally feel like the easiest way to get something done, then people just won't use it again. I learned that from watching how Slack just absolutely took off. And so I think about that a lot now. When we're building these agents, I'm like, if we build our agent purely as like workflow automation, then it's always going to be like pulling teeth to get that thing started right. You're going to need to hire Accenture or someone to come in. They're going to need to deploy FTEs. It's going to be tough. But if you can build a system that people just love using, even if they only use it for partial tasks, over time they'll get better and better at using it, and then you'll get connected to the tools you want over time, and then you can start laddering in automation. Obviously these aren't mutually exclusive.
Interviewer (likely Harry Stebbings or a co-host)
How on earth do you reinvigorate growth at Dropbox today?
Alexander Berrikos
At least from when I was at Dropbox, the thing we were uniquely good at was desktop software and desktop software. It's funny, it was never not back. But anyways, it's so back. Basically because if you're solving for productivity and knowledge work, yes, there are systems of record everywhere that you need to connect with, but everything at the end of the day happens on the user's computer, either in their browser or just locally in apps on their computer. I do think that the fastest way we're going to see productivity gains from agents at work is going to be at first meeting users on their computer, working with the stuff that they have available to them without having deployed FDEs to set anything up. And then over time you'll connect in these various systems. And so if I was Dropbox, I'd be thinking about how do we leverage our unique domain expertise in building really good desktop software and this sort of collaborative layer on top of your computer. How do we leverage that to enable productivity agents? It's a bit broad, but I think that's the angle you go for.
Interviewer (likely Harry Stebbings or a co-host)
No, I love it and I really appreciate the response. Final one, before we do a quick fire promise, I've been brought up in a world where margin matters. Software Margins are wonderful and it's what makes software a brilliant category to invest in. We're seeing margin profiles that are very different in inference, heavy plays in particular. To what extent should I put that out of mind and appreciate that costs will come down, cost of tokens will come down. And actually it's about usage and customer love. Margins will come or no, margins are actually freaking important. Keep that focus.
Alexander Berrikos
I think both costs are going to come down significantly. And I also think that if this is the year of agents being deployed broadly at work, then this is also the year where they're going to have to be connected to all these various systems. And I think that's going to be very sticky. And so I view this year as a race. And so I think you want to win that race and you should be okay taking some hit to margin in the meantime.
Interviewer (likely Harry Stebbings or a co-host)
Dude, quick fire rounds. I say a short statement, you give me your immediate thoughts. Does that sound okay?
Alexander Berrikos
Yeah.
Interviewer (likely Harry Stebbings or a co-host)
What have you changed your mind on most in the last 12 months?
Alexander Berrikos
When I joined OpenAI, I thought that this was a little longer than 12 months ago. But when I joined OpenAI, I thought THAT we would all just be hanging out with our computers, screen sharing within a year. From there, we'd have this agent that we're just talking to that was completely wrong. I think the rate of progress in multimodal models was slower than I expected. Multimodal means models that work with video and audio. So instead what happened was that we saw that agents that work with your computer through code are the way. And so for me, that's been a complete rethink in terms of how we bring the benefits of AI to just people. Generally, it's not through video and audio primarily.
Interviewer (likely Harry Stebbings or a co-host)
Which lesser known competitor do you respect most and why?
Alexander Berrikos
The first one that came to mind was amp. I think they're building. Yeah, amp it's out of the folks at Sourcegraph. Their product has a great Reputation of just being punching way above its weight. But I think the other thing that I really respect is that they helped initiate this whole standardization around agents MD and agents skills, which are what I was saying earlier about making it so it's easier for users to manage all these different agents that they're trying. We obviously put out agents md, but they put out agent. And basically Quinn started this all by putting out a tweet that said, hey, if you guys buy the domain agents md, we'll standardize to your spelling. And as small as that was, that initiated this whole standardization that I think has been awesome in the community.
Interviewer (likely Harry Stebbings or a co-host)
Do you think the response to Anthropic's ads was the right response?
Alexander Berrikos
I mean, there were so many different responses. The one that I heard obviously I think was right. The one that I heard was, well, one company is being pretty negative about the future and the other company US, OpenAI is being really positive and just telling people they can build things into dream. I thought that response was brilliant.
Interviewer (likely Harry Stebbings or a co-host)
I mean, Sam wrote an essay. Do you think it was a good response?
Alexander Berrikos
I think so. I mean, I think one of the cool things that I love about OpenAI is like, people are like very unapologetically and authentically themselves. And so for me, that was just like a very authentic response. And I like that we do that.
Interviewer (likely Harry Stebbings or a co-host)
What's the hardest product decision you've had to make since being at Kodaks?
Alexander Berrikos
Well, I can tell you the most painful product decision we had to make for a while. Codex Cloud was effectively unlimited, not free. Like you needed to pay for ChatGPT, but then you had unlimited usage every day that we left it that way. We knew that would be harder to wind back it being unlimited. But we were just so focused on competing on our other things that had more PMF that we kind of punted that decision out. When we wound back that unlimited use to some more reasonable limit, there was a lot of blowback from users and it was a very small minority of users who thought everything should be kind of sudo free forever. But that blowback affected us everywhere because the social chatter doesn't really distinguish between these things. I think the lesson I learned the hard way there is you can't make things unlimited for too long, dude.
Interviewer (likely Harry Stebbings or a co-host)
It's like pricing, grandfathering, pricing is just such a hard thing. What do we do today in engineering or product that in five years time you'll look back on and go, oh my God, can you believe that we did that?
Alexander Berrikos
Well, one is just editing code by hand. I think probably Another one, this is maybe spicier, but another one might even be actually managing the deployment and monitoring of systems by hand. I basically think that probably big companies will take a long time to deploy this, but many startups might actually kind of start building on a completely new stack that's fully AI managed. To be clear, the stack doesn't exist yet, but a fully managed AI stack where basically it's been built to give you really strong deterministic guardrails over what the agent can do and like control over to like roll back deploys and everything like that. And so we'll get to a world where the way you start a company is you start by getting an agent and just asking it to build things and then you get more agents in that and then maybe eventually you add your co founders to this service that you use to work with agents. And so you end up like maybe your main communication tool is actually your agent communication tool and then maybe you're not actually handholding this like very painful CI and deploy process, but you're just like, like having agents do things.
Interviewer (likely Harry Stebbings or a co-host)
Weird question, but I am intrigued. Are you the one providing agent guardrails? And what I mean by that is agents can go anywhere within an enterprise. Are you responsible providing those guardrails or is there a third party matter provider who is saying hey Alex, you can't go into that, that's human resources. Or you can't go into that, that's marketing. How do you think about guardrail provisioning? And is that the role of the agent provider or a third party provider?
Alexander Berrikos
I think we'll probably see both. Like we are putting a lot of effort into agent guardrails. Like I said, we're basically the only company that cares about OS level sandboxing for coding agents. For instance, there's none that exists on Windows. We're the ones building that and we're doing it in open source so hopefully other people can use it. We think about that a lot. ChatGPT supports connectors, so you know, you can talk to your like Google Docs or something and we put a lot of effort into guardrails around what the agent can do with your Google Docs. Those are just two examples but we think a lot about this and I think probably though the way that we'll do it will not be sufficient. Like there'll be third parties who provide like various bespoke things for various bespoke company needs and there'll probably be a mix of both.
Interviewer (likely Harry Stebbings or a co-host)
Final one for you my friend, what are you most excited about when you look forward 10 years, this is probably
Alexander Berrikos
going to happen in much less than 10 years. But my mission sort of personally when I joined the company was I just felt like even with the models we had a year and a half ago, there was so much just capability, overhang or just ability for these things to be useful, but we hadn't built the right products around that. And so people like me were getting more benefit than people like my grandma. What I'm most excited for is to get to a form factor for AI that means that they're just helping everyone, regardless of whether they're in tech and especially if they're not in tech or especially if they're older. And so the concrete vision I have is at some point we'll add an agent to our family, WhatsApp or something, and it'll just start being useful to the family without anyone having to think harder about it than that. There are many other ways that that could happen, but I think concretely that's the most obvious thing we could do with like my grandma.
Interviewer (likely Harry Stebbings or a co-host)
Dude, I so appreciate you. I so appreciate you putting up with my wandering questions and my very episodic mind. You've been fantastic, man.
Alexander Berrikos
Thanks so much. I mean, I appreciate you putting up with my wandering answers. So all good.
Harry Stebbings
But before we leave you today, the early story of Atlassian is probably very similar to your own. Atlassian knows firsthand the challenges that startups face every day and that the right tools are essential to go from MVP to ipo. That's why Atlassian for Startups gives eligible Companies up to 50 seats free on the premium edition for products like Jira, Confluence, Loom, Jira, Product Discovery, Compass and BitBucket. So your team can use the best in class tools to plan, track and collaborate on work, whatever that work may be. Many of today's most successful startups like Cloudflare, Canva and Rivian relied on Atlassian for their growth trajectory. And Atlassian wants to give that same opportunity to the next generation of builders and investors. We know how important it is to focus on building the right things early. Whether you're in the sticky note stage or well on your journey, teams at any stage can work smarter together. It's never too early to start with Atlassian. Head on over to atlassian.com startups Harry for more details and eligibility. After Atlassian helps your team build and ship great products, Intercom helps you support the customers using them. If you're looking for a way to transform your customer service, let me introduce you to Fin, baby. Fin is the number one AI agent for customer service resolving up to 93% of customer queries automatically. There is no other agent that can do that. Not 93% of customer queries. Okay, no other agent can do that. So why choose finish? FIN is the best performing AI agent for cs. FIN doesn't just answer questions, it takes actions. It automates the most complex customer queries like refunds, transaction disputes, technical troubleshooting with speed and reliability. I wish my team was speedy and reliable. Beats every competitor in every head to head Bake off, completely configurable and code optional setup.
Interviewer (likely Harry Stebbings or a co-host)
My word.
Harry Stebbings
I mean the benefits just go on and on. It's easy and efficient implementation. It works on any help desk with no tedious migration needs needs. It's trusted by over 6,000 customer service leaders including top AI companies like Anthropic, Lovable, Synthesia, Klei Vanta. So if you're ready to transform your customer service team, scale your support and give team members time to focus on the really high level strategic work. Learn more about FIN at FIN AI20VC while FIN scales your support without losing speed, Reforge shows you how to translate that scale into durable product LED growth Everyone's shipping faster than ever Cursor Claw Code Codex AI is making code and writing code faster than ever. But here's the problem. Speed means nothing if nobody uses what you ship. That's where Reforge comes in. Reforge is building the product discovery engine that sits upstream of your coding agents. Not another prototyping tool, research repo or AI interviewer, but a product that will ingest your customer data, generate variations of product solutions, validate the solutions before code is written, and hand off winning directions to your team. Reforge kills product debt before it starts because every unused feature you ship isn't just wasted engineering time. It's a maintenance burden, complexity, tax and surface area that you cannot shrink. Used by product teams at companies like Toast, Vimeo, Klaviyo and many more, Reforge helps teams ship more features that actually get used. Try reforge@reforge.com build and use the code 20VC. That's 20VC for one month free of pro.
Guest: Alex Embiricos, Head of Codex at OpenAI
Host: Harry Stebbings
Date: February 21, 2026
This episode brings Alexander Embiricos, Head of Codex at OpenAI, to the hot seat for a candid, deeply technical, and strategic conversation. The discussion delves into the competitive landscape of automated coding agents (Codex, Claude Code, Cursor), the coming phases of code automation, what it means for developers and product managers, where the real bottlenecks for AGI lie, and how product experience interplays with technological advances. The conversation is open, tactical, and often philosophical about the future of work, product, and AI.
“I'm a maximalist. I'm definitely much more motivated by the idea of winning than the fear of losing. ... What motivates me even more than that is I just love building things and building things for people.” (Alexander, 04:36)
“What does it mean for coding to be automated? ... Every time that's happened, there's been an explosion in demand for the output. And so you need many more people actually to do that kind of work...” (Alexander, 05:23)
“It's incredibly hard to define what a PM is ... All those things I just described ... could be done by a really strong ENG lead or a designer ... You probably don't want many of them until the team is really large.” (Alexander, 07:21)
“I think AI should be helping us tens of thousands of times per day ... But I'm too lazy to type out that many prompts and too uncreative to figure out all the ways that AI can help me.” (Alexander, 08:58)
“If you can just give AI to the people ... they can start to get a mental model for how AI can help ... you have much more intuition for how this works.” (Alexander, 14:18)
“I would say that now probably most people are not even opening IDEs ... the code itself is not being written by humans anymore.” (Alexander, 19:01)
“We’ve explicitly trained the model to be good at code review and that included things like making sure it’s really good at creating high signal feedback ... So nearly all code at OpenAI is reviewed by Codex automatically.” (Alexander, 21:43)
“It’s going to be some entity that I can talk to however I want about whatever I want, right?” (Alexander, 32:32)
“Slack is just such a center of gravity ... I think we’re going to see something similar at work...” (Alexander, 45:26)
“If people don’t want to use your tool, if it doesn’t feel like the easiest way to get something done, then people just won’t use it again.”
(Alexander reflecting on Dropbox, 55:28)
This episode delivers a comprehensive look at the present and (soon-to-come) future of AI-powered coding, work, and product development, through the lens of Codex’s evolution and the broader agent ecosystem. Alex Embiricos shares first-principles perspectives, lessons from competitors and past roles, and a strong sense of optimism about AI’s enabling power — for engineers, enterprises, and everyday people.
Perfect for anyone seeking a nuanced understanding of the new “builder” landscape, agent architectures, and the evolving role of humans in the loop.