
Loading summary
A
We write very, very few specs on the Codex team. We're talking like 10 bullets or something and that's it. The designers on the Codex team write more code now than was written by an engineer like six months ago.
B
And I made a quick prompt to create a little 2D game, maybe add
C
some more decorations, houses, trees and stuff.
B
Could you add some more decoration like trees? And there we go. We have already new trees appearing for a small change.
A
It's often faster to send a PR than it is to communicate to someone and get them to prioritize that task when they have 10,000 other things to do. I don't actually view PM as a goog leadership position. I view it as a fill in the gaps position. I think the fewer people you need in a room to do anything, just the better that thing goes, the more pure every decision is.
C
Okay, welcome everyone. I'm really excited to host today. Alex and Roman from OpenAI's Codex team. They're going to demo how they build new features of Codex, what Codex is capable of, and also talk about how the Codex team ships nonstop. So welcome guys.
B
Thank you. Thank you for having us.
A
Yeah, excited to chat.
C
So do you guys want to just quickly show what kind of things code can actually build in one shot?
B
Yeah, for sure. I mean, let me share my screen to give you a sense. And so there's so much I could show, but maybe a quick glimpse into. For instance, Here is an iOS app I've been building and if I want to actually create a new feature for this app, I can simply dictate and voiceover something that says, hey, can you add a new screen for NASA's Artemis mission return to the moon? And I can send that prompt with GPT 5.4. And sure enough, the model will like create a new screen for this particular iPhone app. So here we have this app. It's pretty cool. And it's currently building this new feature, so we should see that in a moment. But we also have the Codec Spark model, which can really help you ideate and iterate in just a few seconds on. On anything. In fact, let me show you, like what it's working over here, the difference of what it does to have a Spark model responding so quickly. On the left side you have GPT 5.4, right? I'm going to give it a head start. And on the right side you have Codec Spark and boom, you have like 1200 a second on average. This is insane speed. And so when you want to build something, let's say a Game or right before we started this conversation, I actually went to the Codex app and I made a quick prompt to create behind the Crossing, a little like 2D game where I can start building. What also I love using with the Codex app when I'm in the flow is taking the Codex app like this and pop the conversation out on top of the screen, right? And so this way now what I can do is like if I'm like actually working on this game, I can keep iterating and have more ideas. I don't know what we want to do. Do you have an idea, Peter, for what you would like to change on this game?
C
Maybe add some more decorations, houses, trees and stuff, make it more lively.
B
Could you add some more decoration like trees? And I'm going to send this task and in basically a handful of seconds Codex Spark will be able to edit and we'll see the changes live. And there we go. We have already like new trees appearing and I can keep on playing. This is insane speed, right? And so that's why I'm so excited about codecs. You can really have frontier models like GPT 5.4 that can take on very complex tasks like millions of lines of code to analyze or migrate. But if you're in the flow and you're really feeling inspired, you can reach for like the fast mode or even codec Spark and you have this insane speed of thought where you can really build anything. So this is just a quick tour for how to build with codecs.
C
This episode is brought to you by Granola. If you're in back to back meetings, you know how much work it is to take notes, live and clean them up afterwards. That's why I love Granola, the best AI meeting notes app in the market. Here's how I use it. Granola automatically takes notes during a meeting and I can add my own notes too. After the meeting ends, I use a Granola recipe to extract color, clear takeaways and next steps in the exact format that I want. Then I can just share notes directly in Slack with my colleagues or even get Granola to share their notes automatically. Honestly, of all the AI apps that I use, Granola is the one that saves me the most time. Try it now at Granola AI Peter and use the code Peter to sign up and get three months free. That's Granola AI Peter. Now back to our episode. I'm really curious how you guys actually build products with Codex on the team, right? Alex? Do you even write specs anymore or do you get GPT to write A spec. Which model do you use when to make this stuff work?
A
Yeah, I think we write very, very few specs on the Codex team, actually. I think a lot of the work is, like, let's have the people closest to the metal making as many decisions as possible. And so the only time that we'll write a spec is if it ends up being a problem that's, like, kind of hard to fit in one person's brain, Right? And by the way, like, one person can fit a lot in their brain now because they can do a lot. Like, they're delegating most of the coding. Right? And so one person can do a lot, but if it ends up being a thing where we're, like, coordinating across a few people, or maybe it's like a really thorny decision that we have to make, then maybe we'll write a spec. But these. The docs that we do write in these cases tend to be incredibly short. You're talking. We like talking, like, 10 bullets or something, and that's it.
C
Okay, can you guys, like, show me how this works? You give Codex a few bullets, and then maybe it writes the actual requirements first, like an MVP file.
B
Yeah, we could do that. Also, one thing I want to show you also, very simply, imagine going Back to this iOS app I was mentioning that's currently finishing a task over here. Imagine you have ideas for new features for that new project, and you have some ideas, but you're not exactly sure where to go. What is very exciting now with Codex is that if I start actually talking about, let's plan the next steps, you can see that, like, Codex has automatically understood that I'm trying to make a plan for what we should build next. And so if I simply do Shift tab, it will enter plan mode. And if I say, like, what should we build? Like, I can use codecs as a brainstorming partner to plan. And in this plan mode, it's going to look at the code, it's going to look at where we are so far in the project, come up with ideas, and. And then I can also bring my own ideas and start steering the model into a good plan. And so based on that, you can see, for instance, as we speak here, that codecs has some ideas based on what it's looking for and the files. And so here it's going to actually prompt me for some guidance, like, what should we do? Should we start working more on the Artemis idea that we were just doing? Should we just do a pass on reliability, a dashboard? Maybe we'll say, like, yeah, maybe a reliability pass is good. Wait, who should we optimize for? And so I can use Codex this way. And of course here I gave no inputs. So in the case of Alex, you know, as a product lead, I'm sure you would give a lot more guidance up front, but here I'm kind of letting Codex take some ideas.
A
It's funny, I do this a lot and drive them. Yeah, like often I'll. Okay, so there's like various kinds of changes, right? There's like the super simple change, you just go straight in. You just prompt it. Yeah, right. Then there's like sort of a medium complexity change where maybe you'll like reason about how to do it or ask for a specific plan. But something that I actually do is kind of like similar to this where if I have like a vague idea, I might just go into Codex and just ask it to start like thinking about how it might solve a problem. I don't even have a feature in mind. And then like, you know, it'll go explore and like ask me some questions. And like, in my case, I often don't end up even using that thing because maybe this is quite a complex change that. There's a digression here, but what code do PMs write is an interesting thing to get back to, but maybe a complex change. I don't actually want to be on the hook for landing and maintaining that change, but I'll still go through the motions of a plan mode and exploring it and then I just have a better mental model of what we need to do. And then that becomes something that not the plan itself, but just the thinking becomes something that I share with an engineer, I feel like. So to take that digression briefly, the designers on the Codex team, we like to more code now than was written by an engineer six months ago. They're absolutely goated. But obviously the tool is a massive part of this. And the team was making fun of me for not landing that many PRs in the last year. I'm not going to give you the number, but I'm like, yeah, it should be more. Especially when you consider how many of those were very small tweaks. But I feel like we're at a point now where like it's not about can you generate the code? Like the agent is amazing, you can delegate tasks to it. It starts to be a point of like, what are you deciding to do? That's actually super important. Like, are we aligned on what. What this thing is becoming? And then on the other side of It. It's like, how are we making sure the thing is really high quality? Like, you know, like some folks will say, proudly say, like, the entire app is vibe coded. Like, in the case of Codex, like, the vast majority of code was generated an agent, but we still spend a lot of care and attention, like, thinking about the system and making sure it's really high quality. And so that's why, for instance, if there's a really complicated feature, I often will make sure there's like, a more robust stable owner to own it. And I don't. I don't think you want. Part of the value of a PM is they can be like, super distracted and they go around. And so you don't necessarily want PMs owning these systems.
C
Yeah, you don't want a PM to like, maintain a feature code. That doesn't sound like a good idea. I think we'll screw it up.
A
Yeah. Yeah.
C
Okay. And like, yeah, some of the other pros out there are like. Like, I love the other products too, but, like, you have to spend so much time learning them. I almost feel like if I'm not on Twitter, Twitter, I would have no idea how to use the other products. And, like, one thing I really love about Codex is, like, how simple the app is to use. You know, it's just like, very intuitive and very simple. But there is, like, some pretty advanced features, like skills and automations. Right? Like, do you guys use that stuff internally?
B
Yeah, a ton. A ton. In fact, like, I think skills are like, the most interesting things that, like the codecs app Surface enables you to use. Like, for instance, like, imagine you're pairing with designers that use figma. Well, it's amazing now to turn on the Figma skill to kind of pull details directly for, from the Figma files, all of the react components, the variables, and then, like, Codex will, like, implement this code accordingly. But imagine, like, you're building an app, like, maybe you want to share it and you want to deploy that to Vercel or Cloudflare or. Or Render. Those skills are, like, right there. And so you can just simply tell Codex what to do and will basically connect to this ecosystem of tasks. It's funny, I was talking to a friend a couple of nights ago. He was telling me that he had a ton of ideas to improve his product. And he told Codex, just write all of these tasks on linear. So I keep track of everything as we go using that skill. And at the end of it, he's like, well, now I'm going to bed. Go ahead and implement all of these tasks that we just discussed and cross them off and sure enough, you woke up and everything was actually complete.
C
Yeah, that's amazing.
A
Coming back to your point about the simplicity of the app, I think it could be interesting to share a little bit about how we think about designing it. I don't know if that's interesting.
C
Yes,
A
there's something that's really interesting about building in this space is that developers love just automating tools for themselves, building tools themselves, automating parts of the work. And so I feel like a really important part of the product is that it's like super configurable, right? And so, like, for us, Codex, like the harness is open source. Like, you can go deep in, like whenever we're building a feature, we start getting complaints on Twitter that the feature, which by the way is like not enabled in prod, is like broken because that people are going in and like changing the code themselves or forking it to like get these new features working. But for me, that's like an awesome part of the product, right? And what that means is like, the cutting edge of your users are just absolutely living in the future with us and pulling us into that future. On the other hand, though, if you only build for that, you end up with this thing that's nearly impossible to understand. And you should spend all day on Twitter, like you were saying. And so we kind of have this view of like, we are really careful about, like, what the core primitives of what we're building are like. That's the place where the stuff will be written down. And it's not just like a vibe coded thing. It's like, we're really thoughtful, like, okay, how do we mostly let the product be almost invisible, get out of the way of the model and just let the model. And every time the model gets better, just do more and more then from there. How do we package this in as configurable a way as possible for power users so they can figure out what it is? For instance, there's an implementation of subagents that's out in the wild right now, and people are using it and experimenting and we're learning a ton from them. Even though we don't actually trigger that proactively in the product. It's just something that users can learn about and go use. And then we learn from how people are using it and then from there we think about, okay, now how do we make that super simple for everyone else? So like, the Codex app is actually an example of this, where around the Time of like, I would say like GPT GPT 5.2 Codex in December. All of a sudden, like it was like incremental steady model progress, but we just kind of cleared this point where you could start delegating way longer tasks to the model and it would just like one shot it anyways. Yeah. And so we started to see like people were already tmuxing like, or, you know, for anyone who doesn't know what tmuxing people are already like running many parallels in terminal. But we started seeing like crazy like things on social media. Like this is one picture of like Peter Steinberger with like, you know, the creative openclaw with like, I don't know, like 18 terminals across eight, like three monitors. So we started seeing people like using codecs in this very advanced way. We were very excited. We kept making sure the delegation worked well in the basic product, like cli. But then we were like, okay, like Maybe the top 1% of engineers are going to work that way. How do we make this feel really intuitive? And so then we got to the Codex app, which you launch. It just feels super simple. It's just like a chat. It'll do work. But then you start discovering, oh, there's a sidebar. Oh, I can run multiple tasks. Oh, it's like really easy for me to click between them. Okay. Now I'm like being really effective myself. And then it's like, ah, there's a skills tab. Let me like go into here. And so we try to like make it so it's almost like playing a game. You're just like discovering what's next.
B
Totally. And I think we've always had this vision right from the get go that like coding will take place into this like agentic delegation fashion. Like, even when we started codecs almost a year ago, we were always thinking about this like future where as an engineer you're working on multiple things in parallel. But candidly, the models were not quite there yet. Right. And I think we needed to see like the inflection point with GPT 5.2 codecs and beyond that to see the model be able to be like super thorough and, and work reliably for hours on end, if not days. And then by that time you're like, well, now this is a weird interface to have multiple tabs open in a terminal and just let them run for hours, you know, and so then we needed to have this new surface. And I think the timing for something like the became perfect.
A
Yeah. There's two vibe shifts in Codex history. The first is Like, August. So we launched this Codex cloud product. It was a great idea. People were super. People are still super excited about it, but it was a little early. So around August, we shipped GPT5, great interactive coding model. We were like, all right, let's go solve the problem that the models can solve right now. Ship codec, cli, IDE extension growth started exploding then, and I remember it was like we grew like 20 or 30x in a few months, which is awesome. And then the second shift was around December, January, where we actually could get back to this vision of delegating to models.
C
Let's go a little bit deeper into the development of the Codex app. So did you have like some sort of a. Do you have sort of like annual roadmap like a year ago of like, hey, in this time we're going to launch CodeZapp? Or is it more like you kind of saw the market doing this stuff and then you kind of prototype a bunch of stuff like, how was this thing built?
A
Okay, so it's like, neither. And I got some really good advice from a researcher here called Andre, and his advice for me was that at OpenAI, you either plan near term or long term, but you never plan medium term. It's just too difficult. So near term is like, up to eight weeks from now, eight weeks being the absolute maximum. What is a concrete thing that you can, like, motivate a team to, like, rally together around and get done. And this is something that we're really good at, opening eyes, like, kind of like rallying a team around, like a thing that we want to do. Yeah. The other thing you can do is you can kind of have a vibe that's like, you know, like a year from now, we're going to have models that are way smarter. They're going to be able to do, like, you know, I'm rewinding back a little bit in time now. You know, you can be thinking, because now, like, what I'm about to say is, like, obvious, and it's obviously less than a year from now, but you might be like, yeah, we're going to have models and we're not going to want to lend them our computer to do work, because then we can only do one thing at a time. We're going to want, like, infinitely many models and they're just going to be doing work independently, like validating their own work, maybe even deploying the code themselves and monitoring it themselves. And we shouldn't even have to prompt them necessarily. And so you kind of think ahead to, like, this Kind of vibe. Right. And the in between thing is just kind of awkward. So the in between thing is like a product roadmap. We just, we basically don't really have those. We have the combination of like a sort of long term direction and like, things that we think bring us in that direction. So, for instance, in the case of the Codex app, like, one of the strategic goals that we had was to dissociate ourselves from a specific workspace. Okay, so that's a bit abstract. What I mean is that, like, if you're using an IDE like VSCode, which is my favorite idea, you open VS code to a specific workspace.
C
Like a folder.
A
A specific checkout of the code. Yeah, a specific folder. Even if you're using git work trees, you can only open it to one git work tree at a time. You basically can only work on one thing at a time. And the same is true for a CLI as well. Because we know we have this vision of we want people to be working with agents that they've delegated to in the cloud that are just working independently. We know we need to get to a point where it feels really natural to be talking to multiple agents at a time, or even just one agent that's orchestrating multiple agents for you. However, we've also learned that if you start in cloud, it can be quite hard for the developer to get value because your tools aren't there. You've got to do environment setup. It's a little bit hard to get partial credit for a task because maybe if the model goes halfway, you need to jump in and course correct or just poke at things. So we're like, okay, we need a local experience that is separated from a specific folder, but yet feels super intuitive to work with folders on your computer. And so when we started the app, we had a bunch of this, like, vibes thinking up here, like esoteric vibes thinking. And then we had a bunch of like, prototypes that random engineers had built that were just like, I wish we had an app. And it was like this or that or the other. And there was actually a hack week where like multiple independent people built different versions of apps. You might have even built one, I don't remember. Um, and so the project, when it got started, the only thing that really needed to be written down was why we thought it was a good idea to build an app. Like, there was no specific spec for the app. And, you know, eventually we generated one by, like, through building. But really it was like, quite contentious actually. Like, should we Even be building an app. Like, the IDE extension is super popular. Should we just focus on that and improve the quality there? What about CLI? Like, feels like CLIs are a thing and then if we are building an app, like, what is the point of building an app and where should we go? So that's kind of how these things start.
B
Yeah. And luckily we had a great solution with the IDE extension at the time, which we had polished quite heavily so that you can use it in like VS Code, Cursor, Windsurf and others. And so we took a lot of the learnings in the code base from that IDE extension to have a great starting point that was already robust.
A
Yeah, yeah. Actually, like, the app shares a bunch of code with. Well, it just shares code with the IDE extension. And under the hood, both the app and the IDE extension talk to the same core Codex harness in Rust that is open source and that the CLI also uses. So there's a lot of sharing and very intentional layering of these primitives.
C
But the decision to build the app was, I mean, now it's kind of obvious because just using the Codex app is way easier than having a bunch of terminal windows open. But the decision to build the app was basically beginner friendly and it's kind of just play with it as like the best interface to manage multiple agents at the same time.
A
Yeah, I would say, like the way that we think, like, we're very much like, you know, we're very AGI, though. So we're thinking about like, what is the future that we're skating towards.
C
I see.
A
So, yeah, I would actually flip the order. It was more like, we know that we need to build an interface where it feels really natural to delegate to multiple agents because we know we're going to have models that are ready for it. Or in fact, we're already seeing people delegating across agents. So we need an interface where that feels really natural, that will scale really well to Cloud. And we want all of that to feel like ergonomic. It shouldn't feel like you're crazily figuring something out to delegate to multiple agents. At times you just feel like, obviously that's how you want to work.
B
It appeals, by the way, not just to junior developers at all. It's quite the opposite. Even the most prolific, most senior engineers, even at OpenAI, from Peter, from Open Cloud to Greg Brockman, they're now using the app as the primary way to build. So this was very much this agentic delegation vision coming to life. And it's not just for like, oh, like the most sophisticated engineers will stay in the terminal. They're actually moving through the app as well.
A
Yeah. So hopefully. So, okay. We keep talking about Peter because He just joined OpenAI and we're like, super excited, you know, again, creative open cloud. But I don't know if I told you this, but yeah, I went for a walk with him in, like, October at Fort Mason, which is a place in San Francisco. And I didn't, like, outright tell him that we were thinking about building an app, but I was like, you know, I was starting to, like, poke at this idea of like, you know, some kind a new interface that made delegation feel natural. He basically told me he would never use such a thing. And then like last weekend he was like, tweeting, actually, the app is pretty good. Hell is frozen over. I now like it.
C
Yeah, Yeah, I thought to Peter too, if you got him to use the app, that's like a major accomplishment because he has like 20 terminal windows open, so that's like a huge accomplishment.
A
Exactly. I mean, I need to follow up with him. He probably uses both, but who knows? Yeah.
C
Yeah. So, Alex, you were like, the only PM on Codex for a while, right? And how many people does Codex have? Like, you know, 50, 100 people or.
A
That sounds about right. Somewhere in that range. Yeah. Like, we were. We were like eight people back in May, so. Right. Yeah, yeah. Or something like that. I, you know, I don't remember exactly, but we've just like, grown really quickly since then. Yeah, somewhere in the 50 to 100 range is interesting.
C
So how do you, like, spend your time, dude? Like, like, what's like a typical day, like, or is there no typical day?
A
Okay, so in my case, I was thinking about this recently because I realized that I don't know how to answer that question. And I think what I realized is that I have these, like, different modes that I operate in. And, you know, this is not. This is not advice. This is just me, but I think I have a mode. Like before, for example, we were shipping the app, which is just like straight up execution, you know, obsessing over quality, making sure we aren't like, we're looking around all the corners and, like, landing every little bit of thing. And that mode is like spending a lot of time in Codex, actually, like, both to, like, because, you know, you can. We are not, you know, we tend to use Codex a lot to, like, understand what's happening. Like, I use Codex a ton to understand, like, what is happening in Slack. Like, what is the feedback we're getting. I'll have Codex just go and like summarize that. Follow up to linear. So there's like a lot of the, like just understanding the state of quality using codecs. Then there's a lot of using codecs to understand what the, like just things about the code and then using codecs to make changes. Because nowadays it's like for a small change, that's like not building a new system, which again I try to avoid, but like, you know, taking care of existing systems, it's like often faster to send a PR that is good and you've tested than it is to like communicate to someone and get them to prioritize that task when they have like 10,000 other things to do. Because we're aiming to launch an app in like two weeks. Yeah, yeah. So there is that. And then you know, obviously there's a lot of human side of just like cheerleading, rallying, but also being a critic of what we're building. So that is one mode that I've noticed and actually you can tell if I'm in that mode if I'm on Twitter a lot. As we approach a launch, I tend to get more on Twitter. And then there is this other mode which is like where for example, like now it's like quite top of mind for me that we are at a stage where we have these amazing models. Like GPT 5.4 is incredible. We also have this app experience that is even more popular than we anticipated and we now have it on all platforms, including Windows. And so now in my mind I'm like, okay, it's time to really get back to cloud and invest more in that. And so when we enter these kinds of phases, I spend much more time thinking about what to do and understanding what is the state of things. And so that's kind of like a coordinationy mode where actually I am spending less time in Codex. Like I tend to be using Codex more for communication and less for writing code. So I have at least those two modes. There are probably more. Yeah.
C
Like how much cross functional alignment do you have to do?
A
So Codex team is awesome. We do very little cross functional alignment within Codex team. We just kind of like view ourselves as like intentionally a bit of a like pirate ship like team, you know, we even within Codex team, you know, now there's like there's me and now two PMs. As of very recently there's a few Eng leads, although until very recently everyone just reported to TiVo, but we kind of all just fuzz around together and so there's not too much alignment going on there. But increasingly I think a large part of building codecs is building this coding agent and increasingly it's clear or it's probably obvious to everyone now that a coding agent is a really generally useful thing for other work that's not just coding work. Like we see people using the Codex app for more than just writing code, they're using it for tasks across the software development life cycle. And then now like we actually like the vast majority of OpenAI uses the Codex app. Even outside of technical orgs, I just see the app everywhere. And so you know, that kind of thing realization where it's like okay, how do we help Codex be like useful beyond just people writing code that requires more cross functional alignment? Because you know, as OpenAI we have ChatGPT which many, many, many people use. So we have to be thoughtful about how we do that.
B
And on my side also developer experience. We're kind of like an extension of the Codex team now. Like we're spending most of our effort on Codex but for a few different reasons. One, of course it's like it's an exciting product and like developers love using Codex and want to make that better. And to Alex's point, like we have a few modes too. Like we are in the trenches with the Codex team to prepare the launches, to prepare the assets, like how to make the most of codecs and then post launch we try to educate developers on how to use codecs for this like variety of ways. But the other lens for which it's very interesting to us is that when you look at the broader OpenAI platform, we have like millions of developers today also building on the API, the models using different modalities from imagen to Sora to speech to speech and guess what, the best way to build AI has become Codex as the entry point. Right? Like if you rewind just a year ago or even last summer when we introduced GPT5, we had to write a lot of the guides around like how you prompt GPT5. Yes, it's a reasoning model, it's quite different from a GPT4 model. Well now what we try to do is like even for those use cases, we try to teach developers on using codecs and skills. Like for instance, if you had an integration you want to update, you should probably use codecs in a skill and guess what, Codex will definitely take care of that for you. So we are also very, very cross functional and we are seeing Codex as the cornerstone of everything for the developer platform.
C
Got it.
A
One interesting thing about how we work together is, like, I mean, effectively, I think the best part of working at Codex is the community online, on the Internet, and sometimes in real life at events, right? And we kind of just anchor everything about that. So it's like, okay, launches, we're very launch oriented. When are we shipping something? We're very feedback oriented. Like, when is there feedback from the community? And let's fix that and communicate that. And so we're all quite online. And I think even, for example, thinking towards the Codex app launch, we were working super closely with Dom and Dom and Homa's team on devex. We were like, he was basically helping us coordinate actually quite a wide alpha with a bunch of users and building with those users to get feedback, to build skills that like skills for the app to use at the same time and, you know, documentation everything. And so, you know, I think this is kind of this unique strength we have as a Codex team. Again, because we are open source and so kind of like, because we're open source, we kind of just found ourselves being incredibly open about everything we do. And I think it's really. The community really rewards it.
C
Yeah, dude. Like, building with the users and the community is like the best part about being a pm, just, like, talking to them every day.
B
Now we have Codex ambassadors even in many, many cities and countries who are kind of spinning up their own events to kind of teach their own communities locally, because I would love to be in every city, but we cannot be. But it's amazing to see the energy and the enthusiasm from the community to set up these events, these hackathons, and build together. This is awesome.
C
Yeah, make me an ambassador. I'll throw some events. Yeah, that sounds good.
A
All right, we signed up, so let's
C
talk about Peter a little bit. So I'm like an early adopter of OpenCloud, and it's a little bit janky, but it's done so many things for me. The other day, because it has memory of our conversations, it gave me a very vulgar pep talk for three minutes, and it was the most insightful thing that I've ever heard from AI. So how are you guys integrating Peter into the team and this personal agent vision, is that part of what Codez is doing? How do you think about that?
A
Two things there. I mean, I can only share so much here, but the first thing is that he is a ultra, ultra power user of Codex. And OpenClaw was very much built with Codex, and so he is just energizing the team with Feedback and basically work to help improve Codex. That's his side job, but he is doing it and we're really excited about it. And the other stuff that I can't share super much about yet, but he's like, really just like helping us build the, like the next generation of personal agents, but like into ChatGPT.
C
Yeah, that makes sense too. Yeah.
B
What I find fascinating about what Peter has done is like, obviously like, I've known Peter for a while and everybody has like seen this glimpse of the future when they start to play with openclaw. But what I find amazing is like, Peter had seen this vision for quite some time. And if you Rewind all of 2025, he has built more than 40 open source projects last year. But all of them kind of align with one vision which was, I need a command line interface to access my calendar. I need a command line interface to access my tweets and my Gmail. And by building all of these projects, he effectively made this vision manifest around this idea of skills and command line tools that we use coding agents for today. And it's not going to be just coding agents, it's going to be like any kind of personal agent. And so Peter is going to be fantastic at giving us feedback along the way for having built all of these tools that are now part of the OpenCloud ecosystem.
C
Yeah, I'm like, Ray, like, he's just one person. He built this awesome open source community and yeah, it's made me not want to open any other app anymore. I just talked to my little bot, so it's a huge difference.
A
What do you have it connected to? Do you have it connected to everything?
C
I have a dude, I have it connected to many things, man. It has my bank information, it has my YouTube information, it has voices that I've activated, my calendar, my Google stuff. And yeah, I'm in bed and my wife is like, who are you talking to? I'm like, I'm talking to my open cloud bot. It is true that there's a lot of like grifters out there charging $5,000 to set up OpenCloud. So like, yeah, if you guys can make it like mass market friendly, that, that's a huge, that's going to be huge, you know. So
A
we'll report back. Yeah, yeah, yeah.
C
Okay, well, let's wrap up and talk about like some of your hot takes. Alex. Like, and maybe I'm making this shit up, but like, I believe you said something about like how most people, most teams don't need that many PMs anymore or something like that.
A
Like, well, let's make it spicier. What do you think?
B
Do we need pm? I think what I find amazing with these tools is that it's not even just PM or no pm in my view. It's almost like every career ladder is starting to blur. It was like, you have a designer here, you have an engineer here, you have a designer, a PM here, and maybe you have a ratio of sorts. It's like the golden ratio of all. But now, if you're an engineer, sure, you're more productive, but if you're a designer, you have some superpowers to become more technical. If you're a PM and all you did before was like writing strategy docs, well, now you can just prototype. It doesn't mean that you have to be responsible of that feature for billions of users. But sure enough, you can show your team a glimpse of that vision by actually building it. And so I think that's what I find fascinating. To me, it's like all of the lines between career ladders are blurring, and we're all builders all together.
A
Yeah, this resonates.
C
Yeah.
A
Okay, so I think. I don't think I've. I'm trying to remember, like, what have I said? I feel like it's somewhere on the Internet. I said that. I think it's a red flag if a startup has a PM when it's like less than like 20 engineers or something. Maybe. Maybe I said that. I think, like, kind of like what you said, like, all these roles are blurring together. Right? Like, a designer can do more engineering, an engineer can do more design, a PM can do more building. But, you know, also engineers, often they need to be focused, right? So a lot of why they aren't, like, I don't know, triaging tasks or doing some other kind of like the project management side of PMING might be because they just need to spend time coding. But now that that's really easy. And you can just ask an agent like Codex to like, go, like, analyze the feedback and prioritize, you have more time. And so I think everyone's able to do everyone else's jobs. And like, Scott Belsky has this idea of like collapsing the talent stack. Yeah, I like that idea. I think it is happening. I think the fewer people you need in a room to do anything, just the better that thing goes, the more pure every decision is. So then the question is like, well, what is. What is left for PMs? And I think that there are many PMs who should actually convert roles, right? Like if you're a PM who kind of just like always wanted to be an engineer, but maybe you just like you were very good at managing people, but you were like not that good at engineering. Like maybe now you should become an engineering manager, you know, and with a coding agent, like that's fine and maybe that's just a cleaner role for you. I think there's an analogous version where like a different PM might just want to be a designer now, you know, just be closer to building. But I think ultimately what it comes down to is interest. I think interest in agency are like the most fundamental qualities that remain important for humans in a world with AGI. And so for me that's kind of what I end up thinking about. Like if you fundamentally are more interested in writing code and like you just did PM work because there was like someone needed to do it, now you should be delete yourself and become an engineer and just do the same thing from an engineering standpoint. Same for design. But if you are like fundamentally like most interested in like spending a lot of time with users, even if it takes you away from building, right. Or like trying to look around corners and understand where the market is going, etc. And, and if you are on a large enough team where there's already enough engineers, then I think maybe there's still room for a PM there. But yeah, I think it really comes down to like, what are you most interested in? Okay. And maybe I'll add one thing which is like, I still think every problem needs a human that's accountable for the problem area, but I just don't think that that human has to be a pm.
B
Yeah. And I think it depends a ton like what you said on the nature of the product. Right? Because we're lucky enough here to work on Codex, which is very much like a builder developer product. And we are the best users ourselves and we pair with the community thanks to this open source thing. But even if you rewind like a decade when I was at Stripe, like Stripe reached 250 employees with 0PMs, even without any AI tool. Why? Well, Stripe was just an API and we were all engineers and we knew what a great API would look like. So we were building the API we always dreamed about. We wanted Stripe to be so elegant, just a few lines of code. If you're working in a different field and you want to carry that customer obsession, you know, you may need more p.m. time to spend time with customers when, when the vertical or the industry, the space, the Problem space is different. We're lucky here to work on Codex and kind of like building the tool we've always wanted to have.
A
Yeah, but like, let's say, you know, in this example where it's like, maybe, you know, it's a, it's a field or the product is serving users that engineers have less intuition for. Like, PM is just a label for like someone who can design and code but is most interested in user. Is very interested in users. You know what I mean? You could equally well have an engineer who's very interested in users. So I think these labels are kind of losing their meaning a little bit, but that's okay for now. It's like, it's just a bit messy.
C
Yep, that's what I found on my team too. I feel like the best engineers, the best engineers don't ask me like, hey, Peter, what should we build next? They go off and talk to users and figure out what to build and then we have a conversation about it. It's like, it's kind of like everyone's on the same page around the stuff,
B
you know, that's how the Codex team works a lot. Right. Like, so many features that are like, that you're using today with the Codex app came from great ideas from engineers. Completely bottoms up. Because they wanted the feature for themselves.
A
Yeah, but I mean, I would say, I don't know, Like, I think, okay, there's a very strong profile of engineer that I love working with who just like, loves, like, hanging out with users and thinking about what to build. They're equally, well, incredibly strong profiles of engineers who just like, are insanely fast, insanely good at building systems and thinking them through and like, have zero interest in hanging out with users. And I think there's plenty of room for those people too. Right. Like, again, that's like kind of my view of like this world with AI is we can all just become more opinion, opinionatedly ourselves. You know what I mean? Like, yeah, like, be yourself like AI and the team around you maybe will like, just like fill in your, you know, what you don't want to do.
B
That's a great way to put it, dude.
C
But I do think, like, the builder label is very important. Like, I feel like every PM wants to be a leader and like, dude, the traditional career ladder is just like, you become like this VP or something and you don't have time to build anymore, man. You're just like in product reviews all day, like, just like giving some feedback here and there. And I feel like a lot of PMs don't want that, dude. I don't know if you want it but like I want to stay close to the users as you actually ship.
A
Yeah, totally. Yeah. I mean I don't actually view PM as a good leadership position. I view it as a fill in the gaps position. Occasionally that might require leadership. Although even then the leadership is probably just like helping people get aligned. Unless like being some genius who came up with the right strategy. Yeah, I will say one thing for sure. Like PM like the best PMs at OpenAI are incredibly in the weeds. And I think joining OpenAI in a senior leadership position is like very challenging because it's actually still important to be in the weeds. So you somehow need to find time for like you know, a senior leadership, but also how to get in the weeds at the same time. So always better I think here to join directly in the weeds.
C
Yeah. Okay, cool. Last question, man. So you finally hire another pm. I think his name is Rohan or something. Right. And what kind of qualities do you guys look for when recruiting people to the Codex team other than to be Codex power users? What kind of qualities?
A
Yeah, let's both take this I think, I mean look, I said it before, I'm going to go back to agency. Like people who do things is like literally the most important thing. Like OpenAI and also especially Codex team. Like we're intentionally not a team where like you're going to join and it's going to be like hey, here's like 12 tasks to do in increasing order of difficulty and like go for it. It's going to be more like welcome. Okay, that's just it, it's like welcome, you know. So I think people who are self starters and do things and have like energy and ideas for what to be done and don't mind disagreeing with the existing ideas because they're probably wrong because we probably made those decisions by accident. And who will like now I'm just describing like perfect teammate, right? Who will like absorb any incremental like scope or accountability for things that are unowned, I think is ideal. So like that is, that is sort of the broad meta things. And then I think if you're trying to think about what role makes the most sense, like anything technical. Engineering.
B
Yeah, yeah, agreed. On my side, like on developer experience, what I'm looking for is usually like people who have high agency, obviously very technical, obviously mastering the tool like Codex. But I also have this passion for spending time with developers and builders and sharing their knowledge. We just announced this week for Instance that Thomas, who built the open source Codex Monitor, is going to join my team this month. And it's great because someone is like, very creative, very prolific with Codex, but also loves sharing how he's building with it. You know, it's kind of like we need to bring millions of developers to this bright future of Codex. I think, like, agentic coding is changing everything in terms of, like, how we've been always reflecting on how we build software and build apps and products, and there is so much potential to show the world like, that anyone can build anything and, like, teach them along the way. So that's kind of like what I'm looking for.
A
Let me know if this is wrong in my head. The role description for Devex is like, very good engineer who is also very good at Twitter.
C
Yeah, I got half of that. I got half of that with the
B
little asterisks I would add, which is that, like, Twitter is like, outstanding for our communities here. And when you travel in some parts of the world, some developers are not as much on there, like in Europe and some other places they use LinkedIn or they use some other places. So we just have to have this little asterisk of like, thinking about the worldwide.
A
Good on socials. Okay, but good on socials for sure.
B
Yeah. And love spending time teaching and educating.
A
Yeah.
C
And I feel like agency, you can kind of tell even before they go through the interview process. Right? Like, are they shipping stuff online? Do you have, like, site projects?
A
So, like, when, when someone DMs me, like, you know, interest in, like, working together, for me, it's like, is there a link? If there's a link, I always click it. Like, you know, I guess I. Maybe I look to see if it's like a. A bad link, but no, I pretty much always click it and I'm always curious. And then if there's like some spiel with ideas, I usually always read that. And then I don't know how bad this is or toxic this is going to sound, but if it's like some explanation of like, why they're interested in the role and like, you know, their CV and stuff, like, I'm much less likely to read that than, like, their ideas and what they built, you know, and then I, I never. Someone asked me this the other day and I realized, like, I had no idea where, like, people went to college, you know?
C
Who cares, man? Yeah, who cares?
A
Yeah.
C
Like, I'm so glad we live in a world where all these stupid credentials don't matter anymore. Who cares, fan? College just like, show me what we built.
A
Yeah. Yeah.
C
Cool, guys. Well, thanks so much, man. I. I love Codex. And, yeah, I'm going to viper a bunch of stuff this weekend. It should be good.
A
Have fun.
B
Can't wait to see your feedback. Thank you so much, Peter, for having us.
C
All right, guys, thanks.
Behind the Craft Podcast: "How OpenAI's Codex Team Builds with Codex"
April 5, 2026
Guests: Alex (Product Lead, OpenAI Codex), Romain (Developer Experience Lead, OpenAI Codex)
Host: Peter Yang
Duration: 43 minutes
In this fast-paced, insight-packed episode, Peter Yang sits down with Alex and Romain from OpenAI's Codex team to explore how the team builds, ships, and leverages AI to redefine product creation. The conversation covers live demos, internal product philosophies, the future of product roles, community-driven development, and the inner workings of Codex, the leading coding agent.
On Codex’s Impact:
"The designers on the Codex team write more code now than was written by an engineer like six months ago. They're absolutely goated."
— Alex (07:55)
On Simplicity vs. Power:
"We are really careful about, like, what the core primitives of what we're building are... It's not just a vibe coded thing, we're really thoughtful."
— Alex (11:10)
On PM Role:
"I don't actually view PM as a good leadership position. I view it as a fill in the gaps position."
— Alex (38:30)
On Team Structure:
"We view ourselves as like intentionally a bit of a pirate ship like team...there's not too much alignment going on there."
— Alex (24:40)
On Hiring:
"People who do things is like literally the most important thing."
— Alex (39:28)
This episode offers an authentic, behind-the-scenes look at how the OpenAI Codex team challenges conventional software development with high agency, minimal process, and deep integration of AI tools. The hosts' insightful, candid discussion sheds light on how future-forward teams can work faster and smarter—with traditional roles melting away in favor of empowered, multidisciplinary builder culture.