Transcript
A (0:00)
Foreign. Welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Decibel and I'm joined by my co host Zwicks, founder of Small AI.
B (0:15)
Hello.
C (0:15)
Hello.
B (0:16)
Calling in from Singapore here, but we are in the remote studio because the OpenAI team keeps shipping and today they just live Streamed and released ChatGPT Codex. Welcome to Josh, who I think we've talked about. We've met while you're at Airplane, right?
D (0:34)
Yeah.
C (0:35)
I've been building devtools for a bit now and you're kind of talked. I have to talk to you when I'm building devtools.
B (0:42)
I mean, you have now seen me complain a lot when things happen. So I don't know if it's a good or bad thing.
D (0:51)
It's a gift, man. Feedback is a gift.
B (0:54)
Thank you, Alexander. We're new to each other, but you've been leading a lot of the Codex tech testing and demos and stuff.
D (1:01)
Yeah. Hey, I'm Alexander. I'm on the product team here.
B (1:04)
Awesome. So yeah, we're going to just assume that everyone's watched the live stream. You also release a blog post with a bunch of test demo videos. Basically a bunch of. It's very interesting. I noticed in the demo videos it was individual engineers sitting by themselves, very lonely, and then they're just talking to their AI friends coding with them. I don't know if that's the vibe you want to give off, but that's how I came across.
D (1:31)
Yeah, man, those videos we were going for maximum authentic, just engineers talk about how it helps them. Yeah, take the feedback.
B (1:41)
But no, I mean, it's true. I mean sometimes on call is a lonely job. Mobile engineer is a lonely job. There's not that many of those, so. Yeah, totally. But anyway, so what did you guys individually do? Maybe we can kind of start there. How did you get pulled into the project and we'll start from there.
D (1:59)
Yeah, maybe I can go first because then we have a fun story about how we started working together. Okay. So actually before working at OpenAI, I was working on a native macOS software called multi, which is like about. It was kind of like a pair programming tool, but we thought of ourselves as working on human to human collaboration. And basically as ChatGPT and stuff came around, we started thinking about like, oh, what if instead of a human pair programming with a human, it was like a human pair programming with an AI? So I'll skip this whole journey. But that was this whole journey. And then we all ended up joining OpenAI and I was mostly working on desktop software, and then we shipped reasoning models. And I'm sure you guys were ahead of the curve in terms of understanding the value of reasoning models, but for me, it's kind of like starts off as better chat, but then when you can give it tools, you can actually make it an agent. An agent is a reasoning model with tools and environment guardrails and then maybe training on specific tasks. So anyways, we got super interested in that and we were just starting to think about, okay, how do we bring reasoning models into desktop? And at the same time, here at OpenAI, there was a lot of experiments going on with giving these reasoning models access to terminals. I wasn't working on those first experiments, to be clear, but that was the first true. Wow. I really feel the AGI moment that I had, it was actually while I was talking to David K, a designer who was working on this thing called Scientist, and he showed me this demo of it updating itself. And nowadays I don't know if any one of us would be the most impressed. That changed the background color, modifying its own code. Yeah. And then it was like they had hot reloading setup. So I was just like, mind blown at the time. And it's still a super cool demo. And so we kind of were experimenting with a bunch of these and I sort of joined one of the teams that was like tinkering with this. And we kind of realized like, hey, it's just super valuable to figure out how to give a reasoning model access to, to a terminal. And then now we have to figure out how to make that a useful product and how to make it safe. You can't just let it go loose on your local file system. But that's where people were initially trying to use it. So a lot of those learnings ended up becoming the codec cli, which shipped recently. A lot of the work there, the thinking that I'm most proud of is enabling things like full auto mode. And when you do that, we actually increase the amount of sandboxing so that's still safe for you. And then so we were working on these types of things, and then we started realizing we want to let the model think for longer, we want to have a bigger model, we want to let the model do more things safely without having to do any approvals. And so we thought maybe we should give the model its own computer, the agent its own computer. And then at the same time, we were also experimenting with putting the CLI in our CI so it could automatically fix test tests. We did this Crazy hack to get it to automatically fix linear tickets in our issue tracker. And so then we ended up sort of creating this project that is Codex, which is basically really the concept of giving the agent access to a computer. Actually I realized. I don't know if you were asking what I personally did, but anyways, I told the story. I hope that's okay.
