Transcript
Jakob Pochotzky (0:00)
The big thing that we are targeting is producing an automated researcher. So automating the discovery of new ideas. The next set of evals and milestones that we're looking at will involve actual movement on things that are economically relevant.
Mark Chen (0:13)
I was talking to some high schoolers and they were saying, oh, you know, actually the default way to code is Vibe coding. I do think, you know, the future hopefully will be vibe researching.
Podcast Host (Narrator) (0:22)
What does it take to build an automated researcher? And can AI discover new ideas on its own? OpenAI's Chief Scientist Jakob Pochotzky and Chief Research Research Officer Mark Chen joined A16Z general partners Ajahnay Miha and Sarah Wang to unpack GPT5's reasoning push why evals must shift to economically meaningful benchmarks and the march towards an automated researcher. We get into Long Horizon Agency, why RL keeps working, the new Codex for real world coding, research, culture versus product and why for now, compute is destiny. Let's get into it.
Ajahnay Miha (1:00)
Thanks for coming, Jakob and Mark. Jakob, you're the Chief Scientists at OpenAI. Mark, you are the Chief Research Officer at OpenAI, and you guys have both the privilege and the stress of running probably one of the most high profile research teams in AI. And so we're just really stoked to talk with you about a whole bunch of things we've been curious about, including GPT5, which was one of the most exciting updates to come out of OpenAI in recent times. And then stepping back how you build a research team that can do not just GPT5, but Codex and ChatGPT and an API business and can weave all of the many different bets you guys have across modalities, across product form factors into one coherent research culture and story. And so to kick things off, why don't we start with GPT5. Just tell us a little bit about the GPT5 launch from your perspective. How did it go?
Mark Chen (1:51)
So I think GPT5 was really our attempt to bring reasoning into the mainstream. And prior to GPT5. Right. We have two different series of models. You had the GPT kind of 2, 3, 4 series which were kind of these instant response models. And then we had an O series which essentially thought for a very long time and then gave you the best answer that it could give. So tactically, we don't want our users to be puzzled by which mode should I use? And involves a lot of research in kind of identifying what the right amount of thinking for any particular prompt looks like and taking that pain away from the user. So we think the future is about Reasoning more and more about reasoning, more and more about agents. And we think GPT5 is this step towards delivering reasoning and more agentic behavior by default.
