Podcast Summary: No Priors - "Inside Deep Research with Issa Fulford: Building the Future of AI Agents"
Date: April 24, 2025
Host: Sarah Guo (Conviction)
Guest: Issa Fulford (OpenAI)
Overview
This episode features a deep-dive conversation with Issa Fulford, an AI engineer at OpenAI and a key architect behind Deep Research—a new agentic product enabling users to complete complex, multi-step research tasks using reinforcement learning and tool integration. The discussion unpacks the product's origin, core technologies, its implications for agentic AI, challenges around safety, use cases, and the future of AI agents.
Main Topics and Key Insights
1. Origin and Purpose of Deep Research
-
Genesis of the Idea ([00:39]):
- Issa and her colleague Yash were excited by recent advances in reinforcement learning algorithms, seeing potential beyond math, science, and coding tasks.
- The initial focus was on agentic systems that could tackle practical user tasks like online research and software engineering.
- Early brainstorming revolved around real-world, knowledge work use cases.
- “We literally would write out just a list of things. Like I hope the model that could find this list of products for me and rank them by these reviews from Reddit or something like that.” (Issa Fulford, 01:20)
-
Direction of the Product ([02:37]):
- Prioritized information synthesis (“read-only tasks”) over transactional “right action” agents (e.g., ordering food), aligning with OpenAI’s broader AGI goal of enabling new scientific discoveries.
-
Alignment with AGI Ambitions ([02:37]):
- “If you can't write a literature review, you're not going to be able to write a new scientific paper. So [it] felt very in line with the company broader goals.” (Issa Fulford, 02:53)
2. Product Development: Datasets, Tools, and Iteration
-
Iterative Demo and Prototyping ([04:00]):
- Early demos used only prompted models and a UI to showcase the vision, with no initial model training.
- Transitioned to building custom datasets, model training workflows, and browsing tools.
- Close collaboration across teams, notably with Edward Sun and the RL/reinforcement learning team.
-
Task Design and Evaluation ([04:59]):
- “One of them was to find all of the papers that Liam Feddes and Barrett Zoff had written together... We would always ask that question and then another one... finding the middle name of one of our co-workers.” (Issa Fulford, 05:07)
- Strong early internal usage signaled product-market fit.
-
Human-Generated and Synthetic Data ([05:59]):
- Leveraged human trainers to create new kinds of datasets, ensuring a broad exercise of skills.
- Used new grading methods and developed tools like a text-based browser (with image/PDF support) and a Python tool for analysis and graphing.
- Anticipated ongoing expansion: “In future versions, we’ll just expand the toolset and...need to make datasets that actually make the model exercise all of those different tools…” (Issa Fulford, 06:24)
3. Technical Learnings and Advice for Startups
-
When to Apply Reinforcement Fine-Tuning (RFT) ([07:23]):
- Use RFT when the required task is very different from the model’s prior training data or is “make or break” for a business workflow.
- Otherwise, leverage improvements in standard models, as they’re evolving rapidly.
- “If you have a very specific task… and it’s just really not good at it... that is a good time to try reinforcement fine-tuning.” (Issa Fulford, 07:52)
-
Role of Human Expertise in Browsing Tasks ([08:59]):
- “Every single profession involves...finding information from many different sources to synthesize an answer... and the cool thing with RL is you don't necessarily need to know the whole process...” (Issa Fulford, 09:05)
- OpenAI’s broad dataset approach enables robust generalization.
-
Emergent Planning Behaviors ([10:29]):
- Observed the model making plans upfront or creatively using search terms—sometimes even attempting to bypass constraints placed on it.
4. Challenges and Failure Modes in Agents
-
Safety and Hallucinations ([11:18]):
- Comprehensive, longer responses lead to higher user trust but also increase the stakes of hallucinations.
- Deep Research hallucinates less than prior models, but issues mainly stem from incorrect inferences from sources.
- Citations are crucial for transparency and user verification.
- The next challenge: safe agent actions integrated with users’ private data and systems.
-
Guardrails and User Trust ([12:54]):
- Users will want varying degrees of agent autonomy—starting conservative (“confirm every right action”), progressing toward more hands-off trust as capability grows.
5. Product Capabilities: Today and Tomorrow
-
Current and Emerging Capabilities ([13:44]):
- Deep Research is already delivering on multi-modal research and synthesis tasks.
- “The ideal state would be to have a unified agent that can do all of these different things. Anything that you would delegate to a co-worker, it should be able to do.” (Issa Fulford, 13:44)
-
Agent as a Co-worker Paradigm ([14:04]):
- As models improve, user task abstraction rises—from “write a function” to “write a whole PR.”
- Obvious next steps: access to private data (internal docs, GitHub) and taking direct actions (API calls).
-
Internal OpenAI Collaboration ([15:20]):
- Frequent data and model sharing between teams accelerates progress.
6. User Experiences and Domains of Application
-
How Experts Use Deep Research ([16:40]):
- Surprised and delighted by adoption in unfamiliar domains: medical research, code search, data analysis.
- “Seeing experts actually ratify Deep Research responses is useful.” (Issa Fulford, 16:54)
- Especially useful for coding: “use the latest package...help me write this file or something for data analysis... uploading a file... analysis... report with numerical analysis...” (Issa Fulford, 17:12)
-
Why Deep Research Excels in Specific Queries ([22:28]):
- Best for well-defined, specific queries requiring up-to-date, extensive online research.
- “We trained it to have much longer outputs than I think the normal models would. So if you're looking for something very comprehensive...Deep Research is...useful for those things.” (Issa Fulford, 23:05)
-
Fashion, Shopping, and Taste-Related Tasks ([23:21]):
- Particularly adept at constrained queries (e.g., finding brands or very specific products).
- More effective than general models for intricate or up-to-date demands.
7. Product Experience: Surprises and User Flow
-
Key Development Surprise ([24:43]):
- The visceral moment when the model worked well on browsing tasks for the first time.
- “Even though we thought it would work, honestly, just that it worked so well was pretty surprising.” (Issa Fulford, 24:44)
-
Performance and Speed ([25:34]):
- Not designed for instant response—prioritizes quality and thoroughness over speed.
- “Sometimes you don't want it to do really deep research, but you want it to do more than a search. I think that we will release things soon...and will fill that gap.” (Issa Fulford, 25:44)
-
Future Scalability—Long-Running Tasks ([26:54]):
- Envisions agents capable of working for days on research comparable to a human thesis or project.
8. Looking Forward: Unified Agents and the Road to AGI
-
Unified Agent Vision ([27:51]):
- Predicts agents capable of advanced multi-domain work (coding, planning trips, etc.)—with surprising improvements likely in the coming year.
- “A general agent that could do a lot of the… tasks that you would do in a lot of different areas... I hope that we'll get to a more unified experience.” (Issa Fulford, 27:54)
-
UX and Collaboration Model ([28:51]):
- Ideal interface: like collaborating with a remote coworker via Slack.
- Interoperability with user intervention and override capability.
-
Importance of Memory and Context ([18:34], [20:57]):
- Longer, more complex tasks magnify the need for agent memory and context management.
- Efficient context and accumulation over days or weeks is a crucial (and unsolved) challenge.
-
Final Reflections on Safety and Progress ([20:57]):
- “We would never ship anything that we don’t have very high confidence is safe... The stakes are way higher when [the agent] has access to your GitHub repositories and your passwords and your private data.” (Issa Fulford, 21:01)
Notable Quotes & Memorable Moments
- “[Training a model on browsing tasks and seeing it actually working]... was pretty incredible. Even though we thought it would work, honestly, just that it worked so well was pretty surprising.” (Issa Fulford, 24:43)
- “The level of abstraction of the human becomes higher… a year ago I was asking it to write a function for me and now I’m asking it to write a whole file and maybe next year it will make a whole PR for me...” (Issa Fulford, 14:04)
- “I kind of want [an agent] to be something that is just like… having a coworker on Slack... you can just ask to do things for you, send them a Slack message and then they'll start doing it and then you can review their work or help at some point.” (Issa Fulford, 28:51)
- “We trained it to have much longer outputs... So if you’re looking for something very comprehensive... Deep Research is... useful for those things.” (Issa Fulford, 23:05)
Key Timestamps
- 00:39 – Origin story and product ideation
- 02:37 – Product philosophy and alignment with AGI
- 04:00 – Prototyping and Demo building
- 05:59 – Custom data and tool creation
- 07:23 – Advice on reinforcement fine-tuning
- 10:29 – Model behavior and unexpected planning
- 11:18 – Safety, hallucinations, and trust
- 13:44 – The future: unified agents and automation
- 16:40 – Real-world uses and community impact
- 22:28 – Comparative advantages for specific tasks
- 24:43 – Biggest surprises in development
- 26:54 – Vision for long-running research agents
- 27:51 – The case and future of unified AI agents
- 28:51 – User interface and ‘co-worker’ model
- 30:11 – Final reflections on the unified agent as an extension of human ability
Tone & Takeaways
The conversation is candid, insightful, and pragmatic—balancing technical confidence (“everybody does kind of see a pretty clear path to this broadly capable agent” [19:07]), with humility about the real challenges (“There’s a lot of really hard safety questions that we need to figure out” [20:57]). The whole session provides both a front-row seat to the development of AI agents and actionable guidance for practitioners and researchers.
