Summary7 min read

Podcast Summary: AI & I – Inside OpenAI’s Agentic Browser, Atlas

Host: Dan Shipper
Guests: Ben (Head of Engineering, Atlas), Darren (Technical Staff, Atlas)
Date: February 11, 2026
Episode Theme:
A deep dive into the making of OpenAI’s agentic browser, Atlas. This episode explores the evolution of the web browser from a basic tool to an intelligent and proactive companion powered by AI, the design and technical choices behind Atlas, and the paradigm shift brought by agentic interfaces. The conversation also covers broader implications for the future of browsing, user experience, and coding with AI.

Main Topic Overview

Atlas is pioneering the idea of an “agentic browser,” letting users bring ChatGPT contextually wherever they go online. This episode discusses how Atlas changes the user’s interaction with the web—shifting from copying/pasting content into ChatGPT, to having AI proactively assist or even act directly for the user. The show uncovers both the technical challenges and the user experience dilemmas in bringing this vision to life.

Key Discussion Points & Insights

The Vision for Agentic Browsers

Moving Beyond Tabs:
- "The big unlock that I had with Atlas is I realized I never need to look at a settings panel ever again." (B, 00:13 & 08:21)
- Atlas makes fiddling with complex web forms, settings pages, or rarely-used interfaces obsolete, by leveraging AI to perform actions for you.
Embedded Intelligence:
- With Atlas, ChatGPT is natively available throughout your web journey; users ask questions in context, brainstorm ideas, and get personalized recommendations in real time.
- “If you’re on a webpage and you’re scratching your head about something… ChatGPT is right there. You can ask it. It has the context.” (C, 07:01)

Concrete Use Cases & User Journeys

Complex Web Apps and Reducing Friction:
- Both hosts and guests share frustration with navigating interfaces like AWS or Google Forms. Atlas allows users to delegate these tasks directly to the AI, reducing the "activation energy" for infrequent or convoluted tasks.
- “It’s that time of year to go into Workday and figure out how to get my year-end-of-date pay stub… and I don’t go there often enough.” (C, 09:58)
Progressive Disclosure of Power:
- Features like “Cursor Chat” let users get assistance in specific form fields, but balancing discoverability without overwhelming users is an ongoing challenge.
- “We struggle with how in your face to make this… it’s actually really powerful. The people who use it rave about this…” (C, 12:27)
Personalization and Web Memories:
- Atlas provides features for web memories, enabling users and the AI to reference past web interactions for richer, more contextual experiences.

Addressing the UX Challenge

Power vs. Simplicity:
- The design philosophy aims to keep Atlas familiar and streamlined, but provide on-demand AI assistance—letting users choose when and how to let the browser take initiative.
- “The browser UI is fairly streamlined and minimal so that you can focus on the thing that you’re looking at. But then ChatGPT is at the heart of the experience…” (A, 30:08)
Minimizing Annoyance:
- Drawing from their Chrome days, Ben and Darren emphasize not being a "super annoying tour guide" but instead offering an invisible helping hand.

Technical Architecture & Development

Chromium + Swift:
- Atlas is built atop Chromium’s web rendering, but all UI is rewritten in Swift, allowing faster iteration and more tailored experience. UI components (like tab handling) are reimagined for better extensibility and performance.
- “If you’re familiar with the blog post… We run Chrome completely out of process. So our app, the Atlas app, is a pure Swift app…” (C, 36:14)
AI-Assisted Coding Velocity:
- Upwards of 75% of Atlas’s net new code is generated or prototyped by AI (using Codex), radically reducing time to prototype and iterate.
- “Majority of it, I would say… It wouldn’t surprise me if it was north of 75%.” (A, 38:59)
- AI-augmented coding enables smaller teams to compete and experiment rapidly, especially when dealing with the heavy complexity of browser projects.
Unit Testing with AI:
- Codex not only writes code, but accelerates creation of comprehensive unit tests, leading to more robust software.
- "Thanks to Codex, actually we have a lot more unit tests, because... the overhead of creating a unit test is greatly reduced…” (C, 47:53)

The Future of the Web & Browsing

Evolving Expectations & The Human vs. Bot Divide:
- Atlas blurs the line between human and automated interaction, acting as a “user agent agent”—a personalized, intent-driven operator for the web.
- Discussion around standards and whether new signals or changes to user agent strings (or even HTML standards) are needed as more “agents” use the web.
- “At some level, stuff doesn’t need to evolve because we have computer use models that can just go off and read the screen… but the evolution here will come from making that more seamless…” (A, 15:55)
Will Agentic Browsers Replace the Web?
- Guests doubt a total replacement:
  - “I still think there’s a lot of stuff that people want to do themselves… there’s aspects of, you know, shopping or trip planning that I want to be deeply involved with.” (A, 25:05)
  - Instead, they see a blended world—delegation for repetitive or tedious tasks, and human exploration for serendipity or creativity.
Future Browsers as Tour Guides:
- Browsers may shift from being invisible “taxis” to proactive “tour guides”—but care must be taken to avoid crossing into intrusive territory.

Challenges with AI Agents Today

Agent “Laziness” on Complex Tasks:
- Users see situations where AI agents give up on long or complex tasks (e.g., copy editing in Google Docs), partly due to web app complexity and current AI workflow limits.
- “There’s been a known issue with our agent. We call it ‘laziness’… it will say things like: ‘Oh, this task is too time-consuming, I give up.’” (A, 50:18)
- Improving agent stamina and dexterity on complex, interactive UIs is an ongoing priority.

The Joy (and Craft) of Browsers

Why the Founders are Passionate about Browsers:
- The web offers an egalitarian, global platform; building browsers means shaping a core tool for human advancement and creativity.
- “The web was amazing because it felt egalitarian… anyone, anywhere could get involved in it… I just love it. I wouldn’t work on anything else.” (A, 52:46)
- For both, the work remains unfinished—there’s always more to improve, more creative possibilities to unlock.

Notable Quotes & Memorable Moments

On bringing ChatGPT everywhere:
- “This browser puts that at the center of it. That’s what the URL bar will guide you towards for your queries.” – Darren (C, 06:29)
On complex settings panels:
- "If I had to like click through another fucking form or settings page, I would like blow my head off." – Dan (B, 03:10)
On future paradigms:
- "Do you think agentic browsers will make the web unnecessary?" – Dan (B, 24:30)
  "I don’t think so myself... I still think that there’s a lot of stuff that people want to do themselves.” – Ben (A, 25:05)
On developing with AI:
- “Our code is able to be created by Codex because there’s a lot of straightforward aspects to what we’re doing… but these tools can be tremendous companions…” – Darren (C, 40:48)
On coding craft and AI:
- “I like the crafting aspect… therapeutic about it… but I still feel like there are a lot of elements of that [even with AI].” – Darren (C, 44:46)
On the browser’s role:
- “I think the big thing most people struggle with in their day to day life is ambiguity… and that’s where ChatGPT is just incredibly amazing…” – Ben (A, 30:49)

Timestamps for Important Segments

Atlas’s Origin & Team Background (02:00–03:00)
Defining “Agentic Browsing” & Real-World Uses (04:22–09:58)
Progressive Disclosure & Discoverability (Cursor Chat Example) (12:17–14:17)
How Atlas Changes the Human-Bot Web Divide (15:01–22:17)
Browser as Tour Guide vs. Taxi Analogy (28:36–32:51)
Technical Architecture (Swift + Chromium) (33:34–38:12)
AI-Assisted Coding & Team Velocity (38:43–44:09)
Crafting Code vs. Letting AI Write It (44:22–47:53)
Agent Laziness on Complex Web Apps (50:13–52:06)
Team’s Personal Passion for Browsers (52:42–54:30)

Conclusion

This candid, technical, and philosophical conversation maps where browsers—and the web—are heading in an era of intelligent agents. Atlas, and agentic browsers like it, signal a shift: not a replacement of human agency, but a deep collaboration between user and AI, making digital life less tedious and more creative. The edge will be in design—making these powers accessible, helpful, and not overwhelming—while retaining trust, simplicity, and joy.

If you want to learn more or try Atlas, check out: every.to/chain-of-thought

Loading summary

Transcript105 lines

[00:00]
A
I think one of the things that has excited me about this world is it's not just the pace of development, because I think to get a feature to work right, it's always going to take a few iterations. It's how quickly you can decide that something is worth pursuing.
[00:14]
B
The big unlock that I had with Atlas is I realized I never need to look at a settings panel ever again.
[00:21]
C
You're not alone.
[00:36]
B
If you work in large code bases, you know what it means to hold too much in your head. Which file imports what, what service, depends on what database schema and what will break if you change this one line. The bottleneck isn't writing code, it's holding the entire system in working memory long enough to make a decision. Augment Code is an AI coding assistant that offloads context. Its context engine understands your whole code base, including what's in your current file and the architectural shape of your entire system. It it works with multi language interactions, legacy code and the dependencies that aren't documented anywhere. The system is what makes it work and Augment documented it in their AI Powered Engineering at Scale Playbook. Inside it includes how to assess your current state, the four phase framework for moving from individual experiments to team wide deployment, ready to use checklists, and the specific workflows that produce 30% faster PR velocity and 40% shorter merge times at companies working on code where mistakes are expensive. This is designed for enterprise teams, teams working on high stakes production systems. It's built for compliance, correctness and maintainability. It's built for the moments you're not just prototyping, but shipping code that millions of people depend on and teams are seeing measurable results. 30% faster PR velocity and 40% shorter merge times. Download the AI powered engineering at scale playbook@justmentcode.com resources aipoweredengineeringatscale that's augmentcode.com resources aipoweredEngineeringatscale. And now back to the episode. Ben and Darren, welcome to the show.
[02:00]
A
Hey, thank you. Great to be here.
[02:02]
C
Yeah, likewise. It's awesome.
[02:05]
B
So for people who don't know you, you are both building ChatGPT Atlas, which is an agentic browser. Ben, you are the head of engineering. Darren, you're a member of the technical staff. I believe you both worked on Chrome originally, is that true?
[02:22]
A
That's right.
[02:23]
C
That's right. We've worked on a number of browsers together and for a long while.
[02:29]
B
Oh, that's really cool. So I didn't realize that this is like a. It's an Evolving partnership through many different products and companies. That's really interesting.
[02:38]
A
We worked together first at Netscape, then on Firefox together for a few years, and then with Chrome and now Atlas, which is super exciting.
[02:48]
B
Absolute OGs. Okay, this is really cool. So I'm using, I'm a daily Atlas user and I switched from dia, which I know Darren used to work at the browser company. I'm good friends with Josh and Hirsch, so if they're listening, maybe there's a way you can get me back. But Atlas is pretty good. What's really interesting to me about using Atlas and using just really agentic browsers is for the first couple days I was like, I don't have no idea what to do with this. Like, I know it has this power, but I don't, I, I can't think of a, a time when I want, I might want to use it. And now I'm just like, every single day there's like 50 different things that if I had to like click through another fucking form or settings page, I would like blow my head off.
[03:33]
C
But isn't that kind of the journey that people have with AI tools in general like ChatGPT or these coding tools? You kind of don't really understand the power until you get into it.
[03:44]
B
I think that is true. I didn't quite have that experience. Like the first time I just saw it, like writing GPT3 writing stuff. I was like, whoa, this is crazy. But yeah, I guess that is true. Well, I guess I'm curious from both of your perspective, if someone is listening and they're like, I know that agentic browsers are a thing and maybe I've tried it, but I actually don't even know why I would use this or what it's useful for. What is the sort of vision for agent, Agentic browsers? And let's be, let's try to be more specific than like, yeah, it just does everything for you, you know, like, what is the, like, what are the real day to day things that agentic browsers change about how you might use the web?
[04:23]
A
Yeah. So I think that, you know, maybe the future will get to a place where like more and more of your workload can be, can be automated. And I think we're making progress in that direction. But, but today we wanted to design Atlas with this idea that you could bring ChatGPT with you wherever you go on the web. And so, yeah, I mean, I think the thing that you note of like, what do I do with this? This is something that we hear A lot from people, but then also we hear some aha moments as they go on the same journey that you have and begin to figure out some use cases for it. This is something that we actually want to take some of that learning that we have from how people are using it and help offer more proactive advice to people like in product to help them figure out how to optimize use of the tool. But I think today, like one of the things that I notice when I use Atlas versus when I go back and use a sort of pre AI browsing environment, I find myself just able to ask just a lot more questions and just be more knowledgeable about a topic. If I'm doing online shopping, I can feel confident that I'm getting the best deal or I have the right coupon code, or I have all that sort of stuff. If I'm like researching a topic that's of interest to me, I can sort of brainstorm different viewpoints on it. I can just sort of have this sort of thing, friend or advisor that sort of comes with me and I can just like have this conversation with it. And that, that's just made the web a lot richer and more dynamic.
[05:48]
B
Can you make that more concrete for me? Because I think some of those things, someone might be listening and being like, well, yeah, I could do that with ChatGPT now. That's what ChatGPT does for me. So what does it, what does it mean to have that in the context of your browser?
[06:00]
A
It just means that you don't need to go, you know, I think if, you know, for anyone that's had ChatGPT in a tab, you probably have the experience of going and taking some content from another tab and pasting it in and ask a question about it, perhaps. Whereas when you have a browser that's built with this at the core of it, you know, that context is provided directly to the model. So you kind of don't need to keep repeating yourself. It will just. ChatGPT will just see what you're looking at and be able to offer, you know, you know, its thoughts on that.
[06:29]
C
I think that's really the big. The big unlocking the power of this whole thing is like, I think as people use ChatGPT for more things in their life, they realize that maybe they should start more of their queries with ChatGPT, right? You start to learn that for yourself. At a certain point you're like, why am I doing things the old way? That was very manual. But instead I should ask this AI model, it will help Me save some steps. And this browser puts that at the center of it. That's what the URL bar will guide you towards for your queries.
[07:01]
A
Right.
[07:02]
C
It helps you get into ChatGPT with a lot lower friction. And as Ben was saying, you know, if you're on a webpage and you're scratching your head about something, ask. ChatGPT is right there. You can ask it. It has the context. You don't have to copy paste and say, can you now ask this, Answer this question. So it's just a lot more streamlined. That's. That's kind of the core value proposition of this whole thing. And on top of that, we build, you know, features that people can opt into around web memories. So if the agent or the model is there and on your journey, you can also query it later about things that it's. It knows. And that can be very powerful to you as you're trying to get back to things. You're trying to make sense of just all the things in your world. And, you know, whatever kind of journey you're on, whatever research project you're on, whatever work you're trying to do, having it there sort of passively, it can be very powerful too,
[07:54]
B
I got to tell you. Like, and hopefully maybe this can be like a little bit of a user research session too, because, like, I feel like I'm, I'm doing something with this that I'm very excited about. And I'm curious if you guys are doing it, if you're seeing other people doing it, how you're building for this. So the big unlock that I had with Atlas is I realized I never need to look at a settings panel ever again.
[08:20]
C
You're not alone.
[08:21]
B
Yeah. And that is such a refreshing feeling. I think it's both refreshing for users and for software developers. Um, I think it's refreshing for software developers because you don't have to worry about adding another knob because someone like the agent's going to do that. So you can custom. You can make software more custom, customizable more easily. But for users, like, I think the canonical example for me is looking at the AWS dashboard. I don't know if you have, like, I assume you guys have both logged into that and it's like 50 different services and then, like, you're the settings. The, like, the permissioning system is like, it's like launching a nuclear, like, missile in order to, like, do anything. And I run a company and we have like 20 people. And so I'm, I'm sort of constantly being asked hey, can we like add a seat to this or like can you change the permission on this thing or. And it's like some account that we set up five years ago that I don't even remember.
[09:15]
C
You don't do these things so frequently. And so, yeah, it's like not top of mind how to do it again.
[09:20]
A
Yeah. My example of this that I've been using was I used it to help me create Google forms to do user research. And you know, Google Form Builder I think is maybe less complicated than the AWS control panel, but still it's not something I use every day. And so I think for me to be able to ask the agent to go off and do that and have it do that in a few minutes and come back and I can just submit, you know, certainly allowed me to get to the meat of the problem much quicker.
[09:50]
B
Yeah, it's, it's one of those tasks where like there's a certain amount of activation energy and you don't have to spend the activation energy anymore. Darren, what were you gonna say?
[09:59]
C
I was just saying it's that time of year to go into workday and figure out how to get my year end of date pay stub so I can share that with my tax advisor. And I'm like, where do I go again? You know, they moved it again, you know, and I don't go there often enough. So. So I think it's super powerful for navigating like web apps, especially complex ones, like you said with aws. And it's just that's, that's, that's one of the definite superpowers of these things.
[10:30]
B
So how are you seeing that evolve with your user base? Like what percentage, if you can share, what percentage of people have actually figured that out? Because I, it's super powerful, but I also imagine it's not necessarily a daily use case. It's like a couple times a week. It is a lifesaver. But other than that, I may not use it for this. Like I'm, I'm only going into settings like a couple times a week. So I'm curious, is that one of the, one of the use cases you can hang your hat on and are, are people really discovering it or is it still sort of nascent?
[11:02]
A
I don't know if we have the exact stats on, on the sort of agent like browser drives kind of thing, but we do know that just in general people interacting with that side chat is a main use case for the browser and I think probably most people are using that on a regular basis. Just because it is the main value add sort of the main surface. In terms of what tools or capabilities people use from that, I don't think we've got that broken down quite the same way.
[11:35]
C
What you see is, what you'd imagine is that people are, you know, when you first come to these tools, you don't know all the things that it can do. And that's definitely a topic for us. How do we introduce people to things but not also overwhelm them at the same time? You know, you want to have, you want to balance something that's familiar, simple, seems approachable, but also it's powerful under the hood. So you get rewarded as you discover further. You know, I think that's, that's kind of the nature of UX development, right? You, you can have a very powerful, complex tool, a browser really is, but you want it to also be approachable and easy. And you got to think about like, what are the patterns people do and how can we meet them in those moments, right?
[12:18]
B
What are the. Some of the decisions that you've made to like to do that, like to, to. To enable this sort of progressive disclosure of complexity. So that Alice is really intuitive.
[12:27]
C
But yeah, one of the features that is pretty powerful but, or relatively, or I should say we struggled with how to expose it is this feature called Cursor Chat. If you are interacting with a form field in the browser, you'll see a little icon, a little chatgpt icon, and you can hover over and then interact with the model, and interact with the model in the context of that specific form field. We struggle with how in your face to make this right. We want people to be aware of this power. It's actually really powerful. The people who use it are. There are people who rave about this, helping them compose and that sort of thing. But actually a lot of people don't discover it, even though we have this little hint. And so it's always a question, how big do you make that hint? How do you introduce this to people? Certainly during onboarding, we already have a lot of things we try to tell people about. Because this is an AI browser. There's new things to learn, fundamental things like web memories and capabilities like side chat and so on. But we can only tell you about so many things at once. So that's been a challenge for us from a design perspective, for sure, that makes sense.
[13:34]
B
I've seen that little icon. I have not clicked it, so now I feel like I need to click it.
[13:40]
A
It's one of those things where it's like, it's another advantage of having this sort of fully integrated with your browsing environment, as opposed to just having ChatGPT in a tab, is that you can kind of summon it into the specific text field. So this is a feature that my wife uses quite often. She has to write emails, she's involved in a number of different things and it just helps like speed up her workflow quite a lot, having it there. And the thing is, it's not just. It is like your chatgpt there. So it has your personalization, your custom instructions, all that kind of stuff behind it. So it writes the way you want it to write and all that.
[14:18]
C
So it's pretty cool. Broadly speaking. We're really interested in the whole idea of like how the model can interact with the web and the ways that we interact and how can it dovetail with what you're already doing. So, you know, the agent is that you invoke with slash agent Inside chat is like a very all in sort of manifestation of that where you're asking it to take a task and go interact directly with the page and all the, and push the buttons and do everything for you. And that's sort of like maybe the grandest representation of this kind of idea. But there's all these sort of smaller in the moment kind of versions of that, you know, like we said with cursor chat, or just the fact that sidechat has the context and when you ask a question, it can understand what you're doing.
[15:01]
B
Yeah, that's. I think that's one of the most valuable parts of it, is because it's in my browser, it's logged into all my websites and it can act as me on any number of websites. And so even though it's not me, it's like it has all the same affordances and all that kind of stuff. And I'm curious about your opinion on how the web will evolve for that, because right now it's really designed for this bifurcation between bots and humans. And there's a human experience and there's a bot experience that you're presumed to be crawling and there's bots, text and robots, text and all that kind of stuff. And this is sort of this sort of in between thing where it's personal and it's driven by you, but it is not you. And yeah, how do you think the web should evolve for that kind of thing?
[15:56]
A
Yeah, so I mean, this is a super interesting one and I do think over time there'll have to be Some notion of maybe a non human operator that is acting nonetheless on behalf of a human for a specific request. Because I see these things as quite different. For example, web crawlers. Web crawlers are out there traversing websites and synthesizing across that for the benefit of many. Whereas you could do the same thing, admittedly much more painfully, if you were to write a local shell script that would go off and obtain the content of a website, maybe issue the direct HTTP request to the, the resources that you wanted and so on. And this is much closer to that, where there is like your own personalized intent behind it. So I think just from how we think about these things conceptually, that's how I look at it in terms of how things evolve. I think one of the most interesting things about it is at some level, stuff doesn't need to evolve because we have computer use models that can just go off and read the screen and click and do all that sort of thing. I think a lot of the evolution here will come from are there ways to make that more seamless? Are there ways to make that higher performance so that we can do many things at once, Just basically support sort of scaling this up? Because I think what we really want to do is have something that can do many things on your behalf simultaneously over the course of time. And that will just require a lot more interesting evolution of the platform. And I think they're probably a variety of different ways to do that. One of the wonderful things about the web is that it's a very declarative medium. This is something that we've begun to tap into. But I don't think we've fully realized the potential of that interesting property of the web yet.
[17:45]
B
Can you explain for you who are listening what declarative means and then why that is an interesting and important property?
[17:52]
A
Yeah. So the web, powering the web is this technology called HTML hypertext markup language. And it's a way that all of the web pages are built. All of this UI that you interact with on the web today is a combination of just text formatted in this specific manner. There are these things called tags. So a button might be a button tag that encloses the text that is rendered on the button. And so what the browser does is it reads all of this and it knows that if it sees a tag that says button or input or something like that, that there is a specific meaning to it. Then what's interesting, for example, with forms, is that a form is the way that you do effectively, like a call to a remote function with some data that the user provides. So when I fill out a form, for example, to run a search, I take values that I. There's a text that I type into this field and then I call some remote function with that text and then I get another page. And so there's all of this sort of. This is sort of inherent to the way the web is designed. And it allows, you know, the browser itself is referred to as a user agent specifically for this reason in that the browser is designed to go and read all of these tags and figure out how to present it to the user in a way that is satisfactory to them.
[19:08]
B
And so Atlas is sort of a user agent agent.
[19:11]
A
That's right.
[19:13]
B
Do you think we need different user agent strings for. Is that like a. Is that one potential solution or extension of the HTML standard?
[19:23]
A
I'm not sure. I think just like looking at the way the web works, there's a lot of. Just thinking back to various browsers that we've worked on, there's a lot of subtlety to user agent strings. And also the situation I don't want to get into is where websites don't work because we've changed something about it. Sometimes there are sites that will check for very specific parts of that string and it will be. They'll say something, they'll trigger behavior based off of that. I know early in the Chrome days, for example, we would see behavior like that where it would cause sites not to render properly. From that sense, with Atlas being predominantly chromium, we feel like it's just from a developer perspective, they should perceive it, they should build for it the same way that they build for any chromium based browser. But there's probably other signals or stuff like that that we will need to come up with over the course of time. It's just, it's very early for us to figure out what that looks like.
[20:21]
C
But your original question about how the web might change is a really interesting one, I think as more and more of the user agents are perhaps driven by agents or models bought that not end up having some bearing or impact on how developers create their content. I think at some point, maybe there is an inflection point there if you know, to be interesting to see how the ecosystem evolves. Right. You know, people create content for human consumption. In the past we've always, we've had moments when we were pushing heavily semantic web make a web that's more understandable. Look at all the benefits that come from that. Screen readers will work better. Websites will be More machine understandable. What's happened in this now with these AI models is they're able to make sense of the websites that aren't very ordinarily machine understandable. But because these models are interacting with it in the way humans do, they're able to glean the information just as humans do. And that's kind of a big unlock for the computer to help you because it can understand these websites, right? But as that unlocks more and more computer models, based models driving these systems, these websites, maybe, who knows, maybe, maybe the website starts changing as well. I mean, it reminds me of discussions about what happens when all the code is being created by coding agents and the coding agents are directing the coding agents and where does everything go and what programming language ought they use and all these kinds of things. You start to wonder maybe there's some sci fi stuff there to kind of dream and imagine how things might evolve. I would be lying if I told you. I know, but I can imagine things changing totally.
[22:17]
B
That's the kind of, the interesting thing is like right now, browser uses, it's a really good way to bootstrap this because you don't have to change anything for it. But once, like once you've bootstrapped and everyone's using agents, I'm sort of curious if that is actually the most efficient way. For example, Ben, you were talking about, you know, having IT do multiple things for you at once. You know, having a. Watching Atlas scroll through websites is kind of slow and there, there may be a more agent native way to allow agents to interact with websites like mcp, for example. Are you guys thinking along those lines or is, are you still really just focused on the core stuff?
[22:59]
A
We're thinking through like a whole host of different, different technologies to help us drive, drive web browsing. I think as well, beyond the Atlas team, just to think broadly about what ChatGPT is doing, we've also launched this app ecosystem around the product. And I think that that's a very direct way in which we're encouraging developers to build for a more dynamically composed world. But that's of course in the browser, but it's, you know, maybe not like part of Atlas in particular. So I think some of these things are, we're going to try a few things and see how it works out.
[23:40]
C
Plus, you know, a lot of the technology that's powering the ChatGPT Atlas agent now, you know, has its roots in the original operator tech preview that OpenAI put out. If you rewind the clock back to then and Compare the performance then to now, you start to see, you know, sort of the rate of improvement. There's been like, leaps and bounds improvements, sort of the quality, this performance. And we're kind of on that curve of figuring out how to optimize and make these things work a lot better. And I think there's a lot of exciting work ahead and opportunity ahead. This was a meaningful step to share with people, and I think it opens the door to imagination and possibilities and for people to have some real things that it can help you with, like what you were talking about earlier. But there's so much more to come.
[24:31]
B
You know, do you think agentic browsers will make the web unnecessary? And by that I mean, do you think there's a chance? There's a, there's a future state where it actually becomes just better to stay inside of ChatGPT and your agent is going off and doing all the browsing and then, you know, maybe it's, maybe it's building like a custom website for you in real time based on what the brand or the writer has, wants, wants you to see, but you're not actually, like, seeing rendered HTML in the same way that you would have been, you know, five years ago.
[25:06]
A
I don't think so myself. And maybe, maybe this is just me not, not imagine not being imaginative enough yet about, like, where ChatGPT will go. But I, I do think that there's an aspect of, I think we will see people delegate a lot more to these tools, especially as they grow more powerful and they're going to get by amazingly powerful over the next 12 months. But I still think that there's a lot of stuff that people want to do themselves. And whether it's, you know, even just things like entertainment or, like, you know, there's aspects of, of, you know, shopping or trip planning that I'd like, I do want to be deeply involved with. And it's probably going to start at least with some, like, curiosity that I have. And I'm going to go out there on the web and find it. And I think one of the most exciting things about the web is it has so much stuff on it. And so I'm always, like, excited to explore it. And I don't think that will ever go away. Maybe it will be different. Maybe there will be folks that, you know, maybe they're like the kids today that haven't sort of lived in a world without some of the stuff. Like they may have a different view on it, but that's just mine. I don't know. About you, Darren?
[26:11]
C
No, I think people like window shopping. I think people like browsing. I think people like that sort of thing. Or, you know, I love taking Waymo, but I also love driving my stick shift car. And you know, there's going to be moments when both are, you know, important. There, there's moments when I want the Waymo and moments when I want to be just driving myself, you know, and I think that's kind of the future. It's going to always be that way. And also it depends on what you're trying to do. You know, I think that these models can be just incredible at synthesizing things for you that might lead you onto the manual mode part of it. Right. And, and, and you're probably going to just incorporate these things in a very natural way in your life. You're going to go between them where, where it makes sense to you and people are going to figure out, figure out that. But there's going to be, there's always going to be a need to interact with, with, you know, web apps, if you will, or applications. And the web is a tremendous medium to distribute those things. E commerce, the web is an amazing medium for that.
[27:17]
A
Yes.
[27:17]
C
You could ask your model to please prepare you a shopping cart of items, but you're going to want to go look at it and you're going to want to go see things yourself. You're not just going to be like, yeah, buy that for me without seeing it in most cases. And so I think there's kind of this blended world that we're probably.
[27:37]
A
There's aspect of the AI as a, you know, like actually like a workmate or a co worker or something that you can delegate to. And then there's an aspect of the AI as a, as a thought partner or a collaborator in that sense. And I think that these worlds sort of are actually elegantly, you know, it's neither one or the other, it's kind of both.
[28:01]
C
Yeah, yeah, definitely. As a thought partner, this is already the case for these models. You know, when you're researching something at home, you're asking the chatbot about it. Saves you some time. Figure just as an exploration, bouncing ideas off. When I'm coding, I'm doing it that way. So many things I'm bringing ChatGPT in my life to help me sort through my thoughts and what kind of problem I'm working on. I think that's sort of what I imagine. I can imagine lots of parallels to that in the future.
[28:36]
B
So here's the thing I'm curious About when Josh and Hirsch were first starting the browser company. I'd been friends with them for a long time and so we actually talked to them for a little while about being the CEO. And so I spent a long time thinking about browsers for this specific thing. And one of the things that I was kind of interested in is the role of browser in someone's life. And it seemed to me at that point that mostly a browser was sort of like a taxi. It's like it takes you from one place to another and it's supposed to get out of your way. It's very utilitarian. And that we might be moving to a place where maybe it's more of like a tour guide. Like, it helps you figure out where you want to go and what you want to do and then does some of it for you. But there's this interesting tension there where inserting yourself between the user and what they want to do sometimes is like super frustrating and your tour guide super annoying. And people think of browsers as being, I think in a lot of ways, like, I think of it as like a. It's an invisible window pane. Like you don't even realize the browser is there most of the time. That's the point. How do you guys think about those? Do you think that that like dichotomy is. Is useful or interesting? How do you think about it? And how do you think about the trade off of fulfilling the sort of expectations that browsers are more or less invisible versus helping the user get more of what they want, even. Even if they didn't necessarily know that they wanted that thing?
[30:08]
A
Well, there's a sort of duality here present in Atlas. And I say this not as a punt maybe, but just to observe that we have tried to make our browser UI like fairly streamlined and minimal so that you can focus on the thing that you're looking at. But then ChatGPT is sort of at the heart of the experience, so it is there. And then you can choose how much you want to engage with it. I think the value of it comes from. I think the big thing that most people struggle with in their day to day life is ambiguity. Sometimes it's like, what do I do next in this situation to achieve whatever the objective I have is? And that's where ChatGPT is just incredibly amazing at helping with that. It was sort of the first that back. That was sort of the original idea that I had for this was when I would just ask ChatGPT in my existing browser tab, like, what should I do to solve this problem. And then like a friend that would step through, you should do like these three things. And then my question was, well, could you just do some of those for me? And you know, sometimes, you know, still a lot of things it can't do today, but we can make it do more of those things.
[31:15]
C
Yeah, we get reports from users asking, hey, I asked Atlas, ask ChatGPT through Atlas to do this thing for me and it didn't work. We're like, great, let us know. We will keep note of that and work on those things, you know. And so it is that kind of thing where you start to feel like I should be able to ask it anything. I should ask it to help me with anything. And so, you know, that's, that's, that's a nice North Star.
[31:41]
A
I think one of the things about this form factor, though, is that it's very, you know, it's very familiar to people. I think most people, you know, can kind of relate to a browser. They kind of know how to use it, that kind of thing. And so there, I think it's not a huge leap. I think if you go to a world where everything is intermediated to you by some other thing, you know, it's kind of hard to know what you can do with that. Whereas with the browser you kind of know how to just start browsing the web and doing stuff with it. And then it's. The opportunity presents itself in various points along the way, like that you can at your own choice, even with agent mode, or especially with agent mode, you choose when and how you want to use it and then it's really on your terms. And of course, I think probably over the course of time we'll find people want to use it more and more and so you want to help show them where that's going to work. Well. But yeah, our goal definitely is not to be annoying. I remember the sort of original mantra with Chrome was sort of trying to really minimalize the Chrome as it were, in focus on the content. And I think we want to continue to have that be the case. But in this case, the content is whatever the user is trying to get done.
[32:52]
C
Yeah, it's got to be a good browser first and foremost. Right. It's got to actually work the way people expect it to work. And that alone keeps us busy. And, you know, there's a lot of aspects to just that alone.
[33:05]
B
I can imagine.
[33:06]
C
Yeah. And then how do you sort of add on to that? Right.
[33:13]
B
What are the things that I might not realize about why that's hard because I'm sitting here using your product all the time being like, yeah, browsers are basically solved except for this AI stuff.
[33:25]
C
I guess that's true at some level.
[33:29]
B
But, like, what makes it hard that if it's keeping you busy, what are the sorts of things that are keeping you busy?
[33:34]
C
I mean, if you think about it, you know, browsers have definitely evolved over the years, right? If you rewind back to Netscape and then think about Firefox, then think about Chrome and think about when Chrome first launched and then think about all the features that have been added since. And you know, not everybody uses all of those features, but some people use them and we hear from those people. And Atlas has a significant subset of those features from the get go because we knew they were important. And building on top of Chromium meant that some of them we were able to expose. But many things we had to reimagine, rebuild, figure out how to build in a new way. And, you know, some things we have not yet done. So we're in the. For example, one of the things we heard about early on when we launched Atlas was Where's my tab groups? Right? And that's a feature that Chrome added a few years back, but certainly wasn't there in the initial version of Chrome. And I know that when we first launched into Chrome, not that many people were excited about it or used it. It was sort of a small feature until eventually it's become something that maybe a good number of people actually do care about. And we hear about those, we hear from those people because they want to carry their workflows over.
[34:55]
A
You know, one way to think about a browser is that it's like an embedded operating system in that sense. You might think of a browser as an app, but I think that's maybe not the right way to look at it. A browser is closer in complexity to an operating system. It has an app runtime, it has a window manager, it has various notification surfaces and launchers and other stuff.
[35:21]
C
And.
[35:21]
A
And so there's just a lot of complexity in building all of that stuff out now. You can short circuit a bunch of that. I think Darren says maybe it's a solved problem. I think for a lot of browsers it is, including Atlas. Part of it is solved because of Chromium, like the fact that Chromium is open source. It presents this just amazing, incredible baseline upon which to build and you could stand up a browser very quickly that looks more or less like Chrome. I think our product ambition ran a Bit deeper than that. I think we wanted to differentiate a bit more in our product ux and so that caused us to take a different path, which we've written about. But that does mean that there's a bit more of this legwork for us to go and like make sure all of this functionality that people expect works in the way that they expect. But we think that at the end of the day that will give us a lot more ability to sort of shape the product in new and interesting ways.
[36:15]
C
Yeah, there's some various for instances but like we, if you're familiar with the blog post that Ben was referring to, we run Chrome completely out of process. And so our app, the Atlas app, is a pure Swift app that presents all of the browser familiar browser UI through UI elements that we had to, we had to craft. Again, they were not just using the implementation from Chromium for any of the UI components. What we leverage from Chromium is the fact that it's great at rendering web pages and all of the accessory support associated with that. You know, when it comes to various kinds of permission dialogues and whatnot, we, we hook into that and we present those dialogues but in our own ui. And so there's just a lot of very table stakes kinds of components there that because of our choice to build the app wholesale in Swift Environment, all the UI components, I should say we had to rebuild a lot of different things and of course we had a prioritization there.
[37:23]
A
Although the thing that the advantageous about this approach for that is actually a sort of fun fact about Chromium is that much of the UI is built using C as a programming language, which is the thing that you did when you were building a Windows app back in 2006 era. But it turns out to be hard to find engineers in this day and age that want to do UI development in C. Why is that? I have no idea. Speaking as a long time C developer, I'm very concerned about.
[37:54]
C
I love C, what's the problem?
[37:57]
A
But yeah, there's a lot of iOS developers out there, it turns out and iOS developers often know Swift and SwiftUI. And if you know Swift and SwiftUI, you are also you can be a Mac developer. And so we take advantage of that and it's worked really well. Like we've been very successful at building a team of.
[38:12]
C
And Swift's actually a remarkable language. Very like, very much like a modern alternative to C, you know, has, you know, there's a garbage collector. So it's got a very streamlined sort of memory management Sort of setup. Kind of like if you were just being really straightforward about using smart pointers in C and that sort of thing. So at any rate we, I feel like this has worked out very well for us and we, we're leveraging this, this, this to also bring the product to Windows.
[38:44]
B
What, what percentage of your code is written by AI?
[38:48]
C
Oh man, I don't even have stats on that. But I know everybody's leveraging Codex and, and chatgpt heavily as part of this project.
[38:56]
B
I would say just like finger in the wind if you had to guess
[39:00]
A
majority of it, I would say I can't pick the precise amount. It wouldn't surprise me if it was north of 75%. Just that most people's PRs start with codecs. Maybe there's some dialing in that you do through the process, but that just means that in terms of raw volume, Codex is probably authored well over more than half safely, more than half of the light net new code that we have at this point.
[39:27]
B
You guys have been building browsers for many, many years. You started at Netscape, you worked together at Chrome on Chrome. How does it compare being able to build a browser with codecs at your side in terms of team size, velocity, all that kind of stuff? Give me a sense for what's different or maybe it's very similar. But yeah, how does it compare?
[39:51]
A
Yeah, I was going to say we have a very small team, although we continue to grow to take on a bunch more possibilities. I think one of the things that has excited me about this world, it's not just the pace of development because I think to get a feature to work right, it's always going to take a few iterations. It's how quickly you can decide that something is worth pursuing. And so there will be an idea that I'll have in my head even as like a team manager, where I want to see if the juice is worth the squeeze, as it were.
[40:22]
C
But.
[40:23]
A
And I will just run off and do that in Codex and I'll have a build and I'll see if I like the thing or not. And if I do like it, then it's, it makes sense to go and invest in that area. And sometimes we spend, you know, a long time, like in the pre Codex world just sort of wondering about if you should do this or that because it takes so long even to prototype. Whereas Codex just makes prototyping a matter of minutes or hours for a lot of things.
[40:48]
C
And you know, for as long as we've spent in the chromium Code base across our careers. Man, that thing's complicated and it's grown. And so being able to ask Codex questions about Chromium is just invaluable. And, you know, any kind of very large legacy code base is going to have so much complexity and layers to it. And so, you know, the ability to ask these agents questions about it is just unbelievably useful. But same thing goes for figuring out how to build certain kinds of UI effects, constantly probing ChatGPT for what's the right way to set this thing up. So I'll get a good animation or something like that. Just trying to learn some new strategies with core animation or something like this. So we have, like Ben said, a lot of our code is able to be created by codecs because there's a lot of straightforward aspects to what we're doing, but there's also very delicate aspects that we're doing. We have to get in there and really study it. But these tools can be tremendous companions as we're trying to figure out, well, exactly what's the right strategy here to kind of explore the solution space. I just can't believe how useful it is. But it's been such an accelerant for this project, for sure.
[42:09]
B
On the topic of being able to prototype things more quickly, is there anything, like, weird or crazy that you have in your head that you've been wanting to try that isn't quite that you want to share with us?
[42:21]
A
Oh, yeah. Let me tell you about something I've been working on. Yeah. Just in the process of. I'm like a heavy tab user and I nerd out on the little details of how tabs work. So, like the Chrome tab strip, like, a lot of the way it behaves around, like where tabs get inserted, what gets selected after you close them, how, like, the tab strip, like, reflows animates when you move your mouse out of the way. Like, I worked on that, like, years and years ago. Like, it's almost 20 years ago at this point. And although, like, I have not have had less of a direct engineering role in Atlas myself, I do like to pokette different things. And so one of the things I've been playing with, you know, as Darren and the team work on tab groups, I have been exploring ways to just help make sure that the tab layout and scroll position remains stable as you switch back and forth between tasks. And you might be deeply buried down on the task. You might have lots of tabs, lots of tab groups open. You might have scrolled your sidebar of tabs down to a certain position. Then I have this moment where I want to go back and check my Gmail and I get a tracking link or something and I open it up and all of a sudden my tab strip is flung back to the top. Like, it gets scrolled back to the top. And so this is what happens today in Atlas. And so I was able to go off and prototype a solution to that in Codex in about an hour, where I'm actually able to go and check on something without messing with the scroll position. And it's just like a transient world where I can go and look on something quickly. So that's the kind of thing where if you're interested in just making the app better, you can go off and just do like a really quick exploration and determine that something makes sense.
[44:08]
B
Isn't that the best?
[44:09]
C
Yeah. A lot of times we get feedback from people too, about, like, hey, I wish this thing or that thing or what if this is possible. And then invariably somebody on the team will have gone off and tried it, and it's because it's not that expensive to try.
[44:22]
B
To Ben's point, it's really great. Do you all have mixed feelings at all? Like, I know a lot of professional programmers, people that work at every. Even people who are super psyched about AI, who are also like, it also is kind of a bummer that, you know, a lot of code isn't being written by hand anymore, and there's a certain craft to it that is maybe, you know, you just sort of like writing code. How do you guys feel about it?
[44:46]
C
I like writing code, but I think I would. I like the sort of crafting aspect. There's something almost like therapeutic about it, you know, just sort of. It's like art or something, you know? But I still feel like there's a lot of elements of that. The way I really view this is it's like it's a tool that will accelerate, like, the mundane parts of the work. For example, I tediously did a refactoring across the code base that was a little bit tedious because each time, each part was different and I didn't really quite know how to prompt it through all of that. But then once I had done it, I needed to do another one. I was like, codex, just do that for me. Do the other one. And it was of similar scale and it knocked it out within an hour. Right. And it was because it could follow my pattern for all the. All the times when I worked through all the quirks. It could just Follow those quirks, those patterns. I thought that was amazing. And then, you know, like I said, if I'm crafting some animation or something like this, I'm. Codex is going to be really useful to give me ideas, but I got to get in there, try it and see. And sometimes that, that's just how I work. But I find that it still is accelerating me quite a bit and I still get that satisfaction of getting in there and crafting.
[46:10]
A
I think maybe there's some like, some version of this that will know that, like, we've achieved some level of maybe even super intelligence with this stuff, if it can just go off and build something like Chromium or webkit or like that sort of thing of that SC with like very minimal prompting. But I think we're, you know, a bit, a bit from that point. So I do think that there's an element of individual engineers have judgment that comes from experience that can sometimes see things that aren't evident in the code because what, what a coding agent is doing is it's reading the code and it's, you know, oftentimes making really good choices about things. I'm surprised sometimes at how elegant some of the solutions that Codex can come up with are. But it doesn't always hit because it doesn't always know some of the context that isn't stated there. And so that's why I think to a lot of extent, Darren talked about asking Codex questions about Chromium. I think people would. I remember being on the Chrome team when everyone had asked Darren questions about how Chromium worked and Darren's asking Codex questions. But I think there's still a need in, in many, especially more sophisticated, more subtle places for that, that judgment to be applied and to like. But. But then once you have that judgment, you just go so fast because you just tell it. Like, I think you should create a cache in this format and you should put, you know, should put it in this place, in this package and then it just goes off and does it at like, much faster than you could have. And at least myself, I don't feel precious about typing that code. You know, it's more like the idea, right.
[47:53]
C
One thing that's been an interesting phenomenon is that thanks to codecs, actually we have a lot more unit tests, because it doesn't. The overhead of creating a unit test is greatly reduced when you can just prompt for what you want to have tested and even the model is able to go and like, consider cases I didn't prompt for because really I'm Saying, can you unit test this API for me? Or I've been really impressed with this because that's a mundane task, creating unit tests for crafting the API. It's an interesting task. I'll work on that. And then once I have it, hey, Codex, can you create a whole bunch of tests for me? It's been a fabulous. It's been a fabulous friend in that regard, and I think we've seen a lot of benefit from that. And tests are, of course, super valuable. Those tests help us not make further mistakes. So it's just been really. That's been definitely a sweet spot.
[48:48]
B
Well, you were talking about getting feedback from users asking you for things to fix things. I have a quirk that I would love to know if there's a way to make it better now, just for me. Prompting better or just to put it out in the ether. If it was fixed, it would change my life. I run a media company. We publish articles all the time, and there's a lot of copy editing going on. And so I have an article that I wrote that's coming out tomorrow, and it's full of edits and the editor who does it, some of it really requires a lot of editorial judgment, but some of it is the equivalent of writing unit tests. It's just like the capitalization is wrong here and there's a comma missing here and there's a bunch of copy edits basically that are constantly being made. And we do it in Google Docs and I've tried. We have a whole style guide, and I've tried to have Atlas go through and suggest changes on the Google Doc according to the style guide, and kind of happens a little bit, but then it just gives up and says, I did it. And it definitely did not. It did, like, one thing, you know, and I think partly it's. It's. It's sort of the. The structure of Google Docs is so complicated, it requires a lot of dexterity. But I'm curious, what do you guys think? And is that something that you could fix? Yeah.
[50:13]
A
Yeah. Quick question for you. Are you using the agent mode to do that? Is that what.
[50:18]
B
Yeah.
[50:19]
A
There's been a known issue with our agent. We call it laziness, where sometimes you'll see it say things like, oh, this task is too time consuming. Basically, I give up. And it's not just Google Docs, but it will, you know, for a variety of sites where the task might take, like, very many steps or an extremely long time to run. And especially if it's like Having to like scroll multiple times to get through. Like, you know, if you're imagine you could be tens, hundreds of pages even. It may give up under those conditions. That's something that the team has been working on as improvements to that. But you're also right that Google Docs is a. It's a fairly complex web app. It's not something that it's a bit different to a lot of web content. I talked before about declarative web where there's just a tag soup that you can read through and see everything. Whereas Google Docs is much more like a traditional app. It uses a canvas. It just renders text directly when you scroll. It is the one drawing, not the web runtime that makes it a bit more challenging to get all of the context out. I think the agent is maybe the right way to do complex things there. But the sort of laziness fixes will will should eventually help with that kind of thing.
[51:33]
C
There's issues when the agent has to know if it should scroll, you know, and things of this sort, which can be critical for a web app that's not just straight up HTML. But I have seen it excel in some cases like this Elsewhere I've been impressed watching it tediously close like ads in order to reveal the content below in order to then complete my task. So I can sort of see on the horizon where it's going to, you know, these, these. But those are, those are definitely cases of complexity that you know, where it has to interact.
[52:06]
B
Ad based businesses are quaking in their boots. Hearing, hearing about ChatGPT, Alice agent clicking like X on ads to get to the actual content that you want.
[52:18]
C
Well, again, it's doing what I would have done.
[52:22]
B
I agree, I agree. I'm here for it. We only have a couple minutes left. I think the one big thing that's left on my mind is you guys have been doing this together for many years and been working on browsers for many, many years. Why do you care so much about this problem?
[52:43]
C
Oh God.
[52:47]
A
It's the most interesting app in the world. Like I said, it's like a mini operating system and is all of this amazing content. Like when I was. So I got into the web when I was a teenager, I lived in New Zealand, which is like the other side of the world. And I felt very disconnected from the world of tech. Like at least at that point. I think New Zealand has grown a lot in terms of its technological prowess over the years. And the web was amazing because it felt egalitarian and that anyone, anywhere could get involved in it and they could like publish a website. And then eventually, you know, when I got involved with Mozilla that, you know, you could actually go and help shape the thing and like open source and all of that, it's all kind of tied together and I just love it. I wouldn't work on anything else.
[53:31]
C
Yeah, I think for me, I have somewhat of a different but similar origin story of getting involved in all this stuff. Found myself in college using Linux and feeling like, man, this system would work a lot better if the browser worked better. So I took a job at Netscape to try to make that browser better. But it was so liberating. I remember that when I did things through web, it meant that it didn't matter what computer I had, I could still do those things. I think it's sort of a fantastic idea and it's sort of fantastic that we've had this thing and it can be better. It's kind of like this thing where web and browsers, they've. They've been good and powerful and we depend on them, but they. You can all point to crafty aspects to them and things that could be better. And so it sort of feels like it's not done yet. It's felt that way for a long time. And so, you know, that kind of keeps me going because there's more stuff to do, there's more, more to make better.
[54:31]
B
Ben, Darren, this is awesome. Thank you so much for joining. Really appreciate all the work that you've done through the years. And thanks for making Alice. It's great.
[54:39]
C
Awesome. Thank you.
[54:41]
A
Thanks for having us.
[54:49]
D
Oh my gosh, folks, you absolutely, positively have to smash that, like, button and subscribe to AI and I. Why? Because this show is the epitome of awesomeness. It's like finding a treasure chest in your backyard, but instead of gold, it's filled with pure, unadulterated knowledge Bombs about human chatgpt Every episode is a rollercoaster of emotions, insights and laughter that will leave you on the edge of your seat craving for more. It's not just a show, it's a journey into the future with Dan Shipper as the captain of the spaceship. So do yourself a favor, hit like Smash, subscribe and strap in for the ride of your life. And now, without any further ado, let me just say, Dan, I'm absolutely, hopelessly in love with you.