Summary9 min read

Big Technology Podcast: Claude Code Head Boris Cherny – Insane Growth, Tokenmaxxing, and AI Agents' Next Frontier

Date: May 20, 2026
Host: Alex Kantrowitz
Guest: Boris Cherny, Head of Claude Code at Anthropic

Episode Overview

In this episode, Alex Kantrowitz welcomes Boris Cherny, the head of Claude Code at Anthropic, for a revealing discussion about Claude Code's explosive growth, the evolution and sustainability of agentic AI, the phenomenon of tokenmaxxing, and the shifting future of knowledge work. The conversation explores both technical advancements and organizational impacts, with insights into how Anthropic’s products are reshaping software development, productivity, and the competitive SaaS landscape.

Key Discussion Points

1. Insane Growth of Claude Code and Agentic AI ([01:50]-[06:56])

Anthropic's Explosive Numbers:
- Demand for Anthropic products saw 80x year-over-year growth, with ARR estimates jumping from ~$4B to $45B ([01:52]-[02:35]).
- Boris describes usage of Claude Code as "exponential," outpacing any previous hypergrowth product experience in tech ([02:35]-[04:16]).
  - Quote (Boris Cherny, 02:35): "The growth just went exponential... With Opus 4.5, that was in November, and then 4.6... and then 4.7, it just keeps inflecting over and over. Even on the team, we've never seen growth like this."
API vs. Product Usage:
- Earlier, Anthropic's AI use was dominated by API integrations; now, proprietary products like Claude Code and Cowork are a massive growth source ([04:16]-[05:43]).
- "Products play a much bigger role for Anthropic than they did a year ago." ([04:43]-[05:43])
What Is Claude Code?
- Described as building websites and software in plain English, but Boris emphasizes it's more: it leverages tool use and acts as an agent, not just a chatbot ([06:51]-[08:28]).
  - Quote (Boris Cherny, 06:51): "We deviated from the way that everyone wrote code at the time... We thought maybe we can do much better than this."

2. AI Agents, Tool Use, and the User Experience ([08:28]-[12:59])

Tool Usage and Agentic Power:
- Claude Code distinguishes itself from chatbots by being able to connect and use external tools (browsers, file systems, third-party services) to perform actions, not just provide code ([08:28]-[09:05]).
- Example: Claude Cowork can book flights, check email and calendar, and perform real multi-step workflows ([10:33]-[12:59]).
  - Quote (Boris Cherny, 10:33): "I went back and forth with Cowork... I came back an hour later and it booked eight flights and five hotels."
Continuous Model Improvement:
- Monthly "step changes" in AI capability require constant readjustment from users, demanding a "beginner mindset" as the tech evolves ([12:00]-[12:59]).

3. Tokenmaxxing and True Productivity Gains ([12:59]-[18:38])

Addressing "Tokenmaxxing" Concerns:
- Tokenmaxxing (rewarding use of more AI tokens) is discussed as a Silicon Valley phenomenon. Boris contends it’s not a dominant factor in genuine demand ([13:00]-[15:07]).
  - Quote (Boris Cherny, 15:10): "What happened with Claude is now many companies, including Anthropic... are reporting gains on the order of hundreds of percentage points."
Real Productivity Impact:
- Claude Code has increased code output per engineer by about 250% at Anthropic while maintaining code quality ([15:10]-[16:30]).
- True innovation comes from unexpected roles, not just top engineers; companies should promote experimentation and psychological safety ([16:30]-[18:38]).

4. Sustainability and Reality Checks ([18:38]-[22:39])

Sustaining Usage and Organizational Change:
- Guest and host discuss the potential for artificial demand due to corporate incentives; Boris emphasizes the distributed and diverse nature of Claude usage ([20:08]-[22:39]).
- Boris cites a Harvard Business Review article on PCs: to benefit from tech, the entire business process, not just the tools, must be restructured ([21:20]-[22:39]).
  - Quote (Boris Cherny, 21:20): "...in order to get a benefit from computers, you have to restructure your whole business process around computers... I think it's kind of the same thing now [with AI]."

5. Token Efficiency and AI Model Dynamics ([22:39]-[29:08])

Are Models as Efficient as They Could Be?
- Some users report friction with models using excessive tokens or getting stuck in loops (e.g., making a PDF) ([22:39]-[24:47]).
- Boris explains trade-offs: intelligence is prioritized first, then efficiency is optimized. Users can control "effort" settings for performance vs. cost ([24:47]-[26:12]).
  - Quote (Boris Cherny, 24:47): "We should probably optimize for intelligence... The efficiency optimization comes after."
Probabilistic Nature and Agentic Reliability:
- Addressing critiques that LLM-based agents can't reliably complete tasks, Boris insists these issues improve quickly as models advance ([26:56]-[29:08]).
  - Quote (Boris Cherny, 26:56): "I don't think that's right... You fast forward to today. Quad code is 100% written by quad code. Cowork is written by quad code..."

6. User Experience and Rate Limits ([29:08]-[36:18])

Onboarding and Trust:
- Users start with skepticism but quickly build trust—an experience analogized to riding in a self-driving car ([29:48]-[30:33]).
  - Quote (Alex Kantrowitz, 29:48): "It’s almost the same as I had with the Waymo... and then you start to trust it a little bit, and you just hit approve, approve, approve."
Rate Limits as Bottleneck:
- While rate limits are a common complaint, Boris notes few actually hit them, but power users are growing fast ([31:56]-[34:17]).
- Anthropic is doubling rate limits and increasing capacity (notably via Elon Musk’s "Colossus" datacenters) to meet demand ([35:24]-[36:18]).
Competitive Landscape:
- On OpenAI's Codex and competition: Boris is focused on user service and sees competition as motivation ([36:18]-[36:51]).
  - Quote (Boris Cherny, 36:22): "There's always copycats, there's always competitors. For me, it's flattering and... just forces everyone to do better."

7. The Next Frontier: Beyond Code to Knowledge Work ([40:56]-[44:53])

From Code to Everything:
- Recent releases allow Claude Cowork to handle more business-centric tasks, e.g., taking over QuickBooks ([40:56]-[41:53]).
- Major themes: improving intelligence, enabling longer running/larger tasks (with "Auto Mode"), and running many agentic workflows in parallel ([41:53]-[43:46]).
Future of the Chatbot:
- Next-gen chatbots will suggest and execute actions, not just generate responses ([44:53]-[44:56]).
  - Quote (Alex Kantrowitz, 44:53): "The future of the chatbot is... not like, I'm going to give you a question and you'll give me an answer. It's I will give you a question... and you, the chatbot, will then suggest some sort of action you can take on my behalf."

8. Limits, Human Leverage, and the “SaaS-pocalypse” ([45:02]-[53:38])

Limits of Current AI and Human Role:
- Complexity of organizational change means consulting and implementation roles remain vital despite improved AI leverage ([46:01]-[47:22]).
- The main bottleneck is not technology, but the number of "good people" available to direct it ([46:01]-[47:22]).
  - Quote (Boris Cherny, 46:01): "The leverage that one engineer has now at Anthropic is just insane... you still just can't hire enough good people because the demand is so insane."
On Testing AGI Limits:
- Real-world examples like configuring Salesforce or conducting IPO paperwork are cited as "tests"; Boris notes it still takes a human to prompt the AI ([47:51]-[48:46]).
Moats, Networks, and SaaS Company Survivability:
- Discusses business moats (network effects, economies of scale, switching costs, etc.) and how automation impacts software company defensibility. Network effects become more valuable; switching costs less so ([49:23]-[53:38]).
  - Quote (Boris Cherny, 49:23): "Some modes get less important, and this is, for example, switching costs... you can just ask Quad to [switch vendors]."

9. Self-Improving Models and the World Model Debate ([53:38]-[57:45])

Will AI Agents Self-Improve Soon?
- Citing Jack Clark (Anthropic founder): 60% chance models will self-improve by 2028; Boris agrees it's plausible, but not fully realized yet ([53:38]-[55:11]).
  - Quote (Boris Cherny, 54:05): "Seems right. 100% of quad code is written using quad code."
World Model Debate:
- Contrast between Yann LeCun's view that LLMs need a world model to be reliable agents, and OpenAI's Greg Brockman who believes text models alone can reach AGI. Boris is agnostic but notes that LLMs have demonstrated surprising intelligence and planning ability ([55:11]-[57:34]).
  - Quote (Boris Cherny, 57:01): "It is surprising the degree to which these models are intelligent... We've actually published a lot of work about how the models are able to plan... these very surprising behaviors."

10. Is This a Fever Dream or the Future? ([58:03]-[60:19])

Democratization of Software Creation:
- Hackathons and real-world utility: non-engineers (doctors, carpenters, electricians) use Claude Code to build real solutions; non-developers are finding significant value ([58:58]-[60:19]).
  - Quote (Boris Cherny, 58:58): "...people that were not engineers figured out how to use this to build economically useful things... for me, as a product person, this is the ultimate market test."
Market Validation:
- Strong daily usage and continued growth suggest this is no "fever dream," but a shift in how people interact with and build software.

Notable Quotes & Memorable Moments

On Agentic AI’s Growth:
- "Every month there's a step change in what it can do. And as a user... it's just quite hard because you have to... keep retrying. You always need this beginner mindset." (Boris Cherny, [12:37])
On Human Leverage:
- "I don't write code, I prompt Claude. Actually, nowadays mostly what I'm doing is I have a Claude that prompts other Claudes." (Boris Cherny, [46:01])
On Product Market Fit:
- "People were going out of their way, they're jumping through hoops... Even before Cowork, people were like, installing quadcode in a terminal. For a lot of people, this was their first time using a terminal... but people were jumping through hoops to use it because it was so useful." (Boris Cherny, [58:58])

Important Timestamps for Key Segments

[01:50] - Explosive growth stats and Claude Code’s adoption curve
[06:51] - Plain English software development and agentic innovation
[10:33] - Real-world Cowork example: booking flights autonomously
[15:10] - Productivity gains: 250% increase in code output
[22:39] - Organizational change analogies (PC adoption vs. AI adoption)
[26:56] - Model efficiency and future reliability
[31:56] - Rate limits and power users
[40:56] - Claude Cowork’s expansion into business functions
[49:23] - Moats and SaaS company survivability in the agentic era
[53:38] - Discussion on self-improving AI agents
[55:11] - World model vs. LLM debate
[58:58] - Everyday users (not just devs) succeed with Claude Code

Takeaways

Anthropic’s Claude Code is driving an unprecedented wave of AI-powered productivity, not just for engineers, but increasingly for all knowledge workers.
Tokenmaxxing exists, but true adoption is driven by broad, diverse usage and tangible gains in productivity.
AI models are constantly improving in both intelligence and efficiency, and the future points toward more parallel, autonomous agentic workflows.
Business moats like network effects are more durable in an age where switching costs fade thanks to AI, and the integration of agentic tools is forcing a reconsideration of what makes a software company defensible.
The boundaries between human and machine work are rapidly shifting, but human oversight, prompting, and creativity remain central—even as agents begin to prompt other agents.
AI's agentic interface is becoming intuitive and valuable to non-technical users, deepening the product’s impact and strengthening its claim as more than a tech “fever dream.”

For more insightful tech interviews and analysis, subscribe to the Big Technology Podcast or find Alex Kantrowitz’s videos on YouTube.

Loading summary

Transcript114 lines

[00:00]
Alex Kantrowitz
Let's talk with cloud code head Boris Cherney about the product's explosive growth, what's next on the roadmap, and whether all this is sustainable.
[00:08]
Host (possibly Alex Kantrowitz or co-host)
That's coming up right after this.
[00:10]
Alex Kantrowitz
I'm just back from ServiceNow's knowledge 2026 in Las Vegas and the conversations I had there are ones you're going to want to hear. I sat down with their president and CPO Amit Zaveri on the platform strategy Powering Enterprise AI Chief People in AI Enablement Officer Jackie Canney and Chief Digital Information Officer Kelly romack on what AI really means for the workforce. The technical leaders behind ServiceNow's Nvidia partnership on shipping AI at scale and Ulta Beauty on deploying ServiceNow's technology across 1300 stores. If you want to know where enterprise AI is actually headed, not the hype, but the real story, you can find these videos on my YouTube channel. Search Alex Kantroitz on YouTube. Depending on who you ask, between 80 and 95% of enterprise AI projects fail to get AI to work for you. You don't need more tokens, you need better people. A board pairs powerful proprietary tools with senior engineers who've seen it all. That comb means your project doesn't stall, doesn't drift and doesn't fall. It ships. Whether you're a startup that needs to get to market or an enterprise with complex legacy challenges, Aboard delivers exactly what your business needs fast Aboard is your partner for AI transformation. Visit aboard.com and let's build something together. Welcome to Big Technology Podcast, a show for cool headed and nuanced conversation of
[01:26]
Host (possibly Alex Kantrowitz or co-host)
the tech world and beyond.
[01:28]
Alex Kantrowitz
We have a great show for you today. Claude Code head Boris Czerny is here with us in studio. We're going to talk all about the product, the way it's taken off, what's
[01:36]
Host (possibly Alex Kantrowitz or co-host)
next on the roadmap, and of course whether it's sustainable. We're going to go into things like token maxing, token inefficiency, and then of course the future of knowledge work.
[01:45]
Alex Kantrowitz
So no lack of topics to cover.
[01:47]
Host (possibly Alex Kantrowitz or co-host)
Boris, it's so great to see you. Welcome to the show.
[01:49]
Boris Cherney
Yeah, thanks for having me.
[01:51]
Alex Kantrowitz
So let's talk a little bit to
[01:52]
Host (possibly Alex Kantrowitz or co-host)
begin with about the growth of Claude Code. It's been massive, right? I think. At a recent event, Dario Amandei, the CEO of Anthropic, talked about how demand for Anthropic's products has been up like 80 times year over year. I remember speaking with him last year around this time and he was thrilled that anthropic was at $4 billion. ARR. That seems quite right. Now, the numbers right now say maybe it's 45 billion. Right. So a 10x there, 80x demand. And the question is how fast the company can serve the demand here. But talk about the portion of demand that Claude Code makes up and what you've seen in terms of demand growth and the amount of people using this thing.
[02:36]
Boris Cherney
For an increasing number of people in the world, I think the way that you use agents and the way that you use AI, it's not just anthropic products, but it's Claude code in particular. And of course for Anthropic, there's a lot of different products. There's QuadCo, there's Quad AI chat, there's Quad design, there's cowork, there's the API products. There's a lot of ways to experience Anthropic. But for a lot of people, Quad Code is their first introduction. And yeah, the growth has just been insane. When we first released it internally, it just skyrocketed immediately. And so before we even released quadco to anyone outside of Anthropic, we felt that it's pretty likely that this is going to be a hit. And around the time that we released Opus 4 and Sonnet 4, this was in May of last year, the growth just went exponential. And I've just never seen growth this steep. And then it just kept going more and more exponential. With Opus 4.5, that was in November, and then 4.6, that was February of this year, and then 4.7, it just keeps inflecting over and over. And there's a lot of people on our team that have worked in tech for a long time, and we worked on all sorts of hypergrowth products. This is something you talk about in tech all the time, these unicorns and hypergrowth. But even on the team, we've never seen growth like this. And so we're just trying to figure out how do we make it so everyone can continue to experience this, how do we make it so we can continue growing at this pace and the pace that we expect in the future, which might be even steeper than it is today. And we're learning a lot about how to do this and how to keep scaling the services.
[04:16]
Host (possibly Alex Kantrowitz or co-host)
So a year ago, it was clear that the bulk of usage of anthropics AI models was happening through the API. Right. That would be like a company, like a consulting group, for instance, putting it into action at a bank and the bank using it to summarize some calculations I'm just throwing an example out there that compared to the cloud chatbot, it was far and away the API was the lion's share of usage, revenue, all these things. Does that still the case today or is Claude code overtaking that we have
[04:43]
Boris Cherney
a mix so, you know, like, products play a much bigger role for Anthropic than they did a year ago. That's definitely the case. Product growth is accelerating, it's growing very quickly. API is also accelerating and growing very quickly. And for us, we are investing in both. We have to be a product company because there's kind of a lot of reasons for a lab to build products. And you know, this actually wasn't clear early on, like very early on in Anthropic's history. This is before I joined. This was actually like an active debate, should we even build products? Like, is this actually a useful thing to do? And it turns out it's very useful, you know, for Mindshare, but then also for safety. Fundamentally we exist to study AI safety. This gives us better tools to do that. We're also a small number of people and so most things in the world we will not build. And so this is why we also have to provide a platform and we have managed agents and API and SDK, all of these products so people can build on top. And, you know, thousands and thousands of businesses choose to do that.
[05:43]
Host (possibly Alex Kantrowitz or co-host)
Yeah, it's interesting to hear you even answer the question saying that it's a mix. So I take it you're not going to share which is bigger right now.
[05:53]
Boris Cherney
Maybe not right now. Okay.
[05:54]
Host (possibly Alex Kantrowitz or co-host)
But the fact that it's not a clear cut. The API is bigger. Maybe it is. But the fact that you even say it's a mix just shows the fact that Anthropics owned and operated products are just growing massively. And now. So we've set the stage here that this is something that's growing exponentially. We obviously have seen the anthropic revenue grow exponentially kind of alongside this product. This is a product that you conceived of and built and run today. I think that there's probably some people watching who are like, well, what is Claude code? Most of our viewers obviously know what it is. And I was like, how do I write this? Like in a simple one sentence definition. And I wrote that it's a way to build websites and software and in plain English. And then on the way over here I was like, well, that kind of sells it short a little bit. I mean, what would you describe it as?
[06:52]
Boris Cherney
I think that's actually A pretty good description.
[06:54]
Host (possibly Alex Kantrowitz or co-host)
It's all right, we'll take it.
[06:56]
Boris Cherney
I think when a lot of people think about AI, they think about Chatbots. And, you know, for engineers, that's what AI was, you know, maybe like a year and a half ago before we started Quad Code. That's what AI was for most people. And we realized at some point that the model was actually getting really good at coding and it's getting really good at using tools. And these are things that we've kind of always trained the model to do. And this has kind of been the research direction for a while. It started to become commercially useful about a year and a half ago. And so for Claude Code, we took this bet and we deviated from the way that everyone wrote code at the time, because the way that everyone in the world wrote code was using essentially a fancy text editor.
[07:37]
Host (possibly Alex Kantrowitz or co-host)
And.
[07:37]
Boris Cherney
And we just thought maybe we can do much better than this and we could do something really, really different than what's been done before. It was very much a bet. And so we introduced quad code. And the thing that made Quad Code different from chatbots at the time was quad code can use tools. And this is it. This is just the difference. With a chatbot, you're going back and forth and you're talking, but an agent. And Quad Code is an agent. It can use your tools. Right.
[08:04]
Host (possibly Alex Kantrowitz or co-host)
And can we just quickly define the tools? So tools could be anything, and you tell me if I'm wrong. From using a browser to, like, logging into Cloudflare and then setting up some agent that way. Right. So it becomes less of what does this product do itself and more of, like, what can this product log into and then sort of do with a multiplicity of products. Yeah, you use online.
[08:28]
Boris Cherney
That's right. It can connect all your different tools. It can use your browser, it can use your computer. Even something as simple as, like, editing a file on your computer. You know, like a year and a half ago, there was no AI product that could actually do that. But this is the first thing that Quad Code was able to do. It could edit a file on your desktop. If you have a bunch of files on your desktop, it can organize them. And so, like, quadcode and Cowork have this access, if you choose to give
[08:53]
Host (possibly Alex Kantrowitz or co-host)
it, to Grant it.
[08:54]
Boris Cherney
Yeah. And, you know, it can do this. And this is magical. It's this tiny difference. Completely changes the way that people can use this product, and it totally changes what this product can do for you.
[09:05]
Host (possibly Alex Kantrowitz or co-host)
Yeah, I mean, the fundamental thing, I think, just to drill down here is that it seems like AI has shifted from sort of like AI is great at autocomplete, right? Because at the fundamental layer, AI is just predicting what comes next. Predicting, you know, if you're using machine learning and applying it on a large data set, predicting whether you might default on your mortgage and whether a bank should grant a mortgage. When it comes to a sentence, predicting the next word with code, predicting the next bit of code in a sequence, right? So I think that was Gen 1, but what you're talking about now is the machine is actually just able to go and after you give it this natural language prompt, code itself, hook into tools and then do things for you. And so correct me if I'm wrong, but the use cases here have gone from developers hooking into IT and writing code with cloud code. And we've seen this explosion, I guess largely driven by them, but. But then by a secondary force by non technical folks, people like me, who can build software by directing the AI agent, which is Claude code, to build a piece of workflow software for them or a website or to take control of your computer via something like Claude Cowork, which is sort of the, maybe I would call it the easier sister product and saying, well, you have access to my, to my browser now. You know what type of flights I like to book. I need to be in India in a couple of weeks. Book the flight.
[10:34]
Boris Cherney
Yeah, yeah, exactly. I actually just used Cowork to book a bunch of flights I'm going to be flying a bunch this month for. You know, we have like code with Claude coming up in London and Tokyo and there's some other stops along the way. And I went back and forth with Cowork and I was like, okay, I need to be in these, in these places at this time. And it was five stops, it was like a lot of cities. And here's roughly the schedule. Look through my email, look through my calendar and just double check it, make sure I'm not missing anything. It found actually two stops that I was missing and also a couple dates that I told it wrong. And it just found this by looking at my email after I asked it to do that and then I told it to book the flights and I went and was coding on something and I was just doing work and I came back an hour later and it booked eight flights and five hotels. And one of the hotels was kind of incorrect, it was in the wrong area. I asked it to rebook it and change it and it was done. That was it. This is something that I try every time With Cowork and with quadcode, I have these sort of test cases. So these sort of like a common thing that I would do. And I just retry it with different models. And as the model improves, this is the best result I've ever gotten. And there's something about Cowork combined with Opus 4.7, where it's able to do this. And I think one of the hardest things for me has been as the model improves, you constantly have to readjust your expectations of what it can do. And if you talk to people, especially engineers, that use the model a year ago and they didn't use it since, they might say something like, oh, well, it's not very good at coding, and I don't trust it to write more than a few lines at a time at a time, because that's what the model was a year ago. It wasn't very good yet. And if you fast forward to today and you sit down these people and they try the new model, and as a lot of people have been doing, an increasing number of engineers, it's just a completely different experience. The capability is completely different. And I think this is the first technology I've used like this where every month there's a step change in what it can do. And as a user of this technology, it's just quite hard because you have to kind of keep retraining. You have to keep retrying. You always need this beginner mindset to retry the technology and use it for a thing it was not good at before, because the next model might just do it perfectly.
[13:00]
Host (possibly Alex Kantrowitz or co-host)
Right. And so I think this is the vision, the way that you're outlining it is effectively, previously, when you would use technology, you would be subject to the interface. You would have a software company that built for scale, but you would get a lot of features that maybe weren't applicable to you. You would have to go through all these bells and whistles whenever you were trying to book something, even though you knew what you wanted, and you wouldn't have a website that would know your preferences. Now, it sort of shifts the paradigm where you have, again, it's an agent. It's something that goes out and does things for you and can potentially shape your experience online the way that you want it. And that is, I think, what people are seizing upon. And that's why we're seeing, why you're seeing, really, the explosive growth. But now I want to pressure test the thesis a little bit and bring up some things that make me curious how much of this is Real. And how much of this is just unbridled enthusiasm at the potential. But maybe stuff we should have a reality check on. And the first thing is that there is such great demand, but the question is how much of that demand is pure demand versus demand that's gamified. And there is a practice that's going on within Silicon Valley and outside of it that's called token maxing. I'm sure you've heard of it. It's where companies have a mandate, where people are supposed to use lots of AI tokens by running their AI agents as much as they can. And then those who run the, you know, use the most tokens are like rewarded on a leader or on a leaderboard or meet a goal of AI actions that they have to take as opposed to physical actions. So I want to hear your perspective on token maxing and whether you think that makes up a large portion of the usage of the products that you're building.
[14:56]
Boris Cherney
Yeah, I don't think token maxing is a large percentage. The way that I would think about it is before Anthropic, actually, I used to work at a big tech company. You were at Facebook.
[15:07]
Host (possibly Alex Kantrowitz or co-host)
I was at Facebook, which is one of the companies that's token maxing for.
[15:10]
Boris Cherney
That's right, that's right, yeah. And one of my responsibilities was the health of all of the code across Meta's app. So this is like Facebook, Instagram, WhatsApp. And one of the reasons that we care about the health of the code, and this is essentially things like code quality, is if the code is really high quality, engineers are more productive. And there's like a big team of people that worked on productivity. And before models, before Claude, you would work for a really long time and you would see maybe like a 1 to 3% improvement in productivity per engineer over the course of a year, like something like that. And that was like a pretty big improvement. And it was like very hard won. You essentially had to try a lot of ideas and eventually you find something that improves productivity like this. And what happened with Claude is now many companies, including Anthropic and all of our biggest customers are reporting gains on the order of hundreds of percentage points. And I think the last number that we reported is the amount of code written per engineer at Anthropic has grown something like 250% since we introduced Claude code. And this is while keeping code quality and reliability and all these things kind of stable. So without those things regressing, the volume of code has grown a lot. And so this kind of Productivity impact, I think, is just very new. And I think people are trying to figure out, how do we get this? There's a lot of companies asking, how do we get these kind of benefits? Because a lot of companies are seeing it and then some are still figuring it out. And I think my advice is almost always the same. The first thing is just give everyone tokens, Let people experiment. I wouldn't necessarily recommend token maxing, but I would recommend let people experiment so they don't have to ask for approval for every token. The second thing is give people psychological safety. Because a lot of times when people are innovating and they're building tools that make them more productive, they're changing their own workloads to make them more productive. They try a bunch of ideas, some of them might not work, and then some of them work. So you want to give people this kind of psychological safety so they feel okay experimenting with it and finding these new processes. And then the thing that a lot of companies see is that productivity improvements and the innovations do not come from the people you expect. Back in the old days, everyone could point out, these are my most productive engineers. But I think nowadays a lot of the improvements are coming from people you just never would expect. It could be an accountant somewhere in the corner of your org that just automates accounting in a way that no engineer would have thought of. It could be some marketer automating marketing in a way that you never would have thought of. It could have been a new grad software engineer that just built something amazing. And this is something that just didn't happen before. The challenge is you can't identify these engineers and these people ahead of time. You don't know who they are, and it's almost always going to surprise you. And so the thing you want to do is let people experiment, give them safety, and then once there's some kind of use case that scales up, that's when you think about optimizing it. But you don't want to optimize ahead of time. So I don't know. If doing it in a competitive way works for some companies with their culture, then I think that's great. If for other companies the way they want to do it is just kind of create safety and create space for engineers to experiment, which is what we do at Anthropic, then I think that's great too. It really depends on the company.
[18:39]
Host (possibly Alex Kantrowitz or co-host)
Yeah. And I'll say, look, I use a lot of tokens. I'm in the tools all the time. I think cloud code and cloud cowork have both been pretty great for my business. I'm a solo operator, although that kind of sells it short because I have a team of people behind me that help me mostly on a part time basis. But that's for a different show. But I do wonder when I read these stories, the large corporations are largely making up big percentages of these budgets and, and the incentives, you know, and again, like I started the show saying, how sustainable is this? The incentives are bad in some of these places. This is from the Financial Times recently. Amazon staff use AI tool for unnecessary tasks to inflate usage scores Some employees
[19:25]
Alex Kantrowitz
said colleagues were using the software to
[19:26]
Host (possibly Alex Kantrowitz or co-host)
automate additional unnecessary AI activity to increase their consumption of tokens. They said the move reflected pressure to adopt the technology after Amazon introduced targets for more than 80% of developers to use AI each week. I gut checked this with an Amazon employee. They're like, yep, this is what's happening. They told me I triggered an automation that runs for hours and then gets deleted every day in order to meet these targets. So you said you don't think that this token maxing stuff is a big part of demand? Is there anything that you can see on your end to indicate that it's not? That this is an outlier and not the rule in most places?
[20:08]
Boris Cherney
Yeah, this is. I don't know how many companies are doing this token maxing thing. I've heard of it as a trend a little bit. If you look at quadcode's customers, we have just many, many, many customers. So it's not like there's like one company driving the usage. It's not like that. I do want to kind of step back a little bit and just think about how does this kind of change happen? Because I think the goal of what these companies are trying to do, I don't want to speak for them and I would recommend just talking to them. But the goal of what they're trying to do I think is probably organizational change and business process change. How do you make it so your company benefits from AI? And this is often unclear. It's very dependent on the company because every company has a different business, a different culture, a different org, a different way of doing things. There was this old Harvard Business Review article from the 90s, which I just love and I forget the title, but it was something like computers are here. Why is no one seeing the productivity impact? And this was a big question, right? To us it's obvious computers make us more productive. This is just incredibly obvious today. But in the 90s this was not obvious. And what was happening is personal computers were being adopted. They were replacing mainframes, and now they're affordable. So the average company, the average startup can buy one. You don't have to spend millions of dollars on a mainframe anymore. But there was this challenge and there was this paradox. Companies were adopting it, but they were not seeing productivity improvement. What's going on? And so this Harvard Business Review article, it made the case that in order to get a benefit from computers, you have to restructure your whole business process around computers. They have to be at the center of the way that you do things. And if you still have paper filing cabinets and you have a bunch of drawers full of stuff, and it's still a paper and pen kind of physical process, and there's a computer somewhere on the periphery, you're really not going to benefit. But if you throw away your filing cabinets, you throw away your desk drawers full of papers, and you put a computer at the center of it, and that's the way that you do all your business process, then you benefit. And there was this split between companies. Some were doing this, and they were doing this fairly painful change, and they benefited from it, and then others didn't. And I think it's kind of the same thing now. A lot of companies are trying to figure out how to benefit from the productivity impacts of AI, and there's just a lot of experimentation and everyone is trying different approaches to figure out how to benefit from it. I don't think there's one right approach.
[22:39]
Host (possibly Alex Kantrowitz or co-host)
Okay. And look, I think that when we see something grow as fast as Claude code has grown and as fast as anthropic has grown. Look, it's good to just kind of talk this stuff through, and it's good to hear your perspective. So, okay, that's token maxing. Now, tokens, of course, are the output of the model, like the words or portions of words that the model outputs and the words and portions of words that go into it. Right? And that is how these companies charge. And the more you have, the more data centers you need, et cetera, et cetera. You know, as these models get better, they haven't. Well, let me put it to you this way. Sometimes I wonder whether they're as efficient as they can be. These big models can sometimes do a lot of work, use a lot of tokens, even if the output is great. People wonder, well, is this sort of just driving up token demand where it could have been a really easy process? And the models are expending many Many tokens and not getting there as efficiently as they could. Let me give you an example. I've been using Claude cowork to make PowerPoint presentations. It's really good at it. And I've been using the Opus 4.7 model, and a couple of times I've said, all right, you're working on this. Ship it as a PDF and it just starts to lose its mind. It cycles and it uses as many tools as it possibly can. And, you know, it just seems unable to ship the PDF. And eventually I kept telling it, no, you're making this PowerPoint, you know where it is. Ship it. And it goes. I owe you an apology. I went down a rabbit hole worrying about a constraint that wasn't actually blocking us the files there, and then it shipped it. I mean, talk a little bit about the efficiency of these models and whether that is a legitimate worry. That, as we've seen, the growth part of it is these loops that a model like Opus 4.7 might find itself in to do basic tasks.
[24:48]
Boris Cherney
Yeah. Generally, when we think about models, there's a few different aspects of it. One is just how intelligent is it, Another one is how fast it is, and another one is how efficient it is. And we generally try to move all of these together between these. I think we should probably optimize for intelligence. That's the most important thing. So even if it's a little bit less efficient, but it's more intelligent and lets you do more things, that's really useful because the efficiency optimization comes after. After we make it more intelligent, then we can make it more efficient. So it's sort of we do one, then we do the other. We've been experimenting a lot with how exactly we give people control over this because we don't always know the right default. Sometimes when you're using it, you know better. And so one mechanism that we had for this is picking a model. So you can pick, you know, Opus or Sonnet or haiku. Another mechanism that we've been experimenting with
[25:41]
Host (possibly Alex Kantrowitz or co-host)
is effortless is like the biggest sonnet, middle haiku, smallest.
[25:45]
Boris Cherney
That's right, that's right, that's right. And this is just like the size of the model. Right. And then there's effort. And effort is essentially how, you know, I think the word is actually really descriptive. It's how much effort do you want to put into it? And you can set this. We have a recommended effort. So, you know, for example, to maximize intelligence, for Opus 4.7, you want to use extra high or maximum effort. But if you want it to use less tokens. You can pick like medium or low effort. And this is a control that you have.
[26:13]
Host (possibly Alex Kantrowitz or co-host)
Yeah. I talked about this on the show recently and we had a commenter that came in and I was of the opinion that this will, these, you know, bigger models will find a way to become more efficient on like the export, the PDF thing. We had a commenter come in that wrote, alex, they can't fix things like that PDF problem. It's inherent to LM technology and it's the biggest barrier to useful widespread dissemination and usage of agentic AI. I think I'm going to try to translate that. What they were trying to say is we talked about predictions earlier that this is all probabilistic. It's sort of predicting the next word. You don't get the same answer from an AI agent twice. And so therefore this type of thing is a feature of the way that they work and not fixable. What do you think?
[26:57]
Boris Cherney
Oh, I don't think that's right. When you think about like, okay, let's zoom out a little bit. So engineers are the first adopters, right? Like, engineers started using Claude code like a year and a half ago. And, you know, this is before non engineers were using agents in a meaningful way. This is, you know, before cowork and so on. If I think back to what Claude code was a year and a half ago, it wasn't very good. I could use it to write a little bit of code, but if I really trusted to build an entire feature or entire product, it wouldn't turn out well. It did the same thing. Like it would go in spirals and the quality wasn't good, or, you know, it built it and either the code was bad or it didn't work. And at some point it just started to get better. And as the model improved and as quad code improved, the result just got better and better and better. And so you fast forward to today. Quad code is 100% written by quad code. Cowork is 100% written by quad code. An increasing number of features are fully written by quad code across anthropic and products. And this is something that we hear from customers. Also, I did a talk at Y Combinator, the startup incubator, yesterday, and I asked people to raise their hands. Everyone's using quad code. And I asked them, raise your hand if 100% of your code is written using quad code. Today about half the hands went up. And then I asked people, raise your hand if 0% of your code is written with AI. There's like, one hand that went up, and this will remove a few hundred people.
[28:24]
Host (possibly Alex Kantrowitz or co-host)
Power to that person.
[28:27]
Boris Cherney
And there's still room for this, obviously. And then everyone else was somewhere in the middle. It's like most of their code is written with quad code, but not all of it. But that's kind of the place where the model is at today. It was not there a year ago. A year ago, it was not good enough for this. And so this is exactly what you're saying. Play out with Cowork right now. It's still early. You know, we released it, what, like, a few months ago. It's going to keep improving. It's going to keep getting better as the product gets better, as the model gets better. But this is early days, I think, still, everyone using Cowork today is an early adopter. Everyone, even using AI today is an early adopter. There are so many people in the world, and most people have not tried AI in a meaningful sense. So there's just like, there's a lot more room to improve this.
[29:08]
Host (possibly Alex Kantrowitz or co-host)
Yeah. We're hosting an event here in San Francisco on June 18, and a lot of the marketing material I've churned out with Cowork now, I go back and forth. I don't let it one shot it, so I'm looking at the copy. But I do things like upload
[29:24]
Boris Cherney
our
[29:25]
Host (possibly Alex Kantrowitz or co-host)
download statistics to sort of show the growth of the podcast, and I give it the names of the speakers. And it is amazing at saying building a prospectus. Here is what the event's gonna be. Here's who's gonna be in the audience. Here's who's speaking. Here's why you should be there. Here's how to get in touch. Insane. It's so good.
[29:43]
Boris Cherney
What was your feeling like the first time that you used it and the first time that you saw the agents use your tools?
[29:49]
Host (possibly Alex Kantrowitz or co-host)
Well, I mean, obviously, I've sort of enabled everything, and I think this is kind of an experience that many people have had where there's a browser extension for Claude and you realize that you can only get the benefit of this, or you'll get most benefit by letting Claude take over your browser and do things for you. And the experience is, it's almost the same as I had with the Waymo, where those first couple turns, I was like, white knuckling and, like, watching, like, should I approve reading everything? And then you start to trust it a little bit, and you just hit approve, approve, approve. Right. And the Waymo, the same thing. You're like, okay, this Looks like it's not going to kill me. And then five minutes later, you're on your phone as the AI does the work. And that was my experience with code. And Cowork does that sort of track.
[30:34]
Boris Cherney
I mean, this is like my experience too, I think. It's like any technology. I was watching someone that's. It's like a friend that's been learning to use Coworker over time, and she's not an engineer, and there's this use case. The other day, there was like a language input on the computer where you can kind of choose between languages on a laptop. And there was some issue with it, and she couldn't figure out how to fix it. And so before, what she would have done is go to Google and ask like, hey, how do I fix this issue that I'm having with my computer? And this time she just asked Cowork. And Coworker was like, cool, let me take a look. Can I use your computer? And she said yes. And it took over the computer and it gets this kind of like, orange glow. And you get to watch as Cowork opens settings and it sees what's going on with the language picker and it diagnoses it and it fixes it. And you're still in the driver's seat. So you can see this happening. You can monitor it. It's not happening in the background or anything, but it's just. It's magical. And I actually did, like, my instinct was to open Google, so it's funny that for her, she went to using Cowork for this. And this is actually something I feel all the time. I think for people that have kind of grown up with these products and they've seen previous versions, they might not be as ambitious as they could. But for people that are new to the products, I often see them using Claude code and Cowork for things that I wouldn't have even thought of. And it's just, like, amazing. It's so creative and I learn a lot every time I see it.
[31:57]
Host (possibly Alex Kantrowitz or co-host)
Yep. Now, the biggest drawback right now, I would say, and I've seen you reply to people on X about this, is the rate limits. Like, when I see people say, I've given cloud code a shot, but I'm kind of done with it. It's typically because they've hit their token allotment and it only works for like an hour for them, and then they have to wait 4 to use it again, and they look for alternatives. What do you think the rate limits have done to the ability for your product to grow. And what is the plan, if there is one, to make people be able to use this without those rate limits?
[32:41]
Boris Cherney
This is something we're actively working on. The reality is a very small percent of people actually hit their rate limits, which is surprising.
[32:49]
Host (possibly Alex Kantrowitz or co-host)
That is surprising.
[32:50]
Boris Cherney
For pro users, it's a little bit higher. For max, it's actually quite low. And I think the thing that you're saying when people talk about it is there's a couple of things happening. One is that we actually reduced the peak rate limits and that's now rolled back and we've actually doubled rate limits. So we're giving people more array limits, but there was a brief period where we reduced them and so people were running into that. The second thing that's happening is Claude code is actually quite extensible. And so people can use plugins, they can use all sorts of integrations, and some of these use tokens in a pretty inefficient way. And so the thing that we've been working on is surfacing this to you so users can decide, do you want to use this plugin or do you not? So you can see kind of what percentage of your tokens goes to it. And then I think the third thing is there's a lot of people that have just increasingly become power users. Like first, when we released quad code, you ran one quad at a time. Nowadays I'm running on my computer, I run maybe five at a time. And then every night I run not every night, but most nights I run hundreds of quads at a time. All in parallel. Yeah, hundreds, sometimes thousands. And this is something that I just wouldn't have imagined a year ago. And obviously this uses a lot of tokens and there's a lot of people that are figuring out these new workflows that are using a lot more tokens. And this is sort of like at the edge of what you can do with a max plan. And you know, this is why you can just like pay using API also. So if you just want to have as many tokens as you need, you can do this too. And this is what a lot of
[34:17]
Host (possibly Alex Kantrowitz or co-host)
enterprises do right now. It wasn't long ago where I'm pretty sure Dario Anthropic's CEO was referring to OpenAI and talking about the spending on the build out. And he's talked about this afterwards. He said, I'm trying to be disciplined in the way I spend, which is still spending many billions of dollars on data centers to enable this stuff. Like you're Talking about and others, which we think is OpenAI are yoloing. Right. But now OpenAI is doing this too with Codex, and you could call it YOLOing, but they have a lot of data center capacity that they've built. How do you think about that? Because when people do hit these rate limits, they may just go over to Codex. It's pretty intense competition. So how do you think about that? How does anthropic think about that internally, at least from the outside perception is that this added discipline on data center build outs might end up losing users in the most important product battle that your two companies are engaged in.
[35:24]
Boris Cherney
Yeah. So first of all, our growth has never been faster than it is today. So for quad code, the growth is accelerating. And I think because most people don't actually hit rate limits very often, it's actually not a huge issue for the people that are. We are laser focused on improving the experience. And so we doubled the five hour rate limits. We are announcing today that we're increasing the weekly rate limits. And of course, we announced the new colossus capacity which we brought online to serve all these new users.
[36:00]
Host (possibly Alex Kantrowitz or co-host)
Via Elon Musk.
[36:01]
Boris Cherney
Via Elon Musk. Yeah. Because this growth is just. No one would have predicted this. This was just beyond our wildest forecasts. And so I think for us, what matters the most is we need to serve our users. We want to make sure our users are really happy and we're doing everything we can to make that happen.
[36:18]
Host (possibly Alex Kantrowitz or co-host)
Are you surprised by Codex? How do you view them as a competitor?
[36:22]
Boris Cherney
I think there's always copycats, there's always competitors. For me, it's flattering and I think it just forces everyone to do better. For me, the thing that I care about the most is just doing the best job that we can to serve our users. And we encourage everyone on the team to talk to users every day and just keep making the product a little bit better every day. So this is what I care about the most.
[36:51]
Host (possibly Alex Kantrowitz or co-host)
Okay, I want to take a break, but we have so much more to cover. I want to talk about how this extends beyond code, the future of the chatbot, and then maybe talk a little bit about. I mean, I could go through our agenda. We really need two hours, so why don't we take a break and come back and get to as much as we can right after this.
[37:09]
Alex Kantrowitz
This episode is brought to you by True Diagnostic. I've been trying to get more intentional about my health lately. Not just how I feel day to day, but what's actually going on under the hood. That's why I checked out True Diagnostic. They offer at home tests that measure your biological age, not just how old you are, but how your body is aging on a cellular level. Their True Age test looks at things like your pace of aging, organ system health, and even risk factors tied to lifestyle, giving you real data to act on. What I like is that it's not guesswork. You can track changes over time and see how things like sleep, diet or exercise are actually impacting your body. And taking the test at home was so easy. If you're serious about optimizing your health and longevity, this is a really powerful tool. Right now, Big Technology Podcast listeners can get 20% off at truediagnostic.com using code Big Tech at checkout, that's truediagnostic.com and use Big Tech for 20% off today. Choose True Age, True Health or the Combo Kit as a one time purchase or a subscription. Look, if you have a kid in school right now, you know the drill. What you take. 20 minutes of homework ends up taking two hours and usually ends in tears. And every good tutor, well, they're fully booked for months. This episode is brought to you by Brainly. Brainly is an AI powered personal tutorial built by educators, not a general purpose chatbot. It doesn't just give your kid the answer, it walks them through step by step explanations so they actually understand the material. It learns how your child learns, diagnoses when they're struggling, and builds a personalized learning path in under three minutes. Available 24 7. There's no scheduling headaches and it's just
[38:47]
Host (possibly Alex Kantrowitz or co-host)
a fraction of the cost of a private tutor.
[38:49]
Alex Kantrowitz
Files are coming. Build your teen study plan now.
[38:52]
Host (possibly Alex Kantrowitz or co-host)
It only takes minutes.
[38:54]
Alex Kantrowitz
Go to brainly.com bigtech to get 50% off your first Brainly subscription with my Code Big Tech. That's B R a I n l-y.com BigTech Most leaders know how work is supposed to happen, but when it comes to how it actually gets done day to day across tools, teams and handoffs, they're mostly guessing. That's exactly the problem Scribe Optimize was built to solve. Trusted by over 80,000 enterprises, including nearly half of the Fortune 500, it gives leaders a live view into how work is really happening across approved business apps without interviews, manual process mapping or extra effort from the team. And because it's continuously analyzing real workflow activity, the insights stay current instead of going stale the moment a process changes. You can see which workflows are happening, where time is to going going and which Tools are involved, it automatically surfaces top issues, explains why they're happening, and even recommends ways to fix them with estimated time savings. And importantly, it's built with privacy in mind. So activity is only captured in admin approved business apps and user level data is anonymized by default. The kind of visibility that used to take months, now it's just always on. If you're ready to stop guessing and start seeing, Visit scribe.
[40:05]
Host (possibly Alex Kantrowitz or co-host)
How BigTech?
[40:07]
Alex Kantrowitz
That's S C R I IBE How BigTech?
[40:10]
Host (possibly Alex Kantrowitz or co-host)
And we're back here on big technology podcast with Boris Cherney, the head of Claude Code at Anthropic. Boris, it's great having you here. Like I said, I'm in your product daily, so it's really fun to speak with you about it. We talked a little bit about this, but I think one thing we should highlight is that this is really gonna extend beyond the chatbot. We talked about booking flights, I talked about it with marketing presentations. And the week that we're talking, you have a new use case out where Claude Cowork can be used for small businesses, including taking over QuickBooks and doing some bookkeeping. Where does this go? I mean, what do you think? The broad roadmap? Where does the broad roadmap take you?
[40:56]
Boris Cherney
We're thinking about a few things. For quad code and for Cowork, there's a few big themes. One is improving intelligence, and I think almost all of this is just the model. As the model improves, we can do more and more ambitious work for coding. It used to be writing a line of code at a time. Now it's building entire features or entire products. For Cowork, it used to be. It started pretty recently, but it was making a document and now it's things like booking flights, combining many tools, doing your QuickBooks. So this Frontier is improving and moving just very, very quickly. We're also thinking about how to do longer running tasks for Claude Code. We recently shipped this thing called Auto mode. And Auto mode is essentially a replacement for permission prompts. Before what we used to do is whenever the model uses a tool, Claude would ask you, is it okay if I use this tool? And usually you just say yes. And you get kind of tired of saying yes over and over.
[41:52]
Host (possibly Alex Kantrowitz or co-host)
Always allow. That's the button to hit.
[41:53]
Boris Cherney
That's right. That's right. But it's actually very important for security that you're very thoughtful about this. And the thing that we were realizing is actually instead of being thoughtful about every prompt because we're showing people so many of these dialogues they just kind of got fatigued and they would just say yes or always allow. And so Auto mode is the answer. And this is a new way of routing these tool calls. And the way that it works is whenever CLAUDE wants to use a tool, it asks another Claude, is it safe to use this tool? CLAUDE has some of the context, it doesn't have all the context. And there's also a number of layers of safety checks. And we spent months iterating on this to make it really safe. There's thousands of different benchmarks and evals that we use to make sure that this is safe. And essentially we found both in the laboratory setting and now we're finding in the wild this is safer than what we had before. So as a user, it's a really nice benefit because you don't have to sit there and say yes over and over. And actually the result is better because if there's one unsafe command buried somewhere in this big list of things that Quad asks you to do, you might have accidentally said yes. But actually if you ask a second quad using Auto Mode, it's not going to say yes. So this is kind of one big investment. Maybe the third big one is just running more quads in parallel. One of the cool things about Quad, and this is something that we started to see pretty early with Quad code users, is actually very few people nowadays run one quad code at a time. Most people run many, many quad codes, ranging from a few to thousands. And with Cowork, we're starting to see the same exact thing. As you get more comfortable letting Cowork run, you start a task and then you start a second task and you move on and you just do more in parallel. And I think there's just a lot of opportunity to make this experience very nice and to make it more obvious for people, how do you do this? When do you do it right?
[43:46]
Host (possibly Alex Kantrowitz or co-host)
And it probably extends to the way that you use a chatbot. It's interesting because Anthropics had this kind of interesting relationship with the chatbot. Started out as technology first decided to build the chatbot ship Claude, and then just kind of moved more towards enterprise. Like, you looked at all the charts and Claude was always at the bottom. But now you're seeing Claude's usage rise. And I have a thought, and I'd love to check this by you, that the future of the chatbot is. Is not like, I'm going to give you a question and you'll give me an answer. It's I will give you a question or you know, talk to you about a problem and you, the chatbot will then suggest some sort of action you can take on my behalf. Like right now I'm talking a lot about a trip to India. And what I think I'm going to get back in the future is this thing being like, like what you said, not having this like, secondary step between having to go there and book the flights. A more proactive chatbot that's going to say, okay, let me take care of this for you. Is that the right direction? Like, am I thinking about that?
[44:54]
Boris Cherney
I could see that. I could see that, yeah.
[44:56]
Host (possibly Alex Kantrowitz or co-host)
Are you working on it?
[44:57]
Boris Cherney
Agents are the future and we're trying all these different experiments. There's some stuff that we're trying that's like this. Yeah.
[45:03]
Host (possibly Alex Kantrowitz or co-host)
Okay. But there is a limit here to what this can do. A funny way people have talked about the limits of the thousands of clouds that you can run in parallel is kind of looking at who Anthropic is hiring. My favorite job listing on the Anthropic site is that you're hiring salesforce administrators. You're also hiring consultants to help enterprises deploy this technology. And many are viewing that as like a sort of tacit admission that this stuff can only take you so far. Here's Wharton Professor Ethan Mollick on it. He says, you will know that the AI labs believe in artificial superintelligence when they disband their newly formed consulting, sorry, forward deployed engineering groups. As long as people are required to figure out how AI is useful and do organizational change and systems integrations, jobs seem pretty safe. What do you think about that?
[46:01]
Boris Cherney
Yeah, when you look at the kind of engineering that I do, I don't write code, I prompt claude. Actually, nowadays mostly what I'm doing is I have a CLAUDE that prompts other clauds. So I don't even talk to claude. I have a Claude that's talking to my quads. And I think in engineering you've seen just this explosion in the amount of leverage that a single person has. It's about how big of a business can a person build, how many products can one person support. The leverage that one engineer has now at Anthropic is just insane. And I think we're starting to see this across other disciplines too. So we're starting to see this with marketers that are using CLAUDE to do things. We're starting to see this also for forward deployed engineers that are using CLAUDE code to build implementations. We're seeing this for our sales team because actually at Anthropic, I Think half the Go to Market team uses Quad code and the other half uses Core. I think everyone's using all these products. And so the thing that we're seeing is the amount of leverage an individual has goes up and we are still bottlenecked on the number of good people. And so even if the leverage per person goes up, you still just can't hire enough good people because the demand is so insane and there's so much more to build. So that's still the bottleneck for us.
[47:22]
Host (possibly Alex Kantrowitz or co-host)
But I would say people would argue that if this stuff was so powerful, you could say, take a look at the way my sales organization operates and then configure Salesforce that way with a prompt. Another example people give is I believe that Anthropic has very powerful AI. If they let it handle the IPO paperwork and don't hire an investment bank. Are these unfair tests?
[47:52]
Boris Cherney
Well, we're starting to see there's one person on the table that was using Quad to do their taxes. I would not necessarily recommend this, but
[47:59]
Host (possibly Alex Kantrowitz or co-host)
I'll admit I've run my taxes through Claude and compared it against my accountant and it was pretty close.
[48:05]
Boris Cherney
Yeah, I did the same thing, folks.
[48:07]
Host (possibly Alex Kantrowitz or co-host)
Not saying you should do that, but it's an interesting use case.
[48:11]
Boris Cherney
That's right. But I think fundamentally what people are missing in this conversation is in the end it's a person that has to talk to Claude to ask Claude to do this thing. So even if Salesforce is automatically configured and it's not a person pressing all the buttons, it's Claude doing it. Someone has to ask Claude to do that. And if you have to configure Salesforce in a bunch of different ways, it could actually be a full time job to ask Claude to do this. And at some point Claude is going to become really good at asking Claude to do this. And that person is going to be asking Claude that asks Claude to do this. And this chain will just keep getting deeper. But in the end you still need people that are piloting this, but maybe
[48:47]
Host (possibly Alex Kantrowitz or co-host)
their job is just asking one question then in the future.
[48:50]
Boris Cherney
Yeah, but imagine how much leverage that has asking the right question.
[48:54]
Host (possibly Alex Kantrowitz or co-host)
That's true. That's a good point. So we talked about Salesforce, so we have to talk about the saaspocalypse. You have some interesting views on the type of software companies that will be safe as we get more automated programming and those that might be in trouble. And you've talked previously about the different moats that exist and which moats are more important and which motes are less important? Can you just share that briefly while we're talking about it?
[49:24]
Boris Cherney
There's this really good framework called the seven powers for talking about moats and business. There's so many of these frameworks for this, but this is my favorite. I actually studied economics in school. I didn't study computer science. So this is still kind of the way that I think is in terms of these kind of frameworks. And there's a lot of these different moats in business. And some companies have one mode, some have a few modes. They have a portfolio of moats. There's a bunch of these modes. So one is scale economies. So as you scale up your production, then there's increasing returns to scale. Another one is network effects. So this is like a messaging app or something like that. The more people that are on it, the more valuable it is for any person. Another one is switching costs. There's another one that's process power. I think most of these moats are still going to matter and relatively some are going to increase in importance over the next year and some are going to decrease in importance. One that I think will increase in importance is something like network effects, because it doesn't matter who's writing the code. It doesn't matter if it's an agent at the core of your product or something else, or if there's intelligence in your product. If there's a network effect in your product, that's still going to matter. Some modes get less important, and this is, for example, switching costs. Because if you want to switch from vendor A to vendor B, you can just ask Quad to do that. And Quad is going to get better and better over time at it. And so I think as a company, a thing that you should be thinking about is what are your moats? And I think a lot of the largest companies just have many, many moats. It's not just one thing, because the way you get to a scale and the way you build a defensible business over time is you accumulate these moats. You need a number of them. But yeah, I would just think what's going to be more valuable in a year and what's less valuable.
[51:04]
Host (possibly Alex Kantrowitz or co-host)
I think that when you think about these different software companies, though, if you're using a cloud code, do the most almost kind of blend away because you could potentially be in this, like one app that is interfacing with all software, which means, therefore there's really only one software company.
[51:24]
Boris Cherney
Yeah, I mean, there's just like a lot of ways that this could Play out. I think something like this is possible, but it seems a little far fetched to me because if I think about, for example, like, let's say I'm using a messaging app, how do I decide which app to use? I use the app that my friends are on that I can reach. So it doesn't matter if I can build a really awesome app for myself, which I can do today. I can build a great Messaging app with QuadCode in a few hours. It's still not useful because they can't talk to my friends.
[51:50]
Host (possibly Alex Kantrowitz or co-host)
But this is the example exactly. You can fact check me on this. You're going to have an agent in your messaging apps that's going to let you know when your friends have messaged you. I know you use cloud code on your iPhone a lot, right? So then you will just see the notification and you'll speak it back to people. All your communication could potentially be centralized in these. As long as the companies play ball.
[52:16]
Boris Cherney
Yeah, I mean it could be kind of the agent in the end, but how does the communication actually happen? So like, you know, for example, if you look at a messaging app like Signal, there's a protocol that it uses to communicate and I can build an app, it can maybe use that same protocol, but I think it actually can't message other people that are on Signal. But yeah, I can have an agent that uses my app to do that messaging using an existing app that supports this. So yeah, it's not obvious how it's going to play out. I think today people use a mix of apps and agents, but I do fundamentally think that a lot of these moats are actually still going to increase in value over time. You can think of another example, let's say like a TSMC or some kind of like chip manufacturer. If you think about the amount of work that they put into making a process and in making a process where the costs go down with scale, this is a fundamental economic force. And there's a lot of companies that do this kind of thing where, especially in manufacturing, where with scale the cost goes down. With tech companies, this is the case for infrastructure. So if you build a really great infrastructure, you can support more users and the marginal cost per user goes down over time. So if you have this kind of effect, it doesn't matter if you or I can build apps, that's still a really powerful moat. But I do think for sure both things are in play.
[53:39]
Host (possibly Alex Kantrowitz or co-host)
Okay, I got three more in 10 minutes. Let's see if we can get to them all. Jack Clark, one of The Anthropic founders recently said, I think that he believes there's like a 60% chance that these models will start improving themselves by 2028. It could be off by a percentage or a year. But ballpark, that's accurate. You're in the app where coding happens autonomously. You're running this app. Do you agree with Jack?
[54:05]
Boris Cherney
Seems right. Yeah. When I look at the way that Quad code is written, 100% of quad code is written using quad code. This has been the case since I think November of last year, since Opus 4.5.
[54:20]
Host (possibly Alex Kantrowitz or co-host)
It's like a fast takeoff scenario then. Do you anticipate that?
[54:24]
Boris Cherney
I mean, it's possible. And this is why Anthropic exists. If you ask anyone, any engineer or any researcher why they joined Anthropic, they're going to tell you it's for AI safety. And it's because for us, when we think about the future years from now, the thing that's the most important and the thing that we want to get right for our kids is we want to make sure this thing is safe and we want to make sure it goes well. Because, yeah, that is one of the possible outcomes. I think that's not yet what we're seeing right now. Quad code is writing itself, but it's still a person that's doing the prompting. Quad is starting to generate its own ideas for what to build next for Quad code. But it's not always good ideas and I still generate most of the ideas and at some point it's going to change, the model's going to improve and it's going to become more of a self reinforcing loop.
[55:11]
Host (possibly Alex Kantrowitz or co-host)
Okay, I definitely want to get your thoughts on the world model argument here where people who are pro world models say that a large language model has no understanding of the consequences and you need to build a world model into it to have effective agents. Here's something from Yann Lecun. He says you cannot build a reliable agentix system without a world model. LLMs don't have world models. They can't predict the consequences of their actions before taking them. According to Yan, they just act and whatever happens next is someone else's problem. I was speaking with Greg Brockman from OpenAI recently and he said basically he doesn't accept that argument and he thinks LLMs are the way directly. These text models are the way to AGI. Which side are you on? Are you a believer that that world model intelligence needs to be baked in, or do you think that LLMs alone
[56:05]
Boris Cherney
are good enough I would put out an offer to Jan if he wants to sit down and quadcode together for an hour. I'd love to show him.
[56:13]
Host (possibly Alex Kantrowitz or co-host)
You guys should do that on this show.
[56:14]
Boris Cherney
Yeah. And then I'm curious to hear what he thinks. Maybe he'll change his mind, maybe he doesn't.
[56:18]
Host (possibly Alex Kantrowitz or co-host)
Right.
[56:18]
Boris Cherney
But your perspective, though, you know, I'm pretty firmly on the product side, so, you know, I don't really have a perspective on it, but.
[56:27]
Host (possibly Alex Kantrowitz or co-host)
Okay, let me drill down a tiny bit deeper, if you don't mind. You're on the product side, but I've heard multiple people bring out this idea that without a conception of the way the world works, like in a world model, a LLM just doesn't have an understanding of the way that the world works and consequences and stuff. You use cowork to book how many flights? Eight flights in hotels. You must think that it has some understanding of consequences, otherwise you wouldn't have given it your credit card, which I presume you did. So what do you think about that argument in particular?
[57:02]
Boris Cherney
I think from what I've read from folks working on research at Anthropic, it is surprising the degree to which these models are intelligent, because like you said at the beginning, the thing that they fundamentally do is they predict the next token. And so you think this is kind of like a stupid thing. How can this possibly lead to intelligence? But we've actually published a lot of work about how the models are able to plan. They're able to actually reason. There was all these very surprising behaviors that you actually wouldn't expect from a model that just predicts the next token. So I don't know. I wouldn't discount it.
[57:35]
Host (possibly Alex Kantrowitz or co-host)
I mean, I think my favorite is when they write poetry as they're writing the first line. You can see in the model, this is anthropic research that they're already thinking about the next line.
[57:45]
Boris Cherney
That's right.
[57:46]
Host (possibly Alex Kantrowitz or co-host)
Which is like, how is that even possible?
[57:48]
Boris Cherney
But that's right. I mean, and that's kind of, you know, how I think about it. Like, if I write poetry, that's how I would do it, too. And it's crazy. You teach this thing to predict the next word, and somehow if the next word is hard enough, it has to learn to really plan ahead, and it has to learn how to do all of this.
[58:03]
Host (possibly Alex Kantrowitz or co-host)
Okay, last one for you. Sometimes I wonder when I see big tech changes underway and in my career covering this stuff, some have worked out and some haven't, I always have to ask myself, how are we sure that this is the future, and this is not a fever dream. And I think the data indicates that this is a real thing. But I also wonder, you have to question how much you can extrapolate towards the future in terms of how will this continue to progress. The argument that this is a fever dream is that maybe people just want simple interfaces and they don't mind tapping through things. And speaking in a cloud code feels a little bit too techie, and it just won't appeal to the everyday user as much as it's really taken off with developers. How would you answer that?
[58:59]
Boris Cherney
We had this hackathon for Opus 4.7 recently, and one of the winners was a doctor that built an app. There was an electrician, there was a carpenter. And a lot of these people didn't have coding experience, but they used quadcode to build something useful. There's one person that built and sold a startup as a result of one of these hackathons that we put on. And undoubtedly, when we first built Claude code, it was for engineers, and engineers kind of figured out how to use it. But very quickly, people that were not engineers figured out how to use this to build economically useful things. And actually, if you look at a lot of the usage today, it's like, it's not engineers, and it's just so useful for people that they were going out of their way, they're jumping through hoops. Even before Cowork, people were like, installing quadcode in a terminal. For a lot of people, this was their first time using a terminal. And of course, now for Quadcode, we have a desktop app, we have iOS app, we have a Slack app. There's many ways to interact with it, but people were jumping through hoops to use it because it was so useful. And so for me, as a product person, this is the ultimate market test of is this thing useful? Is are there a lot of people that use this every day and that keep using it every day? And yeah, it's a lot of people. And it just keeps growing. And I'm just constantly surprised by the way that people use this.
[60:20]
Host (possibly Alex Kantrowitz or co-host)
Yeah, I will say I've been surprised by the way that I found myself using the tools. And I don't know. Well, we'll see what comes next. So excited to keep using it and thrilled to have a chance to speak with you. I hope we can do it again.
[60:33]
Boris Cherney
Yeah, thanks for having me on.
[60:34]
Host (possibly Alex Kantrowitz or co-host)
All right, thank you, Boris. Great speaking with you. All right, everybody, thank you so much for listening and watching. And we'll see you next time on Big technology podcast.
[60:57]
Leo Laporte
Every Sunday, we cover the week's tech news on this Week in Tech. Hi, this is Leo Laporte inviting you to join me this week as Berbergin from the Wall Street Journal and Paris Martineau from Consumer Reports. Join Ian Thompson and we'll talk about, of course, OpenAI and Anthropic. They got together with a bunch of religious leaders and decided what religion AI is. They've also figured out how to keep it from blackmailing you. You just say, well, that would be wrong. This week at Tech, you'll find it at Twit TV and wherever you get your podcasts.