Summary7 min read

Y Combinator Podcast — Inside Claude Code With Its Creator Boris Cherny

Date: February 17, 2026
Guest: Boris Cherny, Creator Engineer of Claude Code (Anthropic)
Host & Panel: Y Combinator team, Ben Mann, YC Partner

Episode Overview

This episode dives deep into the genesis, vision, and future of Claude Code—a revolutionary AI-powered coding assistant—directly from its creator, Boris Cherny. The discussion covers Anthropic's product philosophy, rapid iteration, latent demand, technical and cultural shifts in software engineering, and practical guidance for startup founders and engineers building with the latest AI models.

Key Themes & Discussion Points

1. Iterative & Forward-Looking Development

Building for the Future of AI: Boris emphasizes Anthropic’s philosophy:

"We don't build for the model of today, we build for the model six months from now." (00:00, 01:52, 43:43)
This strategic bet on AI’s exponential progress shapes all aspects of Claude Code’s evolution.
Constant Reinvention:

"All of QuadCode has just been written and rewritten and rewritten and rewritten over and over and over. There is no part of QuadCode that was around six months ago." (00:09, 39:27)
The product’s codebase and features are continually updated in response to both model advances and user feedback.

2. Origins of Claude Code: Accidents & Latent Demand

Prototyping in Terminal: Initial development was accidental and pragmatic, choosing the terminal for simplicity and speed.

"There was no pressure because we didn't even know what we wanted to build. The team was just in Explore mode." (04:43)
Spotting Latent Demand: Real value emerges from observing how people naturally use tools—focusing on what they already try to do.

"Probably the single, for me biggest principle in product is latent demand. And just every bit of this product is built through latent demand after the initial CLI." (07:57, 28:47)

3. User-Centric Iteration

Dogfooding & Organic Growth: The earliest adopters were Anthropic engineers—no mandates, just internal word-of-mouth and usage.

"I just posted about it and they'd just been like, telling each other about it. Honestly, it was just accidental." (06:37)
Design By Feedback: New features (“plan mode”, verbosity controls) arrive via listening to real usage and conversation:

"Literally this was like Sunday night at 10pm...I just wrote this thing in like 30 minutes and then shipped it that night." (25:45)

4. Changing Landscape of Software Engineering

Productivity Leap:

"Productivity per engineer grew something like 70%...since QuadCode came out, productivity per engineer at Anthropic has grown 150%." (40:30)
"I think the title software engineer is going to go away...Everyone on our team codes—engineers, PMs, designers, EMs, finance." (43:37)
Generalists and Multi-Disciplinary Teams: The most successful engineers are either extreme specialists or “hypergeneralists.” (18:53)

5. Founder and Hiring Advice in the AI Era

Beginner’s Mindset & Humility:

"Engineers, as a discipline, we've learned to have very strong opinions...but it actually turns out a lot of this stuff just isn't relevant anymore." (16:05)
First Principles & Willingness to be Wrong:

"I sometimes ask about what's an example of when you're wrong...For me personally, I'm wrong probably half the time." (16:42)
Hiring via Agent Interaction: Evaluate candidates by how well they interact with AI agents—not just through traditional interviews.

"You can upload a transcript of you coding a feature with Claude code...you can figure out how someone thinks..." (17:32)

6. Claude Code's Architecture and Usage Patterns

Terminal as a Form Factor:
- Minimal UI, highly iterative:
  
  "We felt there is no UI we could build that would still be relevant in six months because the model was improving so quickly." (08:35)
- Accidental longevity:
  
  "If you asked me this a year ago, I would have said the terminal has a three month lifespan..." (30:17)
Plan Mode and Beyond:
- Plan mode arose due to user workflows, but is expected to become obsolete as models improve.
  
  "Plan mode probably has a limited lifespan..." (25:07)
  "Claude code can now enter plan mode by itself..." (25:19)

7. Collaboration & Swarms of Agents

Agent Topologies:
- The product vision includes collaborative “swarms” of agents working in parallel, especially for complex tasks.
- Example: The entire Plugins feature was built by a swarm of agents with minimal human intervention—agents self-organized, wrote tasks, and executed.
  
  "The Plugins feature was entirely built by a swarm over a weekend..." (21:59, 23:02)

8. Broader Impact and Future Vision

Trajectory & Exponential Progress:

"Really all it is is you just trace the exponential...Coding will be generally solved for everyone." (43:37)
The Bitter Lesson Philosophy:
- Emphasizes the inevitability of more general learning systems winning over hand-crafted scaffolding/code.
  
  "Never bet against the model...assume that whatever the scaffolding is, it's just tech debt." (43:43, 37:43)
Mission-Driven Focus & Safety:
- Anthropic’s drive is not just about speed, but about building safe systems—AI safety is discussed constantly within the organization. (41:45)

Memorable Quotes & Moments

On Model-First Mindset:

"Build for the model of six months from now." — Boris Cherny (00:00, 43:43)
On Accidental Product-Market Fit:

"It just wants to use tools. That's how it wants." — Boris Cherny (05:24)
On Early Usefulness:

"Robert...just had quad code on his computer and he was using it to code. I was like, what? What are you doing? Like, this thing isn't ready. It's just a prototype. But, yeah, it was already useful in that form factor." — Boris Cherny (06:24)
On User Feedback:

"I remember early on, I tried to get rid of BASH output...everyone just revolted. I want to see my dash." — Boris Cherny (12:05)
On Plan Mode:

"There's no big secret to it. All it does is it adds one sentence to the prompt that's like, please don't code." — Boris Cherny (25:26)
On Productivity Leap:

"The team doubled in size last year, but productivity per engineer grew something like 70%...since Quadcode came out, productivity per engineer at Anthropic has grown 150%." — Boris Cherny (40:30)
On the Future of Engineering:

"I think we're going to start to see the title software engineer go away." — Boris Cherny (43:37)
On the Bitter Lesson:

"The idea is the more general model will always beat the more specific model." — Boris Cherny (37:43)

Timestamps of Key Segments

| Time | Segment | |------------|-------------------------------------------------------------------------------------------------------| | 00:00 | Boris explains Anthropic’s model-first, six-months-ahead product philosophy | | 01:19 | The inception of Claude Code; “three months, no vacation” | | 04:23 | Early days: prototyping with the terminal, Bash, and tool use in Anthropic Labs | | 05:24 | First “AGI moment”: model scripting AppleScript | | 06:07 | Internal adoption: “What? You’re using this?” | | 07:17 | Early use cases: automation, Git, unit tests, Markdown/Claude MD emerges | | 09:01 | Boris shares his minimal Claude MD, importance of lean user instructions | | 12:05 | Internal revolt against hiding Bash output—listening to users | | 16:05 | Advice for founders: staying humble, beginner’s mindset, and learning to be wrong | | 17:32 | Discussing hiring via agent/coding transcripts | | 18:53 | Two archetypes of effective engineers: hyperspecialists and hypergeneralists | | 21:59 | Vision for Claude Teams and agent topologies | | 23:02 | Plugins feature “built by a swarm”—process and outcomes | | 25:07 | Discussion of Plan Mode’s inevitable obsolescence | | 28:47 | Latent demand & origins of Plan Mode | | 30:17 | Terminal’s longevity, evolving interface experiments | | 31:11 | DevTool founder advice: serving both engineers and agent desires | | 34:43 | Product parallels: TypeScript’s practical design, build for how people really work | | 37:43 | The Bitter Lesson: betting on general models over scaffolding, tech debt mindset | | 39:27 | Codebase churn: “rewritten and rewritten,” rapid obsolescence | | 40:30 | Internal productivity metrics—1000x improvements, industry transformation | | 41:45 | Boris’s motivation: AI transition, safety, and mission-driven culture at Anthropic | | 43:37 | The future: end of "software engineer" title, generalist teams, and dangers as AI scales | | 46:21 | Claude Code’s adoption surge, NASA/Mars Rover, prevalence in startups | | 47:47 | The rise of Cowork: non-technical use cases and GUI interfaces | | 49:32 | Closing: host’s thanks, Boris’s invitation to send bugs |

Practical Takeaways for Founders & Builders

Build for the AI model that’s coming, not the one that exists today.
Search for “latent demand” by studying user behaviors; don’t force new workflows.
Ship minimal UI—function matters most when platform capabilities are evolving rapidly.
Invest in scientific mindset, humility, and rapid iteration—old expertise may not transfer.
Hire and partner with people who “think weird”—generalists and outliers can thrive now.
Assume all scaffolding and custom code is temporary; prefer waiting for model progress over hand-coded patches.
Be vigilant about AI safety: adopt a mission-driven culture as capabilities rapidly increase.

Notable Anecdotes

NASA uses Claude Code to plot the Mars Rover course (46:21).
The original plan mode feature was coded and shipped in 30 minutes based on user feedback (25:45).
Anthropic has a framed copy of Rich Sutton's The Bitter Lesson in the Claude Code team area, underscoring their philosophy (37:43).
“Plan mode” and other features are expected to become obsolete as models autonomously improve (25:07, 26:47).
Productivity at Anthropic has increased 1000x compared to Google in its peak engineering years (40:00).

End of summary.

Loading summary

Transcript147 lines

[00:00]
Boris Czerny
At Anthropic, the way that we thought about it is we don't build for the model of today, we build for the model six months from now. That's actually still my advice to founders that are building on LLMs. Just try to think about what is that frontier where the model is not very good at today because it's going to get good at it. All of Quad code has just been written and rewritten and rewritten and rewritten over and over and over. There is no part of QuadCode that was around six months ago. You try a thing, you give it to users, you talk to users, you learn, and then eventually you might end up at a good idea. Sometimes you don't.
[00:27]
Ben Mann
Are you also in the back of your mind thinking that maybe in six months you won't need to prompt that explicitly? Model would just be good enough to figure out on its own?
[00:34]
Boris Czerny
Maybe in a month.
[00:36]
YC Partner
No more need for plan mode in a month.
[00:38]
Podcast Host
Oh, my God. Welcome to another episode of the Light Cone. And today we have an extremely special guest, Boris Czerny, the creator engineer of Claude Code. Boris, thanks for joining us.
[00:59]
Boris Czerny
Thanks for having me.
[00:59]
Podcast Host
Thanks for creating a thing that has taken away my sleep for about three weeks straight. I am very addicted to Claude Code and it feels like rocket boosters. Has it felt like this for people, like, for, you know, months? At this point, I think it was like end of November is where a lot of my friends said, like, something changed.
[01:19]
Boris Czerny
I remember for me, I felt this way when I first created Quad Code and I didn't yet know if I was onto something. I kind of felt like I was onto something. And then that's when I wasn't sleeping. And that was just like three straight months. This was September 2024. Yeah, it was like three straight months. I didn't take a single day vacation, worked through the weekends, worked every single night. I was just like, oh, my God, I think this is going to be a thing. I don't know if it's useful yet because it couldn't actually code yet.
[01:45]
Podcast Host
If you look back on those moments to now, what would be the most surprising thing about this moment right now?
[01:53]
Boris Czerny
It's unbelievable that we're still using a terminal. That was supposed to be the starting point. I didn't think that would be the ending point. And then the second one is that it's even useful because at the beginning, it didn't really write code. Even in February, when we ga'd it, it wrote maybe like 10% of my code or something like that. I didn't really use it to write code. It wasn't very good at it. I still wrote most of my code by hand. So the fact that actually our bets paid off and it got good at the thing that we thought it was going to get good at because it wasn't obvious at Anthropic, the way that we thought about it is we don't build for the model of today, we build for the model six months from now. And that's actually still my advice to founders that are building on LLMs is just try to think about what is that frontier where the model is not very good at today because it's going to get good at it, and you just have to wait.
[02:38]
Ben Mann
Going back, do you remember when you first got the idea, can you just talk us through that? Was there a spark? Or what was even the first version of it in your mind?
[02:47]
Boris Czerny
You know, it's funny, it was so accidental that it just kind of evolved into this as Anthropic. I think for ant, the BET has been coding for a long time and the BET has been the path to safe AGI is through coding. And this has kind of always been the idea. And the way you get there is you teach the model how to code, then you teach it how to use tools, then you teach it how to use computers. And you can kind of see that because the first team that I joined at Anthropic, it was called the Anthropic Labs team and it produced three products. It was QuadCode, MCP and the desktop app. So you can kind of see how these weave together. The particular product that we built, no one asked me to build a cli. We kind of knew maybe it was time to build some kind of coding product because it seemed like the model was ready, but no one had yet really built a product that harnessed this capability. So still there's this insane feeling of product overhang. But at the time it was just even crazier because no one had built this yet. And so I started hacking around and I was like, okay, we build a coding product. What do I have to do? First I have to understand how to use the API, because I hadn't used the Anthropic API at that point. And so I just built a little terminal app to use the API. That's all that it did. And it was a little chat app, because you think about the AI applications of the time and for non coders today, what are most people using is just a chat app. So that's what I built. And it was in A terminal. I can ask questions, I gave answers. Then I think tool use came out. I just want to try out tool use because I don't really understand what this is. I was like, toolus, this is cool. Is this actually useful? Probably not. Let me just try it.
[04:24]
Ben Mann
You built it in Terminal just because it was the easiest way to get something up and running?
[04:28]
Boris Czerny
Yes, because I didn't have to build a ui.
[04:29]
Ben Mann
Okay.
[04:30]
Boris Czerny
It was just me at that point.
[04:32]
Ben Mann
It was like the IDEs, cursor, windsurf, were the things that were really taking off. Were you sort of under any pressure or getting lots of suggestions of, hey, we should build this out as a plugin or as a fully featured IDE itself?
[04:43]
Boris Czerny
There was no pressure because we didn't even know what we wanted to build. The team was just in Explore mode. We know vaguely we wanted to do something in coding, but it wasn't obvious what. No one was. High confidence enough. That was my job to figure out. And so I gave the model the Bash tool. That was the first tool that I gave it, just because I think that was literally the example in our docs. It just took the example. It was in Python. I just ported it to Typescript because that's how I wrote it. I didn't know what the model could do with Bash, so I asked it to read a file. It could cat the files. Like, that was cool. And then I was like, okay, what can you actually do? And I asked it, what music am I listening to? You wrote some Apple script to script my Mac and look up the music in my music player.
[05:23]
Podcast Host
Oh, my God.
[05:25]
Boris Czerny
And this was sauna 3.5. And I didn't think the model could do that. And that was my first, I think, ever fuel the AGI moment, where I was just like, oh, my God, the model, it just wants to use tools. That's how it wants.
[05:38]
YC Partner
That's kind of fascinating. I mean, it's very kind of contrarian that clocr works so well in such an elegant, simple form factor. I mean, terminals have been around for a really long time, and that seemed to be like a good design constraint that allowed a lot of interesting developer experiences. It doesn't feel like working, it just feels fun. As a developer, I don't think about files where everything is. And that came by accident, almost.
[06:07]
Boris Czerny
Yeah, it was an accident. So after the terminal started to take off internally and honestly, after building this thing, I think two days after the first prototype, I started giving it to my team just for dogfooding, because if you come up with an idea, and it seems useful. The first thing you want to do is you want to give it to people to see how they use it. And then I came in the next day, and then Robert, who sits across from me, who's another engineer, he just had quad code on his computer and he was using it to code. I was like, what? What are you doing? Like, this thing isn't ready. It's just a prototype. But, yeah, it was already useful in that form factor. And I remember when we did our launch review to kind of launch quadcode externally. This was in December, November or something like that, in 2024. Dario asked, and he was like, the UCS chart internally, like, the DAO chart is like, vertical. Are you, like, forcing engineers to use it? Like, why are you mandating them? And I was just like, no, no, we didn't. I just, like, posted about it and they'd just been like, telling each other about it. Honestly, it was just accidental. We started with the CLI because it was the cheapest thing, and it just kind of stayed there for a bit.
[07:09]
Ben Mann
So in that 2024 period, how were the engineers using it? Were they sort of shipping code with it yet, or were they using it in a different way?
[07:18]
Boris Czerny
The model is not very good at coding yet. I was using it personally for automating git. I think at this point I've probably forgotten most of my git because quadcoder's just been doing it for so long. But, yeah, automating BASH commands, that was a very early use case and operating kubernetes and kind of things like this. People were using it for coding. So there was some early signs of this. I think the first use case was actually writing unit tests because it's a little bit lower risk. And the model was still pretty bad at it, but people were kind of figuring it out and they were figuring out how to use this thing. And one thing that we saw is people started writing these markdown files for themselves and then having the model read that markdown file. And this is where Quant MD came from. Probably the single, for me biggest principle in product is latent demand. And just every bit of this product is built through latent demand after the initial cli. And so Quant MD is an example of that. There's this other general principle that I think is maybe interesting, where you can build for the model and then you can build scaffolding around the model in order to improve performance a little bit. And depending on the domain, you can improve performance maybe 10, 20%, something like that and then essentially the gain is wiped out with the next model. So either you can build the scaffolding and then get some performance gain and then rebuild it again, or you just wait for the next model and then you kind of get it for free. The QuantumD and kind of the scaffolding is an example of that. And really, I think that's why we stayed in the CLI is because we felt there is no UI we could build that would still be relevant in six months because the model was improving so quickly.
[08:49]
Podcast Host
Earlier we were saying like, we should compare cloud MDs, but you said something very profound, which is, you know, yours is actually very short, which is. Is almost like the opposite of what people might expect. Why is that? What's in your CloudMD?
[09:02]
Boris Czerny
Okay, so I checked this before we came. So my QuadMD has two things. One is it's just two lines. So the first line is whenever you put up a PR Enable auto merge, so as soon as someone accepts it, it's merged. That's just so I can code and I don't have to kind of go back and forth with CR or whatever. And then the second one is whenever I put up a pr, post it in our internal team stamps channel, just so someone can stamp it and I can get unblocked. And the idea is every other instruction is in our QuadMD that's checked into the code base and it's something our entire team contributes to multiple times a week. And very often I'll see someone's PR and they make some mistake that's totally preventable. And I'll just literally tag Claude on the pr. I'll just do like add quad, you know, like add this to the QuadMD and I'll do this many times a week.
[09:51]
Podcast Host
Do you have to compact the Claude md? I definitely reached a point where I got the message at the top saying your Claude MD is like thousands of tokens. Now what do you do when you guys hit that?
[10:02]
Boris Czerny
So our QuadMD is actually pretty short. I think it's like a couple thousand tokens, something like that. If you hit this, my recommendation would be delete your quantumD and just start fresh.
[10:10]
Podcast Host
Interesting.
[10:11]
Boris Czerny
I think a lot of people, they try to over engineer this and really the capability changes with every model. And so the thing that you want is do the minimal possible thing in order to get the model on track. And so if you delete your quantumD and then you know the model is getting off track, it does the wrong thing. That's when you kind of add back a little bit at a time. And what you're probably going to find is with every model you have to add less and less. For me, I consider myself a pretty average engineer, to be honest. Like, I don't use a lot of fancy tools. Like I don't use like Vim, I use, you know, VSCode because it's somewhere I don't really.
[10:43]
Listener/Guest
Wait, wait, really? I would have assumed that because you built this in the terminal that you were sort of like a die hard terminal, like Vim. Only person, you know, screw those VS code people.
[10:52]
Boris Czerny
Well, we have people like that on the team. You know, like Adam Wolf for example, he's on the team, he's like, you will never take Vim for my cold dead hands. Yeah, so there's definitely a lot of people like that on the team. And this is one of the things that I learned early on is every engineer likes to hold their dev tools differently. They like to use different tools. There's just no one tool that works for everyone. But I think also this is one of the things that makes it possible for quad code to be so good because I kind of think about it as what is the product that I would use that makes sense to me. And so to use quad code, you don't have to understand Vim, you don't have to understand tmux, you don't have to know how to like ssh, you don't have to know all this stuff. You just have to open up the tool and it'll guide you, it'll do all this stuff.
[11:29]
Podcast Host
How do you decide how verbose you want the terminal to be? Sometimes you have to go control O and check it out. Is it internal? Bike shed battles around longer, shorter? I mean, every user probably has a different opinion. How do you make those sorts of decisions?
[11:46]
Boris Czerny
What's your opinion? Is it too verbose right now?
[11:48]
Podcast Host
Oh, I love the verbosity because basically sometimes it just goes off the deep end and I'm watching and then I can just read very quickly and it's like, oh, no, no, it's not that. And then I escape and then just stop it. And then it just like stops an entire bug farm as it's happening. I mean, that's usually when I didn't do plan mode properly.
[12:06]
Boris Czerny
This is something that we probably change pretty often. I remember early on, this is maybe six months ago, I tried to get rid of BASH output just internally, just to like summarize it, because I was like these giant Wong BASH commands, I don't actually care. And then I gave it to Anthropic employees for a day and everyone just revolted. I want to see my dash because it actually is quite useful for. For something like Git output. Maybe it's not useful, but if you're running Kubernetes Jobs or something like this, you actually do want to see it. We recently hid the file reads and file searches so you'll notice instead of saying read foo md it'll say read one file searched one pattern. And this is something I think we could not have shipped six months ago because the model just was not ready. It still read the wrong thing pretty often. As a user, you still had to be there and kind of catch it and debug it. But nowadays I just noticed it's on the right track almost every time. And because it's using tools so much, it's actually a lot better. Just to summarize it, but then we shipped it, we dogfooded it for like a month, and then people on GitHub didn't like it. So there was a big issue where people were like, no, I want to see the details. And that was really great feedback. And so we added a new verbose mode. And so that's just like in config you can enable verbose mode and if you want to see all the file outputs, you can continue to do that. And then I posted on the issue and people still didn't like it, which is again awesome because like my favorite thing in the world is just hearing people's feedback and hearing how they actually want to use it. And so we just like iterated more and more and more to get that really good and to make it the thing that people want.
[13:32]
Podcast Host
I'm amazed, like how much I enjoy fixing bugs now. And then all you have to do is have really good logging and then even just say like, hey, check out that, you know, this particular object, it messed up in this way and it like searches the log, it figures everything out. It can like go into your. You can make a production tunnel and it'll look at your production DB for you. It's like, this is insane. Bug fixing is just going to sentry copy markdown. You know, pretty soon it's just going to be straight mcp. It's like an auto bug fixing, like and test making. Sort of. What's the new term they call it like a making a startup factory. Oh yeah, right. There's like all these concepts now of. Rather than having to review the code. You know, I'm. I'm old school, so I like the verbosity. I like to say, oh, well, you're doing this, but I want you to do that. Right. But there's a totally different school of thought now that says, like, anytime a real human being has to look at code, that's bad.
[14:29]
Boris Czerny
Yeah, yeah, yeah.
[14:30]
Podcast Host
It's fascinating.
[14:31]
Boris Czerny
I think, like, Dan Chipper talks about this a lot as kind of whenever you see the model, make a mistake, try to put in the quantum D, try to put it in, like, skills or something like this. So it's reasonable. But I think there's this meta point that I actually struggle with a lot. And people talk about agents can do this, agents can do that, but actually what agents can do, it changes with every single model. And so sometimes there's a new person that joins the team and they actually use Claude code more than I would have used it. And I'm just constantly surprised by this. For example, we had a memory leak and we were trying to debug it. And by the way, Jared Sumner has just been on this crusade, killing all the memory leaks, and it's just been amazing. But before Jared was on the team, I had to do this, and there was this memory leak. I was trying to debug it, and so I took a heap dump. I opened it in devtools, I was looking through the profile, then I was looking through the code, and I was just trying to figure this out. And then another engineer on the team, Chris, he just asked quadcode, he was like, hey, I think there's a memory leak. Can you run this? And then tried to figure it out. And Claude code took the heap dump. It wrote a little tool for itself to analyze the heap dump, and then it found the leak faster than I did. And this is just something I have to constantly relearn because my brain is still stuck somewhere six months ago at times.
[15:44]
YC Partner
So what would be some advice for technical founders to really become maximalists at the latest model release? It sounds like people fresh off of school or that don't have any assumptions might be better suited than maybe sometimes engineers who have been working at it for a long time. And how do the experts get better?
[16:05]
Boris Czerny
I think for yourself, it's kind of beginner mindset and I don't know, maybe just like humility. I feel like engineers, as a discipline, we've learned to have very strong opinions, and senior engineers are kind of rewarded for this. In my old job at a big company, when I hired architects and this kind of type of engineer, you look for people that have a lot of experience and really strong opinions, but it actually Turns out a lot of this stuff just isn't relevant anymore, and a lot of these opinions should change because the model is getting better. So I think actually the biggest skill is people that can think scientifically and can just think from first principles.
[16:38]
YC Partner
How do you screen for that when you try to hire someone now for your team?
[16:42]
Boris Czerny
I sometimes ask about what's an example of when you're wrong. It's a really good one. Some of these classic behavioral questions, not even coding questions, I think are quite useful because you can see if people can recognize their mistake in hindsight, if they can claim credit for the mistake, and if they learn something from it. And I think a lot of these very senior people, especially there are some founder types like this, but I think founders in particular are actually quite good at it. But other people sometimes will never really take. They'll never take the blame for a mistake. But I don't know, for me personally, I'm wrong probably half the time, half my ideas are bad. And you just have to try stuff. And you try a thing, you give it to users, you talk to users, you learn, and then eventually you might end up at a good idea. Sometimes you don't. And this is the skill that I think in the past was very important for founders, but now I think it's very important for every engineer.
[17:32]
Podcast Host
Do you think you would ever hire someone based on the Claude code transcript of them working with the agent? Because we're actively doing that right now. We just added just as a test, like, you can upload a transcript of you coding a feature with Claude code or codecs or whatever it is. Personally, I think that, like, it's going to work. I mean, you can figure out how someone thinks, like, whether they're looking at the logs or not. Like, can they correct the agent if it goes off, off the rails? Like, do they use plan mode? You know, when they use plan mode, do they make sure that there are tests or, you know, all these different things that, you know, do they think about systems? Do they even understand systems? Like, there's just so much that's sort of embedded in that that I imagine I just want like a spider, a spider web graph, you know, like in those video games like NBA 2K. It's like, oh, this person's really good at shooting or defense. It's like you can imagine a spiderweb graph of, like, you know, someone's Claude code. Skill level.
[18:30]
Boris Czerny
Yeah. What would the skills be? What be those last?
[18:33]
Podcast Host
I mean, I think it's like systems testing must be like, user Behavior. I mean, there's got to be a
[18:38]
Boris Czerny
design part for sure, like product sense, maybe also just like automating stuff.
[18:43]
Podcast Host
My favorite thing in CloudMD for me is I have a thing that says for every plan, decide whether it's overengineered, under engineered or perfectly engineered. And why.
[18:53]
Boris Czerny
I think this is something that we're trying to figure out too, because I think when I look at engineers on the team that I think are the most effective. There's essentially two. It's very bimodal, there's one side where it's extreme specialists. And so I named Jared before, he's a really good example of this. And kind of the Bunn team is a really good example. Just hyper specialists. They understand DevTools better than anyone else. They understand JavaScript runtime systems better than anyone else. And then there's the flip side of kind of hypergeneralists. And that's kind of the rest of the team. And a lot of people, they span product and infra, or product and design, or product and user research, product and business. I really like to see people that just do weird stuff. I think that's one of these things that was kind of a warning sign in the past because it's like, can these people actually build something useful?
[19:38]
Podcast Host
That's the litmus test.
[19:39]
Boris Czerny
Yeah, that's the litmus test. But nowadays, for example, an engineer on the team, Daisy, she was on a different team and then she transferred onto our team. And the reason that I wanted her to transfer is she put up a PR for Claude code, like a couple weeks after she joined or something. And the PR was to add a new feature to Claude code. And then instead of just adding the feature, what she did is first she put up a PR to give Claude code a tool so that it can test an arbitrary tool and verify that that works. And then she put up that PR and then she had Claude write its own tool instead of herself implementing it. And I think it's this kind of out of the box thinking that is just so interesting because not a lot of people get it yet. You know, like, we use the Claude agent SDK to automate pretty much every part of development. It automates code review, security review, it labels all of our issues, it shepherds things to production, it does pretty much everything for us. But I think externally I'm seeing a lot of people start to figure this out, but it's actually taken a while to figure out how do you use LMS in this way? How do you use this new kind of automation? So it's kind of a new skill.
[20:41]
Podcast Host
I guess. One of the funnier things that I've been having office hours with various founders about is you have sort of the visionary founder who has the idea they've built this crystal palace of the product that they want to build. They've totally loaded in their brain, you know, who the user is and what they feel and what they're motivated by. And then they're sitting in Claude code and they can do like, you know, 50x work and then. But they have engineers who work for them who, like, don't have the, you know, crystal memory palace of like, the platonic ideal of the product that the founder has. And they can only do like 5x work. Are you hearing stories like that? There's usually a person who's like the core, like, designer of a thing and they're just like, you know, trying to blast it out of their brain. What's the nature of, like, teams like that? You know, it seems like that's almost a stable configuration. Like you're going to have the visionary who, like, now is unleashed, but, you know, maybe going back to the top of it. Like, I'm experiencing this right now. It's like, oh, well, I'm only a solo person and, you know, I need to eat and sleep and I have, you know, a whole job. It's like, how am I going to do this? You know?
[21:51]
Boris Czerny
You know, like, we just launched quad teams and, you know, this is a way to do it, but you can also just build your own way to do it. It's pretty easy.
[21:57]
Podcast Host
What's the vision for cloud teams?
[21:59]
Boris Czerny
It's collaboration. There's this whole new field of agent topologies that people are exploring. What are the ways that you can configure agents? There's this one sub idea, which is uncorrelated context windows, and the idea is just multiple agents. They have fresh context windows that aren't essentially polluted with each other's context or their own previous context. And if you throw more context at a problem that's like a form of test time computer, and so you just get more capability that way. And then if you have the right topology on top of it, so the agents can communicate in the right way, they're laid out in the right way, then they can just build bigger stuff. And so teams is kind of like one idea, There's a few more that are coming pretty soon, and the idea is just maybe it can build a little bit more. I think the first kind of big example where it worked is our Plugins feature was entirely built by a swarm over a weekend. It just ran for a few days. There wasn't really human intervention and plugins is pretty much in the form that it was when it came out.
[22:52]
Podcast Host
How did you set that up? Like, did you spec out sort of the outcome that you were hoping for and then let it sort of figure out the details and then let it run?
[23:02]
Boris Czerny
Yeah, an engineer on the team just gave Quad a spec and told Quad to use Asana board. And then Quad just put up a bunch of tickets on Asana and then spawned a bunch of agents and the agents started picking up tasks and main Quad just gave it instructions and they all just figured it out.
[23:19]
YC Partner
The independent agents that didn't have the context of the bigger spec.
[23:23]
Boris Czerny
Right, right. If you think about the way that how our agents actually started nowadays, and I haven't pulled the data on this, but I would bet the majority of agents are actually prompted by CLAUDE today in the form of subagents. Because a sub agent is just like a recursive Claude code. That's all it is in the code and it's just prompted by. We call her Mama Claude and that's all it is. And I think probably if you look at most agents that are launched in
[23:47]
Ben Mann
this way, my Claude insights just told me to do this more for debugging, so that I get like, I spent a lot of time on debugging and it would just be better to have multiple sub agents spin up and debug something in parallel. And so then I just added that to my CLAUDE MD to just be like, hey, next time you try and fix a bug, have one agent that looks in the log like one that looks in the code path.
[24:07]
Podcast Host
That just seems sort of inevitable for weird, scary bugs. I try to fix bugs in plan mode and then it seems to use the agents to sort of search everything. Whereas like, when you're just trying to do it in line, it's like, okay, okay, I'm going to do this one task instead of search wide.
[24:23]
Boris Czerny
This is something I do all the time too. I just say if the task seems kind of hard, this kind of research task, I'll calibrate the number of subagents I ask it to use based on the difficulty of the task. So if it's really hard, I'll say use three or maybe five or even 10 subagents, research in parallel and then see what they come up with.
[24:39]
Ben Mann
I'm curious though, then why don't you put that in your Claude MD file?
[24:43]
Boris Czerny
It's kind of case by case. QuadMD. What is it? It's a shortcut. If you find yourself repeating the same thing over and over, you put in the quadmd. But otherwise you don't have to put everything there. You can just prompt quad.
[24:55]
Ben Mann
Are you also in the back of your mind thinking that maybe in six months you won't need to prompt that explicitly? The model will just be good enough to figure out on its own.
[25:03]
Boris Czerny
Maybe in a month.
[25:05]
YC Partner
No more need for plan mode in a month.
[25:07]
Podcast Host
Oh, my God.
[25:08]
Boris Czerny
I think plan mode probably has a limited lifespan.
[25:10]
Podcast Host
Interesting.
[25:11]
YC Partner
That's some alpha for every one year. What would the world look like without plan mode? Do you just describe it at the prompt level? And it would just do it one shot it.
[25:19]
Boris Czerny
Yeah. We've started experimenting with this because quad code can now enter plan mode by itself. I don't know if you guys have seen that.
[25:26]
Podcast Host
Yeah.
[25:27]
Boris Czerny
So we're trying to kind of get this experience really good. So it would enter plan mode the same point where a human would have wanted to enter it. So I think it's something like this. But actually plan mode, there's no big secret to it. All it does is it adds one sentence to the prompt that's like, please don't code. That's all it is. You can actually just say that.
[25:46]
YC Partner
So it sounds like a lot of the feature development for clock code is very much when we talk about oic, talk to your users and then you come and implemented it. It wasn't the other way that you had this master plan and then implemented all the features?
[25:58]
Boris Czerny
Yeah, yeah. I mean, that's all it was. Plan mode was. We saw users that were like, hey, Claude, come up with an idea. Plan this out, but don't write any code yet. And there was kind of various versions of this. Sometimes it was just talking through an idea, sometimes it was these very sophisticated specs that they were asking Claude to write. But the common dimension was do a thing without coding yet. And so literally this was like Sunday night at 10pm I was just looking at GitHub issues and kind of seeing what people were talking about and looking at our internal Slack feedback channel. And I just wrote this thing in like 30 minutes and then shipped it that night. It went out Monday morning. That was plan mode.
[26:31]
Ben Mann
So do you mean that there will be no need for plan mode in the sense of. I'm worried that the model is going to do, like, it's going to do like the wrong thing or head off in the wrong direction, but there will still be a need for that. You need to think through the idea and figure out exactly what it is that you want. And you have to do that somewhere.
[26:47]
Boris Czerny
I kind of think about it in terms of increasing model capabilities. So maybe six months ago a plan was insufficient. So you get Claude to make a plan. Let's say even with plan mode, you still have to kind of sit there and babysit because it can go off track. Nowadays what I do is probably 80% of my sessions. I say plan mode has a limited lifespan, but I'm a heavy plan Mode user. Probably 80 must percent of my sessions I start in plan mode and Claude will, you know, it'll start. It'll start making a plan. I'll move on to my second terminal tab and then I'll have it make another plan. And then when I run out of tabs, I open the desktop app and then I go to the code tab and then I just start a bunch of tabs there and they all start in plan Mode. Probably like 80% of the time, once the plan is good and sometimes it takes a little back and forth, you just get Claude to execute. And nowadays what I find with Opus 4.5, I think it started with 4.6, it got really good. Once the plan is good, it just stays on track and it'll just do the thing exactly right almost every time. And so before, you had to babysit after the plan and before the plan. Now it's just before the plan. So maybe the next thing is you just won't have to babysit. You can just kind of give a prompt and Claude will figure it out.
[27:52]
Podcast Host
The next step is Claude just speaks to your users directly.
[27:56]
Ben Mann
Yeah, it just bypasses you entirely.
[27:58]
Boris Czerny
It's funny. This is actually the current stuff for us. Our Claude's actually like, they talk to each other. They talk to our users on swag, at least internally, pretty often. My Quad will like tweet once in a while.
[28:08]
Podcast Host
No way.
[28:09]
Boris Czerny
But I actually like delete it. It's just like, it's a little cheesy. I don't love the tone.
[28:14]
Podcast Host
What does it want to tweet about?
[28:16]
Boris Czerny
Sometimes it'll just respond to someone because I always have cowork running in the background. And it's the cowork Quad that really loves to do that because it likes using a browser.
[28:24]
Podcast Host
That's funny.
[28:24]
Boris Czerny
A really common pattern is I ask Quad to build something. It'll look in the code base, it'll see some engineer touch something in the git plane, and then it'll message that engineer on Slack. Just like asking a Clarifying question. And then once it gets the answer back, it'll keep going.
[28:37]
YC Partner
What are some tips for founders now on how to build for the future? Sounds like everything is really changing. What are some principles that will stay on and what will change?
[28:48]
Boris Czerny
So I think some of these are pretty basic, but I think they're even more important now than they were before. So one example is latent demand. Like I mentioned it a thousand times, for me it's just like the single biggest idea in product. It's a thing that no one understands. It's a thing I certainly did not understand my first few startups. And the idea is people will only do a thing that they already do. You can't get people to do a new thing. If people are trying to do a thing and you make it easier, that's a good idea. But if people are doing a thing and you try to make them do a different thing, they're not going to do that. And so you just have to make the thing that they're trying to do easier. And I think Claude is going to get increasingly good at figuring out these kind of product ideas for you. Just because it can look at feedback, it can look at debug logs, it can kind of figure this out.
[29:29]
Ben Mann
That's what you mean by plan mode was latent demand that people already had their clawed chat window open in a browser and were talking to it to figure out the spec and what it should do. And now plan mode just became that. You just do it in clawed code?
[29:45]
Boris Czerny
Yeah, yeah, that's it. Sometimes what I'll do is I'll just walk around the office on our floor and I'll just kind of stand behind people, I'll say like, hi, so it's not great. And then I'll just see kind of like how they're using Claude code. And this is also just something I saw a lot, but it also came up in GitHub issues like people were
[30:01]
Ben Mann
talking about it, it seems like. So you're surprised how far the terminal has gone and how far it's been pushed? How far do you think it has left to go? Just given with this world of small multiple agents, do you think there's going to be a need for a different UI on top of it?
[30:18]
Boris Czerny
It's funny, if you asked me this a year ago, I would have said the terminal has a three month lifespan and then we're going to move on to the next thing. And you can see us experimenting with this. Right, because quad code started in a terminal, but now it's in, you know, it's on web, like Quadai code, it's in the desktop app. You know, we've had that for, you know, like three months or six months or something. Just in the code tab, it's in the iOS and Android apps, just like in the code tab, it's in slack, it's in GitHub. There's VS code extensions, there's JetBrains extensions. So we're just like, we're always experimenting with different form factors for this thing to figure out what's the next thing. I've been wrong so far about the lifespan of the cli, so I'm probably not the person to forecast this.
[30:57]
Ben Mann
What about your advice to devtool founders? Someone's building a devtool company today. Should they just be building for engineers and humans, or should they be thinking more about what Claude is going to think and want and build for Sort of like the agent?
[31:11]
Boris Czerny
The way I would frame it is think about the thing that the model wants to do and figure out, how do you make that easier? And that's something that we saw. When I first started hacking on Claude code, I realized this thing just wants to use tools. It just wants to interact with the world. And how do you enable that? Well, the way you don't do it is you put it in a box and you're like, here's the API, here's how you interact with me, and here's how you interact with the world. The way you do it is you see what tools it wants to use, you see what it's trying to do, and you enable that the same way that you do for your users. And so if you're building a devtool startup, I would think about what is the problem you want to solve for the user. And then when you apply the model to solving this problem, what is the thing the model wants to do? And then what is the technical and product solution that serves the weight and demand of both?
[31:56]
Podcast Host
YC's next batch is now taking applications. Got a startup in you apply@ycombinator.com apply. It's never too early and filling out the app will level up your idea. Okay, back to the video.
[32:10]
YC Partner
Back in the day, more than 10 years ago, you were a very heavier, heavy user and you wrote a book about Typescript, right before Typescript was cool. This is when everyone was deep in JavaScript. This is back in early 2010s, right?
[32:26]
Boris Czerny
Yeah, something like that.
[32:27]
YC Partner
Before TypeScript was a thing. Because back then is a very weird language. It's not supposed to do a lot of things with being typed in JavaScript and now is the right thing. And it feels like Claude code in the terminal has a lot of parallels with TypeScript.
[32:43]
Boris Czerny
At the beginning, TypeScript makes a lot of really weird language decisions. So if you look at the type system, pretty much anything can be a literal type, for example. And this is super weird because even Haskell doesn't even do this. It's just like it's too extreme or it has conditional types, which I don't think any language thought of at all.
[33:05]
YC Partner
It was very strongly typed.
[33:07]
Boris Czerny
Yeah, it was very strongly typed. And the idea was when Joe Pamer and Anders and the early team was building this thing, the way they built it is okay. We have these teams with these big untyped JavaScript code bases. We have to get types in there, but we're not going to get engineers to change the way that they code. You're not going to get JavaScript people to have 15 layers of class inheritance like you would a Java programmer. They're going to write code the way they're going to write it. They're going to use reflection and they're going to use mutation and they're going to use all these features that traditionally are very, very difficult to type.
[33:37]
YC Partner
They're a very unsafe type to any strong functional programmer.
[33:40]
Boris Czerny
That's right. And so the thing that they did, instead of getting people to kind of change the way that they code, they built a type system around this. And it was just brilliant because there's all these ideas that no one was thinking about. Even in academia, no one thought of a bunch of these ideas. It purely came out of the practice of observing people and seeing how JavaScript programmers want to write code. And so for Quad code, there are some ideas that are kind of similar in that you can use it like a Unix utility, you can pipe into it, you can pipe out of it. In some ways it is kind of rigorous in this way, but in almost every other way, it's just the tool that we wanted. I built a tool for myself, and then the team built the tool for themselves and then for anthropic employees, and then for users, and it just ends up being really useful. It's not this principled and academic thing,
[34:28]
YC Partner
which I think the proof is actually in the results. Now, fast forward more than 15 years later, not many code bases are in Haskell, which is more academic, and there's tons of them now on TypeScript because it's way more practical, which is interesting.
[34:43]
Boris Czerny
Yeah, it is interesting, right. It's like typescript solves a problem, I guess.
[34:46]
YC Partner
One thing that's cool, I don't know how many people know, but the terminal is actually one of the most beautiful terminal apps out there and is actually written with React terminal.
[34:57]
Boris Czerny
When I first started building it, you know, like I did front end engineering for, for a while, so. And I was also like a, you know, I'm sort of like a hybrid. Like I do like design and user research and you know, write code and all this stuff and we love hiring engineers that are like this so we just, we love generalists. So for me it's like, okay, I'm building a thing for the terminal. I'm actually kind of a shitty vim user. So like how do I build a thing for people like me that you know are going to be working in a terminal? And I think just the delight is so important and I feel like at YC this is something you talk about a lot, right? It's like build a thing that people love.
[35:28]
Ben Mann
Build a.
[35:29]
Boris Czerny
If the product is useful but you don't fall in love with it, that's not great. So it kind of has to do both. Designing for the terminal, honestly has been hard, right? It's like 80 by 100 characters or whatever. You have like 256 covers, you have one font size. You don't have mouse interactions. There's all this stuff you can't do and there's all these very hard trade offs. So a little known thing for example is you can actually enable mouse interactions in a terminal so you can enable clicking and stuff.
[35:53]
Listener/Guest
Oh, how do you do that in Claude code? I've been trying to figure out how to do this.
[35:57]
Boris Czerny
We don't have it in cloud code because we actually prototyped it a few times and it felt really bad because the trade off is you have to virtualize scrolling and so there's all these weird trade offs because like the way terminals work is like there's no dom, right? It's like there's like anti escape codes and these kind of weird organically evolved specs since like the 1960s or whatever.
[36:14]
Podcast Host
Yeah, it feels like BBS's. It's like a BBS door game.
[36:17]
Boris Czerny
Yeah, yeah, yeah.
[36:18]
Podcast Host
Oh my gosh.
[36:18]
Boris Czerny
That's like, that's like a great compliment.
[36:20]
Podcast Host
Yeah, yeah.
[36:20]
Boris Czerny
Like it should feel like you're discovering
[36:22]
Podcast Host
Lord of the Red Dragons. Fantastic. Oh my God.
[36:25]
Boris Czerny
Yeah. But we have, we've had to just like discover all these kind of UX principles for building the terminal because no one really writes about this stuff. And if you look at the big terminal apps of the 80s or 90s or 2000s or whatever, they use edcurses and they have all these windows and things like this, and it just looks kind of janky by modern standards. It just looks too heavy and complicated. And so we had to reinvent a lot. And for example, something like the terminal spinner, Just like the spinner words, it's gone through probably, I want to say, like 50, maybe 100 iterations at this point, and probably 80% of those didn't ship. So we tried it. It didn't feel good. Move on to the next one. Try. It didn't feel good. Move on to the next one. And this was sort of one of the amazing things about quad code, right? Is like, you can write these prototypes and you can just do like 20 prototypes back to back, see which one you like, and then ship that. And the whole thing takes maybe a couple hours. Whereas in the past, what you would have had to do is learn to use origami or framer or something like this. You built maybe three prototypes. It took two weeks. It just took much, much longer. And so we have this luxury of we have to discover this new thing. We have to build a thing. We don't know what the right endpoint is, but we can iterate there so quickly. And that's what makes it really easy. And that's what lets us build a product that's joyous and that people like to use.
[37:37]
Listener/Guest
Boris, you had other advice for builders, and we kept interrupting you because we have so many questions.
[37:44]
Boris Czerny
I would say, okay, so maybe two pieces of advice that are kind of weird because it's about building for the model. So one is don't build for the model of today. Build for the model of six months from now. This is sort of weird, right? Because you can't find PMF if the product doesn't work. But actually, this is the thing that you should do, because otherwise what will happen is you spend a bunch of work, you find PMF for the product right now, and then you're just going to get leapfrogged by someone else because they're building for the next model, and a new model comes out every few months. Use the model, feel out the boundary of what it can do, and then build for the model that you think will be the model maybe six months from now. I think the second thing is actually in the quad code area where we sit, we have a framed copy of the Bitter Lesson on the wall. And this is this Rich Sutton blog post. Everyone should Read it if you haven't. And the idea is the more general model will always beat the more specific model. And there's a lot of corollaries to this. But essentially what it boils down to is never bet against the model. And so this is just like a thing that we always think about where we could build a feature into quad code, we could make it better as a product. And we call this scaffolding. It's all this code that's not the model itself. But we could also just wait a couple months and the model can probably just do the thing instead. And there's always this trade off. Right. It's like engineering work now and you can kind of extend the capability a little bit, maybe 10, 20% or whatever in whatever domain on this spider chart of what you're trying to extend, or you can just wait and the next model will do it. So just always think in terms of this trade off. Where do you actually want to invest? And assume that whatever the scaffolding is, it's just tech debt.
[39:17]
YC Partner
How often do you rewrite the codewase of clock code? Is this every six months with this
[39:22]
Listener/Guest
first principle, is there scaffolding that you've deleted because you don't need it anymore because the model just improved oh so much?
[39:28]
Boris Czerny
Yeah. All of QuadCode has just been written and rewritten and rewritten and rewritten over and over and over. We unship tools every couple weeks, we add new tools every couple weeks. There's no part of QuadCode that was around six months ago. It's just constantly rewritten.
[39:41]
YC Partner
Would you say most of the code base for current clock code is only say 80% of it is only less than a couple months old?
[39:48]
Boris Czerny
Yeah, definitely. It might even be like less than. Yeah, maybe a couple months. That feels about right.
[39:53]
YC Partner
So it's like the lifecycle of code now that's another alpha. Is expecting it to be the shelf life to be just a couple months. Yeah, for the best founders.
[40:00]
Podcast Host
Do you see Steve Yegi's post about how awesome working at Anthropic is? And I think there's a line in there that says that an Anthropic Engineer currently averages 1000x more productivity than a Google engineer at Google's peak, which is really an insane number, honestly, like 1000x. Like, you know, we're three years ago, we're still talking about 10x engineers. Now we're talking about a thousandx on top of a Google engineer in the prime. Like this is unbelievable, honestly.
[40:31]
YC Partner
Yeah.
[40:31]
Boris Czerny
I mean, internally, if you look at technical Employees, they all use quadcode every day. And even non technical employees. I think like half the sales team uses quadcode. They've started switching to Cowork because it's a little easier to use. It has a vm, so it's a little bit safer. But yeah, we just pulled a stat and I think the team doubled in size last year, but productivity per engineer grew something like 70% as measured by just like the simplest, stupidest measure, pull requests. But we also kind of cross check that against commits and the lifetime of commits and things like this. And since Quadcode came out, productivity per engineer at Anthropic has grown 150%.
[41:07]
Podcast Host
Oh my God.
[41:08]
Boris Czerny
And this is crazy because in my old life I was responsible for code quality at Meta and I was responsible for the quality of all of our code bases across every product, across Facebook, Instagram, WhatsApp, whatever. And one of the things that the team worked on was improving productivity. And back then seeing a gain of something like 2% in productivity. That was a year of work by hundreds of people. And so this like 100%, this is just like unheard of, just completely unheard of.
[41:35]
Podcast Host
What drove you to come over to Anthropic? I mean, basically, as a builder, you could go anywhere. What was the moment that made you say like, actually this is the set of people or this is the approach.
[41:45]
Boris Czerny
I was living in rural Japan and I was opening up Hacker News every morning and I was reading the news and it was all. It just started to be like AI stuff at some point. And I started to use some of these early products. And I remember the first couple times that I used it, I was just like, it just took my breath away. That was very cheesy to say, but that was actually the feeling. It was amazing. As a builder, I've just never felt this feeling using these very, very early products. That was in the quad two days or something like that. And so I just started talking to friends at labs just to kind of see what was going on. And I met Ben Mann, who's one of the founders at Anthropic, and he just immediately won me over. And as soon as I met kind of the rest of the team at Ant, it just won me over and I think probably in two ways. So one is it operates as a research lab. So the product work was teeny, teeny tiny. It's really all about building a safe model. That's all that matters. And so this idea of just being very close to the model and being very close to development and being not the most important thing, because the product isn't anymore. It's just the model is the thing that's the most important. That really resonated with me after building product for many years. And then the second thing was just how mission driven it is. I'm a huge sci fi reader. My bookshelf is just filled with sci fi. And so I just know how bad this can go. And when I kind of think about what's going to happen this year, it's going to be totally insane. And in the worst case, it can go very, very bad. And so I just wanted to be at a place that really understood that and kind of really internalized that. And at ant, if you overhear conversations in the lunchroom or in the hallway, people are talking about AI safety. This is really the thing that everyone cares about more than anything. And so I just wanted to be in a place like that. I know for me personally, mission is just so important.
[43:37]
Listener/Guest
What is going to happen this year?
[43:40]
Boris Czerny
Okay, so if you think back six months ago and what are the predictions that people are making? So Dario predicted that 90% of the code at Anthropic would be written by Quad. This is true for me personally. It's been 100% since Opus 4.5. I uninstalled my IDE. I don't edit a single line of code by hand. It's just 100% quad code. And Opus and I land 20 PR a day, every day. If you look at Anthropic overall, it ranges between 70 to 90% depending on the team. For a lot of teams it's also like 100%. For a lot of people, it's 100%. And I remember making this prediction back in may when we GA'd quad code that you wouldn't need an IDE to code anymore. And it was totally crazy to say. I feel like people in the audience gasped because it was such a silly prediction at the time, but really all it is is you just trace the exponential. And this is just so deep in the DNA at ant, because three of our founders were co authors of the scaling laws paper, they saw this very early. And so this is just like tracing the exponential. This is what's going to happen. And yes, that happened. So continuing to trace the exponential, I think what will happen is coding will be generally solved for everyone. And I think today coding is practically solved for me. And I think it'll be the case for everyone, regardless of domain. I think we're going to start to see the title software engineer go away. And I think it's just going to be maybe builder, maybe product manager, maybe we'll keep the title as kind of a vestigial thing. But the work that people do, it's not just going to be coding software. Engineers are also going to be writing specs, they're going to be talking to users. This thing that we're starting to see right now in our team, where engineers are very much generalists, and every single function on our team codes like RPM's code, our designers code, our EM codes, our finance guy codes, everyone on our team codes. We're going to start to see this everywhere. So this is sort of. This is kind of like the lower bound. If we just continue the trend. The upper bound, I think, is a lot scarier. And this is something like we hit ASL4 and at Anthropic, we talked about these safety levels. ASL3 is where the models are right now. ASL4 is the model is recursively self improving. And so if this happens, essentially we have to meet a bunch of criteria before we can release a model. And so the extreme is that this happens or there's some kind of catastrophic misuse, like people are using the model to design bioviruses, design zero days, stuff like this. And this is something that we're really, really actively working on. So that doesn't happen. I think it's just been. Honestly, it's just been so exciting and humbling seeing how people are using quadcode. I just wanted to build a cool thing and it ended up being really useful. And that was so surprising and so exciting.
[46:21]
Ben Mann
My impression from Twitter or just the outside is basically everyone went away over the holidays and then found out about Claude Code, and it's just been crazy ever since. Is that how it was for you internally? Were you having a nice Christmas break and then came back? What happened?
[46:37]
Boris Czerny
Well, actually, for all of December, I was traveling around and I took a coding vacation. So we were kind of traveling around and I was just coding every day, so that was really nice. And then I also started to use Twitter at the time, because I worked on threads back then, way back when. So I've been a threads user for a while, so I just tried to see kind of other platforms where people are. Yeah, I think for a lot of people, they kind of discovered that was the moment where they discovered Opus 4.5. I kind of already knew. And internally, Claud Code's just been on this exponential tear for many, many months now. So that just became even more steep. That's what we saw and if you look at Quadcode now, there was some stat from Mercury that 70% of startups are choosing Quad as their model of choice. There was some other stat from semi analysis that 4% of all public emits are made by Quad code. Of all code written everywhere. All the companies use quadcode from the biggest companies to smallest startups. It plotted the course for perseverance for the Mars Rover. This is the coolest thing for me. We even printed posters because the team was like, wow, this is just so cool the NASA chooses to use this thing. So yeah, it's humbling, but it also feels like the very beginning.
[47:47]
Podcast Host
What's the sort of interaction between Claude code and then cowork? Was it a fork of Claude code? Was it like you had Claud code? Look at the Claude code code and say let's make a new spec for non technical people that keeps all the lessons. And then it sort of went off for a couple days and did that. What's the genesis of that and where do you think that goes?
[48:11]
Boris Czerny
This is going to be like my fifth time using the word wait and demand. It was just that, I mean we were looking at Twitter and there was that one guy that was using quadco to monitor his tomato plants. There was like this other person that was using it to recover wedding photos off of a corrupted hard drive. There were people that were using it for finance. When we looked internally at Anthropic, every designer is using it. The entire finance team at this point is using it. The entire data science team is using it. Not for coding. People are jumping over hoops to install a thing in the terminal so that they could use this. So we knew for a while that we wanted to build something and so we were experimenting with a bunch of different ideas. And the thing that kind of took off was just a little Quad code wrapper in a GUI in the desktop app. And that's all it is. It's just Quad code under the hood. It's the same agent.
[48:55]
Podcast Host
Oh wow.
[48:56]
Boris Czerny
And Felix and the team. And Felix was an early Electron contributor. He kind of knows that stack really well and he was hacking on various ideas and they built it in I think something like 10 days. It was just like 100% written by Quad Code and it just felt ready to release. There was a lot of stuff that we had to build for non technical users. So it's a little bit different than technical audience. All the code runs in a virtual machine. There's a lot of protections for deletion and things like this. There's a lot of permission prompting and kind of other guardrails for users. Yeah, it was honestly pretty obvious.
[49:32]
Podcast Host
Boris, thank you so much for making something that is taking away all my sleep. But in return, it's making me feel creator mode again, sort of founder mode again. It's been an exhilarating three weeks. I can't believe I waited that long, since November to actually get into it. Thank you so much for being with us. Thank you for building what you're building.
[49:53]
Boris Czerny
Yeah, thanks for having me. And send bugs.
[49:56]
Podcast Host
Sounds good.