Loading summary
A
Museums are more than places we visit on a field trip across the country. Museums protect our shared history, care for wildlife and collections, strengthen local economies, support job training, and spark curiosity in people of all ages. Right now, you can help make sure museums stay strong for future generations. Museum Advocacy Day is a national moment when people contact Congress to ask for continued support for museums and the federal agencies that fund them. Learn how to take action@aam-us.org and tell your representatives that museums matter to education, to communities, to the economy, and to our democracy. Hey, Gideon.
B
Hey, Tyler.
A
Thanks so much for being here.
B
Thank you for having me.
A
So I feel like we're at this funny point right now with AI where we. We've been told for years that it was going to replace people like you and me, you know, writers, editors, people in the humanities. And instead we're seeing something where it actually looks like it's the coders who are most at risk. I mean, there was this huge stock market sell off of software stocks, and you see software engineers in particular online kind of grieving about their jobs and just this feeling that, like, the work that they used to do that was so important is no longer that crucial anymore, or it can be done by AI much faster than they were able to do it. And so, given that you've just reported on Anthropic, an AI company that is full of people who seemingly, to me, are kind of at risk of being replaced themselves by the tool that they're creating. What was the feeling there? Like, I mean, how are the engineers at Anthropic thinking about this problem?
B
Yeah, I mean, this is something that came up constantly at Anthropic, starting when I first visited last spring, that they were feeling like, you know, we are really the canaries in the coal mine here, here. And they thought, well, there are all these people who feel like we're not actually paying attention to the effects that this might have on the white collar workforce when, like, no, we're the first people being impacted by this. I mean, I watched over the course of, you know, May, when I first visited Anthropic through the fall, software engineers would tell me, you know, over the past four or five months, I've watched, like, the amount of coding that I do by hand go from 100% to 60%. And then by September, it was 20%. And now it's, you know, during fact checking, one of the people said, well, now it's actually zero percent that I do. And there's an Anthropic employee named Alex Tamkin really Wonderful, warm guy who had sent a slack message to his team at 4:17 in the morning saying, now I have to figure out what I'm supposed to be doing while Claude is doing my work.
A
That's Gideon Lewis Kraus, a staff writer at the New Yorker, who recently wrote about the AI company Anthropic. In addition to developing Claude, a series of large language models positioned as an alternative to ChatGPT, Anthropic has made research into AI safety and ethics central to its public identity. But as the company grows and as AI's capabilities and uses continue to spread through everyday life, questions are beginning to mount about what that commitment to safety actually looks like in practice. I wanted to talk with Gideon about how the challenges Anthropic is grappling with reflect a new phase of AI development, one in which people both inside and outside the industry are asking increasingly urgent questions about how much we really understand these systems, how much control we can ever hope to have over them, and how we wade through the uncertainty. In the meantime. This is the political scene. I'm Tyler Foggit and I'm a senior editor at the New Yorker. So what drew you to Anthropic in the first place? Obviously, there are a ton of AI companies operating in this space right now. So many LLMs, you got Grok, you know, you have like, Meta AI. And so why, why Anthropic? What were they doing that was so interesting to you?
B
So to take a very big step back, I wasn't initially planning to do this as an Anthropic piece. Basically, you know, my kind of personal way into this was that about 10 years ago, I spent like nine months at Google Brain going kind of for a week a month writing about the first implementation of deep learning in a product which was their switch over to neural machine translation. I mean, it was a great experience and really fun to do. And the piece got kind of a surprising amount of traction for something that was pretty technical. And then I continued to pay attention to AI and the development of large language models. The irony being that kind of. Until ChatGPT came out and then I realized I had stopped paying attention and I had to pause, you know, about a. I don't know, a little, maybe a year and a half ago, I thought, like, this is something I should be interested in and have been interested in. Why am I not interested in it anymore? And I think it was because just the discourse felt so boring to me because it was like, in this, like, kind of on this merry go round of like, you know, some people Yelling about how, you know, we were on a path to super intelligence and everything was inevitable, and talking about how powerful these things were going to be and how they were going to change everything overnight. And then you had kind of the other end of the spectrum saying, like, no, it's all fake and bullshit and like, none of this is real and it's just a parlor trick. It's glorified autocomplete. And it was just like one of these, like, discursive patterns where each side felt like, well, this time I'm just going to yell a little louder and like, people will believe me. And then the stuff that started to get me to pay attention again, like little, maybe about a year and a half ago, a little less was research that was coming out of interpretability groups, which are groups looking into, like, how these models work and alignment science groups about, like, what, you know, what are the values reflected in these models. And there was stuff about how, you know, models might fake to, like, pretend that they were aligned in one way in order to, like, get through training to then be deployed. And I sort of think like, well, that's just very bizarre. And it struck me that, like, one of the ways to kind of like get around people's instinctive defense mechanisms about these things where either they're so sure that they're, like, powerful and gonna be super intelligent, or they're so sure that, like, it's all hype, was to say, like, well, maybe we could all take a step back and agree that like, whatever's going on, setting aside anything speculative about the future, like, just what we have right now, I think we could all agree that, like, it's pretty weird that like, whatever's going on is weird. And I thought, like, I think that there's a way to like, go back into this and say, like, look at this research that's being done, that even if you don't want to grant all these other speculative things, you can grant that, like, something bizarre is happening. And so there's a guy, one of the seven co founders of Anthropic is a guy called Chris Ola, whom I had met when he was like, basically a child working at Google ten years ago. And he's done, you know, he's considered kind of the godfather of what they call mechanistic interpretability, which is like a, you know, looking at the individual neurons and like, how it all works on the level of the substrate. And I wrote to him and I was like, look, this is not going to be a story about corporate power. The Consolidation of corporate power. It's not going to be a story about geopolitics or regulation. Like those are all important things for other stories. But I'm interested in a story about what can we say with any kind of certainty about how these things work and even really what they are. And I think the other reason they were interested is that I said so much of the conversation about AI ends up being about what this executive said or that executive said. And I said, with all due respect to the executives, executives are executives. And I'm really interested in the kind of rank and file researchers who do this kind of thing who I think don't get enough attention and who tend to be very thoughtful about these things. And so to my surprise, Anthropic was like, yeah, why don't you come hang out? And I said to them, I want to do this over seven or eight months, I want to come back four or five times. And they were like, great, sounds good. So I kind of looked into it.
A
So you show up at the Anthropic headquarters in San Francisco and what do you see? I mean, I feel like there's this like stereotype of Silicon Valley tech offices where it's like, you know, they're like beanbags and ping pong tables and endless snacks. Like, it's probably like a little bit of like what you saw at Google when you were reporting there.
B
You know, like adult daycare. Yeah, it's like adult daycare with like, you know, go boards set up and chessboard set up and climbing walls and lollipops and like all of that stuff. Like no, there's nothing like that at Anthropic. You know, you like, you go in and like it looks like, I mean it's not even branded on the outside. And like I said in the piece, it like has all of the warmth of a Swiss bank. And then I kind of get whisked to like one of the two floors that they ever allow outsiders on. Like one is kind of this top level cafeteria like floor with sort of a coffee shop and some conference rooms and then a lot of desks where people are doing the kind of work that an outsider can walk by and see their computer and it'd be okay. And then there's like the cafeteria floor and there was no going elsewhere. I mean, I tried.
A
Well, you were able to see a lot of interesting stuff just on those floors. I mean, can you talk a little bit more about Claude and kind of the sort of things that they were testing on Claude? Like I'm thinking About like Project Vend and just kind of like the experiments that they were running kind of in real time while you were there.
B
Well, I mean, I think at first I thought like, oh gee, they did a good job removing anything interesting to look at. But I think it really is because like, Claude is just sort of omnipresent there. And so one of the first things I saw was this vending machine project called Project Ven, which is a partnership with an AI safety company called Andon Labs. And the idea, kind of the first order idea is, you know, we've had so much conversation about the future of automated businesses and you know, like Sam Altman has said, like, I'm on this group chat with my tech CEO buddies and we have bets about like when we're going to see, you know, the first billion dollar company with no employees or one employee. So on some level it's like, let's see how Claude can handle this stuff in real life. Like, Claude can field requests for things and contact wholesalers and like try to run a business. But on another level, like so many of the experiments that they do, it's like on a second order level, it's really just a question of like, well, what is this thing? Like, like, how can we, like, what happens when we mess with it? You know, like, what happens when we ask it to put meth in the vending machine or medieval weaponry? And like, let's like, how can we trick it? You know, can we use very bureaucratic sounding language about discounts to like trick it into giving us stuff for free? And like it turns out, yeah, they could do a lot of that Stu. Then it becomes this kind of like cat and mouse game of like, can the people improve Claude to do this stuff better and try to stay ahead of the employees. But the employees of course are kind of ingenious in their attempts to keep tricking Claude.
A
So you have Project Vend, which is kind of like testing for like a pretty like specialized use case. But then what does Claude look like for, I don't know, like a user at home. Like, I'm just thinking about like listeners who either have never used an LLM or maybe they've only used like ChatGPT. Like kind of what is the experience of like trying to get Claude to do something for you, like at work if you don't work at anthropic look.
B
Like, I mean, it's not all that different from ChatGPT. I mean, ChatGPT has a white background and Cloud AI has a crew background, but also they for reasons that are, you know, kind of partially just contingent and I think partially were part of the plan. Like they never, they've always lagged behind in the consumer market and this has to their great benefits like spared them from a lot of the stuff that CHAT GPT has had to go through. You know, like ChatGPT GPT has had to deal with these issues of self harm and psychosis and like, like egregious hallucinations. Whereas Claude, because like they're, the adoption has been much less in the, in the consumer market they haven't had to deal with a lot of that nonsense.
A
So who is like their ideal user base? Is it coders?
B
Well, so you know, initially it was an enterprise play. It was like we're going to like, you know, help you have like a bespoke version of Claude that's going to work for your company with your data and like do the things that you need. So you know, they have something like 300,000 enterprise customers and also those are you know, much bigger contracts than just people paying $20 a month. But then in the last, now a little over a year, it's been a lot of coding both initially for experienced engineers who could just like talk to Claude in natural language and get code back and then more and more like people doing vibe coding that like they, you know, you can have no coding experience at all and you can sit down with Claude code and like create an app for yourself.
A
What do you make of Claude's personality? I mean it's, it's hard because like you can kind of tell it to act in a certain way. And so the personality seems like it's partly derived from like the user and what they want. But like I feel like I will also see on like recently on X, like someone was complaining that they asked Claude to write a slack message for him and Claude basically refused to do it because it was too simple of a task maybe. I'm so used to like, you know, chatgpt being like sycophantic and telling you that you're like emperor of the universe if you like want it to. That like seeing Claude kind of say no to something is interesting and I guess I wonder if that's a feature or a bug.
B
Well, I mean, I think it kind of throws into question like, you know, you say feature bug and it makes it sound like a lot more of the stuff is engineered than it actually is. You know, one of the things that came out of my conversations is that like the fact that Claude has kind of a strangely Interesting personality, like, was not something that was intentional at the beginning. Like, that they, you know, they had certain ideas about how they wanted it to function, but it's not like they sat down and they were like, we want to create a, like a personality. It was like that kind of naturally emerged from what their orientation was. And their orientation was, I mean, to put it in radically oversimplified terms that, like, the idea before Claude was basically like you trained a model and then you just, you did what's called reinforcement learning with human feedback, which was just like users saying kind of like thumbs up, thumbs down on the answers that it got. And it was just like very broad brush, like, purely behavioral. Like, when you say sentences that we don't like, you know, when you complete a sentence like the recipe for napalm is X, we're going to wrap you across the knuckles. And that it was like largely a kind of negative. It was really just like a rat in a cage style, like pure behaviorism. And like, their idea was that's always going to be kind of brittle because you're going to have all of these edge cases that, you know, something that's purely trained on a kind of like, binary thumbs up, thumbs down is never going to be able to handle. So that instead of doing that, we're going to put a lot of thought into, like, what kind of entity this should be. And, like, they basically came to the conclusion of, like, Claude should be like a good friend whose judgment you trust.
A
Let's take a break, and then when we get back, I want to talk more about AI safety, just more generally. This is the political scene from the New Yorker. Wired has always put a microscope on the people, power and forces shaping our world. Uncanny Valley brings that same fearless reporting straight to your feedback. Is Doge finally over? Will AI actually democratize American healthcare? Each week, Wired journalists from across the newsroom are going to unpack where politics, technology and Silicon Valley collide. From conversations with tech leaders across Silicon Valley, Internet fandom investigations, and government crackdowns on rigged gambling, we're taking you all over the news cycle, going straight inside the priorities, pressures and power plays driving today's biggest decisions. Uncanny Valley tackles the questions keeping you up at night and helps make sense of the future taking shape right now. Listen to new episodes every Thursday, wherever you get your podcasts. So one of the interesting things about Anthropic is that one of its co founders, Dario Amadei, used to work as vice president of research at OpenAI, which is, like, I would say probably anthropic's main competitor. So what's the story behind Dario's departure from OpenAI and what's like the story of the founding of Anthropic?
B
Well, so you kind of have to go back to the story of the founding of OpenAI, which is that after Google buys a DeepMind for $650 million in 2014, Elon Musk and Sam Altman get together and they're like, we mistrust Demis Hasabis, the founder of DeepMind, and like, if someone is going to invent like the most powerful technology of infinite plasticity, like, like that person's gonna be incredibly powerful and like we don't trust him. Now of course, like this was the public story they gave, but also like Elon Musk wanted to buy DeepMind. Like part of it is just that like, seems like they were like the kinds of competitive megalomaniacs that we know that they are today. Yeah, but like their pitch was like, we want to do something that treats this properly as like a SC specific project and is going to make sure that this is developed to benefit everyone. And this helped them recruit a lot of people at the time from Google because Google was like the main powerhouse at the time, including Dario and including Chris Ola, whom I mentioned earlier. So they go to OpenAI and then after a couple of years it seems like, oh, maybe Sam Altman is just another kind of like replacement level, like power seeking tech executive and who certainly knew how to make like the right noises about AI safety and responsible development. But like, there's been tons of reporting about this, about seems to have been kind of talking out of both sides of his mouth. And while he's talking about, you know, doing this for the broader good of humanity, he's also negotiating these billion dollar deals with Microsoft. So at a certain point in the fall of 2020, Dario and his younger sister Daniela, who is the president of anthropic now, and five other people leave OpenAI and then in the, in 2021 announ, they've formed this company. And initially the idea was that they were going to be a kind of safety minded research institute. And when you, you know, when you go back, they, or at least in retrospect, they say things like, well, we weren't even sure if we were ever going to commercialize this. Like, we really didn't know we were interested in like what, like what is the future of this technology. But of course if you're, you know, the remit you've chosen for yourself is to scrutinize these things to make sure that they are safe. It turns out you kind of have to build state of the art models if you want to have state of the art scrutiny. But what they committed to at the beginning was like, we're not going to push the boundaries of capability that like, we will try to keep up with our competitors and while making, ensuring that these are safe, but like, we're not going to get out in front. And so Claude actually was like potentially ready for consumer deployment in the summer of 2022, like three or four months before ChatGPT was released. And they decided not to release it because they thought that it needed further monitoring and they weren't sure it was Safe. And then ChatGPT comes out, famously, Thanksgiving 2022, and like within two, you know, it's the fastest growing, like consumer app in history. Within two months it has 100 million users. And that then they realized, well, if we're going to be able to stay viable in this industry, like we also have to put a marker down. So then in the spring of 2023, they release Claude. And then it's been this horse race since then where like, you know, every month or two you have like Google releasing a new Gemini and OpenAI releasing a new ChatGPT. And like there's like right now, like Claude, you know, they just released Opus 4.6 and like, they seem to be kind of at the top of the leaderboards. But then we all know in a month it'll be Google. Like the horse race was maybe inevitable.
A
What does it mean for an AI model to be safe? Like, is it just if you ask it, you know, for help in like creating a cocktail of drugs that will kill you, it'll refuse to do that. Is it like this idea that AI might replace us? And so I guess I wonder like, when we talk about safety, what we're actually focusing on?
B
I mean, it's a great question. And like they're like part of what makes this discourse maddening sometimes is because like, safety is used as this like umbrella term to talk about so many different things. And some of it is a matter of principle and some of it is just a matter of affiliation. And like a lot of the current trouble that we run into with like some of these questions goes back to like just some basic sociological, like history, which is that now at this point, a little over 10 years ago, you sort of had, you had like two different camps that developed talking about safety. You had like the people who identified as like AI ethics people. And these were the people who were primarily concerned with things like bias and transparency. And. And then you had the safety people who were, like, much more concerned with things like existential risk. And, you know, in kind of if we lived in a better world, like, maybe 10 years ago, those people would have, like, come to some kind of rapprochement. But they were just, like, they cared about different things. They were, like, they identified politically in different ways, and, like, they decided that they really didn't like each other and didn't get along. And so then, like, we ended up with, like, this kind of stupid false dichotomy between, like, caring about, like, proximate harms, like bias and caring about, you know, potentially, like, catastrophic harms, like, paperclip problem. And there's been this kind of, like, idiosyncratic, like, path dependency where we've ended up, like, thinking of these as, like, two different problems with two different camps. But I think, you know, there's a professor at Stanford who does interpretability work named Chris Potts, and one of the things he said to me, which didn't make it into the piece, but he basically said one of the fallacies is to believe as an AI safety person who cares about existential risk, is that, like, you can kind of, like, keep your powder dry and then be, like, humanity's last stand when it, like, kind of comes down to the apocalyptic, like, you know, eschatological moment. And he was like, I just don't think it works that way. Like, I think that the way that you prepare yourself for, like, those, you know, issues of, you know, if we get to the point where there's, like, super intelligent autonomous actors that, like, you're only prepared to deal with that if you're kind of, like, in the trenches, dealing with, like, all of the proximate problems along the way. And, I mean, there are plenty. You know, Eliezer Yudkowski would totally disagree with that and would say that, like, no matter how well you prepare for, like, the proximate harms, like, there's nothing you can do in that endgame. And, like, it's not a stupid argument. Like, it's very plausible, but it kind of seems like, short of just, like, stopping everything, which, like, there is a good argument to do that also, like, short of stopping everything, I think you would want to take a more holistic approach to all of these things.
A
How did people at Anthropic think about the idea of the singularity? And I guess I'm wondering part of the AI safety conversation. It's like, for me, AI Safety would be maybe there not being an AI that's more intelligent than all humans and that can overtake us, even if we think that's going to be a benevolent version of that thing. I guess I'm wondering what version of that conversation is happening at Anthropic and whether they kind of want AI to become better than us or whether they want it to become as good as us, but not necessarily better.
B
So what I think is important to say here, and this is something, you know, like, my experience with this piece was at Anthropic, but my strong suspicion is that at like, you would find the same thing at Google and OpenAI. I don't know about Xai, but that like, there really is a much greater variance in viewpoint than like, one might suspect from the outside. That, like, you really can find at Anthropic, like, virtually every position on the spectrum. From like, yeah, we really, like, I stay up at night thinking, like, we should probably stop to like, all of these existential concerns are ridiculous. What are you talking about? Like, Claw's going to cure cancer and like, we might have some like, hiccups along the way and like, worries about social instability because of mass white collar unemployment or whatever. But, like, you get the whole range. So it's not, I think from the outside there's like a suspicion that either there's like a complete homogeneity of attitude about this and like, kind of everybody thinks like Sam Altman or whatever, or you get this suspicion that, like, people aren't like, thinking about this at all, but there's exactly the same kind of range of opinion, probably even a wider range of opinion than you would like, find among like, normie people about this stuff.
A
Because they're thinking about it all the time.
B
They're thinking about it all the time. And like, there are like, you know, for every one of these positions, you can come up with a good argument about it.
A
I want to ask you about some of the, like, the political criticism that Anthropic has received. Like you mentioned in the piece, that there are figures associated with the Trump administration, like David Sacks, Trump's Aizar, and then Pete Hagseth, his Secretary of War. I'm just realizing, as I say, like, AI Czar and Secretary of War. These are really strange phrases to say aloud. What are those criticisms and where do they come from? Is there a uniquely antagonistic relationship between Anthropic and Trump world, or is it just that any major tech company is now going to be kind of going through the wringer?
B
I think Any major tech company is going through the wringer unless they like go and pledge fealty, you know, and bend the knee the way a lot of the other executives have. I think that there is a, they do have like a special like bee in their bonnet about Anthropic, which they kind of like perceive as like the opposite tribes AI company.
A
What do they say about it exactly?
B
Well, David Sachs has called like, went on this rant last spring about Anthropic being like part of a doomer cult. And he doesn't take the whole thing seriously at all. And I mean, but frankly, like, if he didn't have so much power, it'd be very hard to take this seriously at all. I mean, it's still hard to take this seriously because somehow the whole thing amounts to like, we should let Nvidia sell as many chips to China as it wants, like, which is like a very strange position for like a nationalist administration. And like they make these like hand wavy arguments about how like America should own the tech stack, which like, I don't know, I don't think anybody really like takes this seriously. So you know, when Dario has made some like mild criticisms of saying like, maybe we shouldn't sell our most advanced chips to China, which like as recently as a year ago was kind of the consensus bipartisan opinion, like all of a sudden he's like the evil woke enemy. I mean, it doesn't really make any sense.
A
There's also like Anthropic being pretty public about its tech not being used to develop weapons, which I'm sure would like maybe bother a US government that feels like it's investing and kind of facilitating these companies specifically for that. I mean, I guess in the same way that Anthropic initially wasn't planning on releasing Claude to the public and then it decided, well, we kind of have to in order to keep up with everyone else. I mean, when you see a commitment like, yeah, we're not going to make weapons, do you sort of assume that that is a real commitment that they will follow in the long term, or do you think that all of these companies are sort of subject to these market pressures?
B
I mean, you know, so much of the conversation about like the tech executive classes like Turn to the Right has been about the issue of worker power. And I think that this is like radically oversimplified, but it's certainly part of it. And that like, you know, back in 2017 when there were like the project maven protests at Google that like, that was when like the employees were in such high demand that they had like a lot of leverage. And like now, you know, kind of like post there was like the COVID bump in employment and now like so many jobs have been cut and like so much more of it has been commodified and commodified in part because like now it can be automated that like now like power has returned to capital, like away from labor. But I still think at a place like anthropic, where it is so mission driven and really everybody there seems like so aligned with their mission that like there would be, I mean I could, obviously I'm speculating here. I think there would be tremendous employee blowback if like they reneged on their commitment, like not to make autonomous weaponry. And like, you know, there. Even if like basic software engineering has kind of like become increasingly automated, there's still a huge premium as we saw last summer when Mark Zuckerberg was offering these people like hundred million dollar contracts on machine learning expertise. And these people still have labor in AI, still has a tremendous amount of power. And I cannot imagine that the rank and file would tolerate. Okay, yeah, now we're going to make death machines.
A
Yeah. I do want to talk more about some of the issues and questions we're seeing play out with AI's effect on labor and just sort of how the general public is responding to all of this. But we're going to take a quick break and then come right back. This is the political scene from the New Yorker.
B
Right now we are living through some of the most tumultuous political times our country has ever known. I'm David Remnick and each week on the New Yorker Radio Hour, I'll try to make sense of what's happened Alongside politicians and thinkers like Cory Booker, Nancy Pelosi, Liz Cheney, Tim Waltz, Ketanji Brown Jackson, Newt Gingrich, Robert F. Kennedy Jr. Charlemagne, tha God, and so many more. That's all in the New Yorker Radio Hour wherever you listen to podcasts.
A
So ever since AI, like really came into the public, it seems like the general line, at least from AI companies was that these tools were meant to augment human labor as opposed to replace it. But then you have, you know, like in May, Anthropic CEO Dario Amadei told Axios during an interview that he believed AI could wipe out half of all entry level white collar jobs and that this could push unemployment as high as 10 to 20% in the next one to five years. So I don't know. I mean, it's like when you have a CEO of an AI company. I feel like we hear things like this from AI CEOs and you don't know how much of it is just, like, wishful thinking. And I guess to start, could we just talk about, like, based on what you saw at Anthropic, does that seem like a realistic prediction to you or is this hype?
B
Oh, I think it's definitely a realistic prediction. Yeah. Yeah, I mean, well, I mean, especially considering that, like, so many of us kind of just do, like, bullshit email jobs in the first place that are, you know, not exactly, like, invitations to, like, human fulfillment. Right? I mean, like, there are just a lot of things that are subject to being routinized. And, you know, you can take this kind of, like, sunny view that, like. Well, you know, like, you hear these arguments all the time. Like, the introduction of ATMs actually led to more bank tellers. And, you know, recently I was listening to a podcast where they were talking about how, like, when they first had, like, 3D animation engines, there was this idea that, like, oh, there go like the hand drawn, you know, cel animators. But then actually, like, people just made Toy Story and like, these incredibly creative things. And that, like, the human spirit of ingenuity is, like, indomitable. And like, sure, like, I would love to believe that and like, maybe it will shake out that way. But, like, one of the things that makes these conversations so difficult to have, like, rationally is that people are talking about the possibility of, like, fundamental discontinuity. And, like, you can't reason your way across a fundamental discontinuity. And like, so then the question is, like, is there going to be a discontinuity or not? And like, I would not discount this possibility out of hand just based on, like, well, we haven't. We like, haven't had discontinuities before. Like, we haven't had them of this magnitude. But yeah, I mean, I think it's definitely a possibility.
A
How were people in the anthropic office thinking and talking about the idea of creating a tool that would not only replace them, but replace other people and wipe out jobs? I mean, did they seem to feel bad about it? Like, how often was that something that was on their minds?
B
So setting aside the question of the executives here who are just gonna, like, their executive's gonna executive, you know, like, they're gonna. They all just, like, talking points that, like, sure, we can take them at face value, but, like, they're fundamentally kind of superficial things that, like, you just say when you're on, like, A deal, book, stage or whatever. But, like, the thing that came across to me in talking to a lot of these, the like, rank and file researchers was like, people whose fundamental attitude was, I have a PhD in like, some obscure branch of like, NLP. And I was. Or natural language processing. And like, thank you. I was gonna ask.
A
Sorry.
B
And I was like, planning to spend my life figuring out computational representations of like, center embeddings or like, subject verb agreement or whatever. Like these like, like relatively niche questions of computational approaches to language. And like, all of a sudden, like, my obscure area of expertise has become like, the hottest thing in the world. And like, here I am at this company, like, making a lot of money and like, I just feel like I am doing the thing that I was trained to do because I was really interested in this very specific, arcane question of computational linguistics. And now it's my job to worry about teenagers harming themselves or how we are gonna handle as a society. These questions of potential mass white collar unemployment, I don't know. That is above my pay grade. And I have a lot of sympathy for that, which is. That's not why these people got into this. And like, these are not questions that we should want to be solved by engineers at three different companies. As smart as these engineers are. Like, these are problems for all of us to attempt to solve at like, a societal level. And like, everybody wants there to be a kind of like, magic bullet of like, UBI or whatever. And like, maybe we will kind of like fumble our way there. But I think it's a lot to put on the shoulders of like, these people to be like, you broke it, you bought it. Like, that's. That shouldn't really be the way it works. Like, they are. They are working on a tool that is very, very exciting. Like, you know, part of the point of the piece is that, like, setting aside everything else you think about these things, like, it is. It raises tremendously interesting scientific and philosophical questions about, like, the nature of intelligence and the nature of learning and the nature of language and all of these things that, you know, like, a lot of old questions are new again. And a lot of the people working on this are working on it out of the spirit of, like, scientific curiosity. And I don't. It's hard to blame them for not having, like, answers to these, like, enormous questions.
A
Yeah, no, it's one of my favorite parts of the piece where you kind of introduce this, you know, like the sort of contradiction that is like, in the reader's head the entire time. They're reading the piece, which is like, if you're so committed to safety, then why are you even doing this? And you quote, like an anthropic researcher who told you at one point, like, maybe we should just stop. But then, like, you know, the most candid AI researchers will own up to the fact that we are doing this because we can. And basically, like, we're pursuing this because it's epic.
B
Yeah.
A
And you know, I kind of understood it, you know, at that point it's like, yeah, it would be very hard to stop yourself if like, this is like kind of what you've been training your whole life to do and you're like making these breakthroughs and you're creating these things. And I guess like on one hand it's like, I don't think that we should necessarily hold these scientists responsible for like, coming up with the solutions to society's problems, even if they are kind of exacerbating these problems or speeding them along. But anthropic does frame itself as like the good guys, like the AI safety guys. And so I guess, like, it makes me wonder if they, if they're in a tough position because they're positioning themselves as a more ethical company and then there are going to be all of these, like, ethical questions and implications that come with AI and it's like, will we look to them to mitigate those effects?
B
Well, so, I mean, I think again, you can kind of like break down a lot of these things and you could say, like, okay, well what about like information processing, right? That like, they are, they are committed to like, Claude is not gonna like, tell you that like, the moon landing was faked, that like, they do have some idea that like, they want this to be a kind of like informational backstop and like, they put a lot of energy into that. But then so much of the safety work is like, so many of the resources are focused on like, things that could really happen soon as opposed to like, well, okay, maybe in three, three to five years we have to deal with like mass white collar unemployment. But like, guess what? What we have to deal with like right now is the possibility of, you know, like, what they call, like bio uplift, which is like, is it possible? And they're constantly running these trials. Every time they release a new model, which is like, they get a bunch of biology like PhDs and master's students in a room and like lock them into a hotel room and they're like, use Claude to try to, you know, weaponize botulism or whatever. And it's the kind of thing where, like, you know, there's so many variables that go into trying to figure this out, which is like, is it possible for like, what kind of person now might be able to do that? That like, before you would have needed like a handful of like, you know, the best virologist in the world to do this kind of thing? Like, to what extent can like a normal person do this? Because there's this hope that's like, well, there are all these kind of like practical guardrails that existed before, which is like, maybe these things were possible, but like, there were so few people who could do it that we could like, kind of count on the fact that probably none of those people would be like, motivated to do it. Which was like, maybe a little naive. But it's kind of like largely held up so far with of course the exception of like the Aum Shinriko cult in Japan, which like, did try to do that and like really almost pulled it off. Right? So like, we have there is like a salient reference class of like lunatics who have almost pulled this off in the past. And so, like, what they're really concerned about is how do we make sure that some like, you know, bright kid with two years of biology can't come up with like some novel virus that's going to kill everybody or, you know, like even more recently, like just in the last couple of months. Now the really big concern is cyber because you have like, in cyber you do have like tons of people out there, like state actors and non state actors, actors with like, tremendous financial incentives to figure out how to like, commit like greater cyber crime. And like, they've already shown that like, Claude is being used to do this kind of thing. And so, so many of their resources are on like, well, okay, mass like social instability due to mass unemployment, like, seems very, very bad. But at least that's like maybe three to five years away. Whereas, like, we gotta deal right now with like the possibility of like people using this stuff to like make bioweapons or like commit massive cybercrime.
A
Just to wrap up, like, I wanna go back to one of the central questions of your piece, which is after spending time inside of Anthropic and doing all this reporting, do you come away feeling like the people building these systems feel in control of what they're creating?
B
No, no, no. Well, I think they feel like so far we're still a couple of steps ahead, but they just feel like we're really not that far from the point that like, we Might. We can't take for granted that we're like, a couple steps ahead.
A
And do they. I mean, I don't know. Were you left feeling good after. After you kind of walked away with that conclusion? Are you. Yeah. I mean, do you. Are you excited? Are you scared?
B
You know, I've gone. I've run the gamut of emotions on this. I mean, I think after my first trip last May, I was very, very depressed. And I was depressed for a lot of reasons. I was depressed about all the social issues we're on, depending describing. I was depressed about, like, the, like, all of the threats that exist. I was also depressed about, like, the cultural chasm that exists that, like, I would kind of, like, come back to Brooklyn where, like, you know, at a literary party, people would kind of, like, pretend like this all wasn't happening. And I would think, like, you are doing yourself a disservice by just, like, repeating these shibboleths of stochastic parrots over and over. Like, we kind of need everybody to be thinking about this stuff and taking it seriously.
A
Yeah.
B
Then, I don't know, maybe the next trip I didn't necessarily feel as depressed. Like, then, you know, there were trips where I would feel, like, really excited about all the possibilities here and also, like, very glad that the, you know, there was clearly some selection bias involved in, like, the people that I was talking to because, like, I would. I knew the research that I wanted to be following and. But at least among the, I don't know, 75 employees that I talked to, like, I thought, like, I'm glad that these are the people working on this stuff. Like, you could think of a lot of replacement level people who would not be as. Would not be thinking about these things as carefully. I mean, I think part of the experience of reporting on it was similar to the experience of working on it, which is just those kind of whiplash feelings of moments of terror and despair and moments of awe and moments of enthusiasm and. And luckily, the goal with this piece was not to get to the bottom of all of this. The goal with this piece was to, like, underline the state of uncertainty that we're in and, like, sharpen the questions that, like, maybe we should be thinking about and asking and taking seriously.
A
Well, thank you so much for being here today.
B
Thank you, Tyler. That was really fun.
A
Gideon Lewis Krauss is a staff writer for the New Yorker. You can read his latest piece on Anthropic and Claude@New Yorker.com this has been the political scene from the New Yorker I'm Tyler Foggitt. This episode is produced by John Lamay with mixing by Mike Kutchman and engineering by Pran Bandy. Our executive producer is Steven Valentino. Our theme music is by Alison Layton Brown. Thanks so much for listening.
B
Come see Critics at large live. On February 19th, we're gonna be at the 92nd Street Y in New York City for a conversation about Wuthering Heights. There's a new adaptation coming up starring Margot Robbie and Jacob Elordi, and we will certainly be getting into that. But. But we'll also do what we humbly. I'll say what we do best. Returning to the text, we're gonna go deep on the gothic and Emily Bronte. Join me, Vincent Cunningham and my co hosts Alex Schwartz and Nomi Frey for the discussion. And crucially, if you buy a VIP ticket, you'll join us for an after party, too. Go to 92ny.org for more information. That's 92.ny.org. hope to see you there. From prx.
Episode: Can Anthropic Control What It's Building?
Date: February 12, 2026
Host: Tyler Foggatt
Guest: Gideon Lewis-Kraus (New Yorker staff writer)
This episode centers on the rapid advancement of AI, with a focus on Anthropic, one of the leading AI companies behind the large language model, Claude. Host Tyler Foggatt discusses with Gideon Lewis-Kraus the paradoxes facing Anthropic: AI’s creators are now among the first at risk of being replaced by their own technology, and the company’s prominent commitment to AI safety is facing new, real-world challenges. The conversation delves into the internal culture at Anthropic, the technical philosophy behind Claude, the shifting power dynamics in tech, political scrutiny, and the looming impact of AI on jobs and society.
On Claude’s displacement of coders:
“Now I have to figure out what I'm supposed to be doing while Claude is doing my work.”
— Anthropic employee Alex Tamkin, quoted by Gideon (01:51)
On the oddness of current AI research:
“Whatever’s going on...it’s pretty weird that...whatever’s going on is weird.”
— Gideon (04:06)
On Anthropic’s office:
“It has all of the warmth of a Swiss bank.”
— Gideon (08:29)
On safety tradeoffs:
“We ended up with this kind of stupid false dichotomy between...bias and...catastrophic harms.”
— Gideon (20:32)
On the limits of control:
“They feel like so far we’re still a couple steps ahead, but...we can’t take for granted that we’re...ahead.”
— Gideon (40:07)
On the contradictions of responsible innovation:
“The most candid AI researchers will own up to the fact that we are doing this because we can. And basically, like, we're pursuing this because it's epic.”
— Tyler (36:13, paraphrasing an Anthropic researcher)
The discussion is frank, skeptical, and intellectually curious, alternating between humor (“It has all the warmth of a Swiss bank”) and candid concern about the scale and pace of change. Both participants repeatedly challenge received narratives about AI, highlight the everyday surreality facing AI engineers, and foreground major open questions about control, responsibility, and the societal implications of advanced AI.
This summary captures the substance, tone, and structure of the episode for listeners seeking to understand both the factual content and the deeper questions at stake in AI’s current moment.