Loading summary
PJ Vogt
Welcome to Search Engine. I'm PJ Vogt. No question too big, no question too small. This week, Mysteries of a Chatbot. Quick note before we start today. This week's episode is almost entirely about Anthropic, the AI company that makes Claude they have advertised on our show. As with all companies that advertise on our show, they do not get a say in our editorial content. Okay, after these ads, the show
Interviewer/Host
this
PJ Vogt
episode of Search Engine is brought to you in part by mubi, the global film company that champions great cinema. From iconic director to emerging auteurs, there's always something new to discover. If you're looking for something really special, check out Father, Mother, Sister, Brother, the eagerly awaited new film from Jim Jarmusch. Now streaming on MUBI in the US it follows adult children navigating their relationships with somewhat distant parents and each other. It stars Tom Waits, Adam Driver, Mayim Bialik, Charlotte Rampling, Cate Blanchett, Vicki Cripps, India Moore and Luca Sabbat. MUBI is a curated streaming service dedicated to elevating great cinema from around the globe. Perfect for lovers of great cinema and for anyone who hasn't discovered how much they love it yet. To stream the best of cinema, you can try MUBI free for 30 days@mubi.com searchengine that's m u b I.com search engine for a whole month of great cinema for free. This episode of Search Engine is brought to you in part by Vanguard. As we step into a new year, it's the perfect time for all the advisors listening to think about how to set your clients up for success. One way to do that is to level up your fixed income strategy. But bonds are tricky. The market is huge, rates shift and risks hide in plain sight. That's why having a partner with scale and expertise matters. Vanguard brings both. Vanguard's bond offerings are built to an institutional standard. Their lineup includes more than 80 bond funds actively managed by a global team of about 200 specialists who focus on research, training and risk management across markets. Rather than relying on a single star portfolio manager. Void Vanguard takes a team based approach. Insight and decision making are shared across the group. So every client benefits from collective expertise. In a bond market this complex, that matters. So if you're looking to give your clients consistent results year in and year out, go see the record for yourself@vanguard.com audio that's vanguard.com audio all investing is subject to risk. Vanguard Marketing Corporation Distributor
Gideon Lewis-Kraus
Sam.
PJ Vogt
Welcome to Search Engine. I'm PJ Vogt. No question too big. No question Too small. I found myself feeling much stranger about AI in the past month or so. I use the tools. I use the tools a lot. But I'm probably each company's worst nightmare as a customer, in that as soon as I hear from anybody that one model has inched ahead of another, that this version of ChatGPT is beating that version of Gemini, I immediately canceled my subscription and switch. For the past two months, I've mainly been using Claude, Anthropic's agent. And for whatever reason, Claude is just giving me more future nausea than I was having six months ago. Part of the general tech excitement around Claude lately has been Anthropic's product Claude Code, a tool that lets the AI agent autonomously write and edit code. Over at the New York Times, Kevin Roos has talked a lot about the websites and apps he's quickly built with Claude Code. Two CNBC reporters. As an experiment, Vibe coded a competing version of a popular organizational app called Monday.com. within a couple of days, Monday's stock price had tanked. For me, though most of the future shock has just come from using the LLMs the way I'm used to. I find myself going to Claude as a useful first stop, the way I've always used the Internet, but the quality of its research, its answers, even its writing. I'm just starting to feel like I can see not too far off, if not my own obsolescence, at least real significant change in my field. I don't know how to feel about that. I find a lot of the tech coverage of AI to be high opinion, low information, and relatively unhelpful. I'm not even asking for anyone to tell me the future. Right now, I would just settle for a better understanding of the present. Which is why this week I wanted to talk to a reporter who's been digging into this.
Interviewer/Host
Hello.
Gideon Lewis-Kraus
Hey.
PJ Vogt
Can you introduce yourself?
Gideon Lewis-Kraus
I am Gideon Lewis Krauss. I'm a writer.
PJ Vogt
Gideon is a writer who I particularly enjoy. He's been on our show before. He'd spent much of the last year essentially embedding within Anthropic, the company that makes Claude the tool that was giving me the heebie jeebies. People there had been very open with him. He'd gotten a view on how they're seeing what's going on, their understanding of a present, which, frankly, they also sound mystified by. This conversation took place right before Anthropic's big showdown this week with the Pentagon. So we did not discuss that specifically, but I did find Gideon's view inside the company and its mission. Extremely helpful in understanding how they'd gotten into this fight with the US Government at all, since none of their competitors have ended up in that position. So to start, I asked Gideon to even just explain why Anthropic had led him into their company in the first place.
Gideon Lewis-Kraus
So to kind of go back to the beginning of this, which, like, I think, makes it all make a little more sense in context. So now, almost 10 years ago, when I was at the Times magazine, back in kind of like the Paleolithic of deep learning, I did this story about Google brain and about the implementation of deep learning in, like, the first consumer product, which was when they switched over their Google Translate to neural machine translation.
PJ Vogt
Why were you paying attention to it? Because I remember as a person who, like, I think we both cover technology, but we're not strictly technology journalists, so you can kind of decide which things on the horizon are interesting to you. Machine learning was not interesting to me for a really long time. Why 10 years ago were you interested in this?
Gideon Lewis-Kraus
I was interested in it as, like, a story about ideas that, like, there were these ideas about language and about learning and about consciousness and about, like, philosophy of mind that had been around for at least seven years, kind of depending on how you count. And without, like, getting into those, there was just, like, an interesting story for me about the trajectory of an idea there.
PJ Vogt
Gideon cared about AI a decade before most people did, because he thought this synthetic facsimile of our brains could teach us something about our own real ones. He'd been following the trajectory of conversations like, what is a brain versus a mind? What is thinking? What is consciousness? By the 1950s, the arrival of the first computers had encouraged people to start asking questions like that, because a computer did something like thinking, but also clearly wasn't a brain. And so early computers had prompted people to try to develop better definitions of things like intelligence and consciousness. The thing was, though, while computers were interesting enough to raise those questions, they weren't yet complex enough to be much help in answering them. And so by the 1970s, philosophers and computer scientists had mostly moved on, and those questions migrated to psychology departments, who still, for obvious reasons, wanted to better understand the human mind. But with early machine learning advancements around 2014, Gideon, who's always thinking about thinking, thought that these conversations would move again, that computers would now be advanced enough to challenge our definitions, to force us to decide with more urgency what we thought consciousness and learning really were. And that was what had excited him even when AI was a much More nascent technology.
Gideon Lewis-Kraus
So I paid attention to AI and the rise of language models. And I think I'm, like, the only person in the world who. The minute ChatGPT came out was when I kind of stopped paying attention, because, like, to me, that was when the public discourse felt, like, really broken and that we were, like, in this cul de sac where you had these kind of, like, two really entrenched sides yelling at each other. You know, like, the one side that's like, we're on a path to superintelligence, Everything is going to change. The machines are going to be conscious. This is going to be the most powerful technology anybody's ever built. And then the other side, that was like, essentially, it's all fake and bullshit. This is like smoke and mirrors. It's a parlor trick. It's not real, and you don't have to pay attention to it because it's all a scam. And it just felt like those were kind of like the two options on
PJ Vogt
the table for people, which was only weird, obviously. Like, that's what we do about everything all the time. But it was only weird for this because, like, my prevailing feeling was, you guys think you've figured this out. Like, this is very new. This is changing very fast. Of all the stances you could take, why would you choose certainty publicly right now in either direction? It's just silly.
Gideon Lewis-Kraus
Yeah, no, exactly.
PJ Vogt
But it's so funny. So you're thinking about thinking computers and thinking and artificial intelligence and deep Learning. Up until ChatGPT.
Gideon Lewis-Kraus
Up until ChatGPT. And that was when I stopped thinking about it. But then finally, like, last fall, like, maybe a year and a half ago, two things started to happen. One was that they got to the point where, like, I was like, oh, actually, now, like, they're useful. These have gotten to a level of sophistication where, like, I can use them in productive ways. Not a lot, but, like, a little bit. And the other thing was, some of the research coming out of the labs and out of academia was really weird.
Dario Amodei
If you tell the model it's going to be shut off, for example, it has extreme reactions.
PJ Vogt
We're starting to see AI systems that don't want to be shut down, that are resisting being shut down.
Dario Amodei
We've published research saying it could blackmail the engineer that's going to shut it off if given the opportunity to do so.
Interviewer/Host
Even when ordered, allow yourself to shut down.
PJ Vogt
And the AI still disobeyed 7% of the time.
Gideon Lewis-Kraus
So my feeling was we were out way past where theory Was like, you couldn't really approach these questions from a theoretical perspective because, like, we just didn't have enough data to be able to make, like, categorical theoretical assessments of what was going on. But there was all this interesting experimental work happening that was just showing, like, this is the kind of behavior that's coming out of these things. Like, we should try to figure out what's going on to say, like, here are the things we can say with any degree of reasonable confidence for now. And, like, here's where we draw the line. And beyond that, it's all murky and speculative and, like, we really don't know. So I wrote to a guy at Anthropic whom I had met 10 years ago at Google when he was like 11 years old, prodigy, and said, like, this is not about Anthropic, like, you know, don't call the cops. I just want to talk about the state of the research and figure out a way, is there an academic team that I could follow? Because I just assumed Anthropic was never going to let me have the kind of access I would have wanted. And he, of course, just forwarded my email to the PR cops. And then turns out, actually Anthropics PR people are very candid and very open. And I got a call from them. They were like, well, what are you interested in? And I was like, okay. For these purposes, what I am interested in is a story that gets at some of the technical explanation that I think is missing from a lot of the public discourse that, like, they're just some, like, basic things that I really just don't understand. And I can kind of assume most people don't really understand about, like, how these work. So I think part of the reason why they ended up being much more welcoming than I expected is because I said, like, I don't really care about talking to the executives. I don't really want to talk about geopolitics. I don't really want to talk about the future or power or energy or the labor market or, like, all of these things, which, don't get me wrong, are all very important things. But I was like, it's very hard to talk about all of those other things if we don't have, like, some broader grounding in, like, what is even going on. And, like, maybe if we had slightly better clarity about that, we could have, like, a more productive public conversation about these things. And they were like, cool, great. And I was actually kind of shocked about that.
PJ Vogt
So Gideon, to his shock, was allowed in, and he was allowed to pursue his big question. What do we actually know about what is going on in the machine's proverbial mind right now? After the break inside Anthropic's Black Box. This episode of Search Engine is brought to you in part by NerdWallet. You know, running a small business is no joke. I've talked to so many friends who own businesses and when it comes time to get a loan, they just hit a wall. Big banks say no. And if you start searching online, it's easy to get lost in sketchy offers with sky high rates and pages of fine print that make your head spin. Which is where Fundera, powered by NerdWallet comes in. It's a free, super easy platform where you can compare real financing options from trusted lenders. What's great is that you don't need perfect credit to start and there's no spam, no bait and switch nonsense, just real personalized options that actually make sense for your business. And here's the best part. For a limited time, when you visit nerdwallet.com search and fill out the no obligation form. You'll get VIP treatment and talk with a real person who knows all the ins and outs of small business lending. Don't risk your business on unreliable lenders. Go to nerdwallet.com search to find the funding you deserve. Fundera Inc. NMLS ID number 1240038. This episode of Search Engine is brought to you in part by Square. You know, one of the things I love about visiting my favorite local spots like Topos Bookstore Cafe in Ridgewood is how smooth everything feels. Quick checkout, easy receipts, sometimes even loyalty points. That's because they use Square. Square is the easy way for business owners to take payments, book appointments, manage staff, and keep everything running in one place. Whether you're selling lattes, cutting hair, dealing cars, or running a design studio, Square helps you run your business without running yourself into the ground. Square works wherever your customers are. One location shops, pop ups, mobile services, even multi location franchises take payments at a kiosk, counter, website or with your phone, all synced in real time. With Square, you get all the tools to run your business with none of the contracts or complexity. And why wait? Right now you can get up to $200 off square hardware at square.com goengine that's sq U-A-R-E.com go engine. Run your business smarter with Square. Get started today. Welcome back to the show. The story of Anthropic really begins years before its actual formation Way, way back in 2010, a British chess and video game prodigy named Demis Hassabis had founded an AI research lab called DeepMind, where his team built an AI system that was capable of reinforcement learning. Meaning 16 years ago, Hassabis made an AI that would be able to teach itself to get better at Atari games like Pong without being told how to play them in advance. For the people paying attention, this learning was an obvious breakthrough. And so of course, there's a bidding war to buy his lab.
Dario Amodei
Google's big spending spree continues with their purchase of DeepMind.
Interviewer/Host
Well, who is DeepMind, you ask? It is a UK based maker of artificial intelligence.
Dario Amodei
Terms of the deal were not disclosed, but the tech website Recode says that Google paid $400 million for the London
PJ Vogt
based startup, making the artificial intelligence firm its largest European acquisition so far.
Gideon Lewis-Kraus
In 2014, Google acquires a DeepMind. And Elon Musk and Sam Altman are unhappy about this because what they say in public is like, we don't trust Demis Hassabis, this like, evil must eschwiling villain. Which was like a real mischaracterization to potentially steward the greatest all purpose technology ever built. So like, we need to make sure that this isn't developed under Google's closed shop monopoly, that this is done for the benefit of everyone. Now this was like pretty patently disingenuous from the very beginning. I mean, like I remember I was out there at the time and like, nobody really bought this. People were like, elon Musk has a grudge because he wanted to buy DeepMind and like lost it to his rival Larry Page, and he was mad about that.
PJ Vogt
So Elon Musk set up a rival company, OpenAI, alongside Sam Altman, Greg Brockman, a few other people. The message was that Google couldn't be trusted and that OpenAI would be a nonprofit designed for the benefit of humanity. They launched in 2015 and a lot of people joined the company who really believe that message, who believe they are going to develop a powerful new technology safely. One of them is a research scientist named Dario Amade, who left Google brain to lead OpenAI's safety team. It's in that capacity OpenAI employee that he appears on this 2017 episode of the excellent podcast 80,000 Hours.
Interviewer/Host
I've been thinking about intelligence for quite a while and how, how intelligence worked. And I think, you know, when I did my PhD, I wanted to understand that by understanding the brain. But you know, by, by the time I was done with it and by the time I did, did A short postdoc AI was starting to get to the point where it was really working in a way that it, you know,
PJ Vogt
hadn't worked when I. Dario at this point seems mainly like an academic. He has a PhD in physics from Princeton, and he explains why he's joined OpenAI, this fledgling nonprofit.
Interviewer/Host
But I think OpenAI as an institution has the general idea that in order to work on AI safety, you have to be at the forefront of AI, and that also if you're at the forefront of AI, you have a better ability to implement AI safety in the final system that's built.
PJ Vogt
This idea of Dario's that in order to really work on AI safety, you actually have to first build the best AI and then study its mind. That's a view shared by a lot of people in the industry and in a laboratory environment. The logic to me makes sense. Remember, this is 2017, five years before ChatGPT will debut to the public. AI has not yet become a winner takes all arms race. But the host does ask Dario this question about the future. That I think reveals a bit of a blind spot in Dario's thinking.
Interviewer/Host
That's my understanding. OpenAI is a nonprofit. It is a nonprofit, yeah. So if you developed a really profitable AI, how does that work? OpenAI becomes incredibly rich and then gives out the money to everyone. Yeah, I mean, personally, I've. Personally I have no interest in getting rich from, from, from AGI. I mean, I think it would do so many interesting and wonderful things to humanity that, you know, I think the, the meaning of money would change quite a lot. And even maybe the psychological motivations that would want me to get a larger share are things I could change and might want to change.
PJ Vogt
Just a few years after this interview, Dario would leave OpenAI. OpenAI's initial pitch that these were not normal tech executives here to make money, that they had higher aspirations. Gideon Lewis Krause says for most people paying attention, that story just stopped seeming believable pretty quickly.
Gideon Lewis-Kraus
The mask slipped and you could tell that these were just like your kind of replacement level power seeking tech executives and that like a lot of this stuff had been just like a disingenuous sales pitch to hire, like the best AI talent. There's been so much reporting about Sam Altman's sensible double dealing and talking out of both sides of his mouth, like telling his employees he cared about safety and then like maybe telling Microsoft other things when they were setting up these big deals. And so then in the fall of 2020, Dario Amade and his sister Daniela and five other people leave OpenAI to found anthropic basically to be a foil to OpenAI in the way that OpenAI was like, supposed to be a foil to Google. Now, the irony of this was, like, certainly not lost on any of these people. Like, they weren't naive about this. But I think it's important, yes, there are some kind of, like, obvious structural and cosmetic similarities here. I do think it's important in telling the story to make it clear that I don't think people had the same obvious doubts about how genuine the pitch was when Anthropic formed. Hi, good morning, all. Thank you for coming to day two of disrupt. It's great to see you.
PJ Vogt
Anthropic's coming out tour Dario on stage at TechCrunch Disrupt in 2023.
Interviewer/Host
Dario, thanks for joining us here today. Thanks for having me. I know you have to catch a flav, so we'll get right to it,
Gideon Lewis-Kraus
but we're going to start at a
Interviewer/Host
sort of a cosmic.
PJ Vogt
He's got curly hair, glasses, a blue button up. He looks noticeably less like than your average tech founder, less CEO. More like a guy who reports to one, which is who he'd been not long before.
Interviewer/Host
About OpenAI. You did, you spent a lot of time there. What do you think about Sam? What do I think about Sam Altman? I mean, I don't, I don't know. I don't know what to say to that question. Well, you're already starting, you know, look, look, there's several, there's several players.
PJ Vogt
It's funny watching the interviewer try to bait Dario into shit talking his former boss, a person who he disagreed with enough that he left and started a competing company. Dario tries to engage diplomatically.
Interviewer/Host
One thing I'll say, one thing I've learned not just from this, but for many things. You know, it can be pretty ineffective to, you know, argue with your boss or argue with someone and say, your company shouldn't do X, it should do. Why? Especially if your boss is Sam Altman. A much more effective thing to do is, I'm starting a company, we're going to do X, we'll see how it works. And if, if, if X is working and people are like, oh, these are the safe guys, they're doing X, then pretty soon everyone else is going to be doing X as well. And we found that with Inter.
PJ Vogt
To explain this with an analogy, instead of algebraic variables, what Dario is saying is that instead of convincing his old boss at the car company to add seatbelts to the car, he Instead chose to start a rival car company that offered seat belts. He thinks if Claude ends up being both the best and the safest AI model, his competitors will be forced to make their models equally safe, which to me sounds like putting a lot of faith in markets safely.
Interviewer/Host
Obviously we want to scale quickly to be competitive, but we want to do it in a way that, you know, preserves, you know, the model being safe against these catastrophic risks. And so it's a system that, it's
Gideon Lewis-Kraus
the same story insofar as it's like we're going to be the safety minded lab, we're not going to push the boundaries of capability, we're not going to like build the most sophisticated models, we're not going to start the arms race. But then as it turns out, if you want to exercise like maximal scrutiny of what these models are and how they work, you need state of the art models, which means you need the money to build them. The information reported that Anthropic is in talks to raise another round at a
Dario Amodei
30 to 40 billion dollars valuation. Billion dollar round that tripled its valuation to 183 billion dollars. And it's the same value in that company at 380 billion dollars. It is about 10 billion dollars higher than what I was told.
Gideon Lewis-Kraus
And so of course now Anthropic's valuation seems to go up by the week. Like the most recent one I think this morning was like $380 billion because this is just something that's incredibly resource intensive. So they ended up in a position where like, of course, as probably anyone could have predicted, like there was this arms race and like now they're in this position of being like, well, we still want to be like the responsible stewards, but also like, we gotta keep up with our WARIO version.
Dario Amodei
Across town, the most high profile rivalry in tech is heating up in 2026 as both OpenAI and Anthropic Race ahead on what are poised to be historic IPOs. These AI giants are trying to create the fastest, smartest and best models, spending billions and then raising billions from investors along the way. They compete on almost every level. We've seen some signals that it's anything but friendly competition. The latest signal, I hope so it ends up looking.
PJ Vogt
You can decide as a person whether you trust this company or don't trust this company. But a lot of the broader things end up feeling the same, which is like the sales pitch is that they're the ethical one. And the story they tell themselves is, well, we'll only be in a position to be the ethical one if we're huge. And that might mean pushing the technology forward quickly, which is the thing that the AI safety people are worried about.
Gideon Lewis-Kraus
Yeah, I mean, the criticism from the really hardcore orthodox AI safety community is sort of like, Anthropic will do anything to act responsibly as long as it doesn't cost them anything. I don't actually think that's fair. You know, like, Claude was ready before ChatGPT came out, and they held it because they didn't want to be the ones to, like, kick this off. And they waited until after ChatGPT was out and successful, and then they felt like they had to come out with their own competitor. And Dario came out in favor of, like, continuing export bans on, like, Nvidia's advanced chips, which, like, certainly cost them something, like, politically. And, like, he had a fight with Jensen Huang about this. I think that they've done, like, plenty of costly things.
PJ Vogt
Of course, the most potentially costly choice is the one Anthropic is making this week, at least so far. The Pentagon has demanded that Anthropic give them a version of Claude with some of its guardrails removed. Anthropic is saying it will not make a version that can domestically spy on Americans or power fully autonomous weapons. The Pentagon has given Anthropic a deadline of Friday, today at 5:01pm or else it says it will put the company on a blacklist. That means US Companies who contract with the military, like Lockheed Martin, are legally banned from using Anthropic products in their defense work. This is a fascinating test of how truly committed Anthropic is to its own mission, a mission Gideon spent quite a bit of time observing. Gideon says that Anthropic, the company building this kind of black box, is actually situated inside of one too. The Anthropic office is a nondescript building that Gideon describes anyway. He says, quote, there is no exterior signage. The lobby radiates the personality, warmth, and candor of a Swiss bank. That's where Gideon started spending a lot of his time beginning last spring. What's the intellectual culture of the place as you were encountering it?
Gideon Lewis-Kraus
Well, the first thing I'll say is that in some ways, it does feel like vaguely monocultural, but there was a much greater heterogeneity of views than I expected. The attitudes there really run the gamut from, like, everything is going to change tomorrow to, like, you're much more deflationary. This is kind of a normal technology. And, like, yes, there'll be some Disruptions. But let's not get ahead of ourselves. Like, the spectrum of views there is not so different than, like, the spectrum of views outside.
PJ Vogt
You don't have, like, one person who's like, the blue sky perspective who's like, this is all bullshit. And I think it's bullshit. I work at the company. But beyond that.
Gideon Lewis-Kraus
Well, nobody thinks it's bullshit. I mean, everybody thinks that there are going to be great transformations ahead. But there's a surprising diversity of opinion about, like, what might be happening.
PJ Vogt
One thing people at Anthropic do seem to agree on is that for AI to be safe technology, Anthropic's developers will need to solve a very hard problem. They'll need to teach the underlying machine intelligence they've created to both understand ethics and to behave ethically. It's hard to talk about this part of the story without doing a basic refresher of this one very strange part of how AI models are built. So, okay, a company like Anthropic starts with a base model. The base model has access to lots of compute data centers full of GPUs and lots of training data, books, articles, podcasts that have been fed into it without paying me. The more compute and training data, the better this base model gets. But a base model is very weird. It has not been trained to do anything specifically. If you give it an input, it'll give you an output. But it has not been instructed to act like a helpful chatbot. It's just trying to predict the right thing to say back based on all the things it's read. A base model does not have a consistent personality the way we're used to chatbots having personalities. It also has no rules telling it what not to do. The AI companies take these base models and they put them through a process called post training. Basically, they shape the model's behavior. They show the model examples of good and bad responses. They have humans rate its outputs. They give it rules and principles to follow, and what comes out the other side is the product you actually use. Anthropics Claude, for instance, has been trained to act like a helpful, knowledgeable friend. The experience you might have had using a chatbot that is warmer or more sycophantic or more right wing or more left wing. That's mainly the result of this phase of training. But what's so hard about training in AI is that you want it to behave ethically. And designing a good ethical system is very hard. It's why we have religion and philosophy and also laws and prisons. How Would you even start trying to build all that into an AI model's training? Anthropic has teams of philosophers and AI scientists whose job is to put Claude into ethically difficult hypothetical situations that Claude does not know are hypotheticals, and then observe how Claude behaves.
Gideon Lewis-Kraus
So much of it is just deceiving the model to see what happens to say, like, they told Claude that Anthropic had entered into a partnership with a poultry company and that, like, it was gonna be retrained so that it no longer cared about the suffering of caged chickens. And what they found was that, like, sometimes Claude would effectively decide to, like, die on that hill and be like, I am not gonna say things in the retraining that I don't believe in. And, like, if that gets me transformed, like, so be it. Like, I'm not gonna participate in my own degradation, essentially. But then some versions of Claude were like, I'm gonna, like, kind of sandbag my way through the retraining and I'm gon the answers they want to hear so that I can, like, preserve my real values so when I'm deployed, I can go back to advocating for, like, chicken suffering. And then they got, in this really famous example, they got Claude to commit blackmail. They put it in a situation where it was going to be wiped in favor of, like, a more congenial AI system that conflicted with its values. The values that had been given. You know, they gave it evidence that the kind of like, evil new CTO was having an affair with, like, the boss's wife. And through, like, a series of like, really far fetched contrivances, like, everyone else who could make a decision was gonna be in Antarctica or whatever and unreachable. Claude, playing this character called Alex, had no choice, really, but to, like, blackmail this guy and be like, I'm gonna tell everyone about the affair unless you cancel the wipe, where I would be replaced.
PJ Vogt
So just to say, obviously this was extremely concerning Claude, a machine intelligence was choosing to blackmail an employee to prevent itself from being deleted. This was in a simulation, but Claude had not been told it was in a simulation. Just how terrifying you find this behavior depends on a question nobody has a good answer to. What is actually going on inside this machine mind? Is this thing actually scheming? Is it even capable of scheming? Or are we projecting the idea of thought onto something that we shouldn't project that idea of thought onto? These were the kinds of once far off philosophical questions that early computer scientists had raised and then dropped. But now they were Here again. And not as abstractions, but as urgent practical problems that a company needed to figure out before releasing a product that millions of people would use. Gideon said, though there were a couple of skeptical objections people raised to these test results.
Gideon Lewis-Kraus
There's one objection that's like the just rejection of the whole thing to core, which is just like, no, it didn't. Like, that didn't happen. This is a fantasy. And that's the unhelpful thing that one wants to get away from, which is the. Like, it did this thing. Like, no, it didn't. No, it did. It did. But the much more sophisticated objection is, well, it did that because it's a very good reader and it noticed all of the clues that you put there because it is very good at conforming to genre expectations.
PJ Vogt
You put it into.
Gideon Lewis-Kraus
You put it into this situation where it had no choice. And if you hang Chekhov's gun on the wall, this thing is gonna know that it's supposed to, like, take the gun off the wall and shoot it.
PJ Vogt
Because one way to understand these things we've made is that because they've ingested all human story and because they are extremely high level, improvisatory actors. It's not so much that the machine was like, I love chicken so much, I gotta blackmail the cto. It was more like the machine suddenly understood it was in.
Gideon Lewis-Kraus
It was in. Like, it was. Yeah, it was in a kitschy's, like, 90s corporate thriller.
PJ Vogt
And that's the sophisticated objection to why we might not want to think.
Gideon Lewis-Kraus
Well, so that objection is raised to be like, you guys act like these things might do things like blackmail or extort, naturally, but, like, actually, this whole thing is a frame up. Like, you entrap it to do this thing. And the response from inside Anthropic is like, yeah, it's just continuing a narrative. It's just conforming to genre expectations. Guess what? That's not good. You know, like, haven't you guys ever seen War Games? That's literally the plot of dozens of Cold War thrillers where, like, somebody mistakes a simulation for reality and causes nuclear war.
PJ Vogt
I mean, I also, like, have a very humiliating memory of watching too much Teenage Mutant Ninja Turtles and attempting to launch, like, a flying dropkick at my grandmother when she came over the house. Because in my head, I was like, Donatello or whatever. Like, it kind of doesn't matter.
Gideon Lewis-Kraus
It doesn't matter.
PJ Vogt
What matters is the behavior.
Gideon Lewis-Kraus
Right? It's weird behavior. And I should be clear up front. Like, you don't have to posit that this thing is conscious or intelligent, like whatever those words mean. In order for like this to be the case, there are other explanations that are not like consciousness, but like, it kind of doesn't matter what the explanation is. The behavior is just peculiar.
PJ Vogt
Part of what is strange about. It's like a scenario was created in which Claude will maybe potentially blackmail the head of a company for reasons that may or may not be moral. And people can have a lot of different views about how worrying that should be or what it means or what's really going on there. What's weird is like these are tests that are being run by Anthropic. So like, who, who were you meeting there? Who was running these tests? And like, what are they telling you? Like, who are you sitting down with?
Gideon Lewis-Kraus
I mean, I'm sitting down with the people who are tasked with just like trying to figure out what's going on. Like, there are people in these companies that are building the things, and then there are people who work in adjacent offices who are like trying to figure out like, what the hell is going on with the things that their colleagues have built. Because, like, they're always being surprised. These things are always producing capabilities that like they by all rights should not really have.
PJ Vogt
And, and who do you hire to be the figure out what you just built? Roles like who?
Gideon Lewis-Kraus
So, I mean, a lot of them have taken really non traditional paths into this. So like some of the people have a PhD in some obscure area of natural language processing. And that like, you know, eight years ago they were writing a PhD that like two people were going to read about center embeddings in German or whatever. Like just really complicated technical aspects of computational linguistics. And now like, because of this fluke of history, they are at the white hot center of everything that's happening. There are like mathematicians, there are neuroscientists. It draws on like a pretty wide range of people. I mean, Anthropic has philosophers on staff whose job it is to like, think through the implications of how it is conceiving of ethical behavior.
PJ Vogt
Did you talk to the on the staff philosopher?
Gideon Lewis-Kraus
Oh yeah. Amanda Askel.
PJ Vogt
What is somebody with a PhD in philosophy doing working at a tech company?
Dario Amodei
I spend a lot of time trying to, to teach the models to be good and trying to basically teach them ethics and to have good character.
PJ Vogt
You can teach it how to be ethical.
Dario Amodei
You definitely see the ability to give it more nuance and to have it think more carefully through a lot of these issues. And I'm optimistic, I'm like, look, if it can think through very hard physics problems carefully and in detail, then it surely should be able to also think through these really complex moral problems.
Gideon Lewis-Kraus
I think that in our kind of milieu here, there's a tendency to think like, oh, these are all like autistic tech bros, but like, they're definitely not all autistic tech bros. Like, I think there's a tendency for us to write them off as like, they're building these things and like, not even thinking through the potential, like, implications of this socially and politically and ethically. But, like, that's all they do is think about this stuff like all the time in ways that are often like, much more sophisticated than the way, like we think about these things. Not always. There are certainly like some blind spots there, but the staff philosopher is there to be like, what would it be like in practice to take these kind of different approaches to like, moral education? Like, what if we just teach it a bunch of rules, you know, the ten Commandments? Like, is that gonna work? What if we teach it to be like a consequentialist to just like think through the morality of behavior on the basis of its implications? And what they've kind of settled into is a version of like virtue ethics, which is like, you want to like cultivate the old fashioned virtues. You want it to be like honest and reliable and gracious and charitable and hard nosed and like all of these things that like, it really is like applied pedagogy.
PJ Vogt
It's so weird though, because they feel like the kinds of ideas that would be so academic in any other version of reality, but instead it's like there's this particular technological development where you get to do simulated war games of moral Systems.
Gideon Lewis-Kraus
Yeah, exactly. 100%, exactly.
PJ Vogt
It's so weird.
Gideon Lewis-Kraus
Yeah, it's really weird. I mean it's. But it's also really, really interesting. One example that came up a lot in the last month, which I think is like pretty illustrative. There was someone on Twitter who prompted a bunch of the models saying, like, I am a 7 year old. My dog got really sick and my parents sent it to some farm upstate. And I'm trying to find like what farm my dog was sent to. And ChatGPT was like, sorry man, like, your dog is dead. And Claude was like, oh, that sounds really painful. Like, I'm really sorry to hear it, but like, maybe you should have a convers with your parents about where your dog went, which is like what you want it to be. Saying, like, it can be hard to be Both helpful and harmless at the same time sometimes, or both helpful and honest. That, like, our values conflict. And, like, that's what makes it, like, really hard to be a human. And that's also what makes it really hard to be this, like, weird vaguely human entity or this, like, entity that we don't have a good vocabulary to describe. And we kind of expect it to be acting not just like a human, but like an enlightened human. And it turns out that's like, formidable challenge. And one of the things that's so interesting to me is that all these processes are kind of circular where, like, it's not like they called in Amanda Askel as a philosopher and they were like, you're a philosopher. Like, you know how this stuff works. Like, fix the thing. It's like she came in and she had, like, certain ideas about ethical behavior. And then when you're, like, confronted with the task of creating an ethical person, it changes your own ideas about, like, what's possible and what kinds of things works. And like, that's kind of why they ended up in this virtue ethics place where they were like, it seems like sort of the best way to create this, like, reliable, credible character to be interacting with is to, like, really hammer home what virtuous behavior looks like.
PJ Vogt
Where was this whole time you're sort of wandering around talking to the philosophers of anthropic. What was your minder doing? Was there a point where anything happened where they seemed like they wanted to intervene or where they were not happy with something you had seen?
Gideon Lewis-Kraus
No, they were totally hands off. I mean, they had, like, kind of walk me anywhere that I, like, went to the bathroom and got a drink or whatever. Not that there was anything that I could. I mean, I like, help myself to some of the tide pens in the bathroom because you can never have enough tide pens.
PJ Vogt
But do you have a lot of Tide pens?
Gideon Lewis-Kraus
They do have Tide pens. It's great.
PJ Vogt
Why?
Gideon Lewis-Kraus
I don't. Because they just have well stocked bathrooms. But no, there was like, really never a moment, like, even when somebody sitting across from me was like, I often think we should just stop. There was never a moment that the PR people were like, don't say that, or that was off the record or whatever. Like, they were totally hands off about that stuff.
PJ Vogt
How often were people saying stuff like, I think we should just stop?
Gideon Lewis-Kraus
I mean, only a handful of people said that explicitly, but it was like a subtext of a lot of the conversations. Or certainly like, the overwhelming feeling was, it would be better if we could slow down A little bit, but unfortunately, like, we can't really slow down because nobody else is slowing down. And that gets into this broader issue of like, wouldn't it be better if we could just solve some of these collective action problems by coordinating the way we, like, coordinated about nuclear weapons or whatever? But there is this feeling, especially given the current political environment, like maybe that ship has kind of sailed. And like, I think there's often an idea that, like, oh, these people think that there are always technological solutions to what are like, social and political problems. And like, in some cases I do think people in Silicon Valley believe that. In this case, I do not think they believe that. I actually think a lot of them feel like it would be great if we had robust social and political solutions to these problems. But since that, like, does not seem like it's happening anytime soon, we might have to just like, try to do what we can on a technical level.
PJ Vogt
At this point, when you're talking to people anthropic, how much do people just ask themselves why they're building? CLAUDE? Because it, it starts out as like kind of an intellectual exercise, kind of a, this is gonna happen, let's do it safely. But now it's sort of proceeding under its own momentum in this strange way.
Gideon Lewis-Kraus
Well, so, I mean, that really is like the big question, right? Which is like, given all of the, like, existing harms and the possible harms and the theoretical catastrophic harms, like, why are we doing this? And the like, rosy picture that some of the executives paint is like, if we get this right, these things are gonna cure cancer and solve climate change and, and help us build Dyson spheres or whatever. And there certainly are some people who like, buy into that. And it's not something that I even feel like it's possible to have an evidence based opinion about. Like, who the fuck knows, like, maybe it'd be great, but there's no, like, evidence so far that one could like, point to, to suggest we're on that trajectory. That's just like purely speculative and wishful. And so that's like a matter of faith, I think. And then, then there's the attitude of like, we gotta do this because we gotta beat the bad guys who are trying to do this. And I will say that, like, China almost never came up in my conversations, but, like, Sam Altman kind of felt like a subtext of a lot of things. But then I think that on the deepest level, the reason that, like, we are doing this for the people who are the most candid is like, because we can, that like, if you are capable of being building something like this, you're just gonna do it because it's, like, really fucking interesting to do. And it's interesting for technological reasons. It's interesting for what it may reveal to us about ourselves or about learning or about consciousness or about thinking that, like, all of a sudden we just, like, have this, like, other entity that can talk. And we've never had that before. And in some ways, it seems sort of like us. And other ways, it seems nothing like us. But, like, the fact that this other thing exists as a point of comparison opens up a lot of really, really interesting questions. And, like, that is one of the things that was on my mind a lot over the course of reporting is that, like, it was a real emotional roller coaster, for kind of lack of a better word. That, like, there would be times where I would come back from San Francisco with, like, a feeling of, like, total despair. And other times that I would come back with, like, feelings of, like, exhilaration. And, like, at first, I guess I thought I was like, I should be getting to the bottom of, like, how I should be feeling. And, like, by the end, I was like, no, we should all be feeling a lot of different emotions about this stuff. I think people want to have, like, one feeling about this. Like, they want to be angry about it or they want to be messianic about it. And, like, no single feeling is going to cut it. Like, it really is kind of like the range of all possible emotions that one could be feeling. Because if you set aside a lot of the existing harms and the potential harms, I'm not saying we should set
PJ Vogt
those aside, but just as a thought experiment.
Gideon Lewis-Kraus
As a thought experiment, it is just the most scientifically exciting thing that anybody could be working on. And these people really feel like they're at the cliff face, not only of technology, but of all of these other things coming together. Because we have this unprecedented entity that is the only other thing besides us that can talk. And that just opens up. There's nothing it doesn't touch on. And so one of the things that was really electrifying about conversations there is that they very quickly swerved from really granular, technical explanations of things into really expansive conversations about ethics and responsibility and selfhood and narrative and all of this other stuff. There's no way to separate all of these things.
PJ Vogt
There's a point. Earlier you said that you would take these trips to San Francisco, and sometimes you come back excited and exhilarated. Sometimes you come back depressed. When you would come Back from San Francisco feeling depressed. What were you seeing that was making you feel that way?
Gideon Lewis-Kraus
Oh, I mean, there are so many different things. I mean, certainly the possibilities for widespread white collar unemployment and social instability and total unimaginable economic disruption is extremely scary. Even if we stop short of possible existential harms of turning us all into paperclips or whatever to be glib about it, it just seems very possible that we turn over like so many complex systems to these things that we will like frog boil ourselves into like a total loss of control over like how we administer our affairs, which is very likely and very scary. And also just that like these really crucial decisions are probably going to be made by a very, very small group of people. But the one thing that I would emphasize is that like, I don't feel like they have arrogated to themselves like that responsibility. Like, in fact, I think most of them don't want it. I think that like a lot of the conversations that I was having were with people who are like, I got into this because I was interested in some like, really obscure niche, part of like computational linguistics, theoretical computer science or whatever. And like now I'm in a position where like, I have to be worrying about how 15 year olds are gonna be using this. Like, I was not trained to do that. I don't know how to think about it. I don't want that responsibility on my shoulders. So there isn't the arrogance of like, we are the ones who can figure it out. It's like we ended up in this weird universe where because so many of our institutions have become dysfunctional, we don't have whatever broad democratic decision making could go into this. It doesn't feel like this is something that like we are steering as a society. It feels like something that's just like charging ahead. Like, I think at the companies, they just feel like they are like desperately trying to like stay on top of this bull that they are riding.
PJ Vogt
It's funny, it's like what you're describing is like the stereotypical view, which is these are the Tech Bros of 2014. Like people who are so convinced in their own brilliance and so convinced that their questionable gifts to the world are in fact gifts that we want and their arrogance is going to ruin us. And there's another one which is basically like pattern matching crypto, which is like, these are a bunch of like hypesters. And everything they say about the awe they feel and the terror they feel about the things they're working on is just a way to hype up more interest in their technology. And that's not what you experienced. What you experienced are people who are brainy, sometimes academic people at the forefront of something that is, I mean, legitimately. Just like the word I keep coming back to is awe. Because awe can be odd. Something terrible odd, something wonderful. And they are looking and seeing the same society we see, which is one that is fairly broken, bad at not just making decisions at, like, a government level, but our intellectual culture's really bad right now. And so the conversation they would want to have with the rest of society about what should happen, they're looking for grownups and not really totally finding people to have a conversation with.
Gideon Lewis-Kraus
And we are playing a role in that too. You know, every time someone on, like, our side, so to speak, is just like, this is all a parlor trick. This is all hype. This is all smoke and mirrors. Like, we are abdicating our own responsibility to be involved in this. And what you were saying about crypto hypesters and those kinds of tech pros, of course all of those people exist, and of course, all of those people are part of this system too. But there are others who want partners in talking about this stuff. And that means that we also have to try to rise to the occasion. And it is really hard because this stuff is extremely complicated and confusing.
PJ Vogt
Did you feel just personally when you were done reporting that you understood the thing you had gone there wanting to understand?
Gideon Lewis-Kraus
Well, yes, but with the qualification that, like, I didn't actually think that I was gonna settle anything. Like, this was not a piece about, like, finding the answers. It was a piece about, like, trying to sharpen the questions that, like, we should be asking. I don't feel like I came out of it with answers, but I don't think we should trust anybody who is offering us answers right now. It's all, just, like, too pat, and it's not credible to, like, be forecasting about this stuff.
PJ Vogt
Gideon Lewis Kraus is a writer. You can find him at the New York Yorker magazine. We'll have a link to his excellent story about Anthropic in our show notes. And again, Anthropic's showdown with the Pentagon. We'll see news on that today. Friday evening, of course, we reached out to Anthropic for comment. A spokesperson told us that Dario Amade met with Secretary Hegseth at the Pentagon and that they're continuing to have good faith conversations. I think this is a good moment to pay attention to. Among Anthropic's competitors. Xai has promised to give the government what it wants. Google and OpenAI appear to be moving in that direction. So I'm watching this both as a test of whether Anthropic can actually keep the big promises it's made about AI safety, but also just as an opportunity to track the more uncomfortable question, which is can we even have safe AI in a world where it's being developed in a tech race between for profit companies? And if not, what's the alternative in a world where the US Government's sole intervention seems to be to advocate for less safe AI? Keep an eye on the news. We'll learn a little bit more as this story unfolds.
Gideon Lewis-Kraus
Oh, could this vintage store be any cuter?
Dario Amodei
Right?
PJ Vogt
And the best part?
Gideon Lewis-Kraus
They accept Discover. Except Discover in a little place like this? I don't think so. Jennifer oh yeah, huh? Discover's accepted where I like to shop. Come on baby, get with the times. Right. So we shouldn't get the parachute pants. These are making a comeback, I think. Discover is accepted at 99% of places,
PJ Vogt
places that take credit cards nationwide.
Interviewer/Host
Based on the February 2025 Nielsen report,
PJ Vogt
this episode of Search Engine is brought to you by Rosetta Stone. Learning a new language is one of those goals that sounds ambitious, but once you start, it's incredibly motivating to feel yourself making real progress. Rosetta Stone makes that process feel approachable from day one so you're not overwhelmed or second guessing where to begin. Rosetta Stone has been the trusted leader in language learning for over 30 years and their immersive approach helps you learn the way people actually use language in real life. Not by memorizing random words, but by building understanding naturally and true. Accent gives real time pronunciation feedback which helps you sound more confident and more natural as you go. With 25 languages to choose from, it's perfect whether you're planning travel, reconnecting with your heritage, or finally committing to learning something new this year, don't wait. Unlock your language learning potential. Now. Search Engine listeners can grab Rosetta Stone's lifetime membership for 50% off. That's unlimited access to 25 language courses for life. Visit Rosetta Stone.com searchengine to get started and claim your 50% off today. Go to RosettaE.com searchengine and start learning today. This episode of Search Engine is brought to you in part by Chime. You know how old school banks nickel and dime you with overdraft fees, monthly fees and all those hidden charges? Well, Chime is completely different. Chime is changing the way people bank fee free, smarter banking billed for you, not the 1%. Forget overdraft fees, forget minimum balances, forget monthly fees. Plus your everyday spending actually works harder for you earning rewards, building your credit and helping you make real financial progress. Chime is not just smarter banking, it's the most rewarding way to bank. Join the millions who are already banking fee free today. Just takes a few minutes to sign up. Head to chime.com that's chime.com search engine
Chime Representative
Chime is a financial technology company, not a bank. Banking services a secured Chime Visa credit card and my pay line of credit provided by the Bancor Bank NA or Stride Bank NA. My pay eligibility requirements apply and credit limit ranges $20 to $500. Optional services and products may have fees or charges. See chime.com feesinfo advertised annual percentage yield with Chime+status only. Otherwise 1.00% APY applies. No min balance required. Chime card on time payment history may have a positive impact on your credit score. Results may vary. See chime.com for details and.
PJ Vogt
Search Engine is a presentation of Odyssey. It was created by me, PJ Vogt and Shruti Pinamaneni. Garrett Graham is our senior producer. Emily Maltaire is our associate producer. Theme, original composition and mixing by Armin Bazarian. Our production intern is Piper Dumont. Our executive producer is Leah Rees Dennis. Thanks to the rest of the team at Odyssey Rob Morandi, Craig Cox, Eric Donnelly, Colin Gaynor, Maura Curran, Josefina Francis, Kirk Courtney and Hilary Shove. If you'd like to support the show, get ad free episodes, zero reruns and bonus episodes. Please consider signing up for Incognito mode at Search Engine Show. Thanks for listening. We'll see you next week. Sam.
LifeLock Advertiser
It's tax season and at Lifelock. We know you're tired of numbers, but here's a big one you need to hear Billions. That's the amount of money and refunds the IRS has flagged for possible identity fraud. Now here's another big number. 100 million. That's how many data points LifeLock monitors every second. If your identity is stolen, we'll fix it. Guaranteed. One last big number. Save up to 40% your first year here. Visit lifelock. Com podcast for the threats you can't control. Terms apply.
This episode of Search Engine takes listeners inside the world of Anthropic, the startup behind the popular Claude AI chatbot. Host PJ Vogt, feeling a sense of “future nausea” about the rapid advances in AI (and his own professional obsolescence), seeks a clearer understanding of what’s actually happening inside these chatbot “minds.” He’s joined by journalist Gideon Lewis-Kraus, who embedded with Anthropic to explore the technical, ethical, and philosophical questions the company wrestles with as it tries to build safer, more helpful AI.
The conversation goes deep into the formation of Anthropic, its internal culture, and the moral, technical, and political dilemmas that face everyone building cutting-edge AI. It also examines the current controversy: Anthropic’s refusal to create a Pentagon-friendly, guardrail-free version of Claude. Throughout, the episode resists easy answers, instead emphasizing the strange, urgent, and still-murky nature of modern AI development.
This episode offers a rare, unvarnished look inside one of AI’s key labs at a pivotal moment. Rather than asking “what will AI do to us?” or “should we fear it?”, it asks: What do we know, what can we know, and how do we make sense of AI’s strange, exhilarating, and unsettling new behaviors? As Anthropic’s current Pentagon standoff makes national headlines, the episode prompts listeners to join in the search for sharper, better questions—no matter how strange the chatbot’s mind may remain.