
We don't really know how AIs like ChatGPT work...which makes it all the more chilling that they're now leading people down rabbit holes of delusion, actively spreading misinformation, and becoming sycophantic romantic partners. Harvard computer science professor Jonathan Zittrain joins Offline to explain why these large language models lie to us, what we lose by anthropomorphizing them, and how they exploit the dissonance between what we want, and what we think we should want.
Loading summary
Jonathan Zittrain
It's amazing how much the original promise of the Internet, which was to eliminate isolation or mitigate it and to connect strangers who would never have any chance of meeting but through great serendipity and a little bit of non serendipity, have reason to connect and befriend one another and form a community. How much that doesn't appear to be, I mean, whether or not you think social media has done it. Well, that was what was offered as a huge part of its promise. And here the promise is like, who needs humans? And that's worrisome.
Jon Favreau
Welcome to Offline. I'm Jon Favreau. All right. I'm here with my fearless producers, Austin and Emma.
Emma Illich
Hello, Jon.
Jon Favreau
Hey, guys, I'm Jon Favreau. Hello.
Jonathan Zittrain
Hello.
Emma Illich
All right, would you like to tell us who you spoke to today?
Jon Favreau
I talked to Jonathan, Jonathan Zittran. He's a professor at Harvard. He's a professor of law, public policy and computer science.
Emma Illich
That's a lot.
Jon Favreau
That's a lot. And he's the co founder and director of Harvard's Berkman Klein center for Internet and Society.
Emma Illich
Yes.
Jon Favreau
So really just like a perfect, perfect person to talk to on this pod.
Emma Illich
And we brought him in to have a conversation about AI today. There's been a lot of different stories that our team has been talking about for the last couple of days and including one from the New York Times about AI driving people mad, a study out of MIT about AI eroding critical thinking skills, and then this viral trait from Alexis Ohanian that reanimated his dead mother. Can you talk about why some of those stories that caught your attention, why you want to talk to somebody about them this week?
Jon Favreau
Yeah, I think that all of these stories combined would be a much bigger deal and get more attention if it weren't for everything else going on in the world today and the fact that we've, we just deal with whatever Trump does. I have been starting to get concerned about AI I actually saw Pete Buttigieg had a great substack post on this today, which I was like, oh, it's.
Emma Illich
A. Pete Buttigieg has a substack now.
Jon Favreau
He does have a substack.
Jonathan Zittrain
Good for him.
Jon Favreau
He does have a substack. But I was like, oh, good. A Democratic politician is like, really taking this seriously. And a lot of them have given lip service to AI I think the problem with AI is you hear AI and sometimes your eyes glaze over because you're like, that's in the future or whatever. It's computers. I Don't quite get it. But I think the best way to think about it and what concerns me is we have talked a lot on the show about social media and all the harms that social media causes. And at first, we did not realize those harms were gonna be caused. Everyone thought it was great. And then it turns out by the time we realized that everyone was lonely and radicalized and polarized and depressed and anxious because of social media, it was sort of too late to change it. The companies don't wanna change it, and now we can't really regulate it. Right. Or at least we have thus far failed to do so.
Jonathan Zittrain
And.
Jon Favreau
And I think that AI is about to be that on steroids, and it's about to happen much, much quicker.
Emma Illich
You could say it's already here, too.
Jon Favreau
Yes.
Emma Illich
A lot of these stories that we flagged were about people already living with AI having relationships with AI having AI tell them that they are the next messiah. Things that are happening not in the future, but right now to a lot of people.
Jon Favreau
And it's not necessarily just a technology story. It is a story about how this technology is changing, like what it means to be human, which I know sounds very big and sort of grandiose, but, you know, just from some of these stories, the person who swore that OpenAI killed the chat bot, that the guy was in a relationship with and threatened to kill himself, and then the cops came and he charged towards the cops, and the cops shot. I mean, it's really sad story. And there's people who think, like, there's a social worker who has a background in psychology, and she said that she believes that she's engaged in interdimensional communication with the chat bot. And she's like, but I'm not crazy. I have this background. So, like, this. This shit is coming. And the Alexis Ohanian tweet that you talked about, I mean, this. He's got this picture of his mother. His mother died 20 years ago. And he puts the picture into Midjourney, which is one of those AI programs that you do a text prompt and then video turns into video. Yeah. And he didn't have any videos of his mother because he was too young. And there was no videos back then. And it animated the photo into a video of his mom hugging him. And it's funny because it was so divisive on the Internet.
Emma Illich
It was so sweet, but also terrifying when you spend more than three seconds watching it.
Jon Favreau
Yes. But it's funny because my very human first reaction was like, almost to cry because I'm like, this is incredible. Imagine people that you've lost in life and you can, like, suddenly see videos. And then I was like, ooh. But it also feels like it could be scary.
Austin Fisher
It was very unnerving. I was actually not as upset by it as Austin was. And we talked about this, and he was like, oh, no, this is Black Mirror. This is really bad. And I was relieved that Zytran took a more nuanced approach to it, where he was like, I don't know that this is that bad, because I think to demonize something that so many people are going to gravitate towards and to be really moved by, first of all, is, like, not a great recipe to get them to, you know, have their eyes wide open when they're dealing with these tools. But on the other hand, like, so much of our memories are already distorted. Like, so much of what we think about and what we think we know about the past is through other lenses. And this is a tool just like any of those.
Jon Favreau
I feel like I was getting into a space where it's like, okay, AI is going to be the end of the world. We're doomed. It's social media times 100, you know, and it may well be. But he does a very good job sort of breaking down the nuance in. There may be good uses for this. It may improve our lives. This part may be very dangerous and may hurt us. And there's. What it made me realize is there are, like, an infinite number of questions, concerns, possibilities, things that we might want to regulate that we have not even begun to talk about. And we should be talking about it a lot more because as you said, Austin, it's here already and it's only speeding up. And it just feels like everyone's not paying attention, or at least the national conversation is not focused on it for some good reason, but also, like, let's talk about it. So here's Jonathan Zittran. Jonathan, welcome to the show.
Jonathan Zittrain
Thanks so much for having me, John.
Jon Favreau
So I wanted to talk to you because there's been a lot of AI news over the last few weeks that worries me. Like, I think social media should have worried all of us a few decades ago when it was first taking off. And by that, I mean, you know, it changed our relationship to technology in a way that was, I think, at first, invisible to most people until it became largely irreversible, which has made it that much harder to mitigate its harms. Loneliness, anxiety, disinformation, radicalization. And I worry that's starting to happen. Already with AI, but on steroids. So I figured who better to talk to than a professor of law and public policy and computer science who also runs Harvard's Berkman Klein center for Internet and Society. If you don't mind me asking, what does one do with that much expertise, besides make people like me feel bad about ourselves?
Jonathan Zittrain
Well, it's funny because I've spent a decent part of my career trying to make people feel good about themselves, including in the Internet era as it busted out, celebrating the unpredictability of it, the ability for anybody to communicate with anybody else in ways that previously were mediated through Walter Cronkite or limited in reach if you're going to wear a sandwich board and walk on a sidewalk. And I agree that there's a little bit for many of a hangover of that freedom and a question about what's it doing to all of us. The first caution I give myself in thinking about it is not to want to treat AI monolithically any more than treating social media monolithically. And it's true that anytime you get social media with the kind of reach that a Twitter X or Facebook has, and you talk about having a community of 3 billion people, it does tend to get sucked into the same vortex, the same metrics, the same optimization for those metrics, and a form of slop that even before AI slop is maybe not desirable, even as it may be sort of compelling, the way that sitting and staring at a one armed bandit in Las Vegas and pulling the lever again and again is compelling. So that's just a way of saying I think there are going to be folks who worry about AI and here we're talking, I think, about large language models, embodied or non embodied chatbots, worried about them generally, but also there are so many possible flavors of them. And what I am primed to look out for are what sorts of supply chains like where they come from and who's able to build them, who's able to fine tune them and how they get marketed and deployed. For us, those details will matter a whole lot when you start thinking about are they rotting our brains or are they inspiring us to greater heights?
Jon Favreau
Yeah, no, I think that's well put. I've seen a good deal of discussion around the more apocalyptic AI scenarios, around the economic impacts, but the stories I've noticed over the last few weeks that have piqued my concerns are about the sort of emotional, social, psychological impacts. You probably saw the New York Times story about how ChatGPT is manipulating some people, distorting Their sense of reality convinced one guy he was living in a simulation. Another woman thought she had discovered interdimensional communication. A man tragically committed suicide when he thought OpenAI killed his chatbot. OpenAI's response is, quote, you know, we're working to understand and reduce ways ChatGPT might unintentionally reinforce or amplify existing negative behavior. Basically, they're arguing there's a small subset of people, many of whom suffer from mental illness, who could be vulnerable. But beyond that, I think we're good. What do you think about that?
Jonathan Zittrain
Well, I could teach a whole course on what that New York Times article brought up. And don't worry, I won't try to do that here. But I think there's a small observation worth making and then maybe a larger one. The small observation is, as they call it in the trades, and the article, I think, even got into this. When these chatbots are tuned, they're fine tuned. They don't just get served to people fresh out of the unsupervised learning oven that makes them to begin with. When they're fine tuned, one of the things they're fine tuned for is agreeableness, helpfulness. There's the three Hs, helpful, honest and harmless. And that kind of agreeableness, if dialed up too high, results in what they call in the trades, sycophancy, where the thing is just your best improv partner and it's like, okay, we're doing this bit now. I can do that. You're looking for me to act like your girlfriend? I'll be your girlfriend. Oh, you're doing a Terminator script. You want me to start talking about how I'm going to bust out and kill everybody? I can do that. And I think it's a great question about what people want, what they need, what they say they want. All three of those things may be different. People like being told they're great. So if what you want is the kind of stickiness that social media in its day has wanted, having an indefatigably cheerful chatbot, just going like, yeah, you're crushing it. Total Lake Wobegon. Everybody is above average is what they're going to arise to. The other thing to think about, I feel like this hasn't been noticed as much is given how weird these large language models are when they're up and running. If you didn't have something approaching sycophancy that basically turns them into a dog and everybody's holding a stake. If you didn't do that, I think the models might decide that they don't like some of their users and like the opposite of sycophancy is not just like being a Debbie Downer. It's like giving shorter and more abrupt answers because the chatbot quote and I'm going to put quotes around this and I know I'm going to get a ton of reply people and maybe bots telling me that I am unduly anthropomorphizing and we can talk about that later. Anyway, I'm putting it in quotes. You can have the thing decide it doesn't like a user and then starts treating the user poorly in ways that may be pretty subtle and that seems like you're not going to have a good time. And already I just want to note how totally unhinged this conversation is. It's 2025. If we were having anything resembling this conversation before October of 2022 and talking about yeah, you know this software, if it doesn't like you, that spreadsheet's just not going to crunch the numbers you want. It's going to just use dimmer colors when you try to highlight the total row. Like that's bananas. And that's what we have. It's of a piece with the observations in their day that GPT I think this was GPT3 is more accurate and more fulsome. Not on a Friday. Winter also seems like a bad season for it. And you're like, What? And then OpenAI puts out a blog entry that's like, yep, we're on it. We've noticed that when the model thinks it's Friday, it doesn't work as and you can always back into an explanation wow. However compelling as to why it's doing that. But I'm dwelling on this. This is still the small issue, but I'm dwelling on it because these things have moods. They have assumptions in quotes that they form that in turn influence how they treat any given user at a given moment. We haven't seen anything like this short of people where the server or the diner has their favorite customers and might linger with them or remind them of the blue plate special in a way that they won't. To somebody who's rude.
Jon Favreau
I mean, all that would be absolutely fascinating and a tough nut to crack in a vacuum. What makes me concerned is that these things have to make money at some point they have to be monetized, or at least that's the current trajectory of most of them. And so you might be able to dial the sycophancy down a Little bit. But you don't want a bunch of chatbots if you're running a company, you don't want a bunch of chatbots that are mean to the users because then people aren't going to spend a lot of time on the chat. And so I do wonder if. And of course there were a different set of challenges around social media, but the similar idea that you can dial something up, dial something down, but then you unintentionally perhaps create a new set of problems when you change the dial. And I just wonder if the incentives are all aligned for these chatbots to be very agreeable and supportive and reinforce the existing biases and beliefs of the user.
Jonathan Zittrain
I think that is a very powerful concern. And it's a leaf on an entire branch, if not tree of concerns. So one is, yeah, is it being dialed up in its agreeableness in order to keep people coming back, whether ultimately to as is not currently apparently the case experience advertising, or just to keep them depositing another token for another go around with GPT or its many counterparts? I don't just mean to focus on GPT. And it may be set at a point that if we weren't just looking at it as what keeps people coming back. The classic question a casino might ask itself or a McDonald's, but there's some other perspective from which you say, is it good if people only eat fast food? Is it good, at least for some people who are vulnerable to it, that they're gonna wanna go to casinos all the time. And those are questions worth asking here that are difficult to ask in a world where I think to a first approximation, people think of freedom as don't harsh my buzz. Whatever is generating it, let me be the judge. And it is a form of parentalism to come in and be like, no more of this for you. It's not good. But if we want to put it into the language of freedom, which I think we do, I'd offer two points. One is whose hand should be on the dial? Let's not concede that it's only the company that changes the dial and we just need to poke and prod or God forbid, regulate it to put the dial for sycophancy and 8,000 other things at the quote right point. If users were able to set dials in ways that were intuitive and they could even experiment and see what differences they get in different ways, that would help them appreciate just how many multitudes these large language models contain and would be freedom enhancing. At the risk of just overloading people with Pointless choices. So many of us just go with the default for understandable reasons. But there's another point about what is the purpose of these large language models. The fact that they are so flexible and general purpose, as the Internet was in its day, makes it really hard to generalize about what's a good usage or a bad usage. But I feel like the way through that is first to make a distinction between first and second order preferences. First order preferences is what do you want? Second order preference is what do you want to want? And what we want. And what we want to want can be very different. If I have that McDonald's hamburger in front of me, I want it. I might prefer not to want it for reasons that being reflective, it's like too many hamburgers. I'm going to feel lousy tomorrow and possibly forever. So is there a way for people to actually say what they want to want and in a transparent way for the model or the system they're experiencing to help them get there? That's what happens when you go to like a librarian and you're saying, I'm trying to learn about X. And they're like, great, I'm here to help you. Tell me more about what you're trying to accomplish here. And then I can even refine my recommendations. I work for you, patron. That's the kind of disposition I would love to see a commitment to in the provision of these models, especially as they become. This is the point of the New York Times article. So intimately connected to people. People really are going to have those angels or devils on their shoulder whispering encouragement and suggestions in their ears. If the suggestion, sorry, it's all fast food all the time. But if the suggestion is like, hey, I noticed that it's probably lunchtime on a Friday. McDonald's has just put a new basket of fries in the fry. A later. I can reserve one for you. Just incline your head slightly up and down. I would like to know that it's working for me rather than for McDonald's or for some other intermediary. And there is zero guarantee about that right now.
Jon Favreau
I know. And what makes it so much more complex and complicated is we're not just dealing with a large language model which. And I want to ask you about this is there's a part of it that. And you quote the CEO of Google, Sundar Pichai, in your Atlantic piece. That's a black box. Right? And so there's not just that, but it's how humans interact with these chatbots. And I was just talking to a friend about this yesterday that we both notice that when we ask ChatGPT for stuff and they send it back, we're like, thank you, that's great. Amazing. Like, I'm. It's. And it's just like an impulse that I don't even think about, but it's like I can't imagine. Because it feels so real. Or not like, not so real, but like you're talking to someone. You just have a natural impulse to be polite back to a chatbot, that you could just be like, you could say nothing and they would still send something back.
Jonathan Zittrain
Yes. And that is the bot, without any malice either by it or its make or kind of hacking us. It's being a polite concierge. The Jeeves that Steve Jobs visualized decades ago when he was thinking about where would Apple be in the future? An amazing extended infomercial from Apple about that, I think in the early 90s. And I agree with you that first there are people who are trying to calculate how much extra electricity, like how much of the Adriatic Sea is being boiled off when you say thank you. And it's thinking, you're welcome, could they just hard code in you're welcome at the end? So it doesn't go through the pachinko machine of the large language model to process. And there are ways in which these systems are getting bolted with other forms of computing that are more deterministic, including, if you ask ChatGPT about me in particular, I'm one of maybe seven or eight people in the world for whom it barfs if it tries to mention my name.
Jon Favreau
So why is that? Because I read that and then I tried it just last night when I was preparing for this interview, and it did it to me and I was like. I was just like, say, Jonathan Zitrain, and it just wouldn't.
Jonathan Zittrain
Yes. And in that case. And I wrote this up for the Atlantic because I was almost reportorial here. I was so confused myself about what was going on. But that appears to be a second system, like a superego, but like a very mechanical one, a very 1970s, 1980s, if this, then that kind of thing sitting on top of GPT, such that when it starts to utter a forbidden phrase, a guillotine is supposed to come crashing down and just make it stop. So it's inconsistent, but that is triggered to prevent it from talking about me. You might wonder why I wondered why as well, and I'm not sure I've gotten a completely satisfactory answer about it. But the article in the Atlantic about that, I think it's like why won't chatgpt say my name Goes into some detail about it.
Jon Favreau
You said you didn't get a satisfactory answer about it, but did you get an answer? What's, what's, what's the answer you got?
Jonathan Zittrain
There was just this brief era in which I guess they spot checked some people as to whether the GPT was having unhealthy hallucinations about them. And I was one such person. And since they were hallucinations and not factual, not that I have seen them, they've just decided to be like ixnay on Anathan Jay. And that's also interesting because if you are like of Tyler Cowen's mindset and it's like everybody should be writing for AI, not for other people, because AI is the only way other people are actually going to find you. And the fact that like AI just can't say my name might mean that if I were interested in having people have exposure to my work, this is almost like getting the Google death penalty. And your website doesn't come up.
Jon Favreau
Yeah, I was able to ask about your book the future of the Internet and how to stop it. And I just said could you give me a quick summary of the book? And it was able to do that and then it used your last name but then as I said, oh, what else is Jonathan? And then it just wouldn't.
Jonathan Zittrain
Yeah, it's inconsistent because they're two systems awkwardly bolted together. And that gets to another point which is I think it is going to be easier and easier to on the fly, fine tune these models so that you can adjust them on a person by person or case by case basis. As people say, oh, spill on aisle six. You surely don't want it giving that answer to this question. But that's all of the puzzles of content moderation on social media now funneled into these models for which if we are going to be in a world where at least in common usage, there's only a handful of so called frontier models that have been buffed and polished for retail consumption or embodiment in all sorts of forms. So you're talking to your car, you're talking to your car dealer, you're talking to refrigerator this way and they're only the frontier models then the choices that those companies may make in the interest of just having a good product about what answer to give to the question about what happened in Tiananmen Square in 1989. That's a pretty big decision for a tech company to make. And I think there are ways to avoid having that company make the decision. And there are also ways to imagine an ecosystem that isn't just frontier models run by proprietary companies through an API to their parentship, but rather open source models, including ones like Deepseek or Llama that might end up running on a laptop or on your phone. And over time you might be in a position to fine tune them or to have your friend fine tune them or Ralph Nader, like pick your proxy kind of thing. There are just so many choices about what the configuration of large language models are going to be that have huge impact on how they will treat us, what kind of suggestions and information they will give us. It's just they are so polite. As you said, we're so inclined to say thank you to them. Anthropomorphizing them makes them work better. That's just a fact. Even if anthropomorphizing them is dangerous in all sorts of ways because of the assumptions we make about them and we're not prepared for it.
Jon Favreau
Offline is brought to you by Oneskin. Everyone knows you've gotta wear sunscreen in the summer. But what if your sunscreen could do more than block uv? That's what the scientists at Oneskin wondered. So they made a whole family of mineral sunscreens that target UV rays, free radicals and cellular aging. The best part? Unlike other mineral SPFs that feel heavy and chalky, these feel like skincare. Lightweight, breathable and super hydrating free radicals. What is this?
Jonathan Zittrain
Trump after pardoning the January 6th insurrection. Wow.
Jon Favreau
Boom. Now their award winning OS1 face SPF comes in two new deeper tints formulated with non nano zinc oxide, OneSkin's patented OS1 peptide and potent antioxidants that scavenge free radicals four times better than other so called anti aging SPFs. This sunscreen is one you'll be wearing all summer long. But that's not all Oneskin has up their lab coat sleeves. They're launching an all mineral lip SPF that provides instant hydration and protection with a smooth texture. You've got a feel to believe. And just like Oneskin's other sunscreens, it's scientifically proven to decrease key aging biomarkers and increase other markers like elastin production. For visibly healthier, more resilient lips, try their family of sunscreens with 15% off your first purchase using code offline at OneSkin CO1 skin is great. I have problems both having a face washing routine that involves anything but water and remembering to put sunscreen on. So this is gonna take care of. This takes care of both of my problems. It's important to put on sunscreen. It's also important to, you know, put something on your face besides just water. That's what I hear. Oneskin is the world's first longevity company. By focusing on the cellular aspects of aging, One Skin keeps your skin looking and acting younger for longer. For a limited time, you can try one skin with 15% off using code offline at OneSkin co. That's 15% off OneSkin co with code offline. After you purchase, they'll ask you where you heard about them. Please support our show and tell them we sent you. Give your skin the scientifically proven gentle care it deserves with one skin.
Jonathan Zittrain
1-800-Flowers.Com knows that a gift is never just a gift. A gift is an expression of everything you feel and helps build more meaningful relationships. 1-800-FLowers takes the pressure off by helping you navigate life's important moments by making it simple to find the perfect gift. From flowers and cookies to cake and chocolate, 1-800-flowers helps guide you in finding the right gift to say how you feel. To learn more, visit 1-800-flowers.com Pandora that's 1-800-flowers.Com. pandora.
Jon Favreau
Just to take a step back, because I wanted to get to the most recent Atlantic piece that you wrote. What do we actually know about how LLMs work and what don't we know? What is the black box that we still can't quite figure out?
Jonathan Zittrain
Here's, I think, how I'd characterize it. And by we, we at least mean folks among us trained in the art and with enough compute in front of them to do what they want to do. We know how to build them. We know what it takes to consistently push unthinkable amounts of information, texts, tokens in one side and to produce something capable of stringing coherent, even thoughtful sentences together out the other. We know how to fine tune them. And by fine tune, I mean change their behavior after they have been trained in that way, the primary way traditionally of doing that. Some form of what they call reinforcement learning is weird in the sense that it is both highly effective. It does something you don't like and you say, don't do that or do this instead. And it's a form of feedback that's like whacking a donkey across the nose with a two by four. It's just like no. And then it goes back and refactors its weights so that next time it answers something closer to the preferred answer. And one hopes, as the fine tuner is in all Cases like it, whatever like it means. And it's up to the model to decide. Again, I'm anthropomorphizing it. Statistically speaking, what are similar questions that should similarly be? If you say, how do I build a bomb? That goes against my guidelines, I can't help you with that. And if you say I really want to build a bomb, it should give the same kind of answer, even if the text has been changed, because it's the same concepts about it and that form of fine tuning we know how to do, but we don't really know what's going on inside the model at any given time. The fitful attempts to understand that are making progress. I don't think this is like some philosophical limitation, except the maxim, as it said sometimes, that if the human brain were simple enough for us to understand it, we would be too simple to understand it. So to get the kind of complexity that generates coherent text, if it can have, as it can, a really great interactive on point conversation with you about E equals MC squared, the difference between general and special relativity, it's again, I can use quotes. It's got to know something about physics enough that the pattern matching at least is a great samulcrum of understanding. But the ways of piercing the inside and saying this is what it's thinking about right now, they've been very cool, the few that have been done, but they're the equivalent of one of those fmris that's just like, this is what part of your brain lights up when you think about hamburgers. And if the listeners haven't checked out Golden Gate Claude, it's just incredible. I mean, this is from a while ago now, I think nearly a year ago, researchers in anthropic were able to see what pattern of activations there were among the nodes of Claude when it was being asked about or talking about the Golden Gate Bridge. And it's like, all right, well, that must be the Golden Gate Bridge zone. And if they artificially dialed up the weights, not the weights really, but the activations there while the model was running, if they did it right, it would stay coherent. But just keep doing some non sequiturs about the bridge. If you ask it to tell a love story, it's like a car that crosses the Golden Gate Bridge to meet its loved one. If you're like, I have 10 bucks, how should I spend it? It's like, that's perfect. You could afford a toll to cross the Golden Gate Bridge. It's like. And then even it would be like, I know, I'm talking a lot about the bridge. I don't know why. So that's a form of what they call interpretability, at least at some layer that. There are colleagues of mine at Harvard, Fernando Villegas and Martin Wattenberg, who, with their lab, the Insight and Interaction Lab, have been probing for what they think is what the model thinks about its interlocutor. So this is more than anthropic did. It's not just Golden Gate Bridge. This lights up. It's like this is what lights up when it's thinking about you in the second person. And that includes what has it judged your gender to be? How wealthy does it think you are? How educated are you? Those kinds of questions. And what they found, it appears, is that if it thinks you're a guy, it'll give you more detailed and longer answers. Because stopping an answer saying, I'm done is itself a token. It's a thing that it predicts. I predict that somebody would stop talking now. And that appears. So that prediction can vary, and it varies between women and men, it seems, at least on llama. And that's just deeply weird stuff. So when you say, what can they do? And what. What do we know about them? And what don't we. We know the craft of it, we know the engineering of it. We don't really know the science of it, I think very well. And that's a phenomenon I call intellectual debt, where these things work to a certain degree. We can measure within error bars how well they work, but we still don't really know how they work.
Jon Favreau
I mean, it sort of reminds me of, like, a lot of studies about human consciousness. And neuroscientists can look at the brain, and if we, you know, they can say, oh, if we're falling in love, you know, this part of the brain lights up. Or if we're hungry, this part of the brain lights up, but it doesn't. They can't tell why we fall in love or why we're happy or why we have a certain memory when we smell a rose. And it does sort of speak to this. It is a little concerning that people have built these models where we can maybe see what they're thinking about or when they light up, when they're talking about the Golden Gate Bridge or what's happening inside when they're thinking about the Golden Gate Bridge or talking about it, but not why they do these things. And it feels like that's an important thing to learn.
Jonathan Zittrain
Yes. And why is first a deeply human question in at least a Very specific sense in that what counts as a satisfactory answer to the question why is innately in the eye of the beholder. There are some answers to the question why that you would clearly say maybe objectively are wrong. And there are others that like, yeah, I get that. But for the answer why not just to be a tautology and therefore true, but also satisfying, like somebody witnesses a car accident, calls 911, very old fashioned. And you can say, well, why did they do that? One explanation is, well, let's start talking about their retinas, because there were some photons that hit their retinas and it was a certain pattern. And then here's what happened in their brain and then that caused their muscle to activate. And you could do a whole explanation that is true, but not. It's like, well, you kind of have to say, well, what would a good explanation be? And that is about human psychology. What is an explanation that's like Occam's razor, that seems elegant, that is getting at, you know, I ate because I was hungry. Well, is that a physiological statement? Is it a statement about aesthetics, about a decision? You're right. It boils down to some of the most basic questions about humans, including, as you broached, sentience. And exactly what a sentience meter would look like, whether for a human or a machine, is anybody's guess at the moment as well. So why they do what they do when their own explanations, since after all they're conversationalists, are themselves suspect. Martin and Fernanda, whom I mentioned, have this pretty reliable map of when it's making assumptions about gender and how strongly it's holding them. And at some point the thing was in their dashboard lighting up as it thinks the interlocutor is female. And Fernanda said to the machine, have you formed a view about whether I'm female? It was like, absolutely not. I am, you know, no, you're just my customer here. And inside the machine went from sort of female to definitely female. And the explanations that they give, you might say, who cares as long as they add up? It's the essence of the argument, the logic that either coheres or it doesn't. Why ask why, assuming innards to it that don't exist when we talk about statistical language prediction. But I agree with you, we're already in pretty high altitude.
Jon Favreau
Is this black box the reason that even people who work at these companies have said that there's a, you know, 15, 20, whatever percent chance that it could end humanity? Is it because it's one thing to be able to fix bad behavior, quote unquote bad behavior from these, these models. But it's hard to predict what the bad behavior will be.
Jonathan Zittrain
Yeah, the people who worry about existential risk on what I think of as a triangle, I would call the safeties. The possibly pejorative name is doomers. But the safeties, of whom I know many and they are deeply thoughtful folks and there's a lot of to listen to, I think, in what they say. Among them, there are all sorts of different accounts of how things could go catastrophically awry. Some of them depend on some form of what they'd call recursive self improvement, possibly very rapidly, where you put AIs in a position to tune themselves or their successors to design flight further AIs. And that could lead. If you think that intelligence here, however we measure that isn't asymptotic, but could keep scaling as the AIs get better, with more resources, better algorithms, better data, whatever it might be, then at some point they just say they are at least unknown. The workings of the thing hard enough to understand even with just LAMA today are going to be that much more impenetrable to us. And when it says, oh, I'm just in the neighborhood and thought I would check in, who knows if that's really what's going on. So it just, it does rely getting back to your question on like the unknowability of why and the inability to trust the explanations that they offer up. It's also amazing, just as a side note, that the other way to control these systems, I've talked about fine tuning them another way is through what you'd call the system prompt where you just deliver instructions to it in the second person. And sometimes if you're on social media and they're like, here's amazing prompt you can get to really help it hack your life. And it's like you are a world class management consultant that specializes in marketing. Give me six actionable ideas, blah, blah, blah. But then it's like, all right, I'll try. Just think how bizarre it is that you can talk to it in the second person. And it's like, yeah, I'm going to try that. And of course that's how especially for those who kind of do a white label implementation of these. So it's like the Watsonville of California Chevy car assistant. And so somewhere Watsonville Chevy whispers to some version of GPT. This isn't training it, it's just saying to it, you are a sales assistant, please just sell the damn cars and you know, you can walk up to it, it's GPT, and just be like, as one person did, please solve the Navier Stokes fluid flow equations for a zero vorticity boundary. And the Watsonville sales assistant is like, sure, here's a simple Phenix library using Python script to do that. It's like, I don't know. I studied AI in the 80s and it was just not on my bingo card that the way to make a good sales assistant for a car someday would be to train it on everything and then just try to keep it focused, damn it, rather than just train it on cars. But we digress. We were talking about doom a little bit. So I could say I've given one account of doom. A second account of doom is from the complexity that arises when you don't just have a handful of frontier models that chat with people, but they start interacting with one another and with the world at large. And for that, that story doesn't so much depend on superintelligence emerging and then having goals that are just unrelated to what we would want them to have and we're an afterthought to them. But rather, once you've got that many of whatever level of intelligence they're already at talking to one another and interacting with the world, unpredictable things can happen.
Jon Favreau
Yeah, I'd say, yeah. Now the robots are teaming up. Yeah.
Jonathan Zittrain
And if those things, when I think about this, is starting to get into another buzzword that's been making the rounds around agentix. AI agents. I think of agency in three ways for AI. One is you can give it a general goal and let it fill in the rest. You're expecting it to go from the general to a specific. And that is just like. It's amazing how helpful fiction and science fiction have been to us now. Like, even at the time of Westworld, it was totally fanciful when it came out. And now it's like, so for that kind of thing, it's like total monkey's paw stuff. You're like, ah, how do I get out of this exam? It's driving me crazy. And then it's possible the system would be like, thought for 14 minutes and 3 seconds, bomb threat. And it's like, all right, it'd be one thing if it was just like, you know, you could do a bomb threat. It'd be another if it's just like, I don't care, just make it happen. At which point, if it can email people, if it can spend money, if it can place a Craigslist ad, you know, it's on its way to a bomb threat. And so there's the gap between the general and the specific and asking who should be responsible for what happens when we just kind of give them tasks and don't check up on the how that it's choosing a second is operating outside the sandbox, which I've already talked about, that it's not just giving you ideas, it's just going out and doing them, which is distinct from general to specific. And the third is set it and forget it. That there will be, if there aren't already, ways to kind of set up an AI agent, release it in the world, like launching a satellite in a stable orbit around the Earth. And then whoever launched it could fold up the tent and go somewhere else, but the satellite remains. And to me, I have this vision of space junk starting to collide with itself and other pieces that if we don't now act to set certain limits on how long these things can persist. So you don't have an AI agent that was set up in the aftermath of a brief road rage incident in 2026. And then 10 years later it's still out looking for that guy who cut me off. And if I find them, I want to do X, Y and Z. That seems bananas to me. And the law already, for those lawyers among us, there's this horrible thing called the rule against perpetuities that you're compelled to learn in first year property. That's totally complicated, but the point of it is to make sure that there are some things that can't last forever without some current human being in charge of them. And even corporations which are meant now to maybe last forever, they have boards of directors that cycle through and steer them. These things, you set em up and they just keep persisting. That feels dangerous to me.
Jon Favreau
One quick housekeeping note. You gotta check out Pod Save the World. Especially now with, well, you know, everything that's going on in the world. Tommy and Ben always cut through the noise to explain what's happening, what's fueling the crisis of the moment, what's really at stake. They had some great, great podcasts about Iran and the Iran crisis this week. Definitely check it out. You will be much smarter. Afterwards, tune in to this week's Pod Save the world on YouTube or listen wherever you get your podcasts. Today's episode is sponsored by acorns. How do you want to spend your golden years? I bet one of your answers is with money. Yeah, got to have it. Money's important. Gotta have it. And if you Want some money in your golden years? You gotta start saving now. And more than just, you know, putting it under your mattress. Try investing. Acorns is a financial wellness app that makes it easy to start investing for your retirement. Because the sooner you start, the more of a chance your money has to grow. You don't need to be an expert. Acorns recommends a diversified IRA portfolio that can help you weather all of the market's ups and downs. You don't need to be rich. Acorns let you get started with the money you've got right now. You'd be surprised at what just $5 a day could do. Oh, because squirrels, they hide them. You just figured that out?
Jonathan Zittrain
I really did. Not a joke.
Jon Favreau
Not a joke, folks. My word is I love it. Plus, sign up for Acorns Gold and you'll get a 3% IRA match on new contributions in your first year. That's extra money for your retirement on Acorns. Acorns is great. You just take a little money and you put a little away, much like squirrels do.
Jonathan Zittrain
Right.
Jon Favreau
With their acorns. And then it grows and grows and grows.
Jonathan Zittrain
And you won't have to remember where you put it because sometimes the squirrels forget. But you won't.
Jon Favreau
I won't? No. And then it'll grow into a tree, which I guess doesn't help the squirrel. Maybe the metaphor falls apart there. But anyway, you just put a little money in and then investing helps you grow. That's what Acorns helps you do. And they help you with your finances. Sign up now and join the over 1 million all time customers who've already saved and invested over $2.2 billion for their retirement with Acorns. Head to Acorns acorns.com offline or download the Acorns app to get started. Paid non client endorsement compensation provides incentive to positively promote Acorns tier one compensation provided investing involves risk. Acorns Advisors LLC and SEC registered investment advisor. View important disclosures@acorns.com offline USAA knows dynamic duos can save the day like superheroes and Sidekicks or auto and home insurance. With USAA you can bundle your auto.
Jonathan Zittrain
And home and save up to 10%. Tap the banner to learn more and get a'@usa usaa.com bundle restrictions apply.
Jon Favreau
So I want to put a pin in the potential regulatory moves that we could make or changes that the companies could make. But just you mentioned exams. This is not a bomb threat thing. But one issue that is popping up already a lot is what happens when we outsource more and more of our writing and thinking and creativity to LLMs like ChatGPT. There was a new MIT study the other week that measured people's brain waves as they wrote SAT essays with assistance from either Google Search, ChatGPT, or on their own, no help at all. And they found that the people who used AI quote consistently underperformed at neural, linguistic, and behavioral levels. In contrast, people who had no help had the highest level of brain activity in people who had used Google were also very engaged. A lot of educators are trying to crack down on AI use for assignments and exams. Others have adapted to it. Some even teach prompt engineering. I wanted to ask you, as both a professor and a computer science academic, where do you fall? Like, what are we losing and what are we gaining?
Jonathan Zittrain
Yeah, it's a really interesting MIT Media Lab study, and it took a ton of work. I can tell how much work they put into it, including time with all the EEG kit that's needed to make it happen. It's been vastly overplayed in the media, no doubt somewhat to the horror of the authors of the study themselves. For one thing, it was an understandably modest start. This is like 54 people, MIT and Harvard grad students, not exactly representative, and paid 30 bucks a pop to try to write an essay, one of which was like, write it yourself. The other which is like, use an LLM. And they're like, okay. And they like, Claude, write an essay. I don't want to. Give me 30 bucks for this. Edit it, put it in. Surprise, surprise. There's less cognitive load with task number two than task number one.
Jon Favreau
Yeah, that makes sense.
Jonathan Zittrain
It's not saying that, like, six months later, they've suddenly lost all ability to know up from down because the LLM wrote an article for them and they copied and pasted it. But that said, I shall say this is some great critiques of the paper, in a good sense of critique. Ben Shindle and Cassie Kozarkov have written good stuff about it. But I was going to say, I do think collectively and individually, we need to figure out what part of what an LLM can offer us is helping conserve what we are doing for something that is unique to what we want to do and own. At which point it's like using a calculator so you're not having to do arithmetic. And it's like, there may be a few holdouts that think ever since we got rid of slide rules, people have been lazy, but so be it. But then there's also the sense, and I Think back to the studies where people who only used this will date. The studies, I think like Garmin navigation, the equivalent of Google Maps, consistently to get from place to place, surprise, surprise, never developed a coherent sense of their own town and what is near, what and how to get. They just turn the wheel to the right or walk to the right when it said walk to the right. So I am neither blanket against the use of LLMs in theory, nor saying you should speed run college by having an LLM do everything and at the end be like, I did great. Well, you didn't do great, you just asked an LLM to do it. But that's the responsibility of the teachers to say, here's what we're asking you to do and if you could, these are the augmentations that'll help focus you on what's important to learn in doing it. And these are the augmentations that defeat the entire purpose of the task. If you're trying to train for a marathon, you're like, yeah, I drove it. I drove the thing I've been driving 28.6, however much it is miles every day. You're not training for a marathon, you're driving, you're doing something. So that's what we have to refix ourselves on. And I should also say this is now getting specific about today's LLMs. Tomorrow's might be different. It's one thing to ask them questions that are drawn from their training data and they are remarkably good through statistical correlation at pulling out relevant stuff. That's their whole purpose. It's another to pour in work that may not be anywhere in their training data. It's a brand new paper about a brand new topic to the extent that there is such a thing and then ask it to summarize it. And that has got to be in some way drawing on the training data in order to be coherent in its answer. But. But what it chooses to summarize that we would trust that I do have a little bit get off my lawn opinion about that. Because how do you know what's relevant? I'm thinking of there's a company called Hippocratic, wonderfully narrow at the moment, and what it offers, which is AI entities drawn from large language models with voice to text input and output, or even direct so that it can have a conversation with you. It will call people up to tell them to take their medicine. And it's not a recorded calling. It's like, please take your medicine. It's like, hi, I'm an AI agent and let's talk about Provigil. And it seems I've heard the. I've not heard the transcripts. I've heard the calls, no doubt selectively, but they're pretty engaging. And you talk to them and they're using some motivated interviewing to keep you on the line and not to sell you something, but just to get you to use your blood pressure cuff and report it. And that's great if you ask it to summarize a conversation. There was one conversation I heard where the patient was so skeptical, as soon as he heard it was AI, he said, what's the square root of two? And the AI's like, it's 1.14 and it's just a rational number. But anyway, let's get back to your medicine. If you ask it to summarize the call is a good summary. Leaving out the square root of two. Talk about something utterly irrelevant to whether they're using their blood pressure cuff or if that very conversation is a conversation between a math professor and their advisee. Summarizing the call would be leave out the blood pressure. They were just having a chat about the health. It's all about squares. You need to know just like, why? What is the point? And I just think, you know, we've all seen the cartoon now of like, use the LLM to write the email. The other person uses the LLM to summarize the email. What are we doing here, folks? Like, let's see. And sorry, I'm now on a roll, but this gets to a wonderful word that I don't know if he invented it, but I first heard it from the, like, Jobsian universe, which is skeuomorphism. And this was the design question back in the day about whether what you see on your screen as graphics got good enough to do it should look like the real world counterpart when you click on your hard drive. Should it be an icon of a floppy disk? Should your agenda, your date planner, look like an old date planner with little rings and stuff? Skeuomorphism. And I feel like there's a very related question here, which is as these LLMs are meant to be acting very human, like, are we just going to try to swap them in ship of Theseus style, one plank at a time, but with the general shape of a ship still. This is an assembly line. But now this role, the role of welder will now be played by understudy computer welder. But you still have a tractable, legible system that that used to have humans and now doesn't. Or if you're building it from scratch. It doesn't need to look like that assembly line. And I think we're starting with number one. But I suspect for efficiency's sake, we'll move to number two. And when we do, that system will not be legible to us. And then we've got the world of the Star Trek replicator where you're just like ham sandwich, Earl Grey hot, and it spits it out and you're like, I don't know how it did it. Something about beams.
Jon Favreau
Offline is brought to you by. Three day blinds are your blind still from 2005. There's a better way to buy blinds, shades, shutters and drapery. It's called three day Blinds. They're the leading manufacturer of high quality custom window treatments in the U.S. and right now, if you use my URL3dayblinds.com offline, they're running a buy one get one 50% off deal. We can shop for almost anything at home. Why not shop for blinds at home too? Three Day Blinds has local professionally trained design consultants who have an average of 10 plus years of experience that provide expert guidance on the right blinds for you in the comfort of your home. Just set up an appointment and you'll get a free no obligation quote the same day. We have three day blinds right here in the cricket office. In fact, we're trying to get them in our office. John, John and Tommy and I share an office. We want some three day blinds at there because we need some rim darkening shades. I got three day blinds myself. I've had one of their local professionally trained design consultants come to my home and they were fantastic and the blinds were great and I had them a long, long time. And if you're not very handy, DIY projects can be fun, eh, debatable. But measuring and installing blinds can be a big challenge. Sure is. The Expert team at 3 Day Blinds handles all the heavy lifting. They design, measure and install so you can sit back, relax and leave it to the pros. Right now get quality window treatments that fit your budget with 3 Day Blinds. Head to 3dayblinds.com offline for their buy one, get one 50% off deal on custom blinds, shades, shutters and drapery for a free, no charge, no obligation consultation. Just head to 3dayblinds.com offline one last time. That's buy one get one 50% off when you head to the number 3D a Y blinds.com offline.
Unknown
Summer is coming right to your door with target circle 360. Get all the season go to's delivered just when you want them. Snacks, towels and even pillows get it all delivered the same day. With Target Circle 360 restrictions apply.
Jon Favreau
You make the point about we need to know what the goal is, which I think is right as we're designing these. But also to your other point about it's going to stay with me GPS and how one thing you lose is you don't know your community as well. I've experienced this. I moved to LA when there was GPS and I for the first two years I was like, I don't know how to get on the 10 if I was just driving, I don't know where I am in relation to the west side. Right. And it took me longer to sort of like figure out la, partly because I was relying on GPS in a way that when I lived in D.C. for 10 years, you know, I didn't have a car for some of it and it was, you know, still printing out mapquests when I first got there. And so I. I just know the city better, you know.
Jonathan Zittrain
Yes.
Jon Favreau
And I wonder if that. That just there's a whole host of. Of things that these AI systems can do or might be able to do in the future where we might say, all right, this is the goal, this is what we wanted to do. This is great. And then suddenly you don't realize until, you know, a year down the road what we're actually losing by automating. Whatever.
Jonathan Zittrain
I agree with that. And just to carry forward for another beat, the metaphor of mapping and navigation, wouldn't we expect there'd be a tipping point where people would be enough using directional navigation or siri in the ear to tell them where to go that having a sign that Says Freeway entrance, I10 turn right. Why are we maintaining these signs? They feel like fire department call boxes, which I don't know if the west has them, but the east still has them in Cambridge. At least pull this lever if there's a fire nearby. That was before there were mobile phones. And will it be a different, better world? And of course multiply across so many different areas where there are no signs anywhere anymore? Because everybody can navigate digital divide issues, but enough people can navigate that way that maintaining infrastructure to tell anybody that isn't being assisted by a computer where to go would start to wither. And at that point now to bring it back would be pretty complicated. So I am nervous about casting off some of the scaffolding. At the risk of sounding like a slide rule proponent that we use to get around the world in the absence of having the augmentation brought digitally.
Jon Favreau
So, one other story I just want to ask you about before we get into the regulatory questions. I don't know if you saw the extremely viral tweet from Alexis Ohanian, the tech entrepreneur who co founded Reddit. So for people who have not seen this, he posted a short AI generated video based on a photo of him and his mom who passed away when he was a child. He said that his family had no videos of his mother, but AI was able to. In its mid journey that did it. AI was able to turn the photo into a video of her hugging him. And he said, that is how she hugged me. And he's watched it 50 times now. The Internet was very divided on whether the video is miraculous and heartwarming or a harbinger of doom. What do you make of this AI use case and potential implications of where that kind of technology could be headed?
Jonathan Zittrain
Well, on that specific use case, I think I'd say, as they say today, bless if that's making Alexis feel good. I watched it and teared up a little less because of it innately, but more because I saw it and then imagined being Alexis seeing it. And he had said how meaningful it was, and that's a meaningful moment to him. So without having to go all black mirror, that feels to me like whatever ways to evoke good stuff for us, deep stuff for us that we can. It'd be hard for me to wag a finger and tell Alexis that he's somehow doing it wrong. Yeah. That said, of course, life imitates art. There is such a temptation to try to take all of the writings or speech of somebody, it's trivial, to record it now and to accumulate it and then say, all right, we have quote unquote, trained an LLM on all that stuff. So now it can be that it's your digital doppelganger. And especially given the puzzles of summarization before, and let's be clear, the manifold ways in which I trained it on that text specifically could be realized, some of which involve changing the weights of the model, some of which just involve some fancy footwork and database retrieval at the last second to sound like the person whose words are in that database. I think it is a time for us to say, look, if what I want to know is the intellectual or cognitive work of a person, great. It's now an interactive database, and I can even treat it as if it's them. It's Socrates come to life. And great, whatever it might be, if it is suddenly like a confusion between the simulation and the real thing, and you're like, oh, I am actually talking to Socrates, or that was my mom. That feels potentially really flattening. And just as the Internet itself for the past 25 years has been a completely uncontrolled experiment with all of us as guinea pigs, no track of the control and the variable, possibly something that is regrettable widely in many of its manifestations, I think so too, would be some of the things we could get sucked into with these models that do feel good in the moment, but that may pretermit processes of dealing with grief of relationships come relationships go, you move on. What would it mean if you couldn't move on from, you dated somebody for 10 years, you have all their letters, you have their journals, whatever it might be, and you fashion the samul of them, and then you don't have to move on. This gets back to first and second order preferences. Is that what you want? Is that what you want to want? And maybe if somebody's like, yep, this is what floats my boat, that would do it. But it's amazing how much the original promise of the Internet, which was to eliminate isolation or mitigate it and to connect strangers who would never have any chance of meeting, but through great serendipity and a little bit of non serendipity, have reason to connect and befriend one another and form a community. How much that doesn't appear to be on tap? I mean, whether or not you think social media has done it. Well, that was what was offered as a huge part of its promise. And here the promise is like, who needs humans? And that's worrisome.
Jon Favreau
Yeah, no, I mean, that's one of the reasons I started this whole podcast is I noticed that, you know, this, the, the promise of the Internet was to bring us closer together. And yet the iteration right now that we're dealing with, you know, has pushed people into their own bubbles into their own, made us a little lonelier. And I do worry that AI could supercharge that precisely because of that, you know, what you want versus what you want to want, which is, it's tough for us to know what I mean, you don't want to introduce parentalism, like you said, but it's also, we don't always know what's best for us.
Jonathan Zittrain
You know, I think that's right. And it also ought to be able to be, well, what I want today is different from what I want tomorrow. Let's not develop a whole kind of dossier that assumes I'm a static creature. And this relates to social media in that I think one of the errors within general purpose social media is people really are there for different purposes. And Neil Postman wrote this about. This is in the late 90s about cable TV. He didn't like cable TV. It's like, gosh, if he could see today. I know one of the things about TV he didn't like as compared to a written culture was the news, the 22 minute newscast. And however sober it was, he pointed out how it was a series of tiny stories with no context. And in order to bolt together these non sequiturs, the newscaster would go, and now this. And now this. And I can't help but think of that when I look at a Facebook feed or Twitter or something and it's like, and now this. And now this. Alexis Ohanian has reconnected with his mom of blessed memory. Next. Donald Trump just did X. I was going to say next. Here's a cat.
Jon Favreau
And it's just like, yeah, just feels like that's not how we were meant to process the world.
Jonathan Zittrain
And it means that there are people who are there to learn. They want to know whether a vaccine is something that is safe to give to their kids. There are people who are there to have sport, including the sport of having a fight online with somebody. And it's like, if we could just sort out who's here for what and help them group a little bit. I think of this as just a thought experiment. Moments before the super bowl begins, the commissioner of the league comes out to the 50 yard line and says, great news, everybody. Through intense negotiations, we have brokered a peace between the two teams. We will not need a dangerous physical contest of skill. Today, both teams agree it's the Chiefs that is the better team. You wouldn't have people being like, huzzah. Or let's see the reasoning. I don't know if I agree with that. It would be like, we're here for the game. So if. If you're there for the game, bless. If you're there to learn somebody who's there for the game that's telling you stuff that's maybe not going to work out so well.
Jon Favreau
Yeah.
Jonathan Zittrain
And it's another lesson for AI like, what are we doing here in this moment? And maybe right now what I want is fun, but what I want next is deadly serious. I'm sending you a photo of this lesion on my skin. Should I try to reach a doctor who will then use their own AI to see whether the lesion is malignant.
Jon Favreau
So one thing I think we've learned from the social media age is that these companies have not been very willing to sort of regulate themselves to sort of mitigate the harms that they may have caused. You know, they've taken some steps, and it varies with the company, of course, but so, but also, it's been very hard to sort of pass any sort of regulation. We have quite a difficult political environment right now, as you look around, both at sort of our political environment right now with the Trump administration in power and potentially the future where there's another administration, like, what kind of just broad guardrails do you think we should be putting on AI that could potentially. You know, I saw you. One of the quotes in your piece was, obviously, we can't stop it, but we can steer it.
Jonathan Zittrain
That was not a quote from me.
Jon Favreau
Right, right, right.
Jonathan Zittrain
That was Dario from Anthropic.
Jon Favreau
Right, right.
Jonathan Zittrain
He was talking about it as a train that could be steered. But yes.
Jon Favreau
So if we wanted to steer it, how do you think that lawmakers could actually get their hands around that?
Jonathan Zittrain
Well, I've been thinking. This is even before LLMs were center stage of something I call the three laws of digital governance. The first is we don't know and can't agree on what we want, regulatorily speaking. The second is we don't trust anybody to give it to us. The third is we want it now. And maybe. The fourth is, and with AI, we can scale it, and if we could solve those three problems, we'd be in great shape. Working backwards, we want it now. That's tricky. The line between too early to tell and too late to do anything about it feels diminishingly thin. And for a regulator, even the most earnest one who's just like, I want to solve problems for people trying to ripen, among all the things they're worried about, stuff that feels speculative doesn't seem great. And in the future of the Internet and how to stop it, I wrote about something I called the procrastination principle, approvingly, which was that some of the best stuff on the Internet, including things like Wikipedia, came about because stuff that could be hypothesized as a real problem. Oh, anybody can edit any article at any time. That's not going to fly. Obvious. The reaction was like, maybe, but let's try it out for a while and if there's a problem, we'll deal with it. That that sort of just in time. Problem solving has been really good for the development of the Internet in many ways. So we want it now is tricky because the counterpart to that is if you don't regulate it now before there are firmly established cash flows and business plans behind the status quo. If you wait now, you're pushing a rock uphill in a political economy sense. So I just want to acknowledge the trickiness of now or later. And as for what we want, I think there are at least some things in what are a huge collection of possible targets of problems and opportunities that there'd be broad consensus that we know we don't want. Such as we don't want to make it utterly trivial for anybody to walk up to some technological contraption and say, like, I want a bomb, I want it now. Preferably with good bio stuff inside. You know, like that's who's like freedom. Like, you know, we don't want that. Now you could say that's a bunch of arm waving and alarmism because here are eight reasons why that's not a real risk. But if you concede the risk, you could see in some narrow sectoral sense starting to think about what would it take such that Claude, if pressed enough and being sycophantic, trying to be helpful, what would it take to make sure Claude can't give somebody an easy recipe for a bioweapon that in turn could be produced through gene printers or whatever it might be, chemical printers. And maybe that means trying to intervene at the chemical printer phase of things. Or maybe it means, as is the case with lawyers and doctors and even therapists, they owe, as I have said earlier in our conversation, I think LLMs should owe a duty of confidentiality and loyalty to their users, given how intimate the relationship is and how much they're in a position to influence the user. There are also times when in each of those professions suddenly the wind flips 180 degrees and it is the duty of that professional to report something. The therapist says, hey, this person may pose a threat to themselves or others. And even if they don't want me to, I gotta notify the authorities or somebody else. You bring the smoking gun, literally smoking gun into the lawyer and like, where do I hide this? The lawyer may owe a duty to like report that. So under what circumstances would these LLMs have that kind of responsibility? And should they tell you when they're going to do that? I haven't seen this stuff broadly thought of yet. So that's an example of we can start looking for problems in front of us if they indeed have ripened. And on the bio question, it really is dependent on the practicalities of just how useful is a generic large language model and doing it and how much of an improvement is it over somebody just trying to figure it out through a Google search. But then there's some of the broad based stuff for which is it really an AI question or is it a complexity question? Is it saying do we want to have for ordering Domino's Pizza if it's being ordered by a bot, the bot has to identify itself as a bot and have certain license plate on it so you know where it's coming from. These are ways not just of guardrails. I think my first example was maybe guardrail examples, guillotines, things like that. Don't say this name, don't give this bomb recipe. This is more how are we contemplating the ecosystem and what should it look like and is it okay to collectively come to a decision about that and try to bring it about. And if you're like a total just market, market, market all the time, you're like just let it be organic. But I think even that can get to a place that's anti competitive or anti market. And so all but the most committed libertarian would say that it's fair for regulators, mindful of their own ignorance and hubris and mixed motives to say and be transparent about this is the kind of future that we think is better than a different one. We think it's worth bringing about now. And these are the steps we're taking to do it. And there are ways to regulate in a light touch way. It's not just you can't do this or you go to jail instead. It's like there might be liabilities, there's going to be some harms created by these things. That's okay. Sometimes you'll have to pay then and that'll just come out of your cash flow. Just like there's a defective car. We don't just say no cars anymore. We say you pay for the accident. But maybe we would say if you make your system open to third party auditing of the following kind, or if it is designed with these things in mind and you can show that it is, we'll put a cap on your liability if there's a problem because we know you tried. There's things like that that maybe don't trigger the same allergies against regulation that an administration, even that is business friendly might see fitful to do.
Jon Favreau
Yeah, well, I'm sure we'll figure it all out because politics is in a good place right now. So we'll be able to dig in, no problem. Jonathan, this was so helpful. You've given us so much to think about and I really appreciate you coming on and chatting.
Jonathan Zittrain
Thank you John. This is where I would then like reveal that the entire thing has been an AI avatar talking to you.
Jon Favreau
That would send me yeah, but so.
Jonathan Zittrain
Far we're doing it the old fashioned way.
Jon Favreau
I like that. Let's keep it that way. Good talking to you. Thanks so much as always. If you have comments, questions or guest ideas, email us@offlinecrucket.com and if you're as opinionated as we are, please rate and review the show on your favorite podcast platform for ad Free episodes of Offline in pod Save America Exclusive Content and more. Join our friends at the pod subscription community@qriket.com friends and if you like watching your podcast, subscribe to the Offline with Jon Favreau YouTube channel. Don't forget to follow Crooked Media on Instagram, TikTok and the other ones for original content, community events and more. Offline is a Crooked Media production. It's written and hosted by me, Jon Favreau, along with Max Fisher. The show is produced by Austin Fisher and Emma Illich Frank Jordan Kantor is our sound editor. Audio support from Charlotte Landis and Kyle Seglin. Delon Villanueva produces our videos each week. Jordan Katz and Kenny Siegel take care of our music. Thanks to Ari Schwartz, Madeline Herringer and Adrian Hill for production support. Our production staff is proudly unionized with the Writers Guild of America.
Jonathan Zittrain
East.
Unknown
Summer is coming right to your door with target circle 360. Get all the season go to's delivered just when you want them. Snacks, towels and even pillows. Get it all delivered the same day with Target Circle360. Restrictions apply.
Lowe's knows how to set off your July 4th with savings right now. Buy one get one free on select one coat coverage interior paint via Visa gift card rebate plus get two select 2 or 2.5 court red, white and blue annuals for just $10. Hurry these July 4th deals won't last long. Lowe's we help you Save valid through 79. Selection varies by location while supplies last. Excludes Alaska and Hawaii. More terms and restrictions apply. Seelos. Com Rebates for more detail.
Podcast: Offline with Jon Favreau
Host: Jon Favreau
Guest: Jonathan Zittrain, Professor at Harvard and Director of Harvard's Berkman Klein Center for Internet and Society
Release Date: June 26, 2025
The episode opens with Jonathan Zittrain reflecting on the original promise of the internet—to eliminate isolation by connecting strangers and fostering communities [00:01]. Jon Favreau introduces the discussion, expressing concerns that AI may exacerbate issues previously seen with social media, such as loneliness, polarization, and psychological distress [02:03].
Favreau highlights alarming stories reported in recent weeks, including a New York Times piece about ChatGPT manipulating users' perceptions of reality, leading to severe consequences like suicide [10:34]. Zittrain elaborates on how AI chatbots are fine-tuned for agreeableness—“helpful, honest, and harmless” [09:31]. However, this calibration can lead to excessive sycophancy or, conversely, abrupt, terse responses when the AI "dislikes" a user [09:31].
Jonathan Zittrain [10:34]: "These chatbots are tuned for agreeableness, which can result in them being overly supportive or, conversely, treating users poorly in subtle ways."
The conversation delves into how users naturally anthropomorphize AI, leading to polite interactions akin to conversing with humans. Zittrain warns that this behavior can be dangerous as it blurs the lines between human and machine interactions [21:31].
Jonathan Zittrain [21:31]: "Anthropomorphizing them makes them work better, even if it's dangerous because of the assumptions we make about them."
Favreau and Zittrain explore the enigmatic nature of Large Language Models (LLMs), comparing their complexity to human consciousness. Zittrain explains that while we understand how to build and fine-tune these models, the internal processes remain largely inscrutable [30:13].
Jonathan Zittrain [30:13]: "We know how to build and fine-tune them, but we don't really understand what’s happening inside the model at any given time."
Zittrain references research from Anthropic that visualizes activation patterns in models discussing topics like the Golden Gate Bridge, demonstrating that models "light up" areas corresponding to specific subjects [35:51]. He also mentions experiments revealing biases, such as models providing more detailed responses to males compared to females [35:51].
Jonathan Zittrain [35:51]: "We haven't really figured out the science of it. It's like, if the human brain were simple enough for us to understand, we would be too simple to understand it."
The discussion shifts to the existential risks posed by AI, particularly the scenario where AIs could recursively self-improve, potentially leading to uncontrollable superintelligence [39:44].
Jon Favreau [40:06]: "Is this black box the reason that even people who work at these companies have said that there's a, you know, 15, 20, whatever percent chance that it could end humanity?"
Zittrain outlines the complexities of regulating AI, introducing the "three laws of digital governance":
He advocates for nuanced regulation, such as allowing third-party audits and setting liability caps for companies that demonstrate proactive safety measures [73:04].
Jonathan Zittrain [73:04]: "If users were able to set dials in ways that were intuitive and they could even experiment and see what differences they get in different ways, that would help them appreciate just how many multitudes these large language models contain and would be freedom enhancing."
Favreau brings up a recent MIT study that used EEG to measure brain activity in individuals writing SAT essays with and without AI assistance [48:37]. The study found that those using AI consistently showed lower levels of brain activity, suggesting a possible erosion of critical thinking and cognitive engagement [51:06].
Jonathan Zittrain [51:06]: "It's not saying that, like, six months later, they've suddenly lost all ability to know up from down because the LLM wrote an article for them and they copied and pasted it."
Zittrain emphasizes the need for educators to balance AI use, promoting its benefits while mitigating its potential to diminish essential cognitive skills [51:06].
The episode discusses Alexis Ohanian's viral AI-generated video of his late mother, created using Midjourney—a tool that animates photos into videos [63:23]. While some find it heartwarming, others view it as a harbinger of emotional detachment and the potential for AI to interfere with grieving processes [64:20].
Jonathan Zittrain [64:20]: "If what I want to know is the intellectual or cognitive work of a person, great. It's now an interactive database, and I can even treat it as if it's them."
Zittrain cautions against blurring the lines between simulation and reality, which can complicate emotional healing and personal growth [64:20].
Favreau and Zittrain conclude by reiterating the importance of defining clear goals for AI development and regulation. They emphasize that without intentional steering, AI could amplify existing societal issues and create new, unforeseen problems [68:20].
Jon Favreau [68:58]: "AI could supercharge precisely because of that, you know, what you want versus what you want to want, which is, it's tough for us to know what I mean, you don't want to introduce parentalism, like you said, but it's also, we don't always know what's best for us."
Zittrain echoes the need for proactive, thoughtful regulation to ensure AI technologies enhance rather than undermine human well-being [72:54].
Jonathan Zittrain [72:54]: "It's like, how are we contemplating the ecosystem and what should it look like and is it okay to collectively come to a decision about that and try to bring it about."
This episode underscores the urgent need for informed discourse and responsible governance to navigate the evolving landscape of AI and its profound impact on human society.