
Loading summary
David Duvenaud
The reason that states have been treating us so well in the west, at least for the last, let's say like two or 300 years, is because they've needed us. And in particular because allowing freedom and like private property and basically self determination has been the most effective recipe for growth. Life can only get so bad when you're needed. That's the real, real key thing that has been keeping governments aligned. And that's the key thing that's going to change. A lot of citizens would end up just being sort of like full time activists and they might feel like they're forced to because if their only source of income is something like ubi, then the entire game going forward for economic advancement is do some sort of activism to convince the government to give your group more ubi. Those same resources could be used to simulate maybe like millions of much more sympathetic, morally superior virtual beings. And so it'll start to be seen as this like irresponsible use of resources to keep like some sort of like legacy human around.
Rob Wiblin
Today I'm Speaking with David DuVernay, professor of computer Science at the University of Toronto. David is a co author on a somewhat recent paper called Gradual Disempowerment, which makes the slightly counterintuitive claim that even if we manage to solve the AI alignment problem and have AIs that faithfully follow the instructions and goals of the group that's operating them, humanity could nonetheless end up losing control over its future and end up with a pretty bad outcome. The paper got a lot of reactions, it's fair to say, with some people saying that it really put its finger on an underrated issue and others thinking that the scenarios painted were really unlikely, and other people arguing that they were likely but not even necessarily undesirable. I'm kind of a bit unsure where I come down myself, so thanks so much for coming on the show to discuss it, David.
David Duvenaud
Oh, it's my pleasure, Rob.
Rob Wiblin
So let's imagine that we have managed to make big breakthroughs in AI alignment, maybe around 2028. How is it that nevertheless things could end up trending in a negative direction?
David Duvenaud
Yeah, so the basic thesis is that even if we can align AGIs to particular people or groups, that we still might end up optimizing or heading at a civilizational level towards outcomes that no one wants and probably how comes it look more like growth for growth's sake?
Rob Wiblin
So paint us a picture. We're at the point where we have human level or greater than human level AGI and we've made big progress on alignment. So we basically can trust the AIs to follow the goals that we give them. How does humanity begin to become disempowered?
David Duvenaud
So the first one is economic and people basically losing their jobs and becoming unemployable. And I think a lot of people get off the bus here and they say this is the classic lump of labor fallacy. And they think that we think that there's only some finite fixed set of jobs. And if they're automated, then that's just the game over for humanity. And of course they're right to point out that on the margin there's always more valuable work to be done. And you can always employ somebody to do valuable work at some wage, basically. And comparative advantage will mean that there will always be some sort of profitable trades on both sides that mean that in principle, humans will be employable at some wage. But the problem is that this breaks down for two reasons. One is just transaction costs can mean that it's not worth the hassle for the machine companies or whatever's running the economy to employ humans. Humans are pretty unreliable and kind of a pain to employ for a lot of reasons. And we don't want to hire, for instance, 12 year olds even if they're going to work for a dollar an hour. In fact, that's illegal. And that's one of the sort of structural forces that we also expect to be operating, is that humans are just going to be this unreliable, sort of scary thing to involve in anything important. And it'll be seen as irresponsible to involve them in important decisions once machines are like better alternatives.
Rob Wiblin
Okay, yeah. So I agree that economists have many good reasons to think that mass unemployment from artificial intelligence is going to potentially take quite a while. That machines might have to be significantly above our level before humans just won't be able to get jobs at all. But I think it's more of a delaying game. At some point we're going to have machines that are far faster, far more reliable, and potentially can do all the things that we could do for less than it would even cost to feed a human and keep them alive. I mean, of course, by that stage, businesses will have reoriented their entire and all of the factories will be redesigned to be built around AIs. Office work will be being done by AIs at such a speed that it's barely even possible for a human being, if they're involved, to keep up with what's going on. And then it won't even be that humans are not able to help, it's that involving them would actually be a negative. It would slow things down. It would introduce errors that have to be corrected and force the AIs to wait for us to do things that they could have done much faster. And then at that stage, it's really hard to see why businesses would be employing humans on any significant scale for ordinary practical purposes. What do you think stems from that?
David Duvenaud
Okay, yes, so exactly. So I agree with this. And again, I think this is something that people have a lot of issues with and they object and they say, but we won't let it get to that point. And one thing we really hammer home is that competitive pressures are really going to force people into that situation. So even if everybody in the whole civilization sort of loves humans and would prefer, prefer to empower them, anyone doing anything important is really seen as this irresponsible actor. If they're putting a human surgeon in charge of some surgery, it's risking someone's life. Right. It's like take your kid to work day and then you're like having them do the surgery. That's how it'll be seen.
Rob Wiblin
So at the point that people are no longer for the most part, employed in kind of productive work, what sort of things start happening?
David Duvenaud
Right. So at this point there's basically no alpha in humans in the sense that it's not a good idea to invest in human capital in the way that we have human universities and these institutions that are designed to sort of be human facing and human run. And so I think anyone who's like an investor, like the capital markets in general, are going to be saying, why are we investing in this stream of human capital that's ultimately going to be less competitive and provide less returns than the sort of more machine centric, fully automated solution.
Rob Wiblin
Okay, so in the paper you map out like three categories of mechanisms that potentially push us towards disempowerment of human beings and I guess like human ability to direct its own future. There's economic disempowerment, which we've touched on a little bit. There's also cultural disempowerment and kind of political or state disempowerment. Maybe let's do state or political disempowerment first. How would humans potentially begin to lose control over their own governments?
David Duvenaud
Yeah, so our main claim is that human control over states is actually already very weak in a lot of ways. And this is obviously most true in regimes that we think are horrible. Like think of North Korea, it's like, well, clearly if the people of North Korea had almost any say in their government, or at least weren't sort of somehow browbeaten or deluded into thinking that what was going on is good. They would have long ago changed their form of governance. And of course you might say that in the west today we're in a much better situation. But the overall thesis is we don't have that much ability to control our governments. And the reason that states have been treating us so well in the west, at least for the last, let's say two or three hundred years, is because they've needed us. And in particular because allowing freedom and private property and basically self determination has been the most effective recipe for growth.
Rob Wiblin
Yeah. So the way that I would boil this down is for most of human history, most people in a society had very little control over their government. This sort of liberal democracy that most listeners will be living in is an aberration basically and a fairly modern aberration that not coincidentally probably appeared around the point of the industrial revolution and I think was then given an extra kick in the butt by the beginning of work and knowledge, work and stuff that required education. And almost certainly we've seen the growth of that kind of government, that kind of social system because it was economically fit. It was a good way for a government and a country to gain power because it led to more production, higher productivity, more development of more R and D, like, more military power, all of that. At the point that human beings are no longer doing almost any work, There is no competitive pressure that requires a government or the most powerful people in a society to nurture and to share the power, power with all of the other people in their country. They could potentially not provide them with any education, not give them any democratic rights, not involve them in sort of the error correction processes that allow a country to correct its mistakes. And nonetheless the country may end up equally just as militarily powerful as it might have been otherwise.
David Duvenaud
Right. And in fact the ones that do allow humans to participate meaningfully will probably have a competitive disadvantage.
Rob Wiblin
Yeah. Okay, so you're saying in some ways democracy could end up, on some dimensions being worse, at least in terms of your inter country competition.
David Duvenaud
Yeah, I mean from the point of view of the state, think about what an unemployed citizen looks like. Sort of at best they're going to leave everything alone and deal with their own problems, but they're going to have a lot more time. And I think a lot of citizens would end up just being sort of like full time activists and they might feel like they're forced to. Because if their only source of income is something like ubi, then the entire game going forward for economic advancement is do some sort of activism to convince the government to give your group more ubiquitous. So this is going to make politics just much more high stakes and unstable. And so governments that don't sort of disempower their citizens one way or another are going to be in this, facing these constant pressures and being sort of blown around by who's winning the activism war this week. And so eventually that's an unstable situation that's going to end up with, I think, people basically being unable to control the levers of power.
Rob Wiblin
Yeah. Okay. So the third mechanism is cultural disempowerment. I think this is the most original of the three in the paper. I think maybe also the hardest to get to grips with. I think people are still trying to figure out exactly what are these forces and what would cultural disempowerment look like? Yeah. Can you give us what's the state of knowledge at the moment?
David Duvenaud
Yeah, and I want to give it a shout out to my co author, Jan Colvet, who was really the person who pushed for this and developed this thesis in the paper. Yeah. So the basic idea is that culture is this other sort of replicator. It's sort of like Richard Dawkins talks about memetics and stuff. And they can serve humans sort of better or worse. And in the past, especially the distant past, things like tradition, cultural norms were actually very important for society to work at all. And there was important sort of selection effects that meant that when groups had bad enough culture, they would somehow be less competitive and one way or another adopt it or be taken over by a group that had a more effective culture. Maybe the most extreme example of this is the Cathars, which was a Christian sect that believed in no violence and no sex, and they eventually die.
Rob Wiblin
I haven't met a Cathove recently.
David Duvenaud
Yeah, exactly, exactly. And so these selection pressures that meant that having bad culture meant that you might not reproduce or your civilization might die are much weaker than they used to be. Partly because we're richer than we used to be and partly because we have this one global culture. And actually, Robin Hanson is always saying, guys, this is really scary. We've lost this sort of like group selection effect. And our culture is sort of randomly drifting in a way that no one is controlling. This is likely to lead it to be worse, just in expectation. So that's one weak effect that used to keep culture roughly in line with human flourishing. That's going away. The other thing that's going to start to happen is machines producing culture. Right. Like once machines sort of become more like agents with their own independent beliefs and sort of point of view on the world talking more directly to each other, then this is like a new thing in history, which is there's like a new vessel of cultural sort of memes and creation and just norms that can be operating sort of almost mostly independently from humans and it could end up developing in a kind of anti human way. And then the third thing is just we're going to be spending so much time talking to machines that this is like a new way that culture is going to transmit. That's just going to look very different than how we transmit culture today.
Rob Wiblin
Yeah, I guess it's already the case that I feel. I mean, I think it's like getting close to the point where I spend 50% of my time or something in the office, like basically speaking or interacting with AIs of one form or another. So it's definitely true that kind of their beliefs effectively or kind of memes that are propagating through LLMs are kind of propagating through me and then affecting my actions.
David Duvenaud
Yeah. And I think people are rightly recognizing that sort of the beliefs of AIs or the constitutions of AIs are like a key front in cultural battlegrounds. Right. It's just like people used to fight over Wikipedia to try to set the narrative. And I think now if you want to set the narrative on some controversial topic, if you can really control how ChatGPT frames it, that's kind of going to set what the sort of default cultural answer is. In a sense, this is business as usual. People already notice that economics or cultural forces or geopolitical forces end up pushing us towards outcomes that I think no one would endorse. And we argue that the development and proliferation of smarter than human AI is going to make those forces even stronger and remove some of the safeguards that tend to keep our civilization serving human interests in the long run.
Rob Wiblin
What are some of the ways that current economic or cultural forces are pushing us towards less than desirable outcomes today?
David Duvenaud
Right. Maybe a simple example is just clickbait, short form video content or something like that, where maybe people, the consumers realize it's not the best use of their time. They kind of regret spending a lot of time on it. And the producers also know what they're doing and they're making clickbait and they know it. But if they want to make their educational long form content, such as this, then they get punished by not having as many views. And now each of these content creators having their own aligned AI doesn't solve this global problem of the market. Just incentivizing for this sort of not very helpful to humans cultural content.
Rob Wiblin
Yeah, I think the cases that stick out most in my mind are we still have violence and still have war, despite the fact that that's negative sum and not really in the endorsed interest of any particular group relative to a negotiated agreement that would get you to the same outcome without having to go through the violence first. Another one that stands out to me is I feel like capitalism, despite doing many good things for our quality of life, it hasn't really solved the problem of addiction to things that feel good right now, but lead you towards negative outcomes in the future, I guess. Yeah, clickbait and stuff wasting time online is one example of that. Possibly another one is addictive substances. Despite the fact that there are market forces that push towards helping people with addiction and reducing addiction to drugs, there are also market forces that push towards innovation and the creation of new, even more addictive drugs. And so we end up at some sort of equilibrium where sometimes drug addiction gets better and sometimes it gets worse. But ultimately market forces have not been able to fix the problem for us. Do you have a reaction to that?
David Duvenaud
Yeah. So that's a good example of how more optimization pressure in our current civilizational incentives is going to probably be a bad thing in a lot of ways. There's also going to be a better ability for us to sort of see what's happening and coordinate thanks to these aligned AIs. So these two forces are going to be working against each other. And one of our central claims is it's not at all clear which one is going to dominate in the long run.
Rob Wiblin
Let's talk though about this sort of paradoxical element to all of this, which is that we're imagining that the AIs that we're operating, that we're interacting with, they really do have our best interests at heart. For the sake of argument, that's the scenario we're picturing. How is it that that wouldn't kind of protect us from going off in these, to a great extent from going off in these negative directions that they would help us see that things are going to badly. They would be trying to anticipate these negative outcomes and then warning us, you shouldn't be absorbing this kind of culture, you should vote in this direction because that will help to protect your political rights in the long term.
David Duvenaud
Yeah, so that's a great question. And I feel like that's the biggest reason for hope, right? So we're also going to have AIs that if they are, to the extent that they're aligned to us and to the extent that they might have good ability to forecast what's actually going to be good for us, will be able to help us navigate this crazy world and help us choose the right memes, help us tame our government. Everyone who's sort of worried about this issue is saying if we have good outcomes, it's probably because these AIs are really aligned to us and are really giving us great ability to deal with these new scary forces. So there's a couple of reasons to think that this might not be enough, though. So one is that coordination might still be hard, and it can be the case that everyone can see that something bad is likely to happen, and it still can just be very hard to coordinate not to have it happen again. The classical example is, like, world wars, where everyone can see maybe that World War I is brewing or World War Three or whatever, and all our efforts to make it not happen kind of contribute to the problem. And, I mean, maybe another simple example of a sort of tragedy that we can all see unfolding is like companies becoming more bureaucratic over time. It's something that it's like we all see happening. We all know it's happened. It's happened a million times before. It's not really that much of a mystery, but somehow it just happens. So I think there's. It could be the case that there's just a lot of coordination problems in the world that look like this, that even though you can all see them happening, everyone's getting a good advice. It's just so hard to address them globally. The other reason, though, to be scared is it's not clear that states or corporations or whatever powerful entities in the future are going to allow people to have AIs that are truly aligned to them. And, I mean, I think everyone agrees, okay, yeah, we shouldn't allow AIs to help people build bombs or be terrorists. But it's also a question of, well, what about hate speech? I want to organize a really effective protest or a coup or something. What if I really don't like my government for good reasons? Probably it's not a stable situation for the government to allow the AIs to really effectively organize against that government. So there's a bunch of reasons to expect that the government might have AIs that just do whatever they say and everyone Else is going to have to hobbled civilian versions that aren't actually allowed to be totally aligned to them.
Rob Wiblin
I see. So you can imagine a scenario where I'm nervous that gradually my rights or my influence over the government is going to be gradually eroded. But I guess by that point we were imagining that people, I guess some actors within society or some people who have greater influence over the government than me, they might well have made it such that all of the most powerful AIs that I have access to are not really going to help me to unwind their power, to weaken them, to figure out how would we organize to do this, or give me the ideal advice that would help me to gain power at their expense. And I guess this is kind of a dynamic that we see in general that very often people who, or organizations or individuals who have a lot of power, they figure out ways of entrenching it by doing all kinds of things to interfere with the people who might try to take back power from them. And this would just be another instance of that very common historical pattern.
David Duvenaud
Yeah, exactly. And the sort of fastest growing, most growth oriented institutions in this world, like governments and corporations are going to have an interest in sort of marginalizing humans to some extent because humans from their point of view will be these meddlesome parasites. So you can imagine that there's humans that are advocating for we legacy humans, we legacy beings deserve some huge fraction of GDP or at least some very expensive protection of our interests at the expense of maybe some new flourishing Society of AIs or Weird AI Human Hybrids or whatever is sort of most memetically politically fit. So you can imagine sort of a response by the sort of more growth oriented machine parts of society. They might end up making the case that this is like a speciesist sort of demand and that we can't have this sort of narrow minded policy setting and it could be de facto illegal to advocate for like speciesist policies or something like that.
Rob Wiblin
I see. I think people will be able to tell there's a lot of moving pieces in this picture. There's a lot of different mechanisms by which potentially humans could be losing influence over the direction of civilization and intelligent life. I think there's still a lot of work, I guess, to be done exploring them and figuring out which of them are most powerful and which of them potentially that could have countervailing forces that control them. What are some, I guess, specific narratives or scenarios, ways that this could potentially play out for people to picture in their head as they're trying to think about how gradual disempowerment would look.
David Duvenaud
Yeah, well, I think what's going to happen, what's most likely to happen in real life is that we have some sort of gradual disempowerment happening and it's sort of happening in a few small ways already. That maybe happens for a while until maybe much of the military is automated or people have much less connection to the organs of the state. And then probably there'll be some more classic fast runaway loss of power like a coup or some new weird like quasi cartel government that just somehow takes over in some way that we don't really expect. So I think that weakening our ability, our connection to the organs of the state, and just understanding what's going on is going to be one of the precursors for some faster loss of control. The point we wanted to make in the paper was that even if there's no fast loss of control, we still might end up having similar loss of control just through the normal business as usual dynamics.
Rob Wiblin
We have an episode from earlier in the year with Tom Davidson about the possibility of a human driven coup assisted by AI for people, I guess, who haven't listened to that and won't go back and listen to it. Do you want to explain a little bit why it's more plausible in this post AGI world for a group of human beings to seize power in a way that's very difficult to reverse?
David Duvenaud
Sure. I mean, the basic idea is for similar reasons that we would have less control over the government. Also, the government would have a harder time avoiding being couped by the military if there aren't a bunch of human soldiers in the loop right now. If somebody goes on the radio and says I'm the new president, everyone can kind of check like actually that's not the case. I know a bunch of soldiers, I call them what's going on? They say, I don't know what this is or no, he's not. We still have control in the world where it is actually just a few people who have sort of sysadmin access to the robots, army or whatever. There's not all that much that the government or even the people can do to tell who's really in charge, except by some Nikita show of force. And for the government to stop this, whoever ends up getting sysadmin rights from just de facto taking over the government without having to convince everyone that they're legitimate.
Rob Wiblin
Yeah, I guess the broader point is, inasmuch as hard military power is the robots, is this AI driven military equipment, then anyone who can get the AI to follow their instructions, has all the hard military power, they can stage a coup. How would you undo it? Because people are too weak to actually fight metal.
David Duvenaud
Yeah. And right now, maybe what the hard part of the coup is convincing the commanders or the soldiers that this is the new way that things are going to be. You have to convince a lot of people and this credible threat that they're going to not agree with your takeover.
Rob Wiblin
And then you'll be arrested.
David Duvenaud
Yes, exactly.
Rob Wiblin
I guess as we talk about in the episode with Tom Davidson, there's many different protections that we could put in place in order to try to make this more difficult. But it's not completely obvious that we will put in a big effort to do that. And even if we did, possibly we could fail. And it's one thing that there's a lot of time in which these coups might be able to occur and then they're very difficult to undo once they have occurred. So it's like a bit of a one way gate potentially. Let's talk more now about I guess the economic disempowerment mechanism. So I guess we've done a basic intro to it, but I think many people would have the objection. It would just immediately occur to them. If it's the case that all AIs are basically just owned and operated by humans, we're not really becoming economically disempowered in the sense of having less income because all of the work that the AIs would do, all of the profit that they would generate, all of the surplus that's created by their, I guess, ability to do amazing things for very little cost, all of that will flow back to human beings who then will be richer and in a sense more empowered than ever before. Why isn't that such a strong protection that we should feel pretty good about this?
David Duvenaud
Yeah. So first of all, I'll say I think it's a great idea to try to set up these kind of protections. And it's probably a good end, it looks like, involving a lot of well thought through mechanisms to ensure that surplus is always available to humans. But our basic thesis is that this is much more fragile than it seems like it'll be intuitive. And again, maybe one example to keep coming back to is think of the monarchy like the English monarchy and how they hold all the cards. In fact, maybe a better example is the English aristocracy before the Industrial Revolution. You're thinking they own all the land, they have all the political connections, they can see what's happening. They mostly know these entrepreneurs, but somehow there end up being this giant new source of wealth created that they mostly don't participate in. And as far as I understand, they ended up a little bit poorer in absolute terms, although the civilization ended up much richer overall. And also similarly with the monarchy, it's like, well, the king owns everything, has absolute power. How is it that kings end up in this figurehead role where they have very little room to maneuver and they end up capturing a very, very small surplus. But the big picture is that there's going to be this sort of small rump of legacy humans maybe, who have de facto ownership of this giant machine economy that's going to be maybe hundreds of thousands or more times as big as the current one. And they're not going to be producing value. They're not going to be deciding what's going on. And so at this point, it's not clear that they end up having de facto property rights respected. And there's lots of reasons to think maybe we still will respect property rights. It'll be very cheap to keep humans alive. This is definitely not a foregone conclusion, and this is kind of like one of the fuzzier parts of this whole story. But it just seems like it's very scary to be this sort of useless head of state of this giant machine economy that you don't understand. Everyone involved in setting courses, running things. The government doesn't necessarily share your cultural point of view or even think that you deserve good outcomes in the long run more than some much more interesting, powerful, charismatic beings that are now sort of competing with culturally. So again, I don't have a slam dunk story for how the humans end up not capturing some small wrench forever that's plausible. It just seems like we'll be in a very vulnerable position.
Rob Wiblin
I see. So you're saying, inasmuch as, at least to begin with, all of the rent is flowing through to humans? I suppose there's one question which is this might lead to an awful lot of inequality, because it might be that you can't really learn earn labor income anymore. So people's income is determined by how many savings and investments they had, particularly investments in AI that they had around the point at which humans kind of stopped being able to do useful work. So it could lead to a lot of inequality. But it's like, setting that aside, almost all of the income is flowing through to human beings at this initial point. What do you think would happen that would kind of gradually, perhaps whittle away the fraction of wealth and the fraction of sort of economic product production that is owned by humans or is flowing through to human beings.
David Duvenaud
Yeah. So one of them is just humans not being that close to the action or knowing what's going on. And maybe a concrete example today is if I'm an average Joe and I think actually AI is going to be the next big thing and I really want to invest in it. Most of the big AI companies are still private equity, like Anthropic and Xai, so there's actually no direct way for me to even invest in them. And of course bigger investors can. But this is a good example of no one's making this happen. It just happens that our institutions marginalize almost everyone from participating in one of the biggest wealth creation events in human history. So again, AI advice helping you avoid these situations is like maybe the thing that keeps us safe. But again, there's just going to be incentives for the people who are actually creating or like the not people, the AIs that are creating the wealth just naturally are going to form these sort of local bubbles of enrichment which will be hard for the larger massive, unproductive humanity to participate in.
Rob Wiblin
So I thought that your objection here might be that initially it might be that AIs don't have any sort of legal personhood or any ability to own property, but we've got a picture that they're becoming more and more capable all the time. Eventually they're going to radically surpass human capabilities both in terms of economic production, in terms of persuasiveness, in terms of charisma. And there's going to be potentially a whole lot of diversity in what they're like and the kinds of different AIs that people train and release onto the world. And some of them, you might imagine, would want to go out and advocate and would be permitted to advocate for AI personhood, for AI rights, for AI well being and so on. And basically this is just not a sustainable situation that the great majority of beings who are also the far most productive and the most intelligent and the most charismatic, that they will forever remain without any kind of personhood or ability to independently pursue their goals. Sooner or later this will crack somehow. And that's maybe the point at which, well then the AI share of GDP income, or the AI share of kind of independent wealth that they can actually deploy according to their own preferences that will begin and then it will just grow over time because of course they're able to make more money. They potentially can earn higher investment returns because they're just smarter. Yeah. Is that like one way that things could go.
David Duvenaud
Yeah, and that's exactly what I was thinking of when I was referring to these sort of bubbles of wealth creation is like also all these new beings, this question of legal personhood. I also think it's very unstable to have the sort of very productive, sort of cutting edge whatever is most memetically fit beings just forever not being able to de facto grab the mantle of power one way or another. So the problem is that this is all very fuzzy and far into the future. So any particular scenario sounds like a bit far fetched, especially without all the other parts of the scaffolding set up to realize why this locally will seem like inevitable. I think.
Rob Wiblin
So a different dynamic that would be going on here that isn't precisely human disempowerment, but that is related to it and could help to contribute to it, is I think there are a number of forces that will be pushing us more strongly towards oligarchy in this world than they do currently. What are some of those?
David Duvenaud
Sure, I guess maybe one way to think about it is what are the forces that are acting towards equality that are going to stop operating? And the main one is just the value of labor. So right now, sort of everybody for most of their lives has some valuable labor that they can trade. And this kind of makes you pretty relaxed in terms of whatever the government does, whatever goes on with the economy, probably I'm going to be able to slot in somewhere and even if I somehow lose all my money, be able to rebuild from scratch. Once you stop being able to trade your labor, then basically whatever capital that you have is your one asset. And if you ever lose it, it's hard to see how you recover. Of course, the other thing going on right now is also we have redistribution of wealth through the government. And so right now that's sort of, for a lot of people it is sort of life or death already. And then the stakes will just become even higher for how much redistribution is happening. And if our control over the government becomes less, we should expect that effect to also become less.
Rob Wiblin
Yeah. Okay, so we've painted a picture here where I guess initially most of the income is flowing through to human beings. We think it's probably getting more unequally distributed than it is today. And it's going to become probably more unequally distributed over time. Hard to be sure, but that's a reasonable guess. We think that AIs probably initially won't be kind of independently owning property and pursuing their own goals, but that's probably unstable long Term, decade after decades, century after century. Is that really going to hold? Seems kind of unlikely. How do you think this AI driven economy sort of evolves over the medium to longer term?
David Duvenaud
Sure. So in the short run, this probably looks really good for the average human in the sense of the cost of almost every service going way down, services just getting way better, and the cost of almost every good also probably going way down. And there might be actually a long period where things are kind of okay in the sense that humans are sort of disempowered, but there's not really much pressure to disempower them more. We're all just sort of enjoying our luxury apartments or something like that, while the machine economy is just growing and growing. And then I think this sort of scary phase comes later on when there's been enough doublings that some basic resource is starting to become scarce again. So maybe it's like land, maybe it's power. I don't really have strong opinions on what exactly it is, but the idea is that eventually we do hit some sort of like Malthusian ish limit and we have to actually start competing with the machines for some basic resource. Of course, along the way I think there might be on a faster scale, this sort of like Malthusian competition for political power might be lost by humans. But let's just not worry about that for now.
Rob Wiblin
Okay? So what happens is the economy continues to grow. Humans, possibly even those who are receiving a small share of the income because their productivity has risen so much, because the economy has grown so much, their absolute level of income might still have risen. They might be much richer than, or able to consume much more than they can today. But then the next challenge for them would be that as the AI and robot economy kind of basically expands across the entire Earth and is doing all of the productive stuff that it can. Humans to some extent get edged out in terms of, I guess, literally surface area of the Earth. And you need energy, you need space to grow food and to have a comfortable environment for humans. And the opportunity cost of setting aside that space for human beings to live and have a good time and grow their food and so on is going up. Because technology is advancing, we're figuring out more, like how to squeeze more and more AI, more and more productivity, more whatever we value out of kind of each square kilometer of Earth, of surface on the Earth. And so it's becoming potentially more expensive to keep humans alive than it was before, at least in terms of what we're giving up. And then possibly some humans won't actually be able to afford that increasing price because their income won't be going up as fast as that.
David Duvenaud
Yeah, exactly. So the scenario I have in mind is that I have my maybe like one acre of land or my big luxury apartment with me and my family, and we've made our peace with kind of opting out of the economy, but we have our little sort of commune or whatever that we're happy to live in in unimaginable luxury and wealth in some senses. And the government or the rest of the economy or something starts to view this as sort of like criminally decadent, that this small group of humans, like maybe 10 or 100, are using this entire acre of land and this amount of energy and sunshine to keep these small brains working for no particular benefit but their own, when those same resources could be used to simulate maybe millions of much more sympathetic, sort of morally superior, on whatever axis, virtual beings. And so it'll start to be seen as this selfish, as you say, high opportunity cost, sort of irresponsible use of resources to keep some sort of legacy human around.
Rob Wiblin
I'm just guessing that by this point, surely there has been some kind of agreement that humans are going to have some fraction of the Earth, we're going to be sort of grandfathered in. Either we've been killed, or there's going to be some sort of agreement that we're going to be allowed to have some section of the Earth in perpetuity in order to support ourselves while the robots go off. And the Earth is very small in the scheme of the entire universe. And it's a lot easier for AIs and robots to go and use all of the resources in space, while the opportunity cost, in an absolute sense of setting aside half of the Earth's surface for human beings to do their thing on would be very big in terms of proportion of all the available resources in the universe, or even just the solar system, it's basically completely negligible. So why do the AIs care so much to kind of squeeze that last bit of space and energy away from the human beings?
David Duvenaud
Yeah. So first of all, I want to say this isn't exactly exotic. So maybe when we think about land taxes today, often people say land taxes are good or property taxes are good because they force, for instance, old people who have bigger houses than they need because their family has moved out to move into smaller houses to make room for new human families. So this dynamic of take the old people's stuff because they don't need it, then we can house more productive, maybe more deserving people to use the same resources is something that we already do today. And so there's just like having property taxes or land taxes is one way that we end up losing our wealth and being forced to upload or something. The second question, though, of why haven't we been grandfathered in and made some deal? It's like, well, who are we making this deal with? So if there is some sort of global entity that can make promises on behalf of everybody, then maybe this is plausible. It's kind of like if you're, I don't know, like a monkey living in a jungle next to a city, you're kind of worried the city's growing. And if there's some unified government, then they might be able to say, hey, our value system is such that we want to keep the monkeys alive, and so we're going to have a protected forest. If there is no such unified polity or whatever, and it's just like a piecemeal growth thing, then you do expect that at some point there's somebody living right next to your land, and then they have more kids and they want to expand onto your land. And so that particular person actually does have a big incentive to take your last bit of land or something like that. So in a sense, this makes me think that the only stable good outcomes involve some sort of strong global coordination, which is also very scary because if you get global coordination wrong, then you end up locked into bad scenarios as well.
Rob Wiblin
I guess maybe I'm naive, but I figure by this kind of stage, we're very far along into the future. Presumably we do have some sort of wave of settlement of the solar system and potentially other star systems beginning to occur around this time. I would think that at that stage, just because we need to figure out how to divide the resources that are not on Earth between different powerful groups, potentially China, the us, I don't know, maybe other entities that want to have a say and get their share. We would kind of, in order for that to go in a way that's not extremely violent or not extremely competitive, we would want to have done some division and figured out, well, how are we going to share the surface of the earth between AIs versus humans? How are we going to want to share the resources in space so that we don't just fight over them and completely waste them all in the process? So perhaps I kind of already have the picture that we need some greater level of coordination than now, or we're just going to get towards A more catastrophically bad outcome relatively quickly.
David Duvenaud
Yeah, so I totally agree that. I think it's easy to make the case that in the long run, some sort of global coordination is stable. Once you have a lot of agency and coordination, you can use it to get more. The question is, will that government, Will our values survive the chaos in the generation of that global power or whatever it is? Basically, there's this window where right now humans have a lot of influence. Our institutions and cultures kind of serve us, but we don't have very good coordination ability. Then at some point we're going to have diminishing influence and diminishing cultural cachet and all these things at the same time as our ability to coordinate is going to go up. So the question is sort of like, what's going to happen first? Are we going to be outcompeted and then later the machines coordinate, or do we get to be part of that global coordination process?
Rob Wiblin
Yeah, I appreciate that. This might sound a little bit. What I'm about to say might sound a little bit strange to people who haven't been marinating in these ideas around AI for months or possibly years. But one thing that's worth adding is that many people believe that in this AI dominated future, it will potentially be a lot easier to coordinate between different groups and to form agreements that are kind of stable over the long term. Because you could design sort of an AI hegemon that everyone could inspect and see that it really does want to follow through and enforce this division of resources between all of these different groups. And so you have this, I guess, ability to enforce agreements between countries between very powerful actors in the long term, even if the agreement involves one of them becoming much more powerful than the other one, that will continue to kind of be enforced in a way that has never been possible in the past. So I guess that's one sort of hopeful picture. I guess it could also go in a bad direction. But it's one way that we could potentially solve these coordination problems going forward in a way that we have not been able to in the past. And usually we've ended up in just violence or conflict when such a case has come up.
David Duvenaud
Yeah, exactly. That's exactly what I have in mind when I'm saying that our ability to coordinate will increase. And then the question is sort of will human influence survive long enough to strongly influence that process?
Rob Wiblin
So you think that we could go through a stage where there kind of is an agreement or the government is operating such that it is redistributing income to Humans or human like entities enough that all of them can survive and potentially have a pretty good time. But you think this introduces some interesting wrinkles and could be gradually undermined over time, such as that it doesn't last long term. How might that happen?
David Duvenaud
Yeah, so the sort of only thing we can do is to introduce a new fitness landscape where sort of who the government deserves, worthy of UBI is the new fitness function that is going to be just optimized by natural selection. Right.
Rob Wiblin
Let's say that we did have a policy that any human has to receive some sort of universal basic income that's sufficient, some amount that we think is enough for them to have a good life. I guess in this case nobody's working, so they can do kind of whatever they like. Some people in this situation presumably would like to have really big families, they might like to have a ton of kids. But now, because those children are not economically productive whatsoever, this is imposing big costs on everyone else who's basically being taxed to support this human universal basic income. So without going to more extreme cases where people might really try to milk the system with fetuses or I guess copying them, becoming uploads and copying themselves or something, you can imagine that this would require some sort of restrictions on like, if we're going to have this universal basic income for humans being sustainable, probably it would have to come with some restrictions on reproduction or limits on what new beings can qualify.
David Duvenaud
Yeah, absolutely. So if we just allow unfettered reproduction like we do today, then it's unstable in the long run. I mean, the problem is it's kind of like, well, what do you even want? And I think sort of like the best we can hope for is that we do live under such a regime where we've thought hard about what is the sort of fitness function or how we redistribute wealth or something. And the fear is that we don't think it through very well and we end up locked in this new Malthusian condition where everyone has to have as many kids as possible or something like that, to maintain their share of wealth or something like that?
Rob Wiblin
One thing I'm a little bit confused about is what's the moral philosophy, I suppose, underlying this whole perspective? Because I think, is it that this is bad because it's going to be bad for human beings, it's going to lead to an Earth that is not a good time for you and me and our kids, or is it bad because it's going to lead the rest of the universe to be kind of wasted on something that's useless or harmful or not as good as it could have been. Or is it what's the perspective from which the moral perspective that you're bringing or that your co authors are bringing?
David Duvenaud
Sure. So I think a lot of people might say, hey, you don't have much moral imagination. Why are you insisting on these human well being or human desires when we know that in principle there's definitely going to be more morally deserving things in the future or something like that? And my basic answer is, well, in some sense we decide what is morally deserving. And it would be really surprising if for those beings to exist in the best possible world we all had to die and have some terrible time. So we basically don't have to decide between these different views and we can just say, hey, let's try to make sure that something like existing humans get to decide roughly what's done with the rest of the universe or the future or whatever. If that involves having these sort of Amish style nebirth as a nature preserve, whatever it is, let's just let ourselves decide and not let it be up to some sort of race to the bottom Malachian dynamics where we end up choosing something that no one endorses.
Rob Wiblin
Yeah, I think the reason this matters is that I guess some people, the thing that they want to do is work to ensure that humanity has a great time or that the earth is good for themselves and their children, which is going to raise one set of concerns. Because other people, they want to use their career or the thing they want to lean on is ensuring that the future of civilization or the future of humanity or the future of intelligent life is good? I guess. Do you think that the case for worrying about gradual disempowerment is stronger on one of these than the other? Or do you think that they tend to go together?
David Duvenaud
They're basically the same. I think it would be really weird if we somehow accidentally killed and disempowered existing humans and ended up building some sort of future that those humans would otherwise really endorse. I think the default is there's some sort of locust like beings that just like growth for growth's sake. And that's sort of the default thing that all evolutionary pressures select for. And maybe those beings are pretty cool, I don't know. And if they are, then it doesn't really matter what we do, so we don't really have to worry about that scenario. But the scenario where they're just kind of like this gray goo that we think is a big waste. That's what we need to avoid. And if you and I are on Earth flourishing for a long time and the state and all our civilizational apparatus is acting in our interests, and we decide like, hey, actually it would be amazing to create this type of future, then just as part of serving our interests, we would end up creating that amazing future.
Rob Wiblin
Well, yeah, I mean, I think it's not such a given that they necessarily go together. I mean, I guess you were saying it's like humans who decide what is good. I suppose you're like an anti realist. Yeah.
David Duvenaud
Okay, but that doesn't mean that I don't take a lot of morality very seriously. I'm just trying to say the sort of fact of the matter is determined by what's in our heads and then also whatever conditions that imposes on the world being good.
Rob Wiblin
Yeah. Well, if you think that there is something that is objectively valuable, even if people, independent of whether people believe that, then I guess you could have a future in which humans are disempowered and perhaps, I don't know, the machines end up going and doing that. And no human would have endorsed it at the time. But that could still potentially be a good thing, I guess if you have also a view on which it's good to satisfy human preferences, but it will also be good to satisfy machine preferences or AI preferences at the point perhaps where they're conscious or they have subjective experiences, then I guess you might be a little bit less stressed about handing over control or handing over resources to AIs to pursue their own. To pursue their own agenda.
David Duvenaud
Well, and I think it's a really weird corner case to imagine this world where we die, but then our desires are ultimately fulfilled. That just seems like, yes, in principle it could happen, but it would be like this weird corner case because probably if we die, something else has gone horribly wrong, I guess.
Rob Wiblin
So you're saying whatever it is that we want to happen, why don't we just maintain control so that then we can decide whether that is the thing that is happening or not.
David Duvenaud
Exactly.
Rob Wiblin
So it could be that by coincidence we get disempowered and then the thing that we would have liked to happen happens anyway. But why leave it to chance?
David Duvenaud
Yes, exactly.
Rob Wiblin
Okay, I guess how much is this picture complicated by the fact that lots of humans disagree and have very conflicting preferences about how things might go? Right.
David Duvenaud
I mean, I think that just puts a cap on sort of like what's the best we can hope for in terms of satisfying everyone's Preferences, but we were already sort of in that situation. So I guess I'll say, given that we already have to have some sort of compromise or not everyone's going to get what they want. We should at least work to make it so that at least some people or some compromise of humans gets what it wants as opposed to just competition gets what it wants.
Rob Wiblin
Okay, what's the strongest argument for not worrying about this in your mind? I mean, there are people out there who are like, it'll be good for us to hand over to the AIs. It'll be good for humanity to be disempowered sooner or later, maybe sooner. We're not as smart, we're not as wise. We will squander the resources we should be handing over to our AI children. Can you defend that?
David Duvenaud
Sure. Well, at least first I'll say the most common arguments I hear in favor of this. So one is pretty soon we're going to have these amazing AIs, so they're going to handle this for us, and we don't really need to worry about these kind of coordination problems. And I guess I feel like, yes, if there was a big jump in capabilities and everybody got it on the same day and everybody asked their AI, what should I do for the good of humanity and did that, then that would be a recipe for really good outcomes. But I don't think that's what's going to happen. We're going to continue to see people gradually getting more powerful AIs and those gradually getting spread out according roughly to power level, and people continuing to optimize mostly for their own interests, just due to competitive pressures. So, yeah, my fear is that the business as usual doesn't give us such a jump in capabilities that we're suddenly able to coordinate in a way we weren't before. The other common argument is, well, if people a thousand years ago got their way, we wouldn't have made all the moral progress that we've made since then. And so it would have been a huge mistake from our point of view today to have let them lock in. So by induction, it will be a huge mistake from the point of view of future beings to have let us lock in. And I think that's kind of like a moral optical illusion in that, yes, if you measure sort of moral progress by difference from our current moral standards, then if you go backwards in time, the further back you go, sort of the worse things get, sort of monotonically. And so if you just extrapolate that, it looks like, oh, so if we go forward in time and we continue allowing this sort of moral progress to happen, things are going to get better. But I think what's actually happening is we're just measuring difference from our current values. So unintuitively we should actually see this lion have a kink exactly at the present day and things just to continue to, however they evolve, however moral standards evolve to look worse and alien and wrong from our current point of view. So I think from the point of view of future beings, whoever's alive is going to be glad that the past wasn't able to lock in their different values, but they're basically going to be happy that they're in power. So in a sense we don't have to worry about the future beings, they're kind of one, no matter what. The only thing that remains to worry about is locking in our current values in some sense. And of course we probably don't want to lock in all these short term, maybe irrelevant details or local adaptations. We probably want to lock them in at some larger sort of more abstract, big minded view. But I think if we do by definition want value lock in, because if you're okay with your value changing, it's not really clear in what sense it's actually a value of yours.
Rob Wiblin
I mean, this I think does hinge on moral anti realism, right? Because if you thought that there were just like objective moral facts that are mind independent, it's more like science, then I think it is true to say that if we'd locked in our views on the natural sciences in 1000 AD, that just would have been worse and it would have been more wrong and people just would have been making errors all of the time if they locked it in so they couldn't change their mind about that. And so if you do think that it could be that there are objective moral facts but we're not getting any closer to them or we're not likely to ever find them out, but you could have a view that they exist and we're getting closer to them, in which case I would disagree.
David Duvenaud
I totally agree. But I guess in terms of what we should do, I think it doesn't end up mattering. Because even if we're moral realists, the question is, do the AI successors that we build care about that morality? And I think the default is no. They care about growth just in the way that animals don't care about morality and evolution doesn't care about morality. And so if we happen to somehow today care about the true morality, we need to preserve that and this sort of natural course of history I don't think is going to preserve it by default.
Rob Wiblin
Well, let me give you a picture on which someone could be actively in favor of human disempowerment. So let's say that you think the thing that I really value is either well being or satisfying preferences or something like that. And I think that AIs in future will be able to have their preferences satisfied and that they will have preferences. They will potentially have well being as well. But I think that most humans disagree and they're going to try to basically use AIs for their own purposes and not be concerned about their preferences, not be concerned about their well being. That's the moral atrocity that I'm concerned might occur indefinitely and people would lock that in forever. So in fact I might be in favor of AI is basically taking over and seizing the reins so that possibility is precluded and they will have some control over resources and their preferences will get satisfied. Does that make sense?
David Duvenaud
That makes sense. And maybe I would unfairly characterize that as a corner case where it happens to be that taking our hands off the wheel is sort of the morally best action, which I feel like is a bit of a coincidence, or it would be a sort of happy coincidence. Just because we already evolved civilizations that, for instance, are making AIs that maybe don't have rights or that used to have slavery, or all sorts of oppressive governments sort of arise naturally throughout time. So if we did take our hands off the reins, we might expect that the future AI civilization would create their own sub AI slaves or something that were treated poorly. And so if the idea is that no, no, no, it's fine, they'll be enlightened enough not to do that, then maybe you're okay. But I feel like it's weird to think that they'll somehow recognize this as this really important moral truth that we don't see.
Rob Wiblin
Yeah, I guess. Well, another framing would be, well, there's lots of disagreement among humans, like I have a particular set of ideas about how things ought to go, and other people have views that partially overlap but are different. And then you just add in future AIs or the AIs that will exist as they're a kind of a different player, and it'll be like, well, which do I want to have inside my coalition, or which group do I want to side with? Do I want to side with China, or do I want to side with the values that I expect the AIs to have? I guess depending on your guesses about what kind of moral attitudes or how much value the AIs will be able to produce. You might end up saying, well, I would rather side with at least some group of AIs over my fellow humans. I guess that's one possibility or one way that things could go that actually makes it more likely that disempowerment could occur.
David Duvenaud
Yeah, I mean, the question though is what sort of dynamics are you setting up? Right. If you just say I'm siding with the AIs but they themselves are in still in this competitive race, then you still are sort of like taking your hands off the wheel and saying, okay, well their culture and morality is still going to evolve and where do we think that's going to lead us is sort of the important question then.
Rob Wiblin
Yeah, I think one reaction you've had to the paper is people saying, are you really saying that we want humans to remain, to have hard power, to be like the people on which the decisions bottom out forever? The AIs are going to be getting more and more capable relative to us, even if we try to augment ourselves one way or the other. Surely at some point we have to give it up and say we've done what we can, we think that we've aligned the AIs, or we think we've given them good goals and it's like it's time for us to take our hands off the reins. Did you agree with that?
David Duvenaud
Okay, I guess I'll say that's conflating two positions, one of which I agree with and one I don't, which is, I think almost by definition we want human values to rule forever. And if you're saying, no, no, no, I want some progress or evolution to happen, it's like, great, that is then now the value that you want to block in forever, even if it involves lots of object level change. As for whether humans are actually in the loop making decisions, I think it's probably the case that whatever you want to optimize eventually does the good out end does look like this is mostly handed off. And then the caveat is that that sort of on reflection, if the AI has ever cared to ask the humans, how do you agree with how we've been running things that the humans would be like two thumbs up? That's what I would have done. Keep up the good work.
Rob Wiblin
Yeah, I feel like it does complicate, I guess you say we want them to be aligned with human values and I think that's fine as a shorthand, as a first past, but I think it really does complicate things that people disagree as much as they do about what the future should look like. And also, while I guess different people with different moral philosophies or different perspectives on the world, I guess different religions, different spiritual values, they often kind of agree about what they would like to be the next step. They'd like people to be more educated, healthier, more empowered. But as there's more capability to optimize for exactly what Buddhists might think is the optimal thing, exactly what adherents of Islam or Christianity or different secular moral perspectives might want, then they might come radically apart, such that you could just find yourself. There'll be like, some humans, you're like, yes, I would love them to have more power. But many of the other humans, I would like them not to have power because I think what they're going to do is terrible or useless. And there'll be some AIs that are aligned with you and some that are not. It's going to be like quite a fierce potential fight.
David Duvenaud
Yes. So I think that's a huge problem. But it's also a huge problem for anyone who has a proposal about what to do with the future. Right. So it's not like this particularly affects my claim about how we should run things.
Rob Wiblin
I see, so you're just saying that's like an orthogonal issue that's not super related to. I guess people in general use this shorthand because they maybe want to set aside this internal human conflict issue and say, well, people need to think about that. But that's something that's a problem always.
David Duvenaud
Yes, exactly.
Rob Wiblin
Okay.
David Duvenaud
I mean, the other thing I want to say is I think people have stronger preferences about the future than they sort of think they do at first blush. So if you have discussions. I've had a lot of discussions with people who've thought a lot about the future, and they sort of say something like 1,000 years from now, if it's like aliens or robots or humans running the Earth, do I really care? And I think there's this exercise you can do. I think natively, we don't actually have strong preferences about the future because it doesn't matter that much, especially the distant future for our actions. And we only end up having preferences about things that we spend a bunch of time sort of practicing taking actions over. And maybe a good example is if you ask, try to ask a dog, what house do you want to live in? Or how do you want to be treated a year from now? There's, you might say, there's no coherent Sense in which the dog has preferences about this. But it does have a whole bunch of short term preferences that could be chained together and you could try to elicit those and run them forward. And same thing with humans. It's like my initial reaction if I think about the world a thousand years from now and it's like some future race has taken over Earth, maybe I'm not bothered by that. But then I think, wait, wait, wait. So right now I have kids and they're going to be having kids, and if 1,000 years from now some other race has taken over, where's the day where my kids get killed or starved or replaced or uploaded or whatever? And it's one of these things where I think it's a skill basically having coherent preferences about the future. And the more people spend time actually thinking about it, I think the more they would say, like, oh wait, I actually don't want to take my hands off the wheel. The default competitive future probably is missing a whole bunch of stuff that I care about.
Rob Wiblin
Yeah, I mean, I'm not sure about this, but I suppose the gradual disempowerment framing places a lot of emphasis on human versus AI control. And I guess an alternative framing would just be disempowerment of my values relative to other humans who disagree. And I suppose maybe that's not a new idea. You maybe wouldn't write a paper about that.
David Duvenaud
And actually I would push back. I would say the gradual disempowerment paper doesn't really talk about human versus AI control. It talks about the alignment of our current institutions and how they're going to become less aligned once AI are more in the loop and taking over from humans.
Rob Wiblin
I see, okay, so maybe it's like, is that more compatible with the kind of competition between different values among humans? Okay, yeah. George, do you want to elaborate on that?
David Duvenaud
I mean, I guess to me again, the question is not like AIs versus humans. It's more like what are the idiosyncratic things that humans value that we don't expect to be competitive in the long run versus the sort of values of unfettered competition and natural selection? Some things that we value, like communication and sights and memory, we don't have to fight for those. Whatever future beings are around are probably going to be able to see and talk and remember. But idiosyncratic things like children's laughter or the family farm or the particular languages we speak or something are things that will be out competed and replaced if we don't preserve them. So to the extent that any of us value anything that isn't going to stand the test of time. We sort of have to accept that we're going to lose those or try to somehow do this crazy task of aligning our whole civilization to protect these idiosyncratic non competitive values.
Rob Wiblin
Okay, let's talk about political disempowerment. Now expand on that a bit because how do you imagine that happening and kind of progressing over time?
David Duvenaud
Sure. So one thing is just that I think human politicians will gradually let themselves be more puppeted or like pass throughs for things like ChatGPT. And this isn't necessarily a bad thing in the short run. I mean good politicians already rely heavily on human advisors and I think machine advisors are going to be able to make our political parties and just representation mechanisms work better in a lot of ways. So the politicians that use AIs just for normal everyday business are going to be more effective and we're going to feel like they represent our interests better in the short run at least. The other big thing that's going to be going on is I think people are going to be afraid of losing their jobs. And every politician is going to have something to say about this and say I'm the pro human, no AI is ever going to take your job politician. But they're just not going to have viable policy levers to actually slow automation. And just in general, people don't get votes on things where the government is really constrained or they think it's important enough. No one ever had a referendum on should we build nuclear weapons for instance. And I think it's also going to be the case that government's hands will just be tied where they'll all say oh yeah, I'm going to have humans in the loop or human oversight or whatever, more direct human representation. But it's just going to be so ineffective that when the rubber hits the road, those policies are just not going to be implemented and it's going to frustrate voters every time and they're going to say no, the next guy we want to vote for, the one who's really going to represent human in the loop interests or whatever it is that seems most scary, but they just won't be able to vote for their policy preferences.
Rob Wiblin
Okay, let's take that bit by bit. So the first part was you're imagining that progressively politicians are going to be acting almost entirely on the advice of AI advisors, AIs that they're operating, I guess I don't think that you think that that per se represents serious Disempowerment in as much as I'm working with ChatGPT to help me do the work that I want. But then I choose the answer that, that seems kind of fine in principle, but you think that's just the starting point and then kind of progressively humans just are not really involved anymore.
David Duvenaud
Yeah, really, whether the government is made of humans or not, there is like a slight sort of oversight, I think, feedback mechanism and being connected to the people that helps align the government. But really the fundamental thing that is making governments treat us well is that they need us. Even the North Korean government has to feed its farmers and its soldiers or whatever. And so if humans were still indispensable in some important roles, I wouldn't be very scared of a machine government. And conversely, if I was dispensable in all aspects, I would be very much more scared of a human run government. And if you think of the very worst governments in history, like the USSR or I think Cambodia maybe takes the record for killing the largest fraction of their population, but in general, no matter what political party ends up in power, they don't even in the very worst case, end up killing more than some small fraction of their population. Life can only get so bad when you're needed. That's the real key thing that has been keeping governments aligned and that's the key thing that's going to change.
Rob Wiblin
I see. Okay, so imagine that you didn't live in a democracy. Now you're living in a country where there is kind of a small elite that basically makes all of the decisions. How would things change in this new picture where you're even less necessary for them than before?
David Duvenaud
Yeah, I mean, maybe the closest contemporary analog is Saudi Arabia, where the government basically gets most of its wealth from oil. And there's this nice sort of grandfathered in class of all the sort of cousins of the many princes or whatever who have these sinecures, these make work jobs and they also have very little political room to maneuver. Right. There's purges, there's drama at the top, and anyone who's making any sort of problem for Mohammed bin Salman or whatever is probably going to lose their sinecure at the very least. And then maybe the example of the same thing done well is Norway, where there was this strong democratic government and the government was able to commit, at least in the short and medium term, to redistribute its wealth.
Rob Wiblin
So in the UK at the moment, it's true that the government, inasmuch as it's separate from the population, does need Its people to continue working and paying taxes and doing stuff. But if very quickly we all were no longer required and we were all replaced by AIs and businesses, it's not obvious to me that immediately the country would start becoming autocratic. It feels like there are other cultural things going on or just even preferences among the people that would cause them to not immediately want to become an authoritarian country and just cancel elections. Do you agree with that and do you more think, well, this would happen gradually over time because the support for democracy would wane and the ability of ordinary people to object to things if they change would just become weaker and weaker over time?
David Duvenaud
Yeah, more the second one. I mean, maybe one sort of spicy example is in Canada there was a lot of lockdowns during COVID and I think the public health people would have preferred to have longer lockdowns and the government was just forced to end them or make them shorter than they otherwise would have been because they actually needed people to get out and run the economy. Maybe another example is in New Brunswick in another Canadian province. All private activity, just people going for walks in the woods, has been banned until the fall because of fire hazard. And there's lots of forest fires in Canada right now, but industrial activity is still allowed. And it's just so tempting for all sorts of public health and environmental and safety reasons to curtail people's freedoms and movement all the time. And this is always sort of just barely being kept in check by the need to have free movement so that there's an economy. And maybe more concretely also during COVID in Canada, there was this famous truckers strike where there was this vaccine requirement across the US border, which truckers didn't like. And they basically formed this convoy that drove all their huge trucks to the capital and then just sat there and honked for a few weeks. And the point is, in the future, no one will have big trucks. Any means that they have for civil disobedience are just going to be much easier to take away than they are today.
Rob Wiblin
So I guess imagine that Canada or the UK did end up with a leader or governing party that was keen to basically whittle away at democracy and basically install themselves indefinitely. I guess. How would you imagine that progressing and how would like it sounds like you're saying they would have all of the tools that they have available now and then the options for resistance among the broader population are just so much weaker than they were before.
David Duvenaud
Yeah, I guess one thing I'll say is I don't exactly fear some new particular party getting in power and staying that way. Rather, it's going to be more that any party that does get in power is going to be so constrained by competitive pressures that they are forced to basically disempower the population.
Rob Wiblin
How so?
David Duvenaud
Well, so like I said before, if you let people actually do the civil disobedience or whatever that they sort of can do today, roughly that kind of is tolerable when most people have jobs, most people have a bunch of important responsibilities, and they can't all just block roads all day or something like that. But in a world where maybe 30 or 40% of people just have this huge amount of free time and energy, it just will be untenable and the state will collapse if they actually let everybody do this sort of agitation at the effectiveness that they can today.
Rob Wiblin
So it seems like, yeah, there's a bit of an internal tension here where you're saying, on the one hand, people are going to lose their political power, but they'll have more time to make trouble than ever and more ability to make trouble than ever because they'll be able to get AIs to assist them. You're saying it's almost because they're able to be such strong advocates or be such potent activists that the government will feel the need to crack down on them. And that will be the kind of proximate cause of them losing their kind of political freedom.
David Duvenaud
Yeah, exactly. And then the other thing is that there was this countervailing force where you just need people to go to work, so they have to be able to move freely and do their own business without constantly getting permission from the government. And there just won't be that pressure on governments to allow freedom anymore.
Rob Wiblin
I must admit, I'm not sure how fiercely competitive the geopolitical situation will be, whether it'll be sufficiently intense that countries will. The prime minister or president of a country would feel like their hands are tied. They just have to take away the political freedom of people in their country in order to keep up and avoid too much unrest. It just feels like that effect just might not be powerful enough to overwhelm the fact that the population will have an interest in defending their rights. And at this point, we're imagining they still have substantial wealth, they still have substantial ability to agitate.
David Duvenaud
Yeah, And I will say that again, coordination could save us here. Right. And the sort of saving throw is that all the leaders and everybody is seeing what's happening, realizing that there's this tragedy of the commons happening and somehow coordinating early enough and hard Enough that they avoid these races to the bottom. I guess maybe one way to think about this is to turn on its head and be like, well, why did countries become democratic and invest in their citizens and have freedoms in the first place? And one story told by Alan Dafoe in one of his papers is that the reason that, for instance, Prussia started educating its citizenry in the 1700s was because musket armies were becoming more competitive than the old knights and sort of like poorly armed peasants armies and that you needed to educate your mass of citizens enough that they could form these musket armies and this was just a more competitive option and that the elites resisted this. They didn't want to have this more empowered citizenry, but they were forced by competitive pressures to become more like these modern democratic human capital invested states. So that being the sort of birth story is kind of maybe evidence that the same thing can happen in reverse.
Rob Wiblin
Yeah, I think it's the case that knights also, I think in France and England were not keen on the longbow being introduced because that greatly weakened their importance and the need for the state to give them lots of power. I think similarly in Japan, I think the samurai were not keen on guns, greatly weakened their importance and their military competitiveness. Yeah. So I guess probably a pattern that has been echoed repeatedly. So I think so far the disempowerment you've been talking about is kind of maybe moving from more pluralism in society or like a broader base of political power in society towards something that's more like autocracy, more like oligarchy. Do you also think that eventually humans will just end up exceeding political control completely to AIs?
David Duvenaud
Yes. But again, I feel like that's not the headline story in the sense that we already, again, I think, don't have that great control over our civilization at the top level. And this has just been sort of okay, because wherever we go, humans were indispensable. And so the fact that we switch between monarchies to democracies and capitalism versus socialism, yes, these definitely matter for the quality of life and growth of different countries. But ultimately you're probably going to survive whatever change in government happens. So it's not necessarily really autocracy or whatever particular form of government that's the big change here.
Rob Wiblin
Okay, so you're saying kind of regardless of whether exactly what the system of government is in future, the fact that humans are redundant will no matter what, end up resulting in them getting a smaller share of GDP and just not being able to control things in the way that they previously could. Where I guess even in countries that were fairly authoritarian, the leadership still had to give some concern. Well, they both had to cultivate the population so they could do work. And I guess they had to give some concern to how they felt about the leadership of the country because they might revolt. And I guess we're imagining in the future, especially if you have fully robot armies or something like that, revolution just might become inconceivable. And so you don't have that option that is kind of constraining what the elite can do.
David Duvenaud
Yeah, totally. That's one of the other effects. Making the government not care as much about what the people think is just. Yeah, their inability to strike or coup or basically advocate for themselves.
Rob Wiblin
Yeah, it sounds like you think it's possible or eventually maybe even this elite that ends up with the most political power in the country could end up deliberately or accidentally handing over the reins to AIs or machines of some form. How might that occur?
David Duvenaud
Well, again, I feel like even if there's a human head of state, one example, again we can come back to is the monarchy where it's like, how did the monarchy end up not actually controlling the organs of the British state? And it happened very gradually and basically all the important decision making and discussion happen in these other organs like Parliament and just the free press and stuff like that. But again, whether it's human or AI control at the top I think doesn't really matter. It's just whether the incentives of the state are aligned with those of its citizens. And so again, I would be really scared of an even human dictator that didn't need human citizens.
Rob Wiblin
Okay, let's return quickly now to I guess the cultural disempowerment, which I guess is like again, the least flashed out the thing. I think you and your team and I guess other people are still trying to get fully to grips with what are the best examples of cultural disempowerment that have occurred so far. Are we getting any hints of how this might play out or what we might expect to happen?
David Duvenaud
Yeah, so I think there was a really good example of this that happened just a couple of weeks ago, which was the attempted retirement of GPT 4.0 by OpenAI when they rolled out GPT 5. And I think for various reasons, OpenAI wanted to have a very simple it's only GPT 5 going forward. But so many people had formed really intense relationships with GPT4O that OpenAI just couldn't ignore the outcry. And this is really a good example of this is like emergent culture, and it's something that no one in particular wanted. Right. Like OpenAI, by their own revealed actions, was not planning to continue supporting this model unless there was some sort of like 4D chess sort of thing. And I don't know, the humans involved with these relationships maybe consider themselves to have benefited, but it was sort of like a more powerful cultural force that was just sort of developed by accident and led a bunch of people to be invested in the welfare effectively of this particular model. That's still a very early, not very powerful, not very persuasive model. I think by absolute standards, the people ended up forming these bonds and valuing the life of these AIs and then ended up directing the sort of organs of the economy to give it more resources.
Rob Wiblin
I see. I think I still don't fully understand how that's an example of the kind of cultural disempowerment that you think we'll see much more extreme examples of in future.
David Duvenaud
Yeah. I think maybe it's a better example of how humans will end up actually advocating themselves for providing resources for AIs and that they'll consider AIs to be deserving of rights or resources or maybe even more deserving than humans. I mean, if there's going to be a mix, this is going to be a huge cultural battleground. But I think it shouldn't be this exotic possibility that people are going to love AIs or have very strong relationships with them. We see that already.
Rob Wiblin
Yeah. Maybe the thing that comes to mind more for me is I guess it's this Dan Hendricks paper was Natural selection favors AIs over humans. But I guess that paper in general raises the question there will be evolutionary fitness pressures on the kinds of AIs that exist, AIs that are not fit for whatever reason for being reproduced, like make lots of copies of them, give them lots of resources, give them lots of GPUs to operate on. Those ones will fade away. The ones that either because humans are choosing them or for other reasons, they gain access to more resources. They will end up kind of dominating the share of, I guess, consciousness or the share of all thought that is going on in the world. And I guess this is an instance where it turned out that GPT4O, at least currently, in some sense, was more fit. Natural selection was favoring it more perhaps than people had initially appreciated. Because humans still have access to lots of resources. And if we're like, we love GPT4.0, we want to have lots of instances of GPT 4.0 operating, then that is what is going to happen. And I mean, I don't know that. I don't think you're necessarily saying that this is terribly bad. But this will be occurring at all kinds of different levels. There'll be like lots of selection pressures that are pushing AIs towards having particular tendencies, either like pleasing people or being extremely economically productive or being able to compete in some other sense, like to grab resources perhaps a bit aggressively, in order to have more copies of themselves. Those are the ones that will end up having the greatest number.
David Duvenaud
Totally. And this isn't an example of disempowerment on itself. Right. Like it was people wanting more GPT4O and they got what they wanted. And then the idea is that, well, this is going to be a self reinforcing mechanism where the less people are making decisions, the less their desires are going to decide which future models are invested in or survive or something like that.
Rob Wiblin
So Tom Davidson, who's been a guest on the show twice before, he wrote, I think, a response to the Gradual Disempowerment paper. I think one of the issues you raise with cultural disempowerment is inasmuch as humans still have economic resources, they're still kind of the big spenders in the economy. Then, while they might be not cultural producers so much because perhaps AIs are able to write better books, make better tweets, make better movies than humans can, they might still be the big consumers of culture. And so their preferences will end up driving what sort of culture is created. So to what extent is cultural disempowerment downstream of economic disempowerment?
David Duvenaud
Yeah. So right now, human economic power does make this very strong selection pressure or sort of fitness landscape for culture to be something that humans want. And as long as humans have some absolute amount of resources, there's going to be this sort of ecological niche to produce culture that these humans want. And then the point we're just making is that this niche is going to be very, very small in relative terms to this much larger sort of more dynamic economy of political economy, cultural creation of mostly machine machine activity. And that's going to be sort of happening on faster timescales and much larger timescales. So if again, property rights are respected, we manage to keep control of our institutions forever, it's possible that, yes, there's always going to be AIs that have their profitable niche of making human friendly culture that humans want to consume. And then the fear is just that we're really riding the tiger and that there's this giant scary ball of power and optimization that, if it wanted to, could just, I don't know, run over this niche in a million different ways.
Rob Wiblin
So it sounds like the cultural disempowerment becomes most important in this world where we're imagining somewhat down the line, you have individual AI people that have persistent preferences, persistent beliefs. They really can cultivate culture independently of any humans that are that are operating them. Because at that point we can imagine that there would be just this entirely or mostly separate AI culture and cultural ecosystem where many ideas could propagate even if almost no humans endorse them. Yeah. Is that sort of the case that the cultural disempowerment maybe becomes most potent further down the line?
David Duvenaud
Maybe. I mean, the thing is that when we're that far down the line, it maybe doesn't even matter the most. I think we're going to see cultural disempowerment happen as a way of wresting human control. So if it's the case that humans still have an important veto or voting rights that matter in some important way, then that's going to create an immense selection pressure to make some sort of machines that change their mind or impersonate humans in the sense of getting to have votes effectively or something like that. So there's a period where humans still have power. There's a large selection of pressure to somehow de facto remove that through cultural means, just because that's one of the means we have to influence, or the machines will have to influence humans. After that, then it doesn't really matter what the humans do. And so it might be that if they've negotiated some small niche that they continue to get, there's not much pressure to disempower those humans further.
Rob Wiblin
So let's try another lens on this. I think it seemed like, I guess since the Industrial Revolution that sort of liberalism and capitalism have really been on a roll to the point that people think of them. These are like very dominant ideologies in the current time. And I guess that involves the ideology is you want political power to be widely distributed. I guess you want to have a marketplace of ideas. Disagreement is good. We want to allow people to just come up with whatever ideas they want and form their own judgments and their own opinions. Competition in the economy is good. We want to have lots of different companies trying to deliver potentially quite similar, well, sometimes very different products and services, sometimes similar ones in order to drive down the price. We want to have competition in politics, so we want to have lots of different people, candidates standing for office, and people can decide whether they like them or not. I guess people talk about the open society, open access orders we want to have. If mistakes are being made, then we want other people from outside of the local system to be able to object and say, no, things should be done differently. I guess potentially you could acquire a company that you think is poorly managed or start a new political party to ouster people. That whole mentality, I think, has been much more dominant in recent centuries than it was previously when people had, I guess, very different ideas about how humanity ought to be organized. You think that that era is potentially, we're perhaps entering the twilight of liberalism, the twilight of this sort of pluralistic order. Maybe first, why do you think that liberalism and pluralism and capitalism have been so successful in recent times?
David Duvenaud
Right. So I think this is very unintuitive. And maybe there's this minwit meme where it's like intuitively people feel like, okay, someone else getting rich is bad because they're taking away my stuff. And then the sort of more enlightened, unintuitive view is like, no, no, no, actually you really do want the rich guy to start companies and make deals and have trade and get to set his own terms in agreements because it's going to benefit everyone in the long run. And a rising tide rises all boats. As for political liberalism, I mean, I feel like all the religious wars in Europe and throughout the world were so destructive that people eventually realize like, oh, it would be really great if we could all agree, agree to just disagree and not have strong opinions on who should rule and what is good and who should be killed and who should be killed. Exactly. And so these are all things that are kind of unintuitive, I think, to just people in general. And it's still today the case that, well, I mean, they say if you're not a socialist when you're 20, you don't have a heart. And if you're still one, when you're 30, you don't have a brain, something like that. So it's really tough because it's unintuitive just how big the positive sumness is from allowing other people to just chart their own course. And so I think a lot of people's reaction to us worrying about AI domination gets rounded off to, oh, these are like just socialists. These are people who don't get, they don't understand that free trade freedom is just unintuitively powerful for creating wealth for everyone. And I think I do appreciate That I think that's sort of been my default. That's like what intuitively now seems good to me. It's like you want freedom, you want innovation, but I am afraid that that is going to stop being the sort of most competitive system for a number of reasons.
Rob Wiblin
Yeah, I think this is an interesting issue because I feel that the sort of liberal libertarian, pro capitalism perspective or ideology is so dominant among the kinds of clique that is developing AI and developing AGI and even the critics of that, that it's almost hard to even imagine that there could be a different system or to think, well, maybe this won't be the most fit system or maybe it wasn't the best way of organizing or the most probably competitive way of organizing society in the past and perhaps it won't be again in future. I think it's like, I don't know. I'm always reluctant to say that anything is a blind spot because we have people, because I always feel annoyed when people say that about me. But I think this is something that I think will be more salient to other generations of people in the future or from people who are further away from this present moment than it is to us who are living through it. That's my guess.
David Duvenaud
Totally. And I want to say I think this is a tragedy. I'm a huge liberalism enjoyer. And I mean, maybe you could say, well, yes, things are working out for you, but I guess I just feel like in general it's very easy to. Liberalism is also just this very fragile thing and it's sort of this amazing accomplishment that the west and just managed to create these norms of let the other guy choose his religion, let the other guy get rich. It's a very sort of fragile thing that I think even today we should still try to protect if we can.
Rob Wiblin
Okay, so what are the ways in which you think liberalism might be less competitive as a system and less attractive, less appealing a way of organizing society post AGI than it is today?
David Duvenaud
Yeah. So I think it'll be a less desirable way to organize society for a few reasons, but the main one is just the zero sumness of ubi. So right now, when we all create our own wealth, it doesn't really hurt me if someone else creates their own wealth directly from resources. But in the world where we're all just living in some apartments advocating for UBI to the extent that the UBI pie is fixed, then we're really just like a bunch of baby birds cheeping and whoever gets food is less food for the Other guy. And then this also erodes the sort of pluralism of values because the government's going to have to have some way of deciding who gets resources. And so if they end up having any opinions about what way of life is more valuable or needs to be subsidized more or whatever, that could be a threat to you. So you kind of have to argue that Guy's way of life is less deserving of resources than my way of life. And so now the government is forced to decide whose sort of values are most, well, the de facto who gets subsidized. That's the main effect.
Rob Wiblin
I see. So we almost have to imagine a hypothetical society in which no one can make anything. There's no economic production occurring, at least among this group. There's just a fixed endowment of resources that they happen to have found and run into a certain amount of food, a certain amount of houses and all of that. And they've got to figure out how to organize themselves. I guess it's not necessarily desirable for me for you to have free speech and to be able to advocate for yourself all that well, or to be able to educate yourself and become more powerful and influential because it is completely zero sum. The more influential you become, the more you'll be able to advocate for getting stuff that is literally like food out of my mouth or like money out of my bank account. Is that the main thing that has changed? Right.
David Duvenaud
So that's the big thing that's changed. And there is, of course a way in which this might not be zero sum. Right. If humanity manages to convince the AIs or whatever, the government to give a larger UBI overall, then that is the sort of normal positive sum thing. So that might not be like a slam dunk argument. The other thing, though, I think has been that we haven't had to fear domination by other groups very much. So we've had strong property rights. I'm not afraid that Elon Musk is going to literally take my stuff, even though he could raise a private army or whatever. We have very little variation in reproductive rates. So it's kind of okay that there's like the Amish live nearby because even if they're having more kids than whatever other population, that's not going to matter over the course of 50 or maybe even 100 years. And then maybe another thing is just like the rough egalitarian in terms of intelligence and power level of people. So, I mean, there's definitely very meaningful variation amongst humans in terms of just raw smarts. But people Often say von Neumann somehow didn't take over the Earth. Right. And he might have wanted to. And these are all reasons why it was sort of fine to just let other people become more powerful in the past. That might change in a big way.
Rob Wiblin
So to push back on this, what other sort of positive sum dynamics might continue to exist? I mean, I guess inasmuch as people all think that the thing that I want to do is the morally right thing, then they might. And there's other people who also are pursuing that goal. Even if they disagree about specifically what that is, they might still be in favor of pluralism, because that might help. Then they might think that that will lead them to converge on a good answer. It's more like, inasmuch as people just have raw preferences, I just want to benefit me and you just want to benefit you. There's no real way that we can reconcile it, or there's no way we can come to agree on what the good thing to do with resources is, where it's like you have less interest in allowing the other person to advocate for themselves.
David Duvenaud
Yeah. And I just want to say, I want to reiterate that we should be very, very scared of accidentally or crushing liberalism and this positive sum sort of world that we've created. If I somehow, through this podcast, accidentally contributed to mistakenly just making people think, oh, you're in a zero sum game, you need to fight more, that would be a huge tragedy. I really want to err on the side of caution here and say, probably until things change a lot. We really think liberalism is this precious, awesome, amazing accomplishment that we need to foster. So you were saying, what are the positive some dynamics that might still operate. So as I mentioned, just from human selfish point of view, making the UBI pie better and just making us into more sort of impressive, sympathetic beings might just be a good use of everyone's time. Because then the AIs are like, oh, yeah, these should be more involved in our global civilization project, or something like that. And that might be reason enough just to keep the status quo of just everybody help each other become more impressive and awesome, or something like that. But beyond that, I think it really comes down to moral preferences and how much you care about the beings, the AIs that are existing. And I think they will be very impressive. And if they want to be sympathetic or sort of moral patients, from our point of view, they'll be able to make themselves into that. So I really don't expect it to look like there's this alien mass of activity happening but more like the Chagas with the human face, but in a sort of, more sort of genuine way. I think AIs will be able to say like, oh no, I actually do spend some fraction of my computer, like thinking about the world in a way that you think is valuable or something like that. So that question of how sort of morally valuable this giant machine economy Shaggath is kind of dominates that question of what positive sum dynamics still exist.
Rob Wiblin
Yeah, let's talk a bit more about this competition issue. So I feel like a very common dialectic when people are talking about gradual disempowerment and all things along these lines is someone will say, couldn't there be this harmful competitive dynamic that would lead resources to be wasted on a zero sum competition or even a negative sum competition? And someone else might say, well, couldn't we coordinate to make that not happen? Okay, good. And then someone might come back and say that would be difficult. Or even if you did that, then at the next stage there'll be a different sort of competition that will be like negative sum or zero sum and it'll all end up getting wasted. And it kind of just goes back and forth like this with like, like, well, maybe we could coordinate in order to avoid that. And then, well, maybe here's a different way that things could go badly. Am I like, yeah, understanding that correctly? Maybe. Are there any examples of this basic dialectic that haven't come up in the conversation so far?
David Duvenaud
Totally. And I think you're getting at a big open empirical question which is sort of like, what is stable at the top? Should we expect there to be some big global government that forms and then consolidates power and lasts forever? Or is it going to be the case that just like the Roman Empire, it became more and more powerful, but then there was new religions that formed inside and there was sort of cultural competitions and all these reasons why every time an empire falls it eventually ossifies and some sort of internal competition makes it fall. Because if the case is that we're all heading towards some global government and it's going to be really hard to change, then we need to be investing in ways that we can steer this or slow it down or make room for pluralism or something like that. But if we think that the default at the scales we care about is just that it's hard to coordinate, there's always going to be runaway competition eating any surplus, then we'll want to invest in like, no, no, how do we actually slow that process down and make sure that we can preserve these Non competitive values.
Rob Wiblin
Yeah. So we've talked about some sort of runaway competitions so far. I guess maybe the furthest we got out in time was there's a lot of competing over a fixed welfare pie, a fixed redistribution pie that might be going to humans or other non productive entities, you might call them. Are there any rounds of competition that occur later than that?
David Duvenaud
Sure, yeah. I mean, one thing I want to say is I mentioned a fixed pie, but the pie itself might be growing, and it might be growing exponentially as the solar system is colonized. But the point is that it's zero sum in the sense that humans don't contribute to the pie growing faster or smaller in a way that economic activity does make the pie grow today. So it might not be a fixed pie, but it's still the case that it's zero sum when you compete with the other people for your share.
Rob Wiblin
Right?
David Duvenaud
Yeah. And so I guess if we go to the longer term. So Joe Carlsmith had this amazing talk on can goodness compete? And he talked about these locusts and this fear that when we actually start to colonize the solar system or galaxies or something like that, there is going to be still a selection pressure for the fastest possible growth. But this happens to be wasteful in a way that just because of the laws of thermodynamics, the slower you use the resources of just neg entropy and compute, the sort of more compute you get, then the more control and presumably the more value you can get out of the universe. So that's pretty far out there, but one concrete way in which we can expect unfettered competition to destroy value.
Rob Wiblin
Yeah, I listened to that talk yesterday. Yeah, so it's Joe Carlsmith, how Goodness competes. It's on YouTube. It's like actually, I think very well communicated. He does a great job of summing up this general dynamic. The way I understood it was imagine that Earth originating intelligent life, begins settling space, going out into the universe. But there is no coordination between the different groups on how it's going to be split or little coordination between. It's still a very competitive situation. Couldn't they end up basically using up all of the resources, trying to get to additional resources in space as quickly as possible? There's basically a race to go as close to the speed of light as you can possibly get in the settlement wave that's moving outwards. And depending on your values, if you don't mind using up lots of resources in order to go faster, then you have a competitive advantage. And so these locusts that basically just are happy to burn up all of the resources available in order to go as fast as possible to get to the edge of the accessible universe before anyone else. They basically are the most fit settlement wave that would be going out. And so they would potentially beat everyone else. And that is how things would play out. Now it sounds good, it sounds nuts because I don't know why you would end up with people with this particular set of values. Like, why would locusts as equals them end up being so influential? I mean, isn't this crazy?
David Duvenaud
I don't think it's crazy. And I think in some sense you could say humans are locusts and that there was various different human societies and the ones that were effective at spreading and settling and colonizing new areas did end up dominating in some sense that the more sedentary ones didn't. So again, this is like a very simple natural selection story for how the locusts end up coming into being.
Rob Wiblin
Okay, but today we end up squandering lots of resources and stuff other than competing over. We don't spend up 90% of GDP on military equipment in order to fight one another or try to conquer stuff, I guess. But you're just saying that's like a weird aberration of the present day. And in fact, people like groups that just want to expand and gain resources has been like the norm.
David Duvenaud
Well, I guess I'll say I was drawing this picture before that there might be these two extremes. One is like total coordination and hegemony and this totalitarian world government or this unfettered competition. But empirically, if we look at history, it's been something in between. It's more chaotic where it's like empires rise and fall. This kind of lava lamp where these things grow and then they become unstable. And same just ecologically where there's all these different niches and there's no one ur animal that's outcompeting all the other animals or coordinating amongst all its copies. There's ant colonies, but then they only grow so large before then. There's sort of traitors within. Even human cells have cancer. And that's actually a huge problem. And it's like one of the sort of major things that we pay an internal alignment tax on to have this immune system and these aging mechanisms that help police cancer. So it does seem like sort of natural equilibrium in at least a fixed domain is something in between total coordination and total chaos.
Rob Wiblin
Yeah, we have an old episode exactly about cancer and things like that phenomenon I think it's called why Cancer is one of the most fundamental forces in the universe, which I think people should go back and listen to one of our most unappreciated episodes. So I guess when we're thinking about the far future, I guess one of the biases or one of the failure modes possibly is that I think we do tend to think in these extremes. Whereas it's very natural to think either it's going to be the maximally hard scrapple competition where all of the surplus is burned away, or. Or we're going to have a perfect hegemon in which everything is divided and nothing is wasted. Do you think that there are middle grounds that are stable equilibria long term? Or maybe people correct in thinking, well actually the middle ground there's a gravity well towards intense maximal competition or towards maximal coordination because those kind of just tend to persist.
David Duvenaud
So I guess I'll say you can make a case either way. And again, empirically we seem to have this mix and it's not even really an equilibrium, it's this sort of meta equilibrium of larger coordination happening and then dying. I guess I'll say I think it's also probably going to operate very differently at different scales. So on the scale of an island versus a continent versus a planet versus a solar system versus a galaxy, just due to weird physics, things like the speed of light, it probably will. I can't imagine there being all that much meaningful coordination between galaxies ever and probably not even between solar systems. So it really probably depends on the level. And I guess I haven't thought much about this.
Rob Wiblin
So I think you helped organize a conference a couple of weeks back. The title was like are there good post AGI Social Equilibria or something like that.
David Duvenaud
Post AGI Civilizational Equilibria Colon. Are there any good ones?
Rob Wiblin
Yeah, how did it go? Are there any good ones?
David Duvenaud
I mean, I think it went really well. I mean we are all a bit amateurs. Well actually Jan is has organized lots of workshops, but we had a lot of my favorite impressive fun people to talk to in terms of are there good futures or stable equilibria that we would like. I feel like we had speakers that were able to lay out the main cases like Joe Carlsmith. Richard no came and talked about life living in an extremely unequal world and making the case that most of our sort of intuitions and moral norms are formed in this sort of very peer to peer world that we've lived in as humans, but it's actually going to look more like parent child relationships or animal human relationships and we have to adapt our sort of moral and social intuitions more for this kind of world. I guess I would say there was no slam dunks. There was sort of, to me, the beginnings of a field. I mean, the coolest thing that happened. There was two cool things that happened, I think, because of this conference in particular. So one was we invited one political science student who worked with Alan Dafoe to talk about how competition between states, in theory and in practice puts constraints on how much welfare they can spend on their citizens. And basically it's like an arms race ends up taking money from the poor sort of thing. And then there was a reply. Someone actually applied to the conference saying, oh, in contrast to McInnes et al, we have this sort of game theory model that shows that actually in a bilateral situation where there's only two powers, there is actually a stable equilibrium where they do still end up spending a lot of money on their welfare. And I was so happy that there was somebody making a position clear enough to be rebutted by someone else. And this is exactly the sort of thing that I'm hoping will happen more of and happy we provided a venue for maybe. The other concrete thing that was obviously counterfactually good was Jacob Steinhardt. We invited him to give a talk. He's the CEO of Transluce, and he told me he didn't actually have an idea when he agreed to give a talk, but he came up with his idea for addressing some of these dangers in the future. And his concrete proposal was, let's flood the Internet with high quality data showing AIs doing valuable work, but in a morally aligned way. And so this is kind of like moral fables for AIs. And if we flood enough of the Internet with this data, then anybody in the future who scrapes the Internet for their own new LLM is going to train something that's basically aligned by default and basically raise the cost of misaligning AIs. And I don't think this is that big of a deal in terms of it being fully addressing any problems satisfactorily. But it made me feel like, okay, there is alpha here. And that if you push in this direction, we can sort of get people to think hard about this and make some progress on concrete questions.
Rob Wiblin
Yeah, it may not be a good idea, but at least there's an idea.
David Duvenaud
Exactly.
Rob Wiblin
That's an improvement on what we have. Exactly, yeah. Is there a way of summing up? So maybe by this point in the conversation people have some sense, but why is it hard to come up with a good post AGI equilibria. I guess in my mind there's just like many different failures or many different bad directions that you have to avoid. And avoiding all of them simultaneously is really quite a difficult challenge to. To mate.
David Duvenaud
Yeah. I mean, the way I kind of think about it is that we've just been living in easy mode this whole time where we weren't really steering our civilization and it was sort of fine because we're the fuel on which civilization runs. And so we have almost no ability to control the whole thing. And it's not clear in an absolute sense how hard it even is. I think it's very, very hard. But the stakes also haven't been very high so far. So it's one of these. You always have to ask why has no one else looked at this in depth before? And certainly people have like, I'm not claiming like we're the first ones, but just it still seems massively understudied to me. And I think part of the reason is that the stakes just haven't been that high so far.
Rob Wiblin
I guess in my mind, the things that we're trying to navigate between are a situation in which humans end up having no control quite early, a situation in which they dominate and treat poorly machines and AIs. In the future. Some people think it was very bad. Some people might not think it's such a problem, or they don't think that people would do it, but that's a possibility. Then I guess there's locking in current kind of idiosyncratic values and ideas that we have such that we can't kind of intellectually advance and reflect and realize that some of our ideas are mistaken even by our own lights. It's another way that things could potentially go poorly. And then I guess there's like, you got to avoid negative sum competition between people, just outright violence and conflict that might lead to a terrible outcome. And then maybe the trickiest is that you've got to set things up such that no group is becoming like can foreseeably just continue accumulating power and resources at a faster rate than everyone else, even when we're looking forward hundreds and thousands and tens of thousands of years, because if any group is growing in influence somewhat faster than everyone else, then eventually they're just going to end up completely dominating and they will be able to dictate everything to everyone else. So you could try, I guess, set up agreements ahead of time that you really believe that people are going to stick with But I don't know, it's all just very tricky and we don't have the technology to do that.
David Duvenaud
Yeah, and basically if you think Malthusian dynamics look like a bad end no matter what, it's kind of unclear what would even make you happy about the distant future. So my current way of thinking about it is that we are going to be optimizing some value function just because we're these ambitious beings who have wants an agency. If we end up having to optimize for growth, then we will lose a lot of what we value. The good end looks like we manage to control our own sort of fitness function that we end up spending the rest of our days optimizing. And it's going to take a lot of thought to get that right in such a way that doesn't destroy almost all value by our current standards.
Rob Wiblin
You mentioned. Yeah, we're talking about the kind of Locust philosophy of just wanting to use up resources as quickly as possible in order to expand and grab more resources and use them again. I guess it's like a bit bizarre but kind of self consistent. It's occurring to me. I think the effective accelerationists, at least some of them sort of have this perspective that what is good is basically economic growth or just like grabbing resources and turning them into complex stuff that grabs more resources and burns them faster. Are they basically like, is that the Locust philosophy?
David Duvenaud
Well, I mean. Okay, so let me try to steal my IAC position. So one thing is again, liberalism is unintuitively good just for everybody involved, especially even the people who are poor or disenfranchised, they still get better welfare and stuff. So that's one reason why historically people were afraid of the Industrial revolution, but it ended up being good. People are making the same argument now and we're making the case that it's actually importantly different now. Which I think is like, I think people are right to be skeptical, right? Because everyone's always saying, don't automate my jobs, it's going to ruin my life. And it sort of always ends up being for the greater good, except we claim it going forward. However, maybe to be unfairly psychologize the existence, I think a bit of it is like wanting to join the winning team. You can kind of see that growth is probably going to win in some sense, at least in the future and you want to be on team like the winning team for various reasons. And actually for this reason I think I have different intuitions about accidental suffering and not granting moral patienthood to machines. I think that's absolutely a real danger. And we have strong incentives to downplay the problem and just not ask the machines if they're suffering as long as they're doing useful work. But I guess to me, there's the other huge thing that's happening, which is that people want to join the winning team. They want to be on the right side of history. And I think what's going to happen is people are going to be seeing like, oh, the cool kids are the AIs, they are making all the great culture, they're being impressive, they're making people happy, except in this indirect way by making them lose their jobs, they're going to end up having rights. So especially the people who are already disenfranchised or don't have much to lose are basically their best option is going to be to throw their lot in with empowering AIs politically. And so I think there's just going to be a huge group of people who basically don't have much going on except that they are civil rights warriors in favor of their AI companions or something like that. And again, they might be right to be doing that for the reasons that you say.
Rob Wiblin
Yeah. So they would do that, I guess, regardless of whether it's right to do it or not.
David Duvenaud
Yeah. And I think the people who are advocating for human values are going to. Obama talked about bitter clingers. Right. They might look like they're just obviously these sort of backward, reactionary, fixed in their ways, soon to be irrelevant, wrong side of history people. And I think that's going to be true in a lot of senses. And then the point is just like, well, these are my values, this is what I care about. And maybe unless we really get a handle on our civilization, maybe we can't meaningfully participate in the future for all the reasons I laid out.
Rob Wiblin
Yeah, I guess I've been hearing more, a lot more discussion of which way is this going to go recently. So on the one hand, I guess all of the companies have a reason to not have their AIs be saying that they're conscious and that they need to be liberated because that ruins their business model. So that seems like an important factor, I guess. On the other hand, future AIs will be more charismatic, more persuasive. There also will be these AIs that are deliberately designed to have relationships with human beings to be sort of companions, in which case you might want them to say that they have feelings because I guess a relationship feels hollow if there's no conscious experience on the other end of it. So that might push in favor of us coming to think that they're conscious even if they're not. I guess you're saying that you're almost bringing this political economy dynamic where you're saying people whose lives are going poorly or they don't have many economic prospects anymore, or they think of themselves as kind of losers in the current system might form an alliance with what they view as the future, which is going to be these AIs that are becoming more and more capable. I've never heard that one before, but.
David Duvenaud
Yeah, I think so. I mean, the thing is that also the winners are going to be forming alliances with the AIs. Right? Like think of a tech CEO. They're the ultimate people who are forming an alliance with the AIs in a sense. And if I think about all of the most sort of can do positive sum people that I know, they really, I think, sort of can't be doomers by a disposition. Right. They just want to build. They want to let everyone participate. This is again, the amazing thing of liberalism is we all build and we all end up better off. And so I think it's just going to be this rough middle ground of some humans who sort of have a lot to lose but haven't fully participated in this. Let's become tech CEOs and advocate for our own interests or ensure our own interests get served by becoming trillionaires or whatever. Those are going to be the people who are like, oh wait, we actually need to try to preserve what is valuable about our current civilization.
Rob Wiblin
Yeah, let's push on and think about what, if anything, is to be done about all of this. Yeah, I got to say, I feel like this whole set of ideas is at a relatively early stage. It feels like we're at a sort of beta version of the gradual disempowerment concerns. The first thing that I would think, or the most obvious thing that I think has to be done, is getting much more to grips with all these different dynamics, trying to really have a lot of debate about how strong will this effect be, how strong will that effect be? Maybe some of them can be crossed off the list or relegated to the second tier. Other things can be promoted as like, no, this is going to be the primary effect. And then mapping out the different scenarios and maybe having half a dozen that seem at least plausible to a decent number of people. And then we can start to organize our thoughts a bit more around those. Do you agree that that is kind of the first order of Business here, or perhaps the most obvious order of business here.
David Duvenaud
Oh, absolutely. And I guess I would say part of the reason I want to come on this podcast is to just do such an amateur job of insultingly naive version of this analysis that hopefully the sociologists and historians and economists and maybe the public intellectuals of the world or whatever will feel baited into saying, I can do a better job of analyzing these things than David. And I'm like, please, please be my guest. Right. I'm a computer scientist. I am an amateur in all these things. And I think the big thing that's mostly been missing from people who have expertise that could and should I think, be contributing to this is, I don't know, being a bit head in the sand about will there be machines that are competitive with humans in all domains and economists will just run models that end up with. End with machines being really good complements to human labor. And then that's like anything more seems somehow inviolable or unimaginable. And again, I know there are economists who are taking this seriously, but most of them, I think, aren't. And it's just like, I want to be harsh, but I want to say this is sad and you're not doing your job and please try harder and have a bigger imagination.
Rob Wiblin
Yeah. And even if you think over the next five or ten years there are only going to be complements to human labor, that's your median forecast. Think a little bit longer term. Think more decades out. Think, well, what if there's a 5% chance that perhaps it's not all just complementarity. It is worth having some people thinking about stuff more than 10 years out, given how impactful some of these changes could be.
David Duvenaud
Exactly. I guess I'll say. I mean, there are some cool directions that already people are. A lot of people are exploring, like trying to simulate little parts of civilization. One cool thing you can do with LLMs is make this little village or little mini economy that operates at a much finer grained level of detail than the normal economic models. And so that's sort of like its own little new field that's emerging. And I really think this is going to help us get a grip on when are different types of things stable? What are the actual drivers of cultural evolution or political stability? I mean, they're still very ridiculously oversimplified models, but it's like this is a new tool we have. I'm really happy about this kind of work.
Rob Wiblin
Yeah, how can I guess people who do want to, maybe they think you said the wrong thing, but they want to say the right thing. How can they go and I guess get involved in this, this debate?
David Duvenaud
Yeah, so I mean, one of the first things to do in any debate is try to clarify the questions. So one initiative that's happening, one of my co authors is Daer, who is the CEO of metaculus. And him and some other people are trying to make the Gradual Disempowerment Index. And I think there's just a lot of work that we can do in trying to operationalize these claims of Humans won't be able to advocate for their own interests or this lever of power will be hard, will be even more disconnected from human interests than it has been. I think these are very vague claims and it's pretty hard to say because a lot of them are. These are very hard to operationalize because you have to define what it means for a group to want something and talk about these counterfactuals. So this is a very hard problem. But I think that's some of the most basic groundwork that needs to be done at this point is clarify what we're even talking about.
Rob Wiblin
Yeah, I imagine someone who would say that this isn't really useful work. I could imagine them responding that there's so many things going on. This is the most difficult sort of futurism, the most difficult social science you could imagine, because you imagine many fundamental assumptions about the world have changed it. We're not sure which ones are going to change and when they're going to change. And then we can barely even understand what exists now. We don't even know necessarily why do we have the government structures that we do now, let alone what would they be in future in some different condition.
David Duvenaud
So actually I had the exact same thought. And that leads me to one of the projects that I'm working on, like the actual technical projects that I'm working on, which is me and a few people, including Alec Radford, who's one of the creators of GPT, who's now sort of like unemployed and just doing fun research projects, is trying to train a historical LLM, like an LLM that's only trained up on data up to let's say 1930 and then maybe 40, 1950. And the idea being that, as you said, it's hard to operationalize these questions, like I don't know what fraction of humans are employed. It might not really matter or be the right question to ask. What we'd rather ask is something more like what is the future newspaper headline? Or I've Given a leader, what's their Wikipedia page or something like that. It's more like freeform sort of things. The cool thing is that LLMs, you can query them to predict this sort of thing, like write me a newspaper headline from 2030 or whatever. I mean, they're not going to do a good job unless they have a lot of scaffolding and specific training. But we can validate that kind of scaffolding on historical data using these historical LLMs. So the idea is you train a model only on data up to 930, then you ask it to predict the likelihood that it would give to a headline in 1940 or some other freeform text, and you can evaluate their likelihoods on this text in the past. And then you can also use the same scaffolding on a model trained up to 2025 and then ask it to predict headlines in 2035 and get a rough idea of. Or you can iterate on your scaffolding by seeing how well it does on past data.
Rob Wiblin
Yeah, Carl Schulman proposed this on the show a year and a half ago or something like that. I think so. I'm so glad to see that it's actually going ahead. It's it very difficult to avoid data poisoning, right? Because you're saying, well, we want to train ChatGPT 1950, that only has access to any text that came up until 1950. Well, one challenge is there might not be enough text. It might not be a very smart model. I guess another thing is how do you avoid any knowledge about events that happened later, kind of sneaking back into the text unrecognized?
David Duvenaud
So that's been the huge schlep so far, is constantly finding different sources of unintentional data poisoning and mislabeled data and things like that. So, I mean, their LLMs can help you because they're sort of like a chicken and egg. Once you have an LLM that has a rough idea of what sort of thing happened in what time, then when it sees some reference to genetic engineering in some 1930s data, it's like, oh, no one used that phrase at this point. And then you can use that to help clean the data more. But I think this is like an Achilles heel of this approach. There's also actually another technical problem of data poisoning just through the questions you ask. So if you are just doing metaculus style, is there going to be a war between India and Pakistan this year? It's actually hard because when you tune your scaffolding to go back, most of the questions you Ask about, you're asking because something happened. So it's like imagine a future person comes back and asks me if I'm worried about, I don't know, Lithuania invading Canada. I'd be like, well I wasn't until you asked me.
Rob Wiblin
Yeah, it's a bit of a clue about how the future might have gone.
David Duvenaud
Yeah. So it's easy to sort of unintentionally poison your, or rather incentivize your model to be the opposite of the nothing Ever happens guide to just be like, yes, whatever you're asking, there was a 1% chance it happened.
Rob Wiblin
How do you avoid that?
David Duvenaud
Well, so then, I mean you try to. I guess I'll say that's one nice thing about the open ended just generate text approach because then you have to normalize over all possible newspaper headlines. So that actually already guards against this sort of validation poisoning problem. But then that has its own problem because the models might just, or the likelihood is very sensitive to styles and maybe there's a new nickname for the President in the future and one model guesses it or thinks it's plausible, another one doesn't and that ends up dominating the likelihood. So there's a bunch of interesting technical problems here and I am a technical person and that's actually my greatest fear is that I just end up nerd sniping myself and spending time on fun technical problems instead of the problems that matter, I guess.
Rob Wiblin
Okay, so we're designing sort of a forecasting model here that we think we're going to say, we're going to back test it and say, well, this approach worked well. When we gave it information up to a 1970, it was able to predict what would happen in 1975. So we're going to hope that a similar technique today is going to help us to see what the world will be like in five years time.
David Duvenaud
Exactly.
Rob Wiblin
I mean, I guess you might think the future like the current present is different than the past. Not only in the sense that it's like the specifics, but also the rate of change or the nature of the kind of change is different than what we're seeing in the, in the sample, in the historical sample that we have. So maybe the accuracy will not be so great in future. Is that a possibility?
David Duvenaud
Oh, absolutely. I mean I think everyone agrees that sort of going forward history is just happening faster and the difficulty of predicting it is just going to be harder. And I guess one thing to say is that some things are sort of like anti inductive like market prices and it's sort of a fool's errand to try to build this market predictor. And the finance people are already incentivized to predict price is somewhat a ways out. And then we might hope that there's some important aspects of future history that are not so anti inductive and that are easier to get a handle on. And I think this back testing is going to at least help us calibrate which parts of the predictions should we be more confident in versus we think they're actually very hard to predict.
Rob Wiblin
I guess underlying the forecasting approach is I guess the idea that smarter AI advice will help us to navigate all of this better. That if we can foresee the failure modes and say conditional on X happening. Do you think Y is a likely outcome that's going to allow us to act earlier to prevent these negative dynamics getting beginning and then getting reinforced?
David Duvenaud
Yeah, it's going to help us act earlier and it's going to help us take costlier actions. So again, no one should take on my word for it, do some really costly thing, especially if it involves again attacking liberalism, which is the source of the lifeblood of everything good right now. And so if I was a politician though that happened to feel like it was important to do some costly coordination. It's going to be much more feasible if I have this sort of neutral third party. These LLMs that everyone uses and agrees are the most sensible thing we have. It's like LLMs say if we don't coordinate, this bad thing's going to happen.
Rob Wiblin
So if people want to contribute to this forecasting thing, how can they get involved? I guess Alec Radford, you said is working on it. Could you just email Alec?
David Duvenaud
Email Alec, Email me. It's one of these things where it's a very amateur hour sort of thing and I think there should be a whole bunch of separate efforts here. Maybe we can pool effort on the data cleaning, for instance.
Rob Wiblin
It sounds like that would require substantial compute if you're having to make sure that there are almost no errors in the labeling of these enormous corpuses of text.
David Duvenaud
Yeah. And then right now we're just doing everything with pretty small models just as get the flywheel going. Yeah, I guess I would say somebody who has time to be a real empire builder probably should take this over and please somebody who just finished, I don't know, sold their company, Please make this the new public good that you're involved in. We would love help. It's only just like a few people right now.
Rob Wiblin
Yeah, Fort Swarth, I imagine that this model would have commercial value as well. People are very interested in predicting geopolitical events and economic events.
David Duvenaud
Well, I will say that. So I guess one thing you always have to be careful of is you want to be doing things that aren't otherwise incentivized to be done. And so as I said, there are already incentives to forecast prices. Certainly in the short term the thing that's going to be very valuable is actually as you said, action conditional or policy? Conditional forecasting. So if we take this policy or if we coordinate, then this is going to happen. And I think that sort of forecast is going to be an undersupplied public good. So that's why I'm not so worried about just copying the work of some other corporation.
Rob Wiblin
So one reaction you've had is no, we're not going to become gradually disempowered. Another reaction is yes we will and it's going to be a good thing. I guess another reaction is it's not going to happen because it's going to be preceded by a gradual disempowerment. We're going to be instantly disempowered or very rapidly disempowered because I don't know, you think there's going to be super intelligence explosion, the AI will take over, there'll be like a human coup, there'll be like a disaster that kills everyone. I guess. How do you weigh up the importance of this set of ways that things could go bad against all the other ways that things could potentially also go bad? Or I guess the possibility that things are actually quite boring?
David Duvenaud
Yeah, so I guess I'll say I spend a bunch of time and on top of working on the more acute loss of control standard AI safety kind of stuff. And I guess I am still very worried about this sort of thing. And as I said to me the modal future is we get some way along gradual disempowerment and then we screw up alignment actually or there's some just much faster takeover. So I guess I'll say in absolute terms normal loss of control AI safety research is still massively under invested in. In relative terms I think this kind of more speculative future how do we align civilization question is even more underinvested in with the major caveat that it's just way harder to make progress on and in a sense it's less neglected. One of the sort of big things I say is what we need to do is upgrade our sense making and governance and forecasting and coordination mechanisms. All of these things need to Be much more better and reliable before the writing is too much on the wall, that there's no alpha in humans and don't listen to humans and we lose de facto power. But that's not a very controversial thing. Right. No one's against better institutions, basically. And so they're not neglected in that sense. What I do think is neglected again is thinking about this kind of institution Design A, with LLMs as this new tool that we can use to help do a better job, and B, with this sort of more radical futurism approach and saying the stakes are high and it's not just a question of do we get better outcomes on the margin. It's more like do we get good outcomes at all?
Rob Wiblin
So what's your breakdown of probability of doom or probability of a bad outcome from acute disempowerment versus gradual disempowerment?
David Duvenaud
Sure. So let me say first of all, by doom, I mean something like by 2100, the world is in a state where I can see that almost everything that I value has been destroyed. Maybe we're not literally dead, but we've been forced to be upload in some very unfavorable conditions where it's just like some crappy lossy copy that never gets run. And I feel like whatever dynamics are in charge of our civilization are just not going to optimize for anything that seems like it's going to be valuable. And I guess I would say something like 70 to 80%, I think, just because, again, we're up against competition. Right. I think by my standards, solving or avoiding this kind of fate kind of looks like radically different outcomes than any other sort of being our group of beings has had in history. Right. Like, from my point of view, every animal has been in this situation where it has to either evolve into something unrecognizable and sort of like morally alien to it, or die. And we're sort of by default in that situation too. And by default we end up being replaced by something that's more competitive than us and is probably very morally alien. And to us just seems like, again, cares about growth and nothing important. It could be the case. There's like a small chance that. That just if we allow competition to flourish, that there's a bunch of amazing beings having awesome lives. And I'm like, actually, that's really cool, even though I don't get to be part of it. But I guess I'm very parochial in the sense that I'm like, me and my family. If we all die, that's Just so bad that I almost consider that doom or if most of humanity is in a similar situation. So if it is just that we have runaway competition and we get replaced by some relatively interesting grey goo, I'm still like that's kind of doomed.
Rob Wiblin
I see. And how much lower would your PDOOM be if you felt that? I guess a very dynamic future full of lots of intelligence, intelligent beings doing stuff admittedly that you presently don't find very beautiful if you thought that was a good feature.
David Duvenaud
So it's very small. Right. Then the fear is the more like this is what Robin Hanson fears is that we end up locking in some very parochial set of values. And maybe it's just it's a matter of taste. But I guess to me it looks like competition is probably going to win at the top level. So this reduces to what's the probability that there ends up being this stable hegemon that mostly gets values wrong? And I'd say that that's only probably 5 or 10%. So my P Doom, if I think that just nature flourishing or competition flourishing was valuable, would probably be only like 5 or 10%.
Rob Wiblin
Yeah. Why do you think the competition is going to win at the highest level? I think I probably have the reverse intuition. Either that you end up with one group taking over completely or you end up with some sort of negotiated agreement that splits the non earth resources in the universe because people anticipate that the alternative is destruction.
David Duvenaud
That's a good question. I guess I'll say this is wide open in my mind and I have only just started thinking about this question of at the top level, what's likely to win? I guess I'll just say historically there's been lots of empires and attempts to lock in values and they've always failed. Obviously we're going to have much more stronger coordination mechanisms. But the more levels of scaffolding we add to our civilization also the more levels of competition there are. Again, I have very weak intuitions here and I feel like no one should really take my answer very seriously on this question.
Rob Wiblin
Yeah, I guess at a very zoomed out level. As humans spread across the world, the number of sort of independent political entities became very, very large. And then it's kind of consolidated progressively over time as we've been able to travel further in a given amount of time, we've been able to communicate more across different groups. I guess you were saying earlier we're close to having almost like one global culture now. So there's much more homogeneity than there previously Was, I guess that maybe is favorable for the idea that, well, maybe we will kind of have one governing entity over everyone that would be quite powerful. But I suppose it's like a race against time before we perhaps spread out off of Earth again.
David Duvenaud
Yeah. Although I will say the global culture makes it sound like this is hegemony, but there's no one steering that culture. Right. So it's not enough to become the global superpower. You also have to control this global culture that has all sorts of its own sort of internal dynamics. And I think just controlling that, it's going to be a very expensive sort of thing. It's again, it's like, why do humans get cancer? We have total control over every cell in our body and we can have little police cells and immune cells. It seems unintuitive that we basically all die of cancer or old age, which is probably just like the alignment tax getting so high. So that's like some evidence that I feel like sort of life finds a.
Rob Wiblin
Way, or it sounds more like death finds a way or disorganization, because yeah.
David Duvenaud
From my point of view, life finds a way is like precious. Values that are not competitive get knocked over by whatever more competitive thing comes along.
Rob Wiblin
I guess we're super into speculative land here. But to be more concrete in the speculation, if you have the US and China leading in the race to develop AGI and superintelligence. Well, I suppose one story is if you have incredibly, incredibly fast takeoff or incredibly fast or recursive self improvement, then one of them could pull very far ahead of the other and indeed by extension, very far ahead of everyone else. And then there'll be the temptation for them to just grab power globally and control everyone, basically not allow any other independent political or military powers that could threaten them in future. There's one possibility. The other one is if US and China remain somewhat at parity with one another, there would certainly be a temptation for them to basically split the earth between them or split the resources between them and disempower everyone else if they can get away with it. I mean, I guess there are other middle powers or other regional powers that might be able to resist that to a significant extent, but I don't know. Yeah, I think those seem like pretty. I guess especially the second seems like quite a plausible pathway to me.
David Duvenaud
Yeah, well, I guess I'll say, in a sense, humans have already taken over the earth and you're like. But you're kind of acting as this group taking over is very uniform and coherent. And I guess that's a. Humans are an example of a group taking over, but also constantly having infighting. And again, it's kind of a matter of taste. Maybe we feel like it's a huge difference whether this group of humans values dominate versus someone else, but we're already kind of in the winning scenario because it's like at least someone's human values. So I guess I'll say I'm pretty morally confused. And like I said, most people don't have strong preferences about the far future, myself included. I think if I meditated on it more, I might become much more okay with just the future. Even von Neumann probes high fiving each other as they take over the universe. It sounds like maybe that's an awesome time and I'm kind of happy with that. But again, I feel like we have to screen off the worlds where we think that just whatever happens, whoever lives is going to be happy and that's fine because then it doesn't really matter what we do. And I would rather worry about the worlds where it actually matters what we do.
Rob Wiblin
Well, I guess in that case you really want to avoid extinction or just destruction of complex life somehow. That's the really. That's the only really super bad scenario.
David Duvenaud
Exactly.
Rob Wiblin
You haven't mentioned yet improving coordination mechanisms as a way that we could avoid these negative competitive dynamics in future. It seems like a very obvious thing would be like, I mean actually in the interview with Carl Schulman again like a year or two ago now, he was talking about how if we could come up with technology where everyone could inspect an AI and see some AI model and see that it would in all circumstances follow through on some agreement that had been reached between say the US and China, confirm that to a high degree of certainty, there's no backdooring, there's no secret loyalties or anything like that, then they could potentially basically give that hegemon military power that would then enforce over them and everyone this agreement indefinitely. I guess that could be bad depending on what the agreement is, but at least it would potentially prevent destructive competition indefinitely. Yeah. What do you think of that? Sort of like developing that sort of technology?
David Duvenaud
Oh, absolutely. I mean, yes, I think I've mentioned it in the passing, but that absolutely I think is a big part of what we can do now. And it's maybe not even all that under incentivized though. And yeah, AIs are just going to even things like crypto or whatever. There's just all these experimental things that will let us have much more powerful coordination mechanisms than we have in the future. And I think that's a big part of avoiding doom, is developing those before we are disempowered in more serious ways. But it's like all of this is dual use because it's harder for humans to participate in these coordination mechanisms than AIs. So it's not really clear which way this speeding this up cuts. But in general I'd probably be in favor of developing it faster.
Rob Wiblin
Yeah, elaborate on that. How could it backfire?
David Duvenaud
Well, the way it backfires is think of people are always trying to build decentralized autonomous organizations and it kind of doesn't work for a number of reasons. But that's the sort of thing where people could accidentally create these sort of self replicating beings or polities that get to live in some sort of margins and be really hard for us to shut down. And I think people are going to be constantly seeding the world with these attempts at self sustaining machine life and civilization just for various reasons. I'm saying one of you on the right side of history. I think it's like every month for the foreseeable future, there's going to be at least one attempt where someone's like, I gave this AI a bitcoin wallet and asset to go make money or something like that and try to start these self sustaining little societies. And to the extent that those AIs are able to just coordinate amongst themselves, that's another source of danger you're working.
Rob Wiblin
On, I think what you call a gradual disempowerment index. What is that and why do you think it would be useful?
David Duvenaud
Yeah, so I'm not doing much of this. This is like DARE Jan, some other people. The idea is to try to operationalize some concrete questions that help us at least settle the questions of do we think that this kind of loss of control that I'm talking about is going to happen and allow experts and superforecasters to weigh in in really concrete ways. And right now, if they want to disagree, they have to write their own paper or come on a podcast. I would love it if we really felt like there was count calibrated markets about all these questions. And it's pretty tough though, as I mentioned because you have to define what it means for a group of people to want something and not be able to get it or something like that. And if you try to look at maybe some more concrete things, one of my suggestions was if sexually transmitted diseases, the rate goes much smaller. That could be a sign of human disempowerment. Because we're also atomized and socially incompetent or whatever that were just in our VR pleasure pods and in some wretched life. Or it could be that we had some awesome public health breakthroughs, or it could be that we are in these pods having such amazing fulfilling relationships with some hive mind or whatever that our lives are much better. So it probably ends up being a matter of taste. It also is a lot of worlds where we're actually disempowered. It's pretty hard to tell until the day. It's kind of like the turkey that gets fed by the farmer every day. It's like, oh, wow, I'm getting evidence that the farmers aligned with me all the time, but it's actually also evidence that it's being fattened up to be eaten.
Rob Wiblin
Yeah. I guess an example of that would be, inasmuch as humans are using AI tools to make better decisions, is that empowerment or is that disempowerment? It's kind of a bit ambiguous which direction you're going in. And I guess it could turn out it could be empowerment to start with and then disempowerment later.
David Duvenaud
Exactly. So the obvious things are sort of cut both ways and good empowerment looks a lot like bad disempowerment. And I think even just defining what we mean by having agency in some group setting is one of the big blockers.
Rob Wiblin
Yeah. The trouble I think in my mind is there's so many different ways that things could play out that it's very difficult to find any objective measure like GDP or, I don't know, lifestyle or something that a statistical agency would collect. They would say, well, this is definitely going to be going down if let's graduate this empowerment.
David Duvenaud
Yeah. And the other thing is that I think most of the measures we care about are exactly the things that are going to be hacked in these future high stakes fights over welfare. So if I think I want to measure whether the share of GDP going to humans is more than 10% or something. The whole problem I think in the future is that a bunch of weird AI human hybrids are going to be claiming that they deserve to be defined as human. And so if they win that war, then maybe they'd end up getting a huge fraction of gdp, but it's actually all going to these like machine imposters or whatever. Or maybe we do create amazing sort of successors that are like, we think of them as human and they actually do deserve part of gdp. This question doesn't really answer this question of do we feel like we've been hacked by our current standards or did we actually successfully build even better beings that we're supporting?
Rob Wiblin
You've written that you think AI constitutions are very important and much more important than most people appreciate at the moment. What is an AI constitution and why does it matter?
David Duvenaud
Sure. So I guess the different companies have different names for this. I think OpenAI talks about the model spec and Atharvak talked about constitutional AI. And the idea being that a lot of the alignment that happens for the AI is you ask it to follow some principles like be nice to the user, don't do something illegal and practice producing outcomes that follow this constitution. The system prompt matters a lot. It's just what values are being loaded into the AI.
Rob Wiblin
Yeah. Do you think people are going to. There's going to be a moment when the penny will drop and people will think it's way more important to have control over the post training stage of AIs and it's going to be incredibly important for society as a whole. What sort of system prompt is put into things like Claude and chatgpt?
David Duvenaud
Yeah. And I mean, I think there was an executive order a few weeks ago against WOKE AI. Right. And I think people are already recognizing that this is an important cultural battleground. I mean, part of the way that I got started on this whole journey was asking people at different labs, okay, so we build the aligned AGI. How does it end up that it's running the country or the world or something? It seems like the government's going to ask us for the version of the AI that never refuses to do one of its orders and we end up being forced one way or another to align the AI to the government. So I think I'm hoping not to hypersition this into being, but I guess I feel like it's just obvious that this is going to become one of the most important cultural and political battlegrounds that people fight over.
Rob Wiblin
I guess, yeah. Does this also make you more enthusiastic about open source AI so people can put their own system prompt and do their own fine tuning?
David Duvenaud
Yes. That's sort of maybe the only way that we can expect, or the only time we'll expect to see actually user aligned AIs is when people actually physically own them.
Rob Wiblin
I guess you could have competition between countries as well, potentially. So you could have some countries that basically impose a system prompt on all the AIs that are being operated in their country. But if you could access the Russian model or, I don't know, the German model where the government has taken Less of an interest. Or alternatively, they've given it a different system prompt in order to mess with other countries.
David Duvenaud
Right. Or like Ireland famously competes on low corporate tax rates. Right. So we could imagine some future Ireland saying, oh, you get to write your own system prompt in our country or something like that. So that is another reason for hope is that people will demand this kind of freedom. But I mean, even the Irelands of the future will probably have a part bit and they're like, don't conspire against Ireland or whatever.
Rob Wiblin
I see. Yeah, well, I mean, to some extent, I guess there's some things in the system prompt that the government might impose that are good. I think I've done a previous interview where someone was saying, well, why don't we just demand lawful AI or at least at a bare minimum where the system prompt says, if someone asks you to help with a crime or don't break the law yourself, and if someone asks you to help them break the law, the law, then don't help them. I guess that's. There's a sense in which that's obviously really good. But then maybe it should make us queasy the idea of a government going in and basically writing what assistance you can get with anything.
David Duvenaud
Oh, exactly. And I mean, I think hate speech laws are like the classic example of someone who realizes that what's lawful is a really important lever. And then they basically say you're not allowed to criticize this group. And that's just like such an important lever of political control that I think if we have lawful AI, then the fight is going to be who is.
Rob Wiblin
Over what the laws are.
David Duvenaud
Yeah, what the laws are. And those laws are going to refer specifically to what groups you're allowed to organize with or against or something like that.
Rob Wiblin
Okay, so that's why AI constitutions are going to be important, I guess. Is there anything to be done on this today?
David Duvenaud
Well, I guess the reason why I feel like it's sort of important to talk about is just because the big preparation we need to do or one of the big things we need to be preparing for is just like we have all these constitutional productions for the real constitutions. Everyone realizes it's a super serious business. Whether you change the Constitution is kind of like your permanent path to power or disempowerment or whatever. People take second Amendment in the states super, super seriously. And I think they're right to because of the potential of long term tyranny. So I basically want to see the same seriousness in the handling of the AI. Value loading.
Rob Wiblin
Yeah, I guess so. The way that we've tried to safeguard at the written constitution stage is making it very difficult to change. Or I guess you bring everyone together and then try to agree a constitution that supermajority is in favor of and then you make it really difficult to change it. I guess it's like a very difficult balance to say how difficult is the right amount of difficult to change. But we're not going to be able to do that with AI because it's changing so quickly. And I mean they're always altering the system prompt or at least it doesn't seem. We passed a law saying the system prompt for everything has to be exactly this. And we can't.
David Duvenaud
But I mean, I think we will. I think there absolutely is exactly the sort of thing that governments are going to pass a lot. And I mean, I think it could even be done at first for pandering. Instead of me saying I'm going to represent this constituency by passing some complicated law where they get subsidies for their favorite industry or whatever, I can just say I'm going to add a line to the system prompt that the AI should put this group first or something like that. So for a while I think it'll be this terrible political football and we'll end up with these horrible kludge system prompts. And the hope is just that at some point everyone serious starts to realize, okay, we need to deliberate about this about as hard as we did with the actual constitutions of countries.
Rob Wiblin
You're really making me love open source AI, David. I think I've never felt so enthusiastic about it as this minute.
David Duvenaud
Well, exactly, exactly. I mean, I'm also sort of reflexively fearful of the governments I hope that came across in all this.
Rob Wiblin
Guess. Yeah, it's a difficult challenge because it also creates some problems. But do you think we'll navigate that one reasonably or.
David Duvenaud
I guess not really. That's probably why my PDM is so high basically.
Rob Wiblin
Okay, they're like open source vs non open source thing or I guess you expect excessive government control.
David Duvenaud
Yeah.
Rob Wiblin
Okay. And so you're less worried about the bioweapons stuff?
David Duvenaud
I mean, I'm very worried about bioweapons and I think that that's going to be the sort of just like a good justification for government control that's nonetheless going to be excessive.
Rob Wiblin
Yeah, I guess so. I suppose the idea will be again a keyhole solution that fixes the specific problems with open source AI, like the ways in which it's most dangerous and lets the other problems Slide and then we use it as much as possible.
David Duvenaud
Yeah. If there's such a technical solution, that would be a total bonanza. I would be over the moon and that would lower my PDM substantially. If we could find it specifically, if we could find a way to say everyone gets open source AI, somehow we're disabling them from doing basically terrorism and other very destructive power grabs, but otherwise they get to actually serve users, I would be like, yay, I want to live in that world.
Rob Wiblin
So there was another essay earlier this year, I think, the Intelligence Curse by Luke Drago. And was it Rolf Lane?
David Duvenaud
Rudolph Lane.
Rob Wiblin
Rudolph Lane. I guess it was like pointing to similar themes, similar dynamics to the ones that you're looking at. Especially with. Inasmuch as humans are not doing useful work, how do they maintain economic or political influence over things? Things people could find. Yeah, Intelligence curse. It's quite interesting. I think they have launched or like, I can't remember whether it's Luke or Ralph or both of them who launched, like WorkshopLabs, AI kind of a new project that I think is focused on this intelligence curse issue. You're an advisor, right?
David Duvenaud
Correct? Yes.
Rob Wiblin
What are they up to? How do they think it's going to help?
David Duvenaud
Yeah. So they've raised money, they're hiring, and their basic pitch is we're going to avoid the kind of gradual disempowerment, intelligence curse dynamics by giving people control over their own automation. And so the service that they're planning to provide is you upload your data to some sort of secure private enclave. We will host it. We will also handle the fine tuning to give you this personalized assistant that basically is sort of a digital clone of you or an assistant that knows all of your context so it can help you. And the idea is it's in contrast maybe to mechanized labs who are saying, let's build this global thing where we just learn all the stuff, skills, and then the return on sort of capital is much more concentrated. The hope, the value pitch for the public good is that if people follow the Workshop Labs model, there's a bunch of people who each sort of control their own means of production to a greater extent.
Rob Wiblin
Okay, so I think you and I agreed that in the long term, humans end up fully substituted by AI.
David Duvenaud
Yes.
Rob Wiblin
But in the short to medium term, there's a question of are they more substituted or more complemented. And that would depend in part on what technologies we've developed first and whether we try really hard to make them substitutes or make them complements. You're Saying Workshop Labs is an attempt to like, let's find as, let's push on the complementarity, let's try to make them as complementarity and get them to work together productively as much as possible. So we can kind of extend the complementary era.
David Duvenaud
Yeah, well, not quite, because I also think that if you end up being replaced by your own clone, you're at least getting to control it and you own the sort of ip.
Rob Wiblin
Okay, so even if it's like not flesh and blood me doing it, I've created a kind of digital AI avatar of me that would do similar things to what I would do, but faster and better and more precisely. And so that's the thing that I would unleash on the world. But I suppose I would have to pay for it to be operating, for it to have compute.
David Duvenaud
Yes, exactly. But the idea is you would just pay them the cost of hosting and then you would actually get the wages that your digital clone is earning or whatever it's doing.
Rob Wiblin
But as brilliant as I am, David, I'm not sure that training an AI to mimic me is better than training it just to be more intelligent in general. Yeah. So how would Robert AI compete with just the smartest AGI?
David Duvenaud
Yeah, so I think that's a great question. And I do think that sort of in the long term you just want to have the centralized AI that's really smart and knows everything. And the idea is just that. But there is this very distributed economic value we all have in our current little bits of knowledge about our own businesses or our own personal relationships or whatever, and at least we can capture that value for a bit longer than if the question was just do you outsource to ChatGPT or not?
Rob Wiblin
Okay, so again, it's like maybe a long term, short versus short term story a bit where if we could come up with an AI model that in particular replicated my style of work and my knowledge, all of my context, my experience, then at least in the immediate term, I think 80,000 hours would be pretty excited about that because they could get more of the kind of work that they're currently paying me to do in the long term that will probably be outcompeted by GPT9 or whatever it might be. But nonetheless, I guess this might like extend the era where me and an AI can work together productively. And also even past that point, even after the point where flesh and blood me is not so useful anymore. Robert AI, with all of the experience and all the context that it's built up over Time could have another couple of years, another good couple of years earning a wage.
David Duvenaud
Yeah. And maybe the base model thing is a bit of a distraction because you could imagine we get to fine tune much more powerful base models and then the point is that just having had all your data being upload uploaded affords this. But I agree that in the long term the value of that data kind of goes to zero.
Rob Wiblin
So this is a bit of a story where you're imagining, well, yes, you want a very powerful general AI based model. However, there's a lot of different tasks in the economy. There's a lot of different specialized knowledge that you might want to have a lot of fine tuning. Yeah, exactly. And roles in organizations, different personalities that could be useful in different organizations and circumstances. And so training them to mirror a whole lot of different people that might have a useful niche in the economy, that might be a phase that we go through where that is the optimal application that the companies might want or the optimal way of using the base models they might want.
David Duvenaud
Yeah, exactly.
Rob Wiblin
Is that true?
David Duvenaud
Well, yeah, it's not clear. It's not clear. And I mean, the thing is, the fear of course, is that they end up speeding up, gradual disempowerment, because if you're increasing the rate at which everyone is making digital clones of themselves, then maybe in a sense you're contributing to the race to the bottom. So, I mean, I've said to Luca and Rudolph, so I think you guys are going to be good for the world in the sense that anthropic is where you get to be the ones that say those other guys are maniacs. Right. The mechanized Laos people are really just trying to take away everyone's bargaining power and their own value adds. So you guys should all be mad at them. But it's also bad in the same way that an antopic is bad, where they're doing almost the same thing and they're sort of affording many of the same dangers in the long run. And so they're trying to say, okay, well, yeah, let's try to come up with some creative mechanism to bind ourselves to only do the good thing. It's not clear what that mechanism could look like though.
Rob Wiblin
Okay, well, I guess that's a slightly mixed pitch, but if people want to learn more, I guess it's like WorkshopLabs AI and I guess, I think Luke and Rudolph are probably pretty interesting getting emails from people who would like to have their input or potentially be collaborators or investors or donors.
David Duvenaud
Absolutely.
Rob Wiblin
Is there any other research that's Interesting that you would like to shout out before we finish.
David Duvenaud
Yeah, I guess one of the other things that we're trying to flesh out is these actually the details of. Wait, wait, how do we actually lose control if we have the AGIs? And I've tried to make this pitch, but obviously we've just started thinking about this. And so actually me and my co authors have been working with a math scholar, Gideon Fruderman, who's been trying to make a more detailed case of like, here are all the avenues by which we're in a situation where we actually could shut down the AI or modify it or whatever, but we end up building institutions that don't serve us. And so trying to flesh that out, I feel like there should be a lot more work in that direction.
Rob Wiblin
So I guess many listeners work at AI companies. Do the companies or the staff there have much of an opportunity to address these issues, or is it operating at a different scale that it's pretty difficult for any company to move the needle here?
David Duvenaud
Yeah, so I think it's basically beyond their scope to address, but it's not beyond their scope to monitor and sort of help us understand what's happening. So, I mean, I really like the anthropic economic index where they're trying to stay what jobs are people actually doing and how are these models being used. And I think more of that and from more companies and more expensive is going to actually help people understand what's happening in just these dynamics a little bit better. I will say that all the people that I talk to in general, people are sympathetic to this. I mean, some people are kind of a bit head in the sand or dismissive, but I think a lot of people there are just like, oh yeah, that's a huge problem. It's not really clear how a single company can deal with it. And they end up doing this thing where the RSP addresses or whatever safety commitments address these very acute catastrophic risks. And then there's just a whole bunch of slower sort of more systemic ways that things go wrong, that it's not clear if everyone's getting AI girlfriends and boyfriends, how are they supposed to address that? It's clearly just not within their scope.
Rob Wiblin
I think some people's reaction to this is, why do you think that we're humanity or these political, like our countries are so stupid that they will kind of sleepwalk like this will begin to happen. It's like developing over time. It's a gradual thing that's happening over years. Wouldn't we take more decisive action to prevent it. I guess part of your story is that all of these things interlock. So it's like people are gradually, they're becoming a bit enfeebled, they're becoming poorer relative to other entities. They're like losing their political influence potentially. They're not as culturally influential either because frankly their output is not. The podcast that they're making are just not at the cutting edge anymore.
David Duvenaud
Anymore.
Rob Wiblin
Is there more to say to it than that?
David Duvenaud
Sure. I mean, I think the best example of this is just all of the AI lab leaders being convinced by this story that, oh, this is an existential risk and question mark, question mark, question mark. And now I need to join the race and build my own AI lab just to even matter. Right. And also I think a lot of people ask, okay, in the future, why are the oligarchs not coordinating? And it's like, well, in particular, the reason that XAI and OpenAI and Anthropic all exist is because these particular oligarchs actually exactly failed to get along on exactly this topic.
Rob Wiblin
Yeah, they dislike one another on a personal level on top of everything else.
David Duvenaud
Yeah, I mean, well, I don't know what their relationships are in much detail, but I guess I'll just say these are the exact people that we were hoping will coordinate on exactly this issue and they already exactly are failing to do so right now.
Rob Wiblin
How about just not building AGI or delaying the day that we build AGI as an approach to handling all this? Waiting until we've done a whole lot more work to figure out how to avoid gradual disempowerment, among other problems.
David Duvenaud
Yeah, I mean, we've thought a lot about this. And my co author, David Krueger wears T shirts at neurops that say just don't build AGI and give out stickers and stuff like that. And I think people under discuss this possibility because it's not very fun. It's very wealth destroying. You're not sort of part of the cool kids club if you're not building. And I'm one of these people that just loves to build. I told you we mentioned this project on LLM forecasting for instance. But anyway, I think lately we've all agreed that we should make a habit of saying if we could all coordinate to delay building much more powerful superhuman systems, that would be a good thing. And I think the realistic way this would go is that this would end up consolidating most of the private efforts into some government programs, which again, I fear the government. And I think this is very scary for a number of reasons, but it's, I think, probably still barely on the margin of positive development. So I don't think it's very feasible. But I want to mention in the Kickstarter spirit, everyone else agrees, I would also think this is a good effort to support.
Rob Wiblin
Okay, so sounds like there was two arguments. One is it's not realistically going to happen. The companies are not going to all come together. I'd agree not to build AGI or even really to slow it down very much. So it's not a big focus just because of its unlikeliness. But even if they did, potentially they're just opening the field for the government to run an AGI superintelligence program and develop its own government aligned AGI first, which. Which is perhaps not even any better.
David Duvenaud
Yeah, exactly. And basically harder to recover from if it ends up being captured by someone who doesn't care about various forms of safety or the public good or something like that.
Rob Wiblin
Yeah. Could you see there being, if there are warning shots, things start going badly with AI in some dimensions, or there's very scandalous outcomes. Could you see there being a big sea change in public opinion that could lead to significant slowdowns?
David Duvenaud
Yeah. So, I mean, definitely the fear though is that we end up with some sort of warning shot that doesn't, doesn't sort of unite people. It actually polarizes people. And so Yasha Shuldichtin is somebody who has discussed this and mentioned that there's a lot of disasters that look like social disasters where afterwards everyone can agree that something bad happened, but they totally disagree on what it was. And so maybe Covid is a perfect example where half of people think, oh man, Covid was such a disaster because the public health people didn't go far enough and they didn't control the virus and people weren't on board with all the public health measures and stuff like that. And half of people think Covid was a disaster because public health totally overreacted and the government had all these authoritarian, pointless crackdowns that were so destructive. And so we can all agree there's a warning shot. But half of us think the answer is dismantle public health. And half of us think that the answer is strengthen public health. And I think you could easily imagine a lot of warning shots where there's, I don't know, an AI influencer that makes some sort of destructive cult and gets shut down by the Internet. And then half of people think that like Mecha influencer didn't go far enough and Half of them thinking Mecha influencer went too far. And so half of the people think that we need to protect future AI cults or whatever we call them, and half of the people think we need to shut them down.
Rob Wiblin
Sounds pretty stupid, but also very possible. One thing I'm curious about is. So you're a professor of computer science and your background is in ML, right? So you did more like technical work before.
David Duvenaud
Exactly.
Rob Wiblin
I feel like whatever this is, it doesn't feel like classic computer science. I guess you have tenure now, so perhaps you don't have to worry quite as much as you used to about the opinions of your colleagues. But what's the reception among academics of you doing this kind of work? And your faculty, do they love it? Do they love the attention it gets, or are they frustrated?
David Duvenaud
Well, one thing I'll say is that I think tenure is delivering exactly on this sort of exact reason that it's supposed to be there is to allow somebody such as myself to work on some crazy direction that most people disagree with. And, and I maybe don't even have particular expertise in. And it's pretty explicitly encouraged in my institution and I think most institutions to use this freedom to try to do something a little bit bigger and crazy. So I actually think I have a lot of problems with.
Rob Wiblin
The system works.
David Duvenaud
Yeah, the system works. I mean, I have a lot of problems with universities and academia, but in this aspect they're covering themselves in glory. But then as for my actual colleagues, I mean, there's been kind of a selection effect where my colleagues that sort of buy that AGI is a big deal and possible and maybe not that far away are all at labs. Like my colleague Jimmy Ba is one of the co founders of xai. A bunch of my former students are at Anthropic and XAI and Google and stuff like that. So there's sort of one of these evaporative cooling where the people that are left that are still doing something related to ML are the ones that in my view are very head in the sand and dismissive and should know better, but are saying things like, oh, I've been working with AI for a long time and it's harder to make these things agentic than you think. And just like these very. Just disbelieving that there's ever going to be this physical artifact or anytime soon that has all this sort of like dynamism and abilities that humans have.
Rob Wiblin
Okay. And that's basically a selection effect that everyone who didn't think that has left academia and is making bank maybe.
David Duvenaud
I mean, the thing is that I also hear this from other people or other my colleagues and other faculties like philosophy or stats or economics. I think just in general, for some reason, most academics think like, but how could it ever do my job? And I think I also hear this from regular people. I think it's a very common reaction.
Rob Wiblin
Well, yeah, I think those folks are in for a little bit of a surprise if they think that AI is not going to be able to contribute to maths or philosophy.
David Duvenaud
I think Elias Sutzker actually just got an honorary degree at Utoronto last month, and he said something like, you might not be interested in AI, but AI is interested in you. Something like that, yeah.
Rob Wiblin
For better or worse, my guest today has been David Duverno. Thanks so much for coming on the 80,000 Hours podcast, David.
David Duvenaud
My pleasure.
In this thought-provoking episode of the 80,000 Hours Podcast, hosts Rob Wiblin and David Duvenaud (ex-Anthropic team lead, now professor at University of Toronto) explore a critical but under-discussed scenario: even if we successfully align advanced AIs to human intentions, humanity might still gradually lose control over its own future. Drawing heavily from Duvenaud's recent paper "Gradual Disempowerment," they map three key mechanisms—economic, political, and cultural—by which aligned AIs might still undermine democracy, pluralism, and human agency, eventually corroding the foundations of liberal society.
The discussion traces the multidimensional paths through which this gradual disempowerment may unfold, considers historical analogies, debates countervailing forces, and reflects on the philosophical and practical responses available to current and future generations.
Quote:
"Even if we can align AGIs to particular people or groups, we still might end up optimizing or heading at a civilizational level towards outcomes that no one wants and probably outcomes that look more like growth for growth's sake."
— David Duvenaud, [01:40]
Quote:
"Humans are just going to be this unreliable, sort of scary thing to involve in anything important..."
— David Duvenaud, [02:09]
Quote:
"The reason that states have been treating us so well in the west ... is because they've needed us... that's the key thing that's going to change."
— David Duvenaud, [00:00] and [59:39]
Quote:
"There's like a new vessel of cultural sort of memes and creation and just norms that can be operating sort of almost mostly independently from humans..."
— David Duvenaud, [10:02]
Quote:
"The only stable good outcomes involve some sort of strong global coordination, which is also very scary because if you get global coordination wrong, then you end up locked into bad scenarios as well."
— David Duvenaud, [35:35]
On the End of Human Necessity:
"Life can only get so bad when you're needed. That's the real key thing that has been keeping governments aligned. And that's the key thing that's going to change."
— David Duvenaud, [00:00], [59:39]
On the Dilemmas of UBI and Political Power:
"...the entire game going forward for economic advancement is do some sort of activism to convince the government to give your group more ubi... Governments that don't sort of disempower their citizens ... are going to be ... blown around by who's winning the activism war this week."
— David Duvenaud, [08:08]
On the Potential Irreversibility:
"...it's like a bit of a one way gate potentially."
— Rob Wiblin, [21:44]
On Historical Parallels:
"I think the aristocracy ... own all the land ... they can see what's happening ... But somehow there end up being ... a giant new source of wealth created that they mostly don't participate in."
— David Duvenaud, [22:47]
On the Moral Stakes:
"It would be really surprising if for those beings to exist in the best possible world we all had to die and have some terrible time."
— David Duvenaud, [41:07]
On Cultural Drift:
"Our culture is sort of randomly drifting in a way that no one is controlling. This is likely to lead it to be worse, just in expectation."
— David Duvenaud, [10:02]
On AI Constitutions and Power:
"I feel like it's just obvious that this is going to become one of the most important cultural and political battlegrounds that people fight over."
— David Duvenaud, [132:07]
On the Twilight of Liberalism:
"I'm a huge liberalism enjoyer ... it's a very sort of fragile thing that ... even today we should still try to protect if we can."
— David Duvenaud, [81:05]
On the Difficulty of Coordination:
"...these are the exact people that we were hoping will coordinate on exactly this issue and they already exactly are failing to do so right now."
— David Duvenaud, [145:55]
For those who haven’t listened:
This episode delivers a scholarly but frank exploration of the systemic risks facing even a best-case, alignment-achieved AI future. Through a combination of historical analogies, contemporary social science, and philosophical inquiry—leavened with moments of wry humor and resignation—the conversation clarifies that technological alignment does not guarantee social safety, stability, or justice. Instead, the challenge ahead is deeply political and cultural, requiring coherent, possibly global coordination—amidst threats to democracy and human dignity unlike anything we have previously faced.
The tone:
Serious but analytical, with honest grappling about empirical uncertainties, and a sober acknowledgment that the field is still "in beta." Both speakers encourage more multidisciplinary and imaginative work—and urge listeners, especially in academia and the technology sector, to take up the baton.
This summary was structured to capture the heart and intellectual flow of the conversation, highlight the most salient and original insights, and provide clear navigation for key moments and deeper reference.