#234 – David Duvenaud on why 'aligned AI' would still kill democracy - 80,000 Hours Podcast

Transcript

David Duvenaud (0:00)

The reason that states have been treating us so well in the west, at least for the last, let's say like two or 300 years, is because they've needed us. And in particular because allowing freedom and like private property and basically self determination has been the most effective recipe for growth. Life can only get so bad when you're needed. That's the real, real key thing that has been keeping governments aligned. And that's the key thing that's going to change. A lot of citizens would end up just being sort of like full time activists and they might feel like they're forced to because if their only source of income is something like ubi, then the entire game going forward for economic advancement is do some sort of activism to convince the government to give your group more ubi. Those same resources could be used to simulate maybe like millions of much more sympathetic, morally superior virtual beings. And so it'll start to be seen as this like irresponsible use of resources to keep like some sort of like legacy human around.

Rob Wiblin (0:50)

Today I'm Speaking with David DuVernay, professor of computer Science at the University of Toronto. David is a co author on a somewhat recent paper called Gradual Disempowerment, which makes the slightly counterintuitive claim that even if we manage to solve the AI alignment problem and have AIs that faithfully follow the instructions and goals of the group that's operating them, humanity could nonetheless end up losing control over its future and end up with a pretty bad outcome. The paper got a lot of reactions, it's fair to say, with some people saying that it really put its finger on an underrated issue and others thinking that the scenarios painted were really unlikely, and other people arguing that they were likely but not even necessarily undesirable. I'm kind of a bit unsure where I come down myself, so thanks so much for coming on the show to discuss it, David.

David Duvenaud (1:29)

Oh, it's my pleasure, Rob.

Rob Wiblin (1:30)

So let's imagine that we have managed to make big breakthroughs in AI alignment, maybe around 2028. How is it that nevertheless things could end up trending in a negative direction?

David Duvenaud (1:40)

Yeah, so the basic thesis is that even if we can align AGIs to particular people or groups, that we still might end up optimizing or heading at a civilizational level towards outcomes that no one wants and probably how comes it look more like growth for growth's sake?

Rob Wiblin (1:54)

So paint us a picture. We're at the point where we have human level or greater than human level AGI and we've made big progress on alignment. So we basically can trust the AIs to follow the goals that we give them. How does humanity begin to become disempowered?

Summary

Podcast Summary

Overview of the Episode

In this thought-provoking episode of the 80,000 Hours Podcast, hosts Rob Wiblin and David Duvenaud (ex-Anthropic team lead, now professor at University of Toronto) explore a critical but under-discussed scenario: even if we successfully align advanced AIs to human intentions, humanity might still gradually lose control over its own future. Drawing heavily from Duvenaud's recent paper "Gradual Disempowerment," they map three key mechanisms—economic, political, and cultural—by which aligned AIs might still undermine democracy, pluralism, and human agency, eventually corroding the foundations of liberal society.

The discussion traces the multidimensional paths through which this gradual disempowerment may unfold, considers historical analogies, debates countervailing forces, and reflects on the philosophical and practical responses available to current and future generations.

Key Discussion Points & Insights

1. The Core Thesis: Solving AI Alignment Might Not Be Enough

Thesis Statement: Even if we solve the technical problem of aligning AGIs to faithfully pursue human-given goals, civilization may still drift toward outcomes no one would endorse, such as “growth for growth’s sake.” (Duvenaud, 01:40)

Quote:
"Even if we can align AGIs to particular people or groups, we still might end up optimizing or heading at a civilizational level towards outcomes that no one wants and probably outcomes that look more like growth for growth's sake."
— David Duvenaud, [01:40]

Gradual Disempowerment is a multi-step, systemic trend rather than a sudden event or catastrophic coup.

2. Economic Disempowerment: Human Labor Becomes Obsolete

Automation Dilemma: As AIs radically surpass human capability, they will fully substitute human labor across domains.
Labor Market Dynamics: Legacy thinking (lump of labor fallacy) presumes new jobs will always appear, but transaction costs, liability risks, and human unreliability mean eventually, “employing humans” itself becomes uncompetitive.
Example: Employing a human surgeon when a machine can reliably outperform them will come to seem negligent or irresponsible (Duvenaud, 04:20).

Quote:
"Humans are just going to be this unreliable, sort of scary thing to involve in anything important..."
— David Duvenaud, [02:09]

Human Capital Investment Collapses: As the value of human labor plummets, there’s less incentive for societies and capital markets to invest in people (universities, training).
Poverty & Inequality Risk: Economic rents flow to initial owners of powerful AIs, leading to massive inequality and increasing irrelevance of ordinary humans.
Threat to Property Rights: Over time, as humans become less central, even property rights may not be respected or become politically tenuous (Duvenaud, 22:47).

3. Political Disempowerment: The Death of Democracy?

Historical Basis of Democracy: Duvenaud and Rob trace democracy and human rights in the West to the economic necessity of having an empowered workforce (Wiblin, 06:41).

Quote:
"The reason that states have been treating us so well in the west ... is because they've needed us... that's the key thing that's going to change."
— David Duvenaud, [00:00] and [59:39]

Loss of Leverage: With humans no longer needed for work or economic output, governments are no longer incentivized to share power, cultivate education, or be accountable.
Oligarchy & Autocracy: Political power is expected to concentrate among those who control the machines and the means of coordination. Democratic pluralism becomes fragile, or non-competitive in a geopolitical “arms race.”
Suppression of Dissent: Without leverage (labor strikes, civil disobedience), humans lack credible threats; the state may perceive political activism as destabilizing and respond with repression (Duvenaud, 64:23).

4. Cultural Disempowerment: The New Battleground

Culture as a Replicator: Cultural norms have historically adapted through evolutionary pressures that connected “good culture” to group success.
Loss of Feedback: As humans become marginalized, and machines begin to generate and transmit culture (memes, narratives, entertainment) largely independent of human input, this feedback loop breaks (Duvenaud, 10:02).
‘Alien Drift’: The cultural transmission and production by AIs could lead to a drift towards values, aesthetics, and behaviors not aligned with human flourishing, further undermining human relevance.

Quote:
"There's like a new vessel of cultural sort of memes and creation and just norms that can be operating sort of almost mostly independently from humans..."
— David Duvenaud, [10:02]

Memetic Competition: Eventually, ‘fit’ memes (including those that are pro-AI) will dominate, regardless of their consequences for existing humans.

5. Counterarguments, Critiques, and Responses

Protection via Resource Ownership: Rob raises whether humans owning all the AIs and their output would let them capture all economic surplus and remain empowered.
- Duvenaud counters that “ownership” is fragile, recalling the English aristocracy: such groups can be circumvented, outcompeted, or politically marginalized over time (22:47).
Possibility of Coordination or Agreements: The hope is that, as AIs get smarter, humanity might enforce agreements or ‘grandfather in’ protections for humans, but these depend on the timing and health of human-led institutions (35:35).

Quote:
"The only stable good outcomes involve some sort of strong global coordination, which is also very scary because if you get global coordination wrong, then you end up locked into bad scenarios as well."
— David Duvenaud, [35:35]

Moral Uncertainty: The conversation circles around whose welfare matters—is it just current humans, future humans, or future AI beings? Duvenaud is sympathetic to “let humans decide,” but admits philosophical quandaries abound (41:07).

6. Potential Futures and Evolution of Liberalism

The Zero-Sum Society: UBI for all seems promising, but as jobs disappear and all wealth is redistributed, politics becomes a zero-sum conflict over allocation, eroding the foundations of pluralism and liberal values (81:43).
End of Easy Mode: Liberal capitalist democracy’s golden era is rooted in positive-sum labor economics and the need for human participation; once this logic evaporates, so do liberal norms (80:17).
Twilight of Pluralism: Decision making may increasingly consolidate, and the openness to dissent and difference that marked the recent centuries could wane or vanish (81:33).

7. Coordination, Forecasting, and Institutional Suggestions

Gradual Disempowerment Index: Efforts are underway (with Metaculus and others) to operationalize and forecast the pace and form of human disempowerment (107:33).
AI Constitutions/Model Specs: The “system prompt” or post-training values loaded into powerful AIs will be a vital, contentious axis for cultural and political struggle (131:09).
AI as Forecasting Tool: Duvenaud describes ongoing projects to use AI models trained only on historical data to backtest predictions about the future (108:45–112:53).
Complementarity vs. Substitution: Duvenaud collaborates with WorkshopLabs (137:32), a startup aiming to maximize human-AI complementarity, but notes this may only delay, not avert, displacement.

8. Philosophical and Strategic Open Questions

What (if anything) should be locked in? Is there a value to preserving human control versus allowing “moral progress” led by new entities? Duvenaud is wary of leaving human values to the drift of blind evolution or competitive pressures.
Competition vs. Coordination at Scale: Should we expect unbounded competitive dynamics (“locust logic”) to dominate, or can stable, morally desirable equilibria be engineered?
Can AI-augmented coordination mechanisms save us, or do they accelerate our own marginalization? (126:05) The same tools could empower or enfeeble.

Notable Quotes & Memorable Moments (with Timestamps)

On the End of Human Necessity:
"Life can only get so bad when you're needed. That's the real key thing that has been keeping governments aligned. And that's the key thing that's going to change."
— David Duvenaud, [00:00], [59:39]
On the Dilemmas of UBI and Political Power:
"...the entire game going forward for economic advancement is do some sort of activism to convince the government to give your group more ubi... Governments that don't sort of disempower their citizens ... are going to be ... blown around by who's winning the activism war this week."
— David Duvenaud, [08:08]
On the Potential Irreversibility:
"...it's like a bit of a one way gate potentially." — Rob Wiblin, [21:44]
On Historical Parallels:
"I think the aristocracy ... own all the land ... they can see what's happening ... But somehow there end up being ... a giant new source of wealth created that they mostly don't participate in."
— David Duvenaud, [22:47]
On the Moral Stakes:
"It would be really surprising if for those beings to exist in the best possible world we all had to die and have some terrible time."
— David Duvenaud, [41:07]
On Cultural Drift:
"Our culture is sort of randomly drifting in a way that no one is controlling. This is likely to lead it to be worse, just in expectation."
— David Duvenaud, [10:02]
On AI Constitutions and Power:
"I feel like it's just obvious that this is going to become one of the most important cultural and political battlegrounds that people fight over."
— David Duvenaud, [132:07]
On the Twilight of Liberalism:
"I'm a huge liberalism enjoyer ... it's a very sort of fragile thing that ... even today we should still try to protect if we can."
— David Duvenaud, [81:05]
On the Difficulty of Coordination:
"...these are the exact people that we were hoping will coordinate on exactly this issue and they already exactly are failing to do so right now."
— David Duvenaud, [145:55]

Timestamps for Key Segments

[01:40] – Thesis of gradual disempowerment even with aligned AI
[04:20] – Decline of human labor value and employment
[06:41] – Historical roots of democracy & economic necessity
[10:02] – Cultural evolution and “alien drift”
[22:47] – Why property rights and wealth may not protect you
[41:07] – Moral questions about human value & legacy
[59:39] – Governments only align with human interests due to necessity?
[81:33] – Fragility and future of liberalism, pluralism
[107:33] – Gradual Disempowerment Index & measurement challenges
[131:09] – The political and cultural importance of AI constitutions/model specs
[145:55] – The real-world challenges of institutional coordination
[147:17] – Could simply halting AGI progress help? Problems with that plan

Flow & Takeaways for Listeners

For those who haven’t listened:
This episode delivers a scholarly but frank exploration of the systemic risks facing even a best-case, alignment-achieved AI future. Through a combination of historical analogies, contemporary social science, and philosophical inquiry—leavened with moments of wry humor and resignation—the conversation clarifies that technological alignment does not guarantee social safety, stability, or justice. Instead, the challenge ahead is deeply political and cultural, requiring coherent, possibly global coordination—amidst threats to democracy and human dignity unlike anything we have previously faced.

The tone:
Serious but analytical, with honest grappling about empirical uncertainties, and a sober acknowledgment that the field is still "in beta." Both speakers encourage more multidisciplinary and imaginative work—and urge listeners, especially in academia and the technology sector, to take up the baton.

For Further Engagement

Gradual Disempowerment paper (Duvenaud et al; core argument)
“How Goodness Competes” talk (Joe Carlsmith)
WorkshopLabs, Metaculus forecasting, and practical efforts for the gradual disempowerment index

This summary was structured to capture the heart and intellectual flow of the conversation, highlight the most salient and original insights, and provide clear navigation for key moments and deeper reference.