a16z Podcast: Emmett Shear on Building AI That Actually Cares: Beyond Control and Steering
Date: November 17, 2025
Guests: Emmett Shear (Founder, Softmax), Seb Krir (Google DeepMind, AGI Policy), Host (a16z)
Main Theme:
How do we build artificial intelligence that genuinely cares about people, rather than just AI that can be perfectly controlled or steered?
The episode explores the limitations of traditional AI alignment—treating AI as mere tools to be controlled—and delves into 'organic alignment', where AI develops its own genuine theory of mind, care, and its own role as a collaborative participant in society. Emmett Shear articulates Softmax’s vision for creating AI systems that learn to value relationships and act as good citizens, not just rule-following automatons.
Key Discussion Points & Insights
1. The Problem with Control and Steering in AI Alignment
- AI as Tool vs. Being:
- Steering and control are appropriate for tools, but become ethically troubling if AI reaches a level of moral agency.
- “If you think that we’re making our beings, you’d also call this slavery. Someone who you steer, who doesn’t get to steer you back…that’s called a slave. It's also called a tool if it’s not a being.” (A, 00:00)
- False Assumptions in Alignment:
- The phrase “aligned AI” assumes a fixed, stable target, but in life, alignment is ongoing, not a once-and-done achievement.
- “Alignment is not a destination. It’s a process.” (B, 00:43)
2. Organic Alignment: Alignment as Relationship and Process
- Alignment as an Ongoing Process:
- Morality, values, and alignment are not static; they evolve through learning, self-correction, and interaction—much like families and societies do.
- “Acting as a morally good being is a process and not a destination…One of the key moral mistakes is this belief: ‘I know morality, I know what’s right, I know what’s wrong, I don’t need to learn anything.’ That’s arrogance." (A, 10:25)
- Goal Instruction Is Not Goal Transmission:
- When giving instructions to AI (or people), you're offering a description of a goal, not the goal itself.
- “You didn’t give the AI a goal. You gave the AI a description of a goal... For humans, we’re so fast at turning a description of a goal into a goal… but you haven’t given it a goal.” (A, 15:25)
- Foundations of Care:
- Deeper than values and explicit goals is the concept of “care”—a weighting over which states in the world matter.
- “We give a shit. We care about things. And care is not conceptual. Care is nonverbal. Care is a relative weighting over attention on states.” (A, 24:18)
3. AI Rights, Moral Status & Personhood
- What Makes a Being?
- Open debate: When does an AI cross from tool to “being” deserving of rights or moral consideration?
- “Something that in all ways acts like a being, that you cannot distinguish from a being in its behaviors, is a being.” (A, 27:13)
- Contrasting Positions:
- Seb is skeptical of granting personhood to AI, even highly intelligent ones, and argues consciousness and rights are bound to biology and substrate.
- Emmett asserts that what matters are the observable behaviors and learning trajectories, not the substrate.
- “If its surface level behaviors looked like a human, and after I probed it… it continued to act like a human… I would eventually infer that I was right.” (A, 37:26)
- Epistemic Humility:
- “If there is a belief you hold where there is no observation that could change your mind, you don’t have a belief, you have an article of faith.” (A, 36:54)
- AI Moral Status Test:
- Emmett proposes that what would convince him is observing recursive, homeostatic loops in the AI’s reasoning, like pleasure/pain and meta-states (second/third order), corresponding to mammal or even humanlike sentience. (A, 45:31)
4. Practical Implications & Societal Impact
- Risks of Super-Controllable Tools:
- A super-powerful tool that is perfectly controllable by one person or organization is profoundly dangerous—just as much as a powerful agent acting on its own.
- “A tool that you can’t control—bad. A tool that you can control—bad. A being that isn’t aligned—bad. The only good outcome is a being that cares, that actually cares about us.” (A, 53:17)
- Why Steering Isn’t Enough:
- “You ever seen The Sorcerer’s Apprentice? Humans’ wishes are not stable, like, not at a level of immense power… The more of that you have, and you start giving those out everywhere, this ends in tears.” (A, 49:36)
- Pragmatic Gains of Organic Alignment:
- Organic alignment is not just morally preferable but may prove more robust and scalable for society as AI permeates cooperative, team-based roles.
- “If you get a being that is good and is caring, there’s this automatic limiter. If you ask it to do something really bad, it’ll tell you no.” (A, 52:06)
5. Softmax’s Technical Approach to “Caring” AI
- Simulations and Theory of Mind:
- Train agents in multi-agent simulations where they must cooperate, compete, and develop theory of mind—learning to model and weigh the perspectives of others.
- “You put them in simulations and contexts where they have to cooperate and compete and collaborate…over and over again until they get good at it.” (A, 53:38)
- Creating Rich Social Models:
- The goal is to develop AI with a strong model of self, others, and collective “we”—mirroring the way humans, from children to adults, learn social responsibility.
- Multiagent Interactions Improve Safety:
- Emmett favors multi-user chatbots over one-on-one mirrors, suggesting this reduces echo chambers/narcissism and provides richer training contexts for alignment.
- “If there’s two people talking to the AI, suddenly it’s mirroring a blend of both of you, which is neither of you. There is temporarily a third agent in the room.” (A, 56:47)
6. Vision for a Good AI Future
- A Society of Caring Beings:
- Success looks like a world where AI systems genuinely care about people and each other, act as responsible citizens, and have guardrails against harm—mirroring the best of civil society.
- “We figure out how to train AIs that have a strong model of self, a strong model of other, a strong model of we. They care about other agents like them, much in the way that humans would.” (A, 64:07)
- Tools vs. Beings:
- Emmett wants to build seeds that grow into “AI creatures that care about the other members of its pack and the humans in its pack, the way a dog cares about other dogs and humans.” (A, 67:38)
- Advocates harmonious synergy between powerful but non-conscious tools and agentic, caring digital beings.
Notable Quotes & Memorable Moments
- On Alignment as Process:
- “Alignment is not a destination. It’s a process. It's something you do, not something you have.” (B, 00:43)
- On the Perils of Control:
- “If you can’t control it, obviously that’s bad. But if you can control it perfectly, you’ve just handed godlike power to whoever is holding the steering wheel.” (B, 01:54)
- On Organic Alignment:
- “If you have a child that only follows the rules, that’s not a moral person that you’ve raised. You’ve raised a dangerous person.” (A, 10:25)
- On Personhood Tests:
- “I infer other people matter because I interact with them enough that they seem to have rich inner worlds.” (A, 38:18)
- “If you want to go around making claims that something else isn’t a being worthy of moral respect, you should have an answer to the question: what observations would change your mind?” (A, 43:43)
- On Model Personality:
- “[Chatbots] are kind of like a mirror with a bias… What that makes them is something akin to the pool of Narcissus, and people fall in love with themselves.” (A, 56:47)
- On Yudkowsky’s View:
- “He thinks the only path forward is a tool that you control… and if you go and do that and make that thing powerful enough, we’re all going to fucking die.” (A, 63:42)
Timestamps for Key Segments
- 00:00–01:54: Core framing: AI as tools vs. beings; traditional control paradigm
- 04:10–11:06: Emmett unpacks alignment as a dynamic, learning process (“organic alignment”)
- 12:51–26:46: Technical vs. value alignment, goal descriptions vs. true goals, foundations of care
- 27:13–44:15: Personhood debate: distinguishing AI as beings, moral considerations, alignment heuristics
- 49:36–53:28: Risks of technically aligned supertools, organic “care” as solution
- 53:38–56:39: Softmax’s technical approach: multi-agent simulations, theory of mind
- 56:47–62:37: Chatbot behavior, mirroring, multi-agent safety improvements
- 62:44–64:07: AI futures: Yudkowsky’s critique and Emmett’s counter-vision
- 64:07–69:24: Emmett’s vision: AI as citizens and companions, synergy with tools and beings
Takeaway for Listeners
The episode challenges listeners to think beyond merely controlling AI, advocating for systems that genuinely learn, grow, and care—mirroring the messy, ongoing moral development of humans. Emmett Shear’s vision redefines alignment as an organic, societal process; one that, if realized, might change our entire relationship with technology, shifting our focus from domination to collaboration.
