wave
Pod
Get Wave AI
Sign In
“Fail safe(r) at alignment by channeling reward-hacking into a “spillway” motivation” by Anders Cairns Woodruff, Alex Mallen - LessWrong (30+ Karma) | Wave AI Podcast Notes
Back to LessWrong (30+ Karma)
“Fail safe(r) at alignment by channeling reward-hacking into a “spillway” motivation” by Anders Cairns Woodruff, Alex Mallen
LessWrong (30+ Karma)
Mon Apr 27 2026
Sign in to process episode
Loading summary...
Send to Email