“We found an open weight model that games alignment honeypots” by Thomas Read, Joseph Bloom - LessWrong (30+ Karma) | Wave AI Podcast Notes