“LLM Misalignment Can be One Gradient Step Away, and Blackbox Evaluation Cannot Detect It.” by Yavuz Bakman - LessWrong (30+ Karma) | Wave AI Podcast Notes