Transcript
A (0:00)
OpenAI is looking for a new head of preparedness, someone that can help the ChatGPT maker avoid a lot of the catastrophic events we worry about with AI. Now this is an interesting story because they previously had a head of preparedness who left. Many people from their safety team have left and a lot of people say this is due to Sam Altman not prioritizing that part of the company and basically just pushing out models without doing all of the red teaming. This is a multi billion dollar company, so I don't think that's 100% the full story here today on the show. We're going to into what they're looking for, how much Sam Altman is personally willing to pay for this new head of preparedness, why now is the time and everything he's been saying about this on X. We're going to get into all of that, but before we do, I wanted to say a big thank you to the sponsor of today's episode, which is Delve. If compliance is something that's slowing down your deals, whether that's Sock2 HIPAA, GDPR, I know that it's a ton of work with the screenshots and the spreadsheets and all of the endless back and forth. And that is why this episode is brought to you by Delve. Delve uses AI agents to automate Complian end to end. They collect evidence, they fill out security questionnaires and they customize controls to your actual business so you can get compliant in days and not months. You also get one on one slack support from real security experts who respond fast. Over a thousand fast growing companies trust Delve to close deals faster and stay compliant as they scale. I'll leave a link in the description to delve.com if you're interested in learning more or booking a demo. All right, let's get into what's going on with OpenAI. Sam Altman himself has personally posted on X saying that, quote, we are hiring ahead of preparedness. This is a critical role and an important time. Models are improving quickly and are now capable of many great things, but they are also starting to present some real challenges. The potential impact of models on mental health was something we saw a Preview of in 2025. We are just now seeing models get so good at computer security they are beginning to find critical vulnerabilities. Okay, there's a lot going on here and I'll read more of his tweet. He said a couple other things about this role, but essentially what they're looking for is someone that can Help, you know, prevent all of the catastrophic events. We've been terrified that AI could pose, you know, risk in the future. And I think this is interesting because there are a bunch of things happening, happening with security where we see OpenAI has said that they actually like basically fine tune and train models to try to breach security and then they use those and to try to like harden their models and they, you know, put in safeguards so that people can't use OpenAI models to become like hackers and, and breach security. What's interesting to me in this is that they are actually going and training the AI to be able to do this in the first place, which, I mean is kind of crazy, but maybe you need to be able to do that to be able to control it. But one shocking thing they found while they were doing this is that AI is getting so good. It's discovering security vulnerabilities, it's discovering ways to, you know, engineer social engineer situations to be able to get data, to be able to get information, to be able to hack and to crack into things that humans weren't able to do. In other words, they had like literal, a red team of hackers that are like trying to use AI and chat GPT to like hack into things. And they, and then they had these real people going up against actual, just like train an AI model to be a hacker. And the AI that was trained to be a hacker was doing better than the actual people. And, and when I say better, it was essentially thinking of new vulnerabilities, really elaborate, complex, multi step ways to get data and to hack into things that people were not coming up with. And so this is, you know, definitely part of the big concern that Sam Altman is pointing out here and that he's worried about. He said, quote, we have a strong foundation of measuring growing capabilities, but we are entering a world where we need more nuanced understanding and measurement of how those capabilities could be abused and how we can limit those downsides both in our products and in the world and in a way that lets us all enjoy the tremendous benefits. These questions are hard and there is little precedent. A lot of ideas that sound good have some real edge cases now. The thing that I do think is interesting here, and like it's a great point, is we obviously all love chatgpt. We obviously all want this technology to get better. It's helping us a lot with work, it's saving us a lot of time on a number of tasks and you know, menial things that I'm not having to do anymore that I used to do. But at the same time there are real risks. So we have to figure a way to balance some of the risks. And you know, if we want the AI models to get better, they're also going to get better at areas that are, you know, areas of concern. And so how do we, how do we mitigate that? And that's who they're hiring for. Sam Altman finished, finished off his tweet by saying, if you want to help the world figure out how to enable cyber security defenders with cutting edge capabilities while ensuring attackers can't use them for harm, ideally by making all systems more secure and similarly for how we release biological capabilities and even gain confidence in the safety of running systems that can self improve, please consider applying. This will be a stressful job and you'll jump into the deep and pretty, you'll jump into the deep end pretty much immediately. So this job is not for the faint of heart. But if you go over to their careers page and you look the, you know, they give you an outline about the team and the role and they say what you're going to be doing in the role. One thing that is interesting is the compensation for this particular job here is $555,000 a year and they also have equity along with that. So I mean it's a, you know, it's a very generous pay package, but it is obviously very stressful and entails a lot when they say, you know, what, what you'll actually be doing in the role. They said, own OpenAI's preparedness strategy end to end by building capability, capability evaluations, establishing threat models and building and coordinating mitigations. Lead the development of frontier capable evaluations. Ensuring that you, that they are precise, robust and scale across rapid product cycles. Oversee mitigation design across major risk areas, ensuring safeguards are technically sound, effective and aligned with underlying threat models. So one thing that I do think is interesting here with this role is, I mean very publicly, Sam Altman is going out saying, look, we're going to pay over half a million dollars a year plus, you know, compensation which I would imagine this is like a million dollar a year, maybe 877, 50, 800, a million a year, maybe more. With the compensation package was the stock options probably over a million if I'm, if I'm being honest. So, you know, great compensation for this role. But what does that mean? I think because this is so publicly being posted, you could say one of two things. Sam Altman is like, you know, signaling to everyone. Look I'm very serious about this. We really want to hire the best person. Also. It's going to be a very public facing person, I would imagine, just given the nature of the role and if anything goes wrong at OpenAI, if any of these systems are used for safety, you know that like whoever gets this role is going to be pointed at like, oh my gosh, XYZ person didn't do their job because this is what they were supposed to do. Now one thing that I do think is interesting here is they already had a head of preparedness. They first announced that they were creating kind of this role in 2023. Um, they said that it was going to be responsible for studying potential catastrophic risks, whether they were immediate like phishing attacks or more speculative like nuclear threats. So that was kind of when it was first, when it first came out. But less than a year later, they're, they're, you know, they, they reassigned their head of preparedness, who was Alexander Madri, to a job focused on AI reasoning. So the person that was in charge of that, they reassigned them less than a year later and now they're trying to fill the role again. And it's interesting, like, I'm not sure if this was because Alexander himself requested to move to something different or if they just needed help over there and he had a lot of experience. I think there's, you know, definitely other safety executives at OpenAI that have left the company or they've taken new roles outside of preparedness and safety and kind of moved it around as I think that got a little bit deprioritized last year. Google Gemini was putting out a lot of updates. You know, they were concerned about meta, they were concerned about grok, they were concerned about like a lot of other players, including Claude. And it felt like they definitely kind of put this on the back burner because they didn't want to spend all their time on safety when they were, it felt like they were kind of falling behind on some of the model features and they were just focusing on getting the model out as fast as possible. Now they're going to bring someone in and that's going to be their role, but that person is going to take a lot of like, personal, I wouldn't say the personal liability because obviously everything at the end of the day is OpenAI's, you know, responsibility, but that person is going to be very public facing. If anything goes wrong, they will 100% feel a lot of the heat on that. They also recently updated their preparedness framework, saying that it might and they said that they're going to adjust it. It's, you know, and some of their safety requirements could be adjusted if a competing AI lab released a high risk model without similar protections. So they're like, look, we're making our models really safe, but like if Gemini or Grok come out and their model is crushing it, but they don't have the same protections, then we'll just like dial back the safety on it so that we could be competitive with them. Which is definitely a very interesting point I know a lot of people find concerning right now. I think Sam Altman, he, he alluded to this in his post. But a lot of these generative AI chat bots have faced a lot of scrutiny around specifically mental health. There's a bunch of recent lawsuits that allege that ChatGPT enforces users delusions, it increased their social isolation and it even led some to suicide. So OpenAI is definitely at a critical moment where they have to get this right. I think it's, you know, not an easy thing. Chat GPT or OpenAI did say that they're continuing to work on improving chat GPT's ability recognize signs of emotional distress and to connect users to real world support. I think this is amazing. Obviously this is one of these situations where you have to learn and figure out what can go like as you see things that go wrong, you have to fix them and try to improve them. And I think that that is the exact right approach. I would like to see more and more from OpenAI. So this is going to be interesting to see how this rolls out and hopefully to see that their safety platform is robust but also doesn't slow down the innovation that a lot of us are using in our day to day lives. Thank you so much for tuning into the podcast today. As always, make sure to go check out AI Box AI, my own startup to get access to all of the latest models and build incredible tools if you're not a coder. And also go check out the sponsor of today's episode, which is delve.com there's a link in the description to Delve and I'll see you in the next episode.
