
The rules of responsible disclosure were written …
Loading summary
Greg Otto
Security researchers have more power than ever to find critical vulnerabilities. So why does it feel like the relationship with the industry is more broken than ever? We'll talk about it on this episode of Safe Mode.
Welcome to Safe Mode. I'm Greg Otto, editor in chief at cyberscoop. Every week, we break down the most pressing security issues in technology, providing you the knowledge and the tools to stay ahead of the latest threats, while also taking you behind the scenes of the biggest stories in cybersecurity. An attack is coming. It's about keeping us safe. It's just a disc.
Gal Abaz
She's a super hacker. Stay alert.
Derek Johnson
Stay safe.
Gal Abaz
Stay safe.
Greg Otto
This is Safe Mode.
Welcome to this week's episode of Safe Mode. I am your host, Greg Otto. In our interview segment, we're going to be talking with Gal Abaz, the co founder and CTO of Oligo Security, talking about the fight that we're currently seeing play out with some vulnerability researchers and Microsoft. A really big fight that over the past two weeks, I would have said is the biggest mess in cybersecurity. And then last week, particularly Friday, happened, and Anthropic and the US Government are going through it right now when it comes to the Fable and Mythos 5 models. And look, there's been a lot of noise this week around this. If you know, only if you've been living under a rock, have you not seen all the reporting that has been going on. But to that degree, there's a lot of noise right now flooding the zone. So wanted to bring on Derek Johnson to talk about what exactly is happening here in terms of the technical capabilities, the issues with the technical capabilities, and how this fight really may not be about that and how it's not just necessarily about Anthropic too, in the long run. So, yeah, let's. Let's dive into it. Let's dive into it, I think first by talking about the cybersecurity community's reaction to this, because with the banning and the way that it shook out in terms of what is or is not capable with these tools, the cybersecurity community signed basically like an open petition. And these are like luminaries, like people we all know, people we all talk to, sources that we all talk to basically said these are.
Derek Johnson
These are cybersecurity officials who use AI at the cutting edge every day in their work.
Greg Otto
Right. And them basically coming together to say, this isn't good because this is the way these tools are supposed to work. And if everybody's doing their work the right way, this isn't something that we necessarily need to worry about.
Derek Johnson
Yeah. When is a jailbreak not a jailbreak? I think is kind of the, the really relevant question here because it's really focused on these reports that we've seen publicly. One of them, another one has been described by people like Katie Massouris who has, who have, who, who have looked at them. They describe what the researchers call jailbreaks and, and, and in some ways perhaps they are, but they're not jailbreaks in the way that we talk about when we talk about preventing a lot of the really harmful capabilities that Mythos has when it comes to, you know, setting up complicated and automated hacking schemes or getting access to really high level biology information that could help you make a bioweapon. These are the guardrails that Anthropic was really focused on. The research from one of the independent researchers managed to get the Fable 5 to spit up its, its system prompt, which is, which is impressive, but it's not about accessing some of these higher level responses. And then from Katie Massouris description of the, the Amazon report, it really sounds like they just asked Fable to look at some vulnerable open source code that already had some vulnerabilities baked into it. Fable initially refused, but using a multi step and manual process, they got it to develop some automated scripts to test patching, which is totally normal and not super high level hacking stuff.
Greg Otto
Right. I think the process that you're describing there goes into the defense oriented prompting, which I'm not a total expert on this. Obviously I do not have access to these tools. But researchers have told me that this is an important part of using these tools to find these bugs and that this defense oriented prompting is part of what caught the Department of Commerce and the Trump administration off guard on what is possible with Anthropic and Fable. And while I can understand seeing that and going, wow, this is really scary, the way that I know Katie has talked about it and other researchers have talked about it is this is, this is the way these tools are used. Whether it's anthropic, OpenAI, some of the Chinese based LLMs that are out there. This is using AI. This is not something that is unique to Anthropic.
Derek Johnson
The techniques that are described in these reports can be reproduced not just by other frontier models like OpenAI's Daybreak, they can be reproduced by older existing models that are already on the market. You know, you're talking about Claude Opus, Claude Sonnet, Chinese models, open source models that have been able to reproduce the same thing that the administration is allegedly using to, to justify putting these export restrictions. So it's very, very odd to single out Fable for capabilities that are not at all unique, according to a lot of cybersecurity officials. If you are worried about that, that is something that literally everyone can do with models right now, today. And that was sort of one of the big points about from the cybersecurity researchers.
Greg Otto
So that all aside, I think something that is very, very interesting here is there's been reporting about how Mythos and Fable, like, how we got to this point in time, basically, and how this has been building over the first half of this year, because obviously Anthropic was working on this particular model and how the capabilities came along with it. And look, when we wrote about Project Glasswing, one of the big things that Anthropic said to us was, we are not giving this out. We understand the capabilities that are in this. So we are keeping it somewhat under lock and key where we are not just going to put it out there for the world because we know what is possible here. That doesn't seem to have changed at all. So part of the calculus here from the government standpoint is what has really, like, I have trouble wondering, like, why now? Like, because it just seems that the capabilities have been there for months. And so now that even if we have one that is nerfed, to use a word that we have said, that everybody's getting so worked up.
Derek Johnson
Yeah, I mean, I, I again, I wonder how much of it just comes down to the term jailbreak. And, and, and you know, to Anthropic's credit, they saw this in advance in their announcement for Fable 5. They said that they had done a thousand hours of testing and they had found no universal jailbreaks. Now you could kind of go into, you know, what kind of jailbreaks did you find? What kind of partial jailbreaks did you find? But if you are to believe Anthropic, none of their testing showed that the safeguards that they put in that people are really concerned about have been circumvented. And none of the research that has come out since has proven that maybe that will change. Because as we know, the more that you let these models get, get, let the hackers get their hands on these models, the more that they can do. But, you know, I think more broadly, you really have to look at these kinds of capabilities that the administration is, is putting export controls on and really realize that if we keep progressing at the point that we have been for the last few years and along the path and the plan that the administration is, you know, sort of the all speed ahead, you're going to have open source and older models, like commercially available models that give you those capabilities within, within the year. That's what, that's what a lot of folks think. You know, you, it's important not to look at Mythos as some huge outlier. I mean, it is an outlier to a certain extent, but it's also another step in the model chain of improvements. There are going to be improvements beyond this. And so putting export controls on one piece of technology for capabilities that are already available today because you're worried about them in the future. It's weird.
Greg Otto
The export control part is really interesting to me. Look, I am not a lawyer in this at all, but from covering the conversations almost a decade ago when this was really hot in the cybersecurity policy circles, talking about intrusion software or zero days, when it was lined up with talk about the Watson art agreement and all of these export controls, was that. My understanding was that export controls were on part of it because we were handing something over. Like when you are talking about an actual zero day, I can take that, put that on a USB drive and I could go to China or Russia or anywhere else and basically turn over that as a piece of property. And the export controls are supposed to be there to say, no, no, no, don't do that. We do not want that to fall into the hands of our adversaries. Something I can totally understand even if I don't know all the ins and outs. Like I have law degrees on the wall with anthropic. I can, I cannot give you a disk or, or a thumb drive and say, you now have this model. It is a cloud service. So are the export controls that we're talking about here really the outputs? And that is not anthropic. Like anthropic can't control or is it.
Derek Johnson
Yeah, just license output.
Greg Otto
Right. Like, it reminds me in the same
vein, it would be like if I,
and I would never do this, obviously, I am presenting a hypothetical.
If I use Microsoft Word to write
a manifesto of all the bad things that I'm going to do and possibly get arrested for that. Microsoft Word is not to blame for that. It is a piece of software that is giving me the outputs. So use that metaphor and think about that. And I ask you as a listener, a viewer to think about that too when it comes to these export controls. Because as I think this, I'm like, wait a minute, the dynamic has changed so much in terms of what is possible here that it just seems that we are pulling an emergency brake without thinking through the operational or legal or governmental or policy ramifications of what this means.
Derek Johnson
Well, that's kind of like what's sticky about and very really contradictory I think about the way that and Mark Warner pointed this out. Like maybe you could make an argument that the fully completed models on the frontier side are these fully packaged products. Maybe you say that you want to put those under a different kind of export regime, but the Trump administration has, you know, lifted export restrictions on like AI chips and all other kinds of technology that, to, to countries like China. And so it is very odd to say you can buy all of the, the parts from the, you know, you can go to the grocery store and buy all the parts to make this meal, but you can't buy this one specific brand's ready made meal.
Greg Otto
Like it's, that's an interesting way to look at it.
Derek Johnson
Yeah, it's, it's, it's there. I, and it's again, I think it's the Trump administration kind of figuring this stuff out in real time as, as, you know, partly because it's a fast moving, you know, space. But you know, I think we're it, it, we're, we're going to have to, we're going to have to figure out like what parts of this deck we don't want to sell internationally and what parts we do because another part of this is that the Trump administration wants other governments and other countries to use American AI. They want other countries to have Claude and OpenAI and Codex and things like they want that.
Greg Otto
Right.
So that puts a very, very fine point on it where we have spent this administration going. American AI is the greatest AI and you should use the American AI now all of a sudden.
No, no, no, not like that.
Don't, not like that. Do, do other cool stuff. Don't, don't, don't do the hacking. The hacking is, is, is not welcome.
Derek Johnson
So they're trying to do a lot of things at once and, and, and I think they're still figuring it out.
Greg Otto
Well, we're all just kind of figuring it out, aren't we? So I can't wait to see what fresh hell the next couple weeks or months gives us. Derek, thanks for joining us and helping us figure out what exactly is going on here. Thanks. Now to my talk with Gal Abbaz, the CTO and co founder of Illegal Security. And you've been paying attention to the vulnerability to disclosure mask that's been happening over the past month really centers on this researcher who goes by Nightmare Eclipse and the fight that he's been having, fight that they've been having with Microsoft Security. It really is a conversation that's rooted in the frayed trust that we see pop up from time to time between vulnerability researchers and those responsible for dealing with their disclosures. And this doesn't really get into AI, but the AI part of this really factors in. And now look, there is so much going on and it's moving so fast that the 90 day vulnerability disclosure window really is being tested and it's being stress tested. And we talked to Gal about his experience on being both a vulnerability researcher and somebody that has had to deal with disclosures and how that trust can be fixed and if it can be fixed at all. Check it out.
All right. Joining us on this week's episode of Safe Mode is Gal Abaz, the co founder and CTO of Oliga Security. And Gal comes to us at a time when responsible disclosure is just a mess. Right now vulnerability disclosure is a mess due to just an absolute tidal wave of bug reports. Thanks to AI. And anybody that has been paying attention knows the fighting going on between the research community and Microsoft and wanted to have Gal on to talk about it. I know this is something near and dear to your heart. So Gal, thanks for joining the program.
Gal Abaz
Thank you very much. It's a pleasure to be here. I'll mention that besides being the co founder of Oligo, I also lead the research part of our company and our team has disclosed and found one of the most talked about vulnerabilities and campaigns and we collaborated and did exactly that with the biggest organization in the world. And no, I presented our work in research over Blackhead, but I myself come from that profession as well. I was a hands on researcher for many years at Checkpoint. So I was dealing myself both from the researcher head, both from like you know, leading and doing the communication efforts with the vendors. And I can share both aspects right now. But these actually like a point in time where everything changes due to AI and AI assisted vulnerability research and expectations. So exciting time to be here. Scary times sometimes. But would love to share more from my experience now and before.
Greg Otto
Yeah, so I guess look, we know things have changed but I wonder just how you see things changing. It's just been so quick. Like I feel like this has really been overnight and I think that a lot of the problems that we are seeing stem from that. I mean just 90 day disclosure times. I think about that that's been the industry standard for almost like a decade and overnight it's totally changed. Like if we think about 90 days right now, like Mythos was not a thing in, in the industry or in the public collective's mind 90 days ago. So the, the world has changed in 90 days, let alone the disclosure of a timeline. So talk about sort of how you adjust to the breakneck speed of the way that this particular avenue in the industry has changed.
Gal Abaz
Absolutely. So I think there's like two different perspectives, right. First, you mentioned it yourself, what we all know is responsible disclosure. That, you know, the traditional 90 days was that, you know, when humans communicated with humans over bugs that humans found. But I think just to understand from both perspective, right, both the vendor side and the researcher side, we're in an era that MITRE literally says and admit, hey, we cannot keep up with all of the reports. We're going to drop reports by design because we are still humans behind the scenes that need to validate each and everything that now AI models are reporting to us again on that version. Even Linus Travels, the founder of Linux, says itself, like, even us, the Linux foundation cannot keep up with that amount of vulnerabilities that we get some reports on the Linux scanner. So of course, I think everyone needs to understand, both vendors and the researchers, that something has changed. Right. And I think let's take the a Microsoft example just for us to learn from those two different perspectives. Right? Of course, you know, for Microsoft, as, you know, someone that wants to collaborate with the community and you want researchers to tell you about the vulnerabilities and not the world or the world will know about it before you or of course, threat actors that can utilize it and you know, threatening with legal implication on researchers, of course, it's something that can frighten the community and, and you do the exact opposite of what you're trying to build this trust with the community, that researcher will want to play along and, you know, disclose those bugs to the vendors as they should. But on the other hand, researchers again needs to understand that it's absolutely not okay to drop a 6, 0 days. And literally it is a sort of sticker weapon. And when everyone is still exposed, then there's a reason why it's called responsible disclosure. I think both sides needs to be responsible for that because it's not about the vendor or the researcher. It's about us as a community using software as the infrastructure and the invisible backbone of everything we do. We cannot again, you know, just, you know, drop zero days, of course, to the world and just break that trust between those parties. So that was in again the traditional way and now when we're talking about AI and I think 90 days as you mentioned is pretty insane to even like, you know, measure when we see that metaslike model find and exploit a new zero day within minutes. Right. So there has to be this like adoption to this new world. Of course I don't have the answer for everything but I think as an industry being much more focused on again stuff that actually validated and literally haven't exploit or even explored in the wild, like we should literally give it like you know, days, hours to days time of a response from a vendor versus like you know, getting thousands of, you know, maybe possibly theoretically, you know, bugs but put the focus on the one that are vicious, that can actually make you know, a really big impact and maybe shorten the timelines when we talk about that. Right. Type of bugs just to address from both ends I think.
Greg Otto
So with that I'm wondering, you know, as we think through this, what assumptions need to be sort of just totally reworked when it comes to responsible disclosure? Because I do feel like like you said, exploit development right now has turned around from maybe weeks to hours. And I'm wondering just how does it change everything? Because it really does change everything. But what are the assumptions that have been baked in to the responsible disclosure timeline that now with the way that things are need to just be thrown out and start from scratch?
Gal Abaz
Yeah, I think that is a great way to look at it. How we can rebuild everything now to, to adapt to our era. And I think there's like a couple of like three big aspects I would mention like from, from both ends. Right. I had vendor had myself and I and I were the researcher but like from the you know, cookie cutter timeline that we all know, like moving from that timelines to like exploitability based ones. I think researchers know when a bug they found is a better described as a weapon than a vulnerability. Right. A bug. Right. A vicious RCE remote code execution that can take over a machine and you know, it's a 10 out of 10 RCE. I think everyone can agree on like what is severe enough. And again in our world it's much more easier also to prove and show like a walking exploit and put like another angle to it. Not just the theoretical aspect of what's the possible risk but put it into like much more understood determination about what is a real vicious bug. And I think both researchers don't need to abuse the superpower. And of course in Our era with AI, this assisted research, the security research community has real leverage right now. And I think that leverage only stays meaningful if we use it responsibly and in the service of like our broader ecosystem. We can get to a place where a researcher just drops zero days in Twitter and like the entire industry gets impacted by it just to generate buzz or like, you know, get some fame. And I would just mention that something I think can change a lot of that. Discussions that happen between the researchers and the vendor is about like what are the criterias, right? Because you know, the severe the bug is you can get a bigger reward. And I think if the vendors will publish more about what their criteria score, how they measure, like, you know, maybe be more open about internally what they really validate and what and they actually care about. And I think the both ends can be much more meaningful, like you know, focusing on stuff that actually matter both to the vendor side and to the researcher side. Right. Researchers want to find the biggest vicious bug, get the biggest rewards. But you know, if they keep on hearing from the vendors, oh, it's not critical to us. Cause and you know, this is a very different thing, what a vendor may describe as a risk. Right? Of course there's like PR and communication as part of it and there's like, does that really affect me or not? There's a lot of question gets into it which is not like the pure aspect of the bug itself when it shouldn't. I think if organization would be much more willing and open to share their real criteria about what's important and critical for them and how they evaluate it, can share much more open conversation and much better way to determine real criticality risk.
Greg Otto
So yeah, let's dive into that a little bit more. In your opinion, who should own the decision to accelerate a disclosure timeline? Because I do feel like, like we have been saying that the 90 day general assumption there, I, I think is going to go by the wayside. So when it comes to accelerating that timeline, where does that conversation lie? Does it lie with the researchers? Does it lie with the vendors? Does it lie with like a third party that needs to be created to talk about whether there needs to be delay or whether there needs to be a more of a compressed timeline to publicly disclose it. Where do you think the responsibility lies there?
Gal Abaz
Yeah, so, so I think you mentioned like, like all parties involved. Right. In the perfect world that we had like on one hand, you want to delegate more power to the vendors and that's why you have the cna, the authority to issue Your own cve, like you know exactly for that reason. So vendors don't need to wait for like a third party authority, but because there might be like disagreement on the score and might be like a conflict of interest from the vendor side sometimes about sharing the little truth. And that's why there's like a third party, right? Like sisa, like someone you can actually report to and they validated the third party. But the problem is also cisa, human beings behind the scenes. And that's why they also cannot keep up with everything. And that's why I think we should have like a maybe different rules rewritten about like well, no responsible disclosure. It's 90 days. We all know like what is the CVSS code and what are the criteria of a cve. But maybe if you could all agree, like for example RCS that can run code on this and this amount of stars and project, which means this amount of exposure and no time and excuses and we should have either the vendor or the side, like give it like a week or two weeks. And there's been stuff before, right. I remember when we were in Checkpoint, one of the vicious bug that we uncovered was like warmable RCE through a very famous window upgrade service. Like something that can again on one hand hack a machine, but it's also like in propagate. So it was a very vicious bug. And I remember that Microsoft took it really, really seriously and fixed it within two weeks. Right. And they wanted to do it not because we asked them to. Right. So I think there can be like shared responsibility from both ends. Both the researchers know what is, you know, a vulnerable bag. Like this is their proficiency, this is what they do. And on the other hand the vendors also like have those like different calls of actions to different type of severities and impact that again we all can agree like for sure, oh my God moment when we, when we both like talk about the criterias of this new cv.
Greg Otto
Do you think it's worth instituting some type of new framework for vulnerabilities? Because I think like you brought up CVSS scores and I know from previous coverage that it's related and from my own time in industry, people argue about the CVSS scores all the time and we see revisions. One day it might be a 6.5, some work is done, two weeks later, now it's an 8.1 and behind the scenes people get agitated by that too. So there are arguments just about that score. I feel like with the compression that we're seeing with AI, it's only going to expound the problem, basically. So I'm wondering, is it worth it to maybe discuss new frameworks or consider separating out what AI can do from what humans can do? Only now all of that stuff is being blended. Like these are the types of things that run through my head as I'm doing coverage. I imagine they run through your head too.
Gal Abaz
Absolutely. So I'll answer that in a couple of areas. First, yes, we should have different frameworks. And the frameworks that we do have are constantly changing. We had the CVSS core like 1.0, 2.0, now we have 3.0 and the criteria might change. So of course it is a framework that tends to be adapted to the industry. The whole goal of a CVE is for a vendor to know about the presence of a bug it should apply a fix to. So I think in that regard, like adapting to a new world is absolutely a must. And maybe changing the CVSS score and adding more criteria is absolutely something we should all do. I'll mention that there are stuff along the industry that really helped along the years to support that goal. For example, the whole CISA cavalry and not to mention the CVSS score is like 10 or nine. They're like what is actually exploited in the wild, right? It can be a CVSS score 6.5, but who cares because attackers are abusing it, right? And it can be like 10 out of 10. But like there is a focus because right now it's exploded in the wild. So when you use those like spread feeds on top and what's like existing, you know, frameworks, it can help, you know, organizations and defenders to better act on like, you know, true risks. But I would mention that it's not always that easy to determine. I would like give some examples, right? When you give a score of a cvs, like when you give a CVSS score of a certain bug, you have different criteria, like is it reachable from the network or not? Right. Is it like leading to a privileged escalation from a non privileged user to a much privileged user? And sometimes there's like different points that is, you know, should be discussed, right? There's a credible bug, but only if it's reachable from the network, right? And then there can be a discussion. So I'm not saying it's that easy to just determine and scale and automate all of that, but with AI now coming into play, and the same way that we use AI to find vulnerabilities or to weaponize vulnerabilities, we can actually use AI in our favor. Someone needs to pay for it. Right. We'll talk about the tokens problem, but let's say we could actually take a lot of the hard work that people are doing behind the scenes and help AI validate and automate. Is that reachable from the network or not?
Greg Otto
Is it?
Gal Abaz
It's something that AI can help answer the same way it can help like find and weaponize bug also to use it for good or for, you know, the defenders type of work. So I would mention that we have to think about every angle of it and, and absolutely I would think about adding more. Right. It's not like, hey, let's leave what we did. It's like, hey, let's adapt and add more criterias and more points to, to consider when we're like discussing risk in this era.
Greg Otto
So earlier you said the security research community has all the leverage right now and we've seen the blowback that nightmare Eclipse has, has gotten. And there's a lot of vice this when it comes to the conversations that we are seeing around this. So I'm wondering, is the security research community, you know, pushing that lever a little bit too hard right now? And does there need to be a little bit rolling back to community norms and not just individual heroics just due to the volume on its own? Because like you said, there, there is a human aspect to this. But on the flip side too, there's a human aspect for the research community. There's money on the line and look, there's ego on the line. A lot of security researchers like to do this for the clout. So I'm wondering if it's worth like what needs to happen on the security research side to understand the power that they have here. But it's almost like, you know, the, the, the Spider man cliche. With great power comes great responsibility. So do you think that there needs to be a little bit of, you know, a break pushing and everybody step back to say, all right guys, we, we do have a great opportunity here. Let's not screw it up just because we're mad at Microsoft for their own vulnerability disclosure problems.
Gal Abaz
Yeah, absolutely. I would divide it too, right? Like you have like professional teams that you know, do that for their living. Like you know, I used to work in Checkpoint or like the Oligo research team, that it's more about like as you mentioned, like the marketing angle and like the story and there I think it's very different from individual people when it might be their job, right? Like bug hunters or like you know, just individual people as you, as you Mentioned like people problems and money problems. It's always problems and there's money on the line. And, and I think in that part, like, of course people will abuse that. Right? Let's be honest. Right? But at least the well funded professional security teams should not be the ones abusing it as again, they have more power and more tokens and more ability to do stuff. And again, in an era when everyone is excited about finding a bug and doing another story, I think the ones that have the ability should be much more open and willing to collaborate with the other vendor side and make sure that again, on one hand, again, the industry should not be impacted. But, but if there's a way for you to collaborate and maybe give them a little bit more time or help them in that part of fixing it or not or push anymore, you know, kind of like gasoline into the fire when you know, right now there's a different situation that we should also pay attention to. Right. As I mentioned, when Linus travels, excuse it to the world, you understand there is a real different project needs to be taken from both ends. So again, showing from my own personal perspect perspective and how I treat that, like when I disclose a bug to a vendor I tend to collaborate with, right? And I want to help. Right? And we both need to be like respectful to the actual users in the world that use those products. And they are the first and foremost, the most important one. We don't want to impact people and that's the whole reason why we do those stuff. We want to find the bedbugs before the bad guys find them and weaponize them. And that's why it should be like conversation when both ends are willing to collaborate. And I think that's on the vendor side that of course you can burn much more tokens and prove to them and find more bugs. But it's not about that, right? And proving that you can burn more meters like tokens to prove your point and just find a way to collaborate on it and maybe suggest help or so just to even hop on a call and don't just, you know, disclose it like, you know, as an email, a thread that, you know, it's much more cold and not like human involved. Again, we are still humans behind the scenes and we should understand we are working with humans. Right.
Greg Otto
I was going to ask a little bit from the vendor side of the conversation, what does meeting researchers halfway look like? Like you talk about the collaboration, but let's dive a little bit deeper. Like I don't know what the collaboration looks like during vulnerability disclosure, Is it just basically an email chain back and forth? Is it a thank you for your submission, we'll get back to you in 14 days, six months or wherever. And could that collaboration be better to the point where there's just more of it? Or is it more? Or is it just something different altogether where researchers not so much that they can feel validated, but that they can actually show some respect for the work that is being done, Even if it does come from something that is AI generated and something that does come down the line that turns out to be this looks like it's a low level bug or just a flat out hallucination, which we know can still happen.
Gal Abaz
Absolutely. And I'll mention from my perspective on the way we do stuff and we tend to disclose bugs to really big vendors, but also to very famous open sources that might have few maintainers, right. So we always adjust to what we do. And again, the general process that everyone should follow is like first just disclose over email, like in an encrypted, the PGP way or like, you know, the, the preferred way that the vendor asks to disclose the bug and have that initial report and finding. Because like in the end you don't want to waste both inside, right? Or say like hey, just for you to see, this is the, the report that we're serious, this is what we're thinking, this is the information. And once you get the, you know, the act from the other side and you get the like to get that feedback back that they saw it and they acknowledged there is an issue. Not talking about like severity and stuff, just hey, there's an issue. We always like reach out and say, hey, we would love to help if you want to hop on a short call, like we'll share with you our goals. They see there's no like an malicious intention behind it or we're not like after the money. I think it just takes a little bit of distress or like, you know, the walls that usually you have when someone can either, you know, trying to claim for money or like, you know, disclosing a bug and it might sound scary, something that a problem that you have that someone found. So I think just hopping on a call again, if they like, right, not to push it but just to suggest help or to share our perspective, hear their thoughts, maybe hear why they think or their thing just it solves a lot of the problem like from both ends. I think again, but only for the stuff that you really think is like worth the time. I think from both ends it can fix A lot of just the email back and forth and like people misreading, misinterpreting that, the other side intention.
Greg Otto
So Val, to wrap this up, you've done a good job showing that you've been on both the research side and the vendor side. So what's one thing that you would change right now if you could with the way that disclosure is happening? Because we are at a really big moment of change. So what is one thing that you would plant your flag and say, look, everybody, calm down, breathe, do this and life will be better for everybody?
Gal Abaz
Yeah, I think. And again, that's just my 2 cents and my personal take on it. And there's a lot of angers to, to, to actually fix that big problem. But I think something that can help everyone, both vendors and researchers, they like to put the noise aside, is like, to have a better definition, like, to what is like those clinical bugs that we should all like, say, oh my God, versus like, you know, what is like another disclosure, another bug that should be addressed. But like, we all should focus on what actually is like, exploitable, hitting us in the wild and making a real impact and like, better frame, like, you know, the framing of rcs or the framing of what you care about. Both as a vendor is the one that gets those reports, and both as a, you know, as a researcher, what you should actually fight for and make sure you're wasting your time and the efforts on what's actually important to the other side. I think it's both ends giving back time on, on, like those discussions on, on again, what should scare us and keep us up at night versus like, yeah, it's another marketing stand of another vendor. And it's just like, you know, this wolf, wolf scenario, like, frying wolf wolf scenario, when you're like, just, you know, starting to lose trust, right? We want those, like, reports when someone says, hey, red alert, like, guys, go and fix it. It should be something real. So I think from both ends that get the real reward on that as a researcher, but both, like, for that we have a much better clarification about those critical bugs and I think like adding exploitability and more color into stuff. What is exploited and can be exploited in a mass scale. And it's where we should put the focus.
Greg Otto
Thanks for listening to Safe Mode, a weekly podcast on cyber security and digital privacy brought to you by cyberscoop. If you enjoyed this episode, please leave a rating and a review and share it with your friends, your co workers, your sizzos, your sysadmins, your mom, your dad. Anybody that wants to know more about cyber security, to find out more information or to contact me, please look for all of our social media handles or visit cyberscoop.com thanks for listening. Check us out next week.
Date: June 18, 2026
Host: Greg Otto (Editor-in-Chief, CyberScoop)
Guests:
This episode explores the rapidly shifting landscape of vulnerability disclosure, driven by AI-assisted bug finding and recent controversies involving major tech players like Microsoft and Anthropic. Greg Otto leads nuanced discussions with Derek Johnson and Gal Abaz on how traditional norms in responsible disclosure are being upended, why trust between security researchers and vendors is fraying, and what new frameworks and mindsets are required to keep both security and collaboration viable context of unprecedented technical change.
[00:39–13:35] Greg Otto & Derek Johnson
"When is a jailbreak not a jailbreak? ... The research from independent researchers managed to get the Fable 5 to spit up its system prompt, which is impressive, but it's not about accessing some of these higher-level responses."
"The techniques can be reproduced not just by other frontier models like OpenAI's Daybreak, but by older, existing models ... so it’s very odd to single out Fable for capabilities that are not unique."
"Export controls ... were on part of it because we were handing something over. Like when you are talking about an actual zero day ... I can take that, put that on a USB drive ... With Anthropic, I cannot give you a disk or a thumb drive and say, you now have this model. It is a cloud service."
"It's very really contradictory ... Maybe you could make an argument that the fully completed models on the frontier side are these fully packaged products ... But the Trump administration has lifted export restrictions on like AI chips and all other kinds of technology ... so it is very odd to say you can buy all the parts ... but you can't buy this one specific brand's ready made meal."
"American AI is the greatest AI and you should use the American AI now all of a sudden ... No, no, no, not like that. ... The hacking is not welcome."
[13:35–14:55] Greg Otto (Transition to Interview with Gal Abaz)
[15:09–39:27] Greg Otto & Gal Abaz
"MITRE literally says ... we cannot keep up with all of the reports. ... Even Linus Travels ... says the Linux Foundation cannot keep up with that amount of vulnerabilities."
"The world has changed in 90 days, let alone the disclosure of a timeline."
"If organizations would be much more willing and open to share their real criteria about what’s important and critical for them ... it can be a much better way to determine real criticality risk."
"Both the researchers know what is, you know, a vulnerable bag ... and on the other hand the vendors also have those like different calls of actions to different type of severities ... We all can agree like for sure, 'oh my God' moment when we talk about the criteria of this new CVE."
"Is it worth it to maybe discuss new frameworks or consider separating out what AI can do from what humans can do?"
"We should have different frameworks ... The whole goal of a CVE is for a vendor to know about the presence of a bug it should apply a fix to ... Adapting to a new world is absolutely a must. ... Maybe changing the CVSS score and adding more criteria is absolutely something we should all do."
"At least the well-funded professional security teams should not be the ones abusing it ... In an era when everyone is excited about finding a bug and doing another story, the ones that have the ability should be much more open and willing to collaborate."
"Hopping on a call ... just to suggest help or to share our perspective, hear their thoughts ... it solves a lot of the problem from both ends."
"Have a better definition of what is like those critical bugs that we should all say 'oh my God' versus what is just another disclosure ... focus on what actually is exploitable, hitting us in the wild, and making a real impact."
Derek Johnson [02:41]:
"When is a jailbreak not a jailbreak? ... The research from independent researchers managed to get the Fable 5 to spit up its system prompt, which is impressive, but it's not about accessing some of these higher-level responses."
Greg Otto [09:24]:
"Export controls ... were on part of it because we were handing something over. Like when you are talking about an actual zero day ... I can take that, put that on a USB drive ... With Anthropic, I cannot give you a disk or a thumb drive and say, you now have this model. It is a cloud service."
Gal Abaz [17:31]:
"MITRE literally says ... we cannot keep up with all of the reports. ... Even Linus Travels ... says the Linux Foundation cannot keep up with that amount of vulnerabilities."
Gal Abaz [21:17]:
"If organizations would be much more willing and open to share their real criteria about what’s important and critical for them ... it can be a much better way to determine real criticality risk."
Gal Abaz [27:57]:
"We should have different frameworks ... Adapting to a new world is absolutely a must."
Gal Abaz [37:42]:
"Have a better definition of what is like those critical bugs that we should all say 'oh my God' versus what is just another disclosure ... focus on what actually is exploitable, hitting us in the wild, and making a real impact."
The conversation is direct, practical, and at times anxious, reflecting both fascination and concern about a system close to overload. There is a repeated call for humility, collaboration, and adaptation, mixed with exasperation at slow-moving institutions and opportunistic behavior.