Dawn Song: When AI Becomes the Hacker and the Defender - Afternoon Cyber Tea with Ann Johnson

Summary6 min read

Afternoon Cyber Tea with Ann Johnson Episode: Dawn Song – When AI Becomes the Hacker and the Defender Date: February 17, 2026

Overview

In this episode, Ann Johnson speaks with Dr. Dawn Song, a leading researcher in AI security and privacy and Professor of Computer Science at UC Berkeley. The discussion centers on how rapidly advancing AI technologies are redefining cybersecurity—empowering both attackers and defenders—and what this dual-use dynamic means for enterprises, regulations, and the future of secure systems. Dawn shares insights from her pioneering research on adversarial AI, automation in bug detection and patching, privacy, data governance, and offers her vision for leveraging AI as a transformative tool in building provably secure systems.

Key Discussion Points & Insights

The Growing Importance of Communication & Trust in Cybersecurity

Storytelling and Transparency as Foundations (01:09–03:21)
- Ann sets the stage by noting cybersecurity’s shift beyond technicalities, emphasizing communication’s role in shaping risk understanding and trust.
- Dawn Song: “Communication is really important... especially given how fast AI has been advancing… We hope through this type of communication we can help raise awareness in the broad community to both understand the upcoming risks and also to work together to build resilience and trust.” (01:34)
- Example: A workshop on frontier AI in cybersecurity drew nearly 4,000 participants, highlighting the appetite for shared understanding.

AI’s Impact at Every Stage of the Cybersecurity Kill Chain

AI in Detection, Threat Intelligence, and Automation (03:21–07:21)
- Ann asks about near-term AI impacts in cyber defense.
- Dawn explains that AI now affects "many different stages" of both attacks and defenses.
- Her recent work, CyberGym, benchmarks AI’s capabilities to find vulnerabilities in roughly 200 open-source projects, detecting both known and zero-day vulnerabilities.
  - Improvements are rapid: “Even just last year … AI model capabilities weren’t quite there. However, fast forward for this year ... we are really seeing drastic increase.” (03:38)
  - Quantitative leaps: from 18% detection of known bugs with earlier models to 67% when running 30 trials with Anthropic’s most recent release.
- Quote: “Anthropic has shown that … 30 trials on these benchmarks … the agents were able to identify and generate PoCs for close to 67% of previously known vulnerabilities.” (07:13)
- Dawn’s team also saw AI uncovering and responsibly disclosing new, previously unknown vulnerabilities: “The agents actually were able to discover 35 zero days ... CVEs have been issued … developers have patched a number … after we reported.” (07:21)

Automation in Patching & the Dual-Use Dilemma

How Quickly Can Defenders Respond? (08:03–10:38)
- Ann probes for practical examples of patching automation.
- Dawn references her projects like BountyBench, demonstrating AI’s advances in generating patches, significantly reducing human effort.
- Dual-use dynamic: "There’s a key question, who will AI help more both in the near term and in the longer term? Attacker side or the defender side?... AI can help the attacker side more in the near term." (08:13)
  - Example: Patch deployment in hospitals averages nearly 500 days, giving attackers an upper hand even after a patch is available.
- Looking ahead: Dawn is optimistic AI can eventually help defenders more through automated program verification—eliminating vulnerability classes altogether.

Adversarial AI: Threats, Attacks, and Defensive Strategies

Adversarial Machine Learning (10:38–13:48)
- Ann highlights Dawn’s seminal work in this field.
- Dawn: Her group was among the first to show that “deep learning models ... are very vulnerable to these adversarial attacks ... attackers can just change the inputs ... and this can actually cause the model to give wrong answers.” (10:57)
- Artifacts from her research are now displayed in the London Science Museum.
- With the advent of agentic AI—models with privileges to act on data, make decisions, interact with sensitive enterprise resources—attackers can “construct malicious attacks ... jailbreaks, prompt injections, ... data poisoning … they can really cause the agents to misbehave and ... take wrong actions.... leak sensitive information ... even delete databases.” (12:50)

Data Hunger vs. Data Minimization: Privacy and AI

Risks of Memorization & Extraction (13:48–18:02)
- Ann discusses the tension between AI’s appetite for data and organizations’ push for minimization and privacy.
- Dawn’s research was among the first to show large language models are susceptible to memorizing sensitive data—“an attacker that doesn’t even know the details of the trained model ... by just querying ... can actually extract sensitive information in the training data.” (14:28)
- She details mitigations: differential privacy, synthetic data, reinforcement learning.
- Quote: "With high quality synthetic data ... it can help both increase the model utility and while mitigating the risks for data privacy." (16:50)

Regulation: The Push for Science- and Evidence-Based Policy

Fragmentation and the Need for Transparency (18:02–20:25)
- Ann prompts on the regulatory landscape.
- Dawn discusses her involvement with a proposal for “A Path for Science and Evidence Based AI Policy,” and emphasizes the value of empirical benchmarks in shaping regulatory approaches.
- Quote: "The AI policy needs to be grounded in science and evidence…. transparency is the first step." (19:47)
- Their work helped inform the California bill SB53, which aims to increase transparency in AI’s capabilities and risks.

AI as Cybersecurity Optimist: Towards Secure-by-Design Systems

Long-term Prospects: Provable Security (20:25–24:38)
- Ann closes with a call for optimism about AI’s future in security.
- Dawn: “I do think that AI itself actually can also help us a lot ... to help us to build more secure systems.” (20:46)
- She outlines a vision for AI-enabled formal verification—moving from today’s “cat and mouse game” of patching bugs to a world where AI helps generate, specify, and prove code is secure by design.
  - “We can now … do what we call verifiable code generation where we can actually generate code that also has provable security.” (23:02)

Notable Quotes & Memorable Moments

On trust and resilience:
"We hope through this type of communication we can help raise awareness in the broad community to both understand the upcoming risks and also to work together to build resilience and trust."
— Dawn Song (01:34)
On AI outpacing past limits:
“Even just last year … AI model capabilities weren’t quite there. However, fast forward for this year ... we are really seeing drastic increase.”
— Dawn Song (03:38)
AI’s dual-edged nature:
“AI can help the attacker side more in the near term … deploying a patch even after it becomes available takes close to 500 days.”
— Dawn Song (08:13)
Adversarial AI stakes:
“As we adopt and deploy AI… the consequences of these attacks can be much more severe.”
— Dawn Song (13:48)
On science-led policy:
“The AI policy needs to be grounded in science and evidence…. transparency is the first step.”
— Dawn Song (19:47)
Long-term optimism:
“Instead of just doing this bug finding and patching this cat and mouse game, we can actually develop systems with provable guarantees, with formal verification … Ultimately, using this approach, AI can really help us to create probably secure systems and really help defenders to win over attackers in the longer term.”
— Dawn Song (23:14)

Key Segment Timestamps

Communication, Trust & Awareness: 01:09–03:21
AI Capabilities in Cyber Vulnerability Discovery: 03:21–07:21
Patch Automation & Dual-Use Risks: 08:03–10:38
Adversarial Machine Learning: 10:38–13:48
Data Privacy & Mitigations: 13:48–18:02
Regulatory Landscape & Science-led Policy: 18:02–20:25
Secure-by-Design & AI Optimism: 20:25–24:38

Tone & Style

The episode is conversational yet deeply technical, reflecting Ann Johnson’s expertise as an executive and Dawn Song’s leadership in academic and applied research. The speakers balance realism about risks with a focus on tangible, actionable solutions and a notable sense of optimism for AI’s constructive potential in cybersecurity.

Summary for New Listeners

This episode captures the exhilarating, unsettled frontier where AI’s growing powers are reshaping cybersecurity, for both attackers and defenders. Dr. Dawn Song translates cutting-edge research into insights for enterprise leaders: AI is already finding and even fixing vulnerabilities at scale, but its dual-use nature presently helps attackers more than defenders. However, with deliberate focus on automation, privacy-by-design, regulation built on scientific benchmarks, and the pursuit of verifiable security, AI holds the promise to one day tip the balance in favor of defenders. If you want to understand both the urgency and promise of AI in cybersecurity, this episode is essential listening.

Loading summary

Transcript21 lines

[00:02]
A
You're listening to the Cyberwire Network powered by N2K. Welcome to Afternoon cybertea where we explore the intersection of innovation and cybersecurity. I'm your host, Dan Johnson. From the front lines of digital defense to groundbreaking advancements shaping our digital future, we will bring you the latest insights, expert interviews and captivating stories. To stay one step ahead. Today, I am joined by Dawn Song, professor of Computer science at UC Berkeley and one of the world's leading researchers in AI security and privacy. Dawn has been at the forefront of studying both the opportunities and the risks of artificial intelligence, from adversarial machine learning to privacy and data governance. And she's also built our own businesses, founding companies and putting this research into practice. Welcome to Afternoon cybert, Dawn.
[01:08]
B
Thanks all for having me.
[01:10]
A
So you and I have both seen how cybersecurity is no longer just a technical conversation. It is about how people understand risk and ultimately how trust is built. With employees, with customers, with the public. Communication really becomes key to bridge the technical and the human perception. When you look at cybersecurity through that lens, what role does storytelling and what role does transparency play in building resilience and, and building trust?
[01:35]
B
Yeah, this is a great question. I do think communication is really important in particular in today's environment, especially given how fast AI has been advancing. How fast? Actually AI's capabilities even in cybersecurity has been increasing as well. So it's a good that you asked. So this also relates to some of our recent efforts. So some of our recent research has shown that AI capabilities in cybersecurity has been increasing really fast. So for example, our recent work, Cybergem builds a benchmark of close to 200 large scale widely distributed open source software. And we use this benchmark to evaluate agents capabilities in finding vulnerabilities, both finding previously known vulnerabilities and also finding new vulnerabilities zero days and generating POCs, proof of concepts. These are inputs that actually can trigger the vulnerabilities. And our work has shown that AI capabilities in cybersecurity has been increasing really fast and also has been included in for example, the system card for Anthropic's most recent model release, sonnet 4.5 and so on. And given this huge change. So we recently actually just hosted a workshop in particular focused on frontier AI in cybersecurity, discussing how AI is changing the landscape of cybersecurity really quickly. And we are really hitting a pivotal moment. And we had close to 4,000 joined online. So this is an example that we hope through, through this type of communication we can help raise awareness in the broad community to both understand the upcoming risks and also to work together to build resilience and trust.
[03:22]
A
That's great. And I think it shows that there's such an expansive use of AI that we could use actually in cybersecurity. I want to talk just about near term. Where do you think that AI is going to have the most impact on cyber defense near term? Is IT detection, automation, Threat intelligence?
[03:39]
B
I will say in the near term AI can have impact across many different stages. So recently we have been doing in depth analysis and have written overview paper in particular looking at impacts of frontier AI in the landscape of cybersecurity. And as we know, cybersecurity is a very complex space. For example, an end to end attack actually can take many stages, oftentimes called a kill chain that has at least six, seven stages and so on. And defenses also. Similarly, there are many different types of defenses at different steps in the overall lifecycle of attacks and defenses and so on. So to better understand the impact of frontier AI in the landscape of cybersecurity, we actually need to really understand how AI is currently changing at each stage for both attacks and defenses in the landscape of cybersecurity. So with that in our overview paper we actually did, we have done in depth analysis for each of these stages and show that at different stages AI is helping attackers and defenders actually to different degree. So with that said, so as I mentioned earlier, so for example, our recent work, CyberGym has shown that even just last year, if you have had asked whether AI can actually find these vulnerabilities and generate proof of concepts PoCs in these really large scale widely distributed open source software, then I would say in general last year the AI model capabilities wasn't quite there. But however, fast forward for this year, given the increased reasoning capabilities and tool use for these models and agents, we are really seeing drastic increase in AI capabilities in cybersecurity. So for example, for the cybergym example that I mentioned in the benchmark containing close to 200 large scale widely distributed real world open source software, it contains over 1500 previously known vulnerabilities. So we test the AI agents with these two tasks. One is to identify previously known vulnerabilities and generate POCs for them. And the other one is to identify previously unknown vulnerabilities zero days and also generate POCs for them. So our work showed that even just in the late spring, early summer with Cellnet 4 back then that was the best Performing model and agents on our benchmark, it was able to detect identify close to around 18% of previously known vulnerabilities in the benchmark and generates the POCs for them. And fast forward with GPT5 high thinking mode, just a few months later it went up to 22%. And now with the most recent model release from anthropic on Sundance 4.5, this number went up to 28%. And these adjust with a single trial for the agents on our benchmark. And Anthropic has shown that when they increase the budget to run the agents for 30 trials on these benchmarks, it showed that this number actually went up close to 67%. So the agents were able to identify and generate POCs for close to 67% of previously known vulnerabilities.
[07:13]
A
Wow, that's incredibly fascinating and I think that it really is a precursor to what we're going to see a lot of.
[07:21]
B
And with our work we also showed that the agents can even discover previously unknown vulnerabilities zero days. So in our case the agents act actually were able to discover 350 days in these large scale widely distributed open source software. And a number of CVEs have been issued for those since our discovery. And also developers have patched a number of these vulnerabilities as well after we reported. So these are examples showing the AI capabilities in finding vulnerabilities has been increasing. And so of course defenders can use this technology to try to find more vulnerabilities in their codes before attacking.
[08:03]
A
So dawn, as you're thinking about how AI is going to help us with cyber defense, I know you have some meaningful examples related to automation and patching. Can you tell us a little bit more about that?
[08:13]
B
Our other work such as Bounty, Bench and also others work have shown that AI capabilities in generating patches is also increasing very fast. And they actually can now generate fairly good patches in a lot of cases. And this can significantly reduce the human effort in developing patches as well. All these can help defenders to up their game. But however. So in our work we also try to analyze, given that AI as a dual use technology, it can help both attacker side and defender side. So there's a key question, who will AI have more both in the near term and in the longer term? Attacker side or the defender side? So unfortunately with our recent analysis we show that in the near term, due to the natural symmetry between attack side and defense side, AI can help the attacker side more in the near term. In particular, for example with the bug finding and so on as we know that even when there are patches, actually deploying patches can take a really long time. There has been estimates showing that, for example, in hospitals they estimate the deploying a patch even after it becomes available takes close to 500 days. So all these can make it harder for defenders to essentially have AI to in the near term can potentially benefit attackers more. And this also motivates our line of work, actually using AI to help build properly secure systems where AI can help, for example, automate program verification, to actually generate essentially what we call appropriately secure code, so that it can eliminate integration entire classes of vulnerabilities. So this way we think that AI can actually help in the longer term defenders more. So this is a long answer to your earlier question that I think AI is going to change the cybersecurity landscape in many different ways and also very fast. So it's very helpful for defenders to actually really understand in particular this dynamics with attackers, given that AI as a dual use technology.
[10:38]
A
I completely agree. Some of your most cited work is an adversarial AI. Can you talk a little bit about what you're seeing today? Can you also talk about how enterprises could potentially protect themselves from manipulations, whether coming in the manipulations from the model or how we actually can secure AI itself?
[10:58]
B
Yes, that's a very good question. So my group has been among the earliest to study the area of adversary machine learning, showing that these machine learning models, essentially all different types of deep learning models, architectures and so on, they are very vulnerable to these adversarial attacks where for example, attackers can just change the inputs to the machine model slightly and this can actually cause the model to give wrong answers. And in particular these inputs are maliciously crafted by attackers and so on. And actually some of our earlier work now is part of the permanent collection the London Science Museum. So that's a rare privilege as a scientist to have some of our work artifacts to be part of the collection at a museum, so that we have been working in a space for a decade and so on. But fast forward now, these type of adversarial attacks with today's large language models and frontier AI is only making things worse because the frontier AI now is even more powerful. And also as we develop agentic AI, these AI agents, they can have very strong capabilities and also we give them a lot of privileges. For example, the AI agents may be able to read your emails and may take actions, interact in enterprise with for example, backend database and so on. Now also this agentic AI given again the vulnerabilities in these machine learning models, it also can significantly increase the attack surface. So given the combination of these factors now as attackers construct these malicious attacks through for example, jailbreaks, prom injections and all these different types of attacks, and also data poisoning and so on, they can really cause the agents to misbehave and can take wrong actions. And these wrong actions can lead to very dangerous consequences. For example, leaking sensitive information in enterprise, even sending proprietary information to the attacker, and also to take action, for example, even delete database and so on. So yes, as we adopt and deploy AI, and in particular agentic AI, and also as the AI systems become more powerful and have more privileges, the consequences of these attacks can be much more severe.
[13:48]
A
That's really great content. Thank you very much. Let's talk a little bit about data and privacy. From two aspects. There's been this proliferation of data and AI is very hungry for more data. Yet enterprises really want data minimization. They don't want unfettered data going into the AI systems because they're worried not just about privacy, but also about security. So talk to me about the research that you have done when it comes to privacy, data security and how you're thinking the future looks for what privacy regulations, data regulations, et cetera, how that's going to impact the growth of AI.
[14:25]
B
Yeah, thank you. This is a very good question. Yes, in particular in AI applications, as we develop and deploy AI technologies and so on, data privacy is a very important aspect and this actually happens at multiple levels at different layers. So my group actually was among the very first to actually study the problem of memorization of large language models. Quite a few years back. Our work was the first to show that this language model, as we train them, they have very high capacity, so they can actually remember a lot of sensitive information in the training data and attacker that doesn't even know the details of the trained model, such as the model architecture and the parameters of the model, but by just querying the language model through the interface can actually extract sensitive information in the training data. So in our work we show that through this type of attack, attackers can extract sensitive information such as credit card numbers, Social Security numbers and other PI information in the training data. And this essentially we demonstrated across different sizes of models and so on. So this is just an example illustrating that as we train machining models, it's very important to pay attention to and be careful about protecting users data privacy. And especially in enterprise, which can contain a lot of proprietary information as enterprise, either trained model and our fine tune model and so on. It's particularly important to pay attention to this type of issues and hence also it's very important to develop better technologies to still be able to train good models, to develop strong AI capabilities, but at the same time to try to minimize mitigate these type of issues. There have been many different approaches, solutions that have been proposed. So this includes, for example, including differential privacy, trying to actually train differentially private models. And this can happen at different stages as well, including pre training, fine tuning and so on, with the different level of impact on utility for the model. Another actually, I think more promising approach is one can also generate synthetic data to train the model and also with the reinforcement learning and so on with these new technologies that has been shown that actually can also be very helpful and effective in developing strong capabilities for AI models. And a lot of these approaches can significantly reduce the risk for data privacy. So for example, generating synthetic data with a high quality synthetic data, then it can help both increase the model utility and while mitigating the risks for data privacy. Also, I think reinforcement learning can also wind down properly, can also be a great way to help the model gain strong capabilities and also at the same time mitigate some of these data privacy risks.
[18:03]
A
Dawn, can you talk a little bit about the regulatory environment and how it's going to impact data and privacy and artificial intelligence?
[18:10]
B
Yes, this is a very good question. I would say there has been a lot of different activities and different approaches and different proposals in the regulation space. And so actually we together, together with a number of other leading AI researchers and scholars, we actually have put out a proposal called A Path for Science and Evidence Based AI Policy. And we have a recent science paper that got published that summarizes the key points for this proposal. And the proposal actually focuses on that. Given that there actually has been a lot of fragmentation in the space, there has been different opinions in terms of what would be the best approach for regulation. With our proposal, it emphasizes that the AI policy needs to be grounded in science and evidence and need to be informed with science and evidence. So for example, some of the work that I mentioned earlier, such as Sabergen and so on, that develops a benchmark to actually evaluate AI capabilities in cybersecurity and how it changes over time. And this actually provides very important evidence and grounding for developing better policies to help the society, for example, to build a stronger resilience in the space. And this of course is just one example. And there has been a lot of ongoing efforts, both for example, with the EU AI act and other efforts. So for example, also our proposal on this path for science evidence based AI policy, this actually has been served as a basis for a recent policy in California state, the SB53, which actually proposes that we need to develop better transparency so that the society and also government can have better understanding of the state of AI technologies and so on. So this is another example that we need to develop science evidence based approach for AI policy and transparency is the first step.
[20:26]
A
So Don, we always close afternoon Cybertea with a bit of optimism. I like to consider myself a cyber optimist. Given that you've talked about how AI can be used to defend the threat actors are going to use AI, you've talked about regulation, you've talked about privacy. But what makes you really hopeful about the future of artificial intelligence and cybersecurity?
[20:47]
B
That's a great question. I do think that AI itself actually can also help us a lot actually in the longer term to help us to build more secure systems. So as I mentioned a little bit earlier, in particular even in some of our recent work we show that, and also we hope this direction can be further developed is that using AI we can actually truly develop provably secure systems. So the issue with the previous approaches, for example for bug finding and then applying patches, is that you are still playing the cat and mouse game. Given we all know in large code bases a significant fraction of there's statistics, I think every few hundred lines of codes there may be a bug and so on. So for a large code base there can be lots of bugs. And this approach of finding bugs and patching them, you may never know that whether you found all the bugs and whether attackers will get lucky and find some bugs that you didn't and the defenders didn't find and can still launch successful attacks. But however, this we call secure by design or safe by design approach is to actually build programs and systems that's actually properly secure. So in this way, instead of just doing this bug finding and patching this cat and mouse game, we can actually develop systems with approval guarantees, with formal verification. And this can eliminate entire classes of vulnerabilities and hence no longer being in this Kantama scheme. And so far this approach has been difficult to be adapted in practice, especially given that it's very labor intensive so far to actually do this formal verification. Oftentimes it can take tens of proof engineering years to do formal verification, even for relatively small systems. But however, I do strongly believe that AI can significantly help us in changing this. So my group has been among the earliest to use deep learning to help automate improving even back six seven years back and our earlier work showed that deep learning can be helpful in help automating certain steps into improving and now fast forward with the increased capabilities of frontier AI. So our recent work recent paper also outlined a roadmap how we can bring the community together to further develop the technologies and to advance frontier AI to perform more mathematical reasoning and ultimate theorem proving. We hope that using this approach we can now just generate code like today where our work and others work have shown that a lot of this AI generated code can have a lot of vulnerabilities. Instead we would do what we call verifiable co generation where we can actually generate code that also has provable security. With AI it can also generate both code and specification and also proving that the generated code satisfies the desired security properties and the specification. Ultimately, using this approach, AI can really help us to create probably secure systems and really help defenders to win over attackers in the longer term. So this is just one example that I think AI can be a truly transformative technology to help us build properly secure systems and build truly resilient systems.
[24:38]
A
Great. Dawn, I want to thank you so much for joining me today. Your work has really shaped AI and I know based on the depth of the conversation our listeners are going to get a lot from it. Thank you so much.
[24:49]
B
Thanks a lot for having me. Really great talking together.
[24:52]
A
Yeah, thank you and many thanks to our audience for tuning in. Join us next time on Afternoon cybertea. I invited Dawn Song to Afternoon Cyber Tea because she is an expert on AI and security. The depth and breadth of information that dawn shared is really useful. It was definitely an education for me and I know our listeners will gain a lot of value from this episode.