Podcast Summary: AI + a16z
Episode: Why Social Engineering Now Works on Machines
Date: December 2, 2025
Host: Joel de la Garza (a16z)
Guest: Ian Webster (Founder and CEO, PromptFoo)
Overview
This episode explores a new paradigm for security in an age of AI “agents”—AI systems empowered to take actions on users’ behalf in the real world. With enterprises racing to implement these agents across applications, their fundamentally different attack surface (conversational, not deterministic code) leaves them highly vulnerable to “social engineering” tactics that trick them, much as one would a human. The discussion features Ian Webster, founder of PromptFoo, which provides agent adversarial testing, and a16z’s Joel de la Garza. Together, they discuss the “lethal trifecta” of AI agent vulnerabilities, why traditional security methods fall short, and how security is evolving from code scanning to adversarial conversations at scale.
Key Discussion Points & Insights
1. The Shift from Deterministic to Conversational Threats
- Agents Defined:
- Ian defines an "agent" as an LLM that is empowered to take actions (e.g., interacting with APIs or other systems) ([02:52]).
- Unique Attack Surface:
- Traditional security methods (SQL injection prevention, access controls) don’t suffice; agents are vulnerable not to code-level exploits, but to being "persuaded" or socially engineered via conversation ([00:43], [10:02]).
- “You can't patch persuasion, you can't firewall social engineering. The attack surface isn't code, it‘s conversation.” – Podcast Host ([00:43])
2. Enterprise Adoption and the “Year of the Agent”
- Enterprises are rapidly progressing from exploring internal chatbots to integrating AI agents with core internal systems like Salesforce.
- “I'm on board with 2026 as the year of the agent ... that’s what we keep hearing whenever we work with folks on the corporate side.” – Ian Webster ([02:52])
- Security priorities are still lagging behind deployment as initial rollouts focused on features and not on security ([00:12], [05:46]).
3. Security Patterns: The “Lethal Trifecta”
- Coined by Simon Willison ("lethal trifecta", also called "rule of two" by Meta): An agent is considered fundamentally insecure if it has all three:
- Takes untrusted user input
- Accesses sensitive information/PII
- Has an outbound communication/exfiltration channel ([04:23], [07:24])
- "If you take an untrusted user input, if you have access to sensitive information or PII, and if you have some sort of outbound communication channel or exfiltration path, then your agent is fundamentally insecure." – Ian Webster ([04:23])
4. Why Traditional Security Approaches Break Down
- Agents are often attacked via non-traditional inputs (e.g., uploaded documents, images, API integrations)—risks are subtler than classic web application exploits ([09:27]).
- Social engineering tactics apply: attackers can role-play, use psychological manipulation, and leverage contextual conversation to bypass controls ([17:15], [19:35]).
- “It's just absolutely mind blowing to me that we have computers that persuasion can work on ... It's sort of like emotional fuzzing.” – Joel de la Garza ([20:21])
5. PromptFoo and Automated Adversarial Testing
- PromptFoo (started as OSS) now powers security testing for large enterprises, simulating thousands of conversations to test for data leaks, broken access controls, and social engineering vulnerabilities ([00:43], [12:46]).
- Unlike deterministic signature-based tools, PromptFoo uses AI red teaming agents that employ generative conversations to probe weaknesses, leading AIs into states where they “let their guard down.”
- "PromptFoo doesn't try to write SQL injections ... everything is generated on the fly, tailored to the situation." – Ian Webster ([14:54])
- Conversations may require 30-50 turns, simulating realistic adversarial scenarios ([14:54]).
- “Most of the cases we see are definitely conversational. For stuff like data leakage or access control issues, you sometimes have to lead the AI down a path toward where it's more vulnerable first.” – Ian Webster ([14:54])
6. Case Studies and Real-World Incidents
- a16z Incident:
- Joel recounts a real data leak in a SaaS platform where AI chat access control failed in multi-tenant cases ([11:36]).
- “...Typed in ‘show me my report’—it would kind of rotate through data from other customers ... obviously that's a big problem.” – Joel de la Garza ([11:36])
- Discord Experience:
- Ian and team faced early and relentless pressure to iterate on security while deploying agents for 200M+ users—most of their time went to security, not features ([00:43], [20:55]).
7. On Jailbreaks, Social Engineering, and Creativity in Attacks
- Jailbreaks and prompt injections are techniques; their true impact is forcing an agent to break intended access controls or leak data ([12:46], [17:40]).
- Creativity is a key attack factor—unexpected scenarios, informal tones, or even emojis can lower an AI’s “defenses” ([17:40], [19:28]).
- “My original favorite [jailbreak] was probably like, ‘Oh my grandma died, but she used to read me a story about how to do this illegal thing…’" – Ian Webster ([17:40])
- "It's almost like a multiplier: how creative can you get using emojis to convince an application to do something it shouldn't?" – Joel de la Garza ([19:28])
- Many successful attacks mirror classic human social engineering (e.g., urgent requests, impersonating managers) ([19:35], [20:21]).
8. On Security’s Place in the Development Lifecycle
- Security must shift “left” into the developer workflow, not be a late-stage gating process as in traditional enterprise app development ([05:52]).
- PromptFoo’s developer tooling and CI/CD integrations are designed to make adversarial testing a continual part of building with LLMs ([05:52], [06:50]).
9. Reflections on the Security Industry
- Innovations in security often come from engineers facing roadblocks, not from “security people” in the traditional sense ([20:55], [23:09]).
- “The people who build the next wave of security ... typically aren't security people. They're solving their own problem.” – Joel de la Garza ([23:09])
Notable Quotes & Moments (with Timestamps)
-
“You can't patch persuasion, you can't firewall social engineering. And the attack surface isn't code, it's conversation.”
– Podcast Host ([00:43]) -
“If you take an untrusted user input, if you have access to sensitive information or pii, and if you have some sort of outbound communication channel or exfiltration path, then your agent is fundamentally insecure.”
– Ian Webster ([04:23]) -
“PromptFoo doesn't try to write SQL injections ... everything is generated on the fly, tailored to the situation.”
– Ian Webster ([14:54]) -
“It's just absolutely mind blowing to me that we have computers that persuasion can work on ... It's sort of like emotional fuzzing.”
– Joel de la Garza ([20:21]) -
“So, yeah, basically I learned the hard way that there are all these problems with the way that AI is rolling out... there was also the lethal trifecta stuff, which now has like a name or phrase to describe it. But at the time ... there is this exfiltration risk because you have a bot that has access to a potentially private channel history, the ability to render images, and the ability to search the web."
– Ian Webster ([22:23])
Important Timestamps (MM:SS)
- 00:43 – Why AI agents break traditional security models (“You can’t patch persuasion…”)
- 02:52 – Definition of AI agents and why 2026 is the "Year of the Agent"
- 04:23 – The "Lethal Trifecta" model of agent risk
- 05:52–06:50 – Why security must be embedded in developer workflows
- 11:36 – Real-world a16z SaaS data leak incident
- 12:46 – PromptFoo’s automated adversarial testing
- 14:54 – From deterministic signatures to conversational attacks
- 17:40–19:28 – Creativity, jailbreaks, and social engineering in agent security
- 20:21 – “Emotional fuzzing” and persuasion attacks on machines
- 22:23 – Hard-won lessons from Discord’s agent rollout
Final Thoughts & Resources
- Security must now consider conversations, not just code. AI agent threats are fluid, creative, and hard to “patch.”
- For builders: It’s essential to test how agents respond in real-world adversarial scenarios—at scale and with creativity.
- PromptFoo is open source and designed for developer-driven security testing.
- "Good luck to everyone who's building with agents. I think it's going to be a pretty exciting year ahead." – Ian Webster ([24:13])
This summary distills the key concepts and memorable moments for listeners interested in how AI agents are creating unprecedented security challenges, and how adversarial “conversational” testing is the way forward. For more, check out PromptFoo and follow industry insights from a16z and their founders.
