AI + a16z Podcast Summary
Episode: Durable Execution and the Infrastructure Powering AI Agents
Date: February 19, 2026
Guests: Samar Abbas (CEO of Temporal), Sarah Wang (a16z), Raghu Raghuram (a16z)
Theme:
Exploring the pivotal role of durable execution infrastructure, specifically Temporal, in powering modern AI agents and long-running, stateful applications. Samar Abbas explains how the emergence of agentic AI systems is elevating the need for reliable, recoverable backend orchestration, drawing from deep experience at Uber and beyond.
Episode Overview
This episode dives into the infrastructure needs for the new generation of AI agents—applications that must handle long-running, complex workflows reliably, even amid failures or distributed chaos. The conversation centers on the development and impact of Temporal (an evolution of Uber’s Cadence project), its role in the future of autonomous agentic applications, and its implications for scalability, context management, and productivity. The discussion also touches on the evolution of software engineering, market shifts, and building resilient companies.
Key Discussion Points & Insights
1. Defining Durable Execution (Temporal) and Its Importance
- Durable execution refers to infrastructure ensuring that, in the event of failure, application state is preserved and computations are resumed without developer intervention.
- Samar Abbas emphasizes abstraction: “We completely abstract out state management for you as a developer… You just code up your business logic and we are the execution authority of making sure every order gets processed exactly once in the presence of all sorts of chaos and failures in the system.” [00:00-01:50]
- Kitchen/restaurant analogy: managing orders through chaos is likened to how Temporal handles multiple services and failures.
2. Origins in Uber and Evolution to Temporal
- Uber’s migration to microservices led to complex state management problems with hundreds of microservices (more than engineers).
- Initial use cases included tipping flows and the Uber loyalty program—scenarios needing long-running, recoverable workflows.
- Samar: “Building stateful applications required orchestrating calls across dozens of those microservices with variable availability… State management became a big mess.” [05:18-06:13]
- Temporal (and predecessor Cadence) allowed for reliable, piecemeal adoption—powering everything from tipping retries to loyalty programs.
3. Why Durable Execution Matters for Modern AI Agents
- As AI agents become longer-lived and more expensive to restart (in both cost and time), guaranteed execution is now “mission critical.” [00:55-01:51]
- Raghu: “Recoverability and long running transactions seem to be two things that are very attractive to your customers.” [07:48-08:07]
- Managing failures, auditing, and seamless rollback are essential: “We could reset the workflow, go back in time…and suddenly you recovered all of the corrupted workflow.” – Samar [09:36-10:09]
4. Scaling to Internet and Agent Scale
- Temporal now underpins critical operations at massive companies: OpenAI’s Codex, Snap Stories, Coinbase, and Yum Brands (KFC, Pizza Hut, Taco Bell).
- Temporal delivers “five nines” (99.999%) operational SLA, and multi-region business continuity features.
- “We actually already have a cloud system which can handle spikes of 150k actions per second on a moment’s notice… Snap scale is peanuts now.” – Samar [31:20-32:27]
5. Temporal’s Relevance in the Agentic AI Era
- The platform shift: AI agent apps automate complex work, increasingly autonomous and asynchronous.
- Developers (including non-traditional ones) can now create specialized apps rapidly; explosion in app volume is expected.
- “We are in the MS-DOS era of agents right now.… We will run out of what things you can do in a single sandbox, so there will be a whole swarm of those agents.” – Samar [18:19-19:47]
6. Patterns in the Agent Infrastructure Landscape
- Emergent best practices: use of sandboxes (to guard against dangerous tool access), prompt evaluation, robust observability, specialized sub-agents.
- “Observability… has always been a problem… but these agents are going to push the boundaries out in observability at a completely different scale.” – Samar [41:50-42:56]
7. Critical Role of Durable RPC and Industry Standards
- Need for durable RPC to orchestrate swarms of specialized agents—current standards (like MCP) still lack asynchronous support.
- Project Nexus: Temporal’s initiative toward an industry-wide protocol for durable, asynchronous tool/agent invocation.
- “How do you stitch together the swarm of agents… to manage state across that? I feel we have a project called Nexus, where we’re trying to drive an industry-wide standard.” – Samar [50:37-51:17]
8. Context Management for Agents
- Massive challenge in feeding agents with timely, reliable, multi-source context.
- Temporal is increasingly used for “retrievals”—pipelines connecting APIs, Slack, Google Docs, etc. to feed LLM context in real time.
- “Traditional data orchestration solutions are not powering context engineering… the context is coming from such a broad class of sources… everyone has their own reliability characteristics.… Temporal is the orchestrator.” – Samar [46:34-48:07]
9. Business Model & Future of SaaS
- Despite “SaaS is dead” sentiment, Samar argues more value will concentrate in API-driven business logic and orchestration.
- “If you are providing the right valuable business outcomes through APIs, your business is going to skyrocket.… The world is evolving quickly—by the time you take a bet on [agentic] platforms, within six months they might disappear.” – Samar [37:31-39:13]
10. Lessons in Company Building & Leadership
- Unpacking the CEO/CTO switch at Temporal; importance of aligning leadership roles with current company challenges (strategy vs. execution).
- “Majority of the problem was not that we lacked product strategy or where we are… Majority of the problem is execution risk as the company grew bigger.” – Samar [56:12-57:19]
11. Advice for Startups in the AI Infrastructure Era
- Samar champions customer focus, resilience, and maintaining clarity on product value—assuming capital is just a lever to amplify impact rather than a cushion.
- “The thing which has not changed for us is focus on solving customer needs and having the clarity on what value you are bringing to the customer… That doesn’t change, up market or down.” – Samar [60:13-61:25]
Notable Quotes & Memorable Moments
- On Durable Execution:
“We completely abstract out state management for you as a developer … you just code up your business logic and we guarantee all of that state or each and every order gets processed.”
– Samar Abbas [01:27, 03:31] - On Exemplary Use Case:
“Uber’s loyalty program… that entire system was running on top of Cadence … running a workflow for every Uber rider, forever.”
– Samar Abbas [08:08] - Agentic Shift Analogy:
“We are in the MS-DOS era of agents right now.… It’s very natural … there will be a whole swarm of those agents.”
– Samar Abbas [18:19] - On Observability:
“Everything in your system is auditable … you get so much visibility into your business transactions now just because of that nature.”
– Samar Abbas [33:02] - On Scaling:
“Last year, I was talking to Max and said, Max looks like we have overengineered on scale… Then, of course, the whole AI agentic wave happened—now you’re at agent scale and Snap scale is peanuts.”
– Samar Abbas [31:51-32:27] - On the API Era:
“A lot of value is going to move to APIs… If you are providing the right valuable business outcomes through APIs, your business is going to skyrocket.”
– Samar Abbas [37:31] - Leadership Transition:
“Majority of the problem is execution risk as the company grew bigger… we are in a phase where the company needs a different flavor of leadership for the next phase.”
– Samar Abbas [56:12-57:19] - Advice to Founders:
“Focus on solving customer needs and having the clarity on what value you are bringing to the customer… That doesn’t change, up market or down.”
– Samar Abbas [60:13-61:25]
Key Timestamps for Important Segments
- Durable Execution Explained: [01:51-03:31]
- Uber Origins & Scaling Example: [05:18-09:22]
- Incident: Workflow Recovery and Event Sourcing: [09:35-10:09]
- Temporal’s Operational Guarantees (Five Nines, Multi-region): [11:02-12:45]
- Role in Modern AI Agents: [13:33-17:30]
- Patterns in Agent Infrastructure Stack: [40:56-43:03]
- Durable RPC & Industry Standards (Nexus): [50:14-52:44]
- Context Management for Agents: [46:34-49:42]
- Reflecting on SaaS & Value in APIs: [35:45-39:13]
- Leadership Journey at Temporal (CEO/CTO switch): [55:08-57:40]
- Startup Lessons and Fundraising Advice: [59:38-61:25]
Flow and Tone
The conversation flowed as an open, technical, and anecdote-rich dialogue, balancing deep dives into distributed system design with startup wisdom and a sense of market urgency. Samar provided vivid analogies and practical insights, while Sarah and Raghu asked pointed, high-level questions to connect infrastructure stakes to the broader arc of AI and software industry evolution.
Takeaways for Non-Listeners
- Durable execution is fast becoming essential for complex, long-running AI agents, not just traditional business workflows.
- The infrastructure beneath AI is rapidly evolving; scalable, reliable orchestration like Temporal is at the heart of the agentic revolution.
- Real-world incidents (like Uber) showcase not just reliability but recoverability and auditability—crucial for business impact.
- The agentic stack is still in flux: best practices are emerging around state management, observability, sandboxes, and context engineering, but no standards yet.
- API-driven orchestration and value extraction are likely to outlast short-lived “agentic” platform trends.
- Founders should prioritize resilient execution and solving genuine customer problems—regardless of funding environments.
[Summary compiled by analyzing the episode transcript and organizing core themes, insights, and memorable moments with direct quotes and timestamps for clarity.]
