Podcast Summary: Redis and AI Agent Memory with Andrew Brookins
Podcast: Software Engineering Daily
Host: Sean Falconer
Guest: Andrew Brookins, Principal Applied AI Engineer at Redis
Date: August 26, 2025
Overview
In this episode, Sean Falconer speaks with Andrew Brookins, Principal Applied AI Engineer at Redis, to explore the complexities of building agentic AI systems—particularly focusing on the engineering of memory systems for AI agents, the expanding role of Redis in the agentic stack, the nuances between different memory types, the evolution of search strategies like hybrid search, and the critical missing pieces in today’s infrastructure for truly autonomous AI agents.
Key Discussion Points & Insights
1. Why AI Agent Memory Is a Hard Problem
- Statelessness and Limited Context: LLMs are stateless and have limited context windows, making it challenging to maintain continuity and reliability across sequential interactions.
- Quote:
“LLMs don’t model state transitions like that for environments. That’s where they tend to break down.” – Andrew (02:24)
- Quote:
- From Chat to Agency: Demonstrations and proof-of-concepts tend to mask these weaknesses, but productionizing agents that act in, and predict changes in, real environments exposes the limits of traditional LLMs.
- Mapping Problems: Building effective agents requires deep understanding of which tasks are suited for agentic approaches and what extra engineering (especially around prediction and state) is required.
2. The Anatomy of Memory in AI Agents
- Layers of Memory:
- Message History (Short-term/Working Memory): Storing recent conversational exchanges.
- Summarization: Condensing those histories to fit context windows; beyond storage, involves task-specific engineering.
- Long-Term & Reference Memory: Extracting significant facts over time to persist and retrieve for future interactions.
- Metaphor:
“A cognitive system interacting with people all day… their brain in a background thread picks things out that are important.” – Andrew (06:26)
- Metaphor:
- Difference Between Knowledge Base and Long-Term Memory:
- Knowledge base = developer-supplied facts; long-term memory = runtime-learned facts by the agent.
- Quote:
“Retrieval looks probably similar… but long-term memory is stuff the agent learned at runtime, while RAG or knowledge base is what developers inject.” – Andrew (09:03)
- Durable State & Workflow: True agentic applications require not just LLM memory, but tracking environment state, tool calls, event-driven logic, and durable execution, sometimes requiring checkpointing and dynamic workflows.
- Quote:
“We're really talking about durable execution at some level; dynamic workflow that... is going to die in the middle of something and have to restart.” – Andrew (10:24)
- Quote:
3. Redis in the Agent Memory Stack
- Filling Multiple Roles: Redis is used for:
- Fast storage/retrieval of working memory (message history, context blobs)
- Vector search and indexing for both knowledge base and long-term memory (fast retrieval)
- Streams for managing workflow and orchestrating agent state
- Quote:
“Redis is absolutely a great fit [for working memory]. It’s super fast… and as a vector DB, it can serve retrieval too.” – Andrew (12:16)
- Recent Innovations:
- Addition of a query engine for dynamic queries and vector indexing (Redis 8).
- New vector sets data structure for more native vector operations.
- Semantic Caching: Caching based on similarity, not just keys, to avoid redundant LLM calls.
“Semantic caching is caching based not on a deterministic pattern, like a key… instead on similarity.” – Andrew (18:19)
- Integration and Ecosystem:
- Redis Vector Library (Redis VL) provides drop-in components for agent frameworks (LangGraph, LangChain).
- Move toward plug-and-play memory providers in frameworks, though complete standardization is difficult due to nuanced database behaviors.
4. Engineering and Modeling Memory
- Schema and Recency: Storing structured data (timestamps, metadata) is crucial to supporting episodic (time-bound) and semantic (general fact) memory.
- Example: Storing both “likes apples” (semantic) and “preferred Cosmic Crisp this week” (episodic).
- Prioritization in Context Construction: Recent information often overrides older long-term memory, but prioritization logic is still an open engineering challenge.
- Quote:
“I feel like the answer is still out there… I’m just trying to screw with the prompt until I get better retrieval many times.” – Andrew (30:44)
- Quote:
- Hybrid Search: Combining vector (semantic) and keyword (exact) search, and choosing between them based on task—e.g., code navigation (semantic) vs. variable renaming (keyword).
- Quote:
“It often tends to be task-specific… the demands on search are different.” – Andrew (34:34)
- Quote:
5. “World Models” as the Next Frontier
- Limitation of Context & Retrieval: Agents can memorize and retrieve facts, but without the ability to predict environment state changes, they fall short of autonomy.
- Quote:
“Agents need to predict how the environment will change… agents that are playing text games, you can do a lot of context engineering, but they just don’t improve that much.” – Andrew (44:15)
- Quote:
- The Need for World Models: Inspired by DeepMind’s research, agents must be able to model and predict the consequences of their actions—true environment state transitions—beyond “just” managing memory.
Notable Quotes & Memorable Moments
-
On the challenge of agent memory:
“It's actually quite difficult at a certain point when you have enough messages to try to figure out what exactly is relevant to the incoming question or the input.” – Andrew (05:32)
-
On durable execution:
“When you realize… you’re really talking about durable execution at some level… like checkpointing, right? … it’s been around for a while.” – Andrew (10:24)
-
On open standards for “memory”:
“Open standard for memory… doesn’t always work because all databases have slightly different traits… especially things like filtering the vector search.” – Andrew (20:55)
-
On hybrid search merits:
“Vector search is great… when the input is less specific, you want to find clusters of related things… But when I tell the agent to rename a variable, I just need keyword search and exact matches.” – Andrew (34:34)
-
On schema and recency:
“You definitely do want to store more than just the text… The time that the user referenced in the memory… Then you could order information by the times that users were talking about.” – Andrew (27:58)
-
On the missing piece in AI agents:
“General agents need world models…to predict how an environment will change. That’s fundamentally not about predicting the language that will represent that afterward.” – Andrew (44:15)
Important Timestamps
- Statelessness & Prediction Gaps: 02:01–03:53
- Defining Agent Memory Layers: 04:26–09:48
- Durable State and Workflows: 09:48–11:47
- Redis for Working, Long-Term, and Knowledge Memory: 12:16–16:50
- Semantic Caching: 18:16
- Plug-in Memory Frameworks, Standards: 20:35–23:06
- Modeling Memory Types and Recency: 24:27–29:51
- Prioritizing Short- vs. Long-Term Memory for Context: 29:51–31:54
- Hybrid and Task-Specific Search: 32:53–37:46
- World Models and Agentic Limits: 44:15–47:04
Final Thoughts
Andrew and Sean’s conversation paints a vivid picture of the cutting-edge engineering challenges at the intersection of AI agents, memory, and data systems like Redis. Redis is no longer just a cache; it sits at the heart of next-gen agentic architectures, powering everything from fast working memory to hybrid vector-structured search to workflow orchestration. The path to robust, autonomous agents will require not just better memory systems and smarter context engineering, but crucial advances in world-modeling—the ability to understand and predict the consequences of actions in dynamic environments.
“That sounds like a project that absolutely is going to fail. And I am all in because that’s what's exciting. If I don’t see the path yet to that working really well, I’m very excited to find it.”
– Andrew (47:41)
