Practical AI Podcast
Episode: "AI at the Edge is a different operating environment"
Date: March 25, 2026
Host: Daniel Lightnack & Chris Benson
Guest: Brandon Shibley (Edge AI Solutions Engineering lead, Edge Impulse/Qualcomm)
Episode Overview
This episode dives deep into the current state and unique challenges of deploying artificial intelligence at the edge—in devices and environments outside the traditional cloud or data center. Guests and hosts explore the evolving definition of "the edge," advances in small and efficient AI models, the increasing practicality of edge computing, toolsets and hardware for developers, real-world use case considerations, and future directions. Brandon Shibley shares specialized insight as a leader in edge AI, blending technical depth with pragmatic advice.
Key Discussion Points & Insights
1. Defining AI at the Edge (02:27)
- "The edge" broadly includes anything not in the cloud. It encompasses “far edge,” “near edge,” “edge of network,” and any device or system embedded near where real-world data is sensed or acted upon ([02:27], Brandon).
- Past and present definitions vary, but the key is proximity to the source of data and distance from centralized computation.
2. Trends and Shifts in Edge AI Models (05:45)
- Explosion in both directions:
- Models are getting bigger in the cloud, and smaller (but more capable) at the edge.
- Smaller LLMs (SLMs), often in the range of single-digit to tens of billions of parameters, can now run on smart edge appliances with powerful modern NPUs or GPUs ([05:45], Brandon).
- Specialization at the edge: Smaller models are most effective when fine-tuned for domain-specific tasks rather than general knowledge ([07:40], Brandon).
- Model composition: Increasingly, edge solutions use ensembles or cascades of specialized models, balancing capability and resource efficiency.
3. Unique Operating Constraints of the Edge (09:17)
- Constraints:
- Size, power, cost, and (un)reliable connectivity are acute challenges ([09:17], Brandon).
- Privacy is both a challenge and an opportunity: keeping sensitive data close to its origin prevents exposure to cloud or Internet ([10:30], Brandon).
- Latency and deterministic performance are critical for applications like robotics or manufacturing.
Quote:
"These constraints are what we have to live and die by at the edge. Size, power, connectivity... we're also dealing with cost constraints."
— Brandon ([09:17])
4. Physical AI vs. Edge AI (12:21)
- Physical AI often refers to systems that not only sense/predict but also take action in the physical world (e.g., robotics, vehicles).
- Overlap is large: Physical AI is a subset of edge AI but brings in the "actuation" element ([12:21], Brandon).
5. Latency and Real-Time Demands (14:21)
- Response time ("real-time") must fit the task:
- Microseconds in manufacturing or driving
- Milliseconds in robotics
- Seconds for chat interfaces
- Deciding where models run depends on the necessary latency, available compute, and communications realities ([14:21], Brandon).
6. Model Cascades and Pipelines (17:26)
- Edge solutions often use multi-stage pipelines:
- Lightweight, efficient models (like object detectors, e.g., YOLO) run first, filtering out most data.
- More complex or heavy models (like VLMs or LLMs) engage only when something interesting is detected, saving power and compute.
- Example: Object detector filters frames; only frames with objects pass to a vision-language model for deeper analysis. This pattern applies to audio and sensor data, too ([17:26], Brandon).
Quote:
"What we'll do in many cases is have this pipeline or cascade where on the front end is some kind of very initial detection that can be done very efficiently... and then when we see an object that looks of interest... we can do much deeper or more dynamic analysis."
— Brandon ([17:26])
7. Advances in Tooling and Frameworks (21:36)
- State of tooling: Edge Impulse is highlighted as an abstraction layer that enables developers to work with diversified hardware, managing data, model training, optimization, and deployment easily.
- Fragmentation at the edge (unlike cloud, where Nvidia dominates) requires more sophisticated tools to support many hardware targets.
- Portability and optimization are essential—Edge Impulse offers target-aware conversion for different chips ([21:36], Brandon).
8. Agency, Autonomy, and MLOps at the Edge (24:22)
- Agency at the edge: More systems act and plan, not just infer ([24:22], Chris).
- ML Ops & model management: Edge deployments require not only runtime efficiency but solutions for data acquisition, drift adaptation, model updates (often over intermittent connectivity), and control of distributed devices ([24:22]–[26:54], Brandon).
- Over-the-air updates and centralized aggregation are best where possible for governance.
9. Progress in Small/Medium Models & Their Effectiveness (28:13)
- Techniques like knowledge distillation and fine-tuning extract specialized knowledge into smaller, more efficient models tailored for edge tasks ([29:38], Brandon).
- TinyML and ultra-low power microcontrollers continue to make specialized ML possible for even wearable devices.
10. Hardware Evolution and Vertically Integrated Approaches (31:44)
- Edge Impulse's opinionated approach: Abstraction + target-optimized deployment supports edge diversity and leverages Qualcomm NPUs for efficiency ([32:45], Brandon).
- Hardware advances: Specialized processors (NPUs, DSPs, ISPs) dramatically improve operations/watt, enabling richer on-device intelligence ([36:46], Brandon).
Quote:
"What we previously been able to do, we can just keep building on... Once you have sort of AI in the tool chest, you kind of just broadens the perspective of like what could the world be like if we put intelligence right where the data's at?"
— Brandon ([36:46])
11. Accessible Getting Started Advice for Developers (40:09)
- Start with problems you care about: Home automation, custom sensors, pet feeders, etc.
- Low-cost maker hardware is widely available (e.g., Arduino). Free tools like Edge Impulse help rapidly prototype real-world edge AI projects.
- **Enterprises often begin with maker setups before scaling to robust hardware ([40:09], Brandon).
Quote:
"It's amazing that so many of these things are readily achievable with commodity like maker hardware that's out there. That's a great place to start."
— Brandon ([40:09])
12. Looking Ahead: The Edge’s Future (43:35)
- Vision: As power, compute, and cost constraints shrink, intelligence could be embedded everywhere—much like biological intelligence is co-located with sensors in living organisms ([43:35], Brandon).
- Anticipate more robotics, intelligent action models, and “edge-native” insight/actions transforming both mundane and extraordinary facets of life.
Quote:
"What if power and cost and compute, they basically kind of go to almost zero... it means that we could put intelligence literally anywhere right at the edge... What I see is we're going to continue bringing models to the edge, more of them."
— Brandon ([43:35])
Notable Quotes & Memorable Moments
- On edge constraints:
“These constraints are what we have to live and die by at the edge. Size, power, connectivity... cost constraints.” — Brandon ([09:17]) - On real-time needs:
“The application really drives home the requirement. It all comes down to what is the requirement for the type of behavior we're trying to get out of the system.” — Brandon ([14:21]) - On cascades of models:
“If you were using a large language model... and running it continuously on every frame that came through, it's a very quick way to burn through a lot of power.” — Brandon ([17:26]) - On developer accessibility:
“It's amazing that so many of these things are readily achievable with commodity like maker hardware that's out there. That's a great place to start.” — Brandon ([40:09]) - On the edge’s future:
“What could the world be like if we put intelligence right where the data's at?” — Brandon ([36:46])
Timestamps of Key Segments
- Intro & State of Edge AI: 00:41–04:38
- Edge Model Trends (SLMs, LLMs): 05:45–08:30
- Edge Constraints & Opportunities: 09:17–11:51
- Physical AI vs Edge AI: 11:51–13:21
- Latency & Location of AI: 13:21–15:57
- Model Cascades/Pipelines: 17:26–21:36
- Tooling/Frameworks: 21:36–24:22
- Agency, MLOps at the Edge: 24:22–28:13
- Small Model Techniques: 28:13–31:44
- Edge Impulse & Hardware: 31:44–36:46
- Battery Power & Capability: 36:46–39:01
- Getting Started in Edge AI: 40:09–42:16
- Looking Ahead/Future Vision: 43:35–45:40
Final Reflections
The episode showcases how edge computing is now poised to deliver on the promise of real-world AI: making systems smarter, faster, and more private by operating close to where data originates and actions are taken. With maturing hardware, advanced tooling, and developer accessibility, the barrier to entry is lower than ever—enabling both enthusiasts and enterprises to innovate “at the edge.” The future points toward intelligence everywhere—not just in the cloud, but ambiently embedded across the physical world.
For further exploration or to get started, check out EdgeImpulse.com and experiment with affordable hardware like Arduino as encouraged by the guests.
