AWS Podcast #753: Amazon Bedrock Mantle and Developing at the Speed of AI
Release Date: January 26, 2026
Host: Simon Elisha
Guest: Joe Magirumov (VP & Distinguished Engineer, AWS)
Episode Overview
This episode features a deep dive into Amazon Bedrock Mantle, AWS's new inference engine, and explores how developing with AI tools is fundamentally transforming software engineering velocity, team workflows, and the required mindset for AI-driven development. Simon Elisha interviews Joe Magirumov about his two decades of experience at Amazon, the scaling challenges of Mantle, and concrete lessons learned from taking a truly "AI-first" approach to building production systems.
Key Discussion Points & Insights
Joe’s Two Decades at Amazon (00:23-03:46)
- Joe reflects on 20 years at Amazon, split between Amazon.com’s core retail backend (shipping, payments, marketplace) and AWS cloud infrastructure (VPC, load balancers, containers, serverless, Lambda, ECS, EKS).
- Joe’s recent focus is on Bedrock and leading the development of Mantle.
- Quote:
“Reliability and scale is at the heart of a lot of what we do at Amazon... There's this constant tension behind engineering for scale versus engineering for reliability.”
— Joe (02:23)
What is Mantle? (04:22-07:21)
- Mantle is the new inference engine powering Amazon Bedrock, AWS’s managed service for foundation model inference.
- The key insight: serving model inference at scale is fundamentally a massive scheduling problem (prioritization, fairness, placement, resource efficiency) rather than a typical web service.
- Mantle was built to maximize customer experience (latency, features) and AWS’s own hardware efficiency.
- Quote:
“At the heart, inference is not quite a web service, but more of a scheduling system.”
— Joe (05:10)
AI-First Development Approach (08:39-14:44)
- Joe describes his own journey using LLMs for coding, from “gimmick” status two years ago to producing near-production quality code six months ago.
- Within Mantle, the rule: every line of code must be ultimately attributed and reviewed by a human. AI is seen as a tool (like a compiler or language), not as an autonomous agent.
- Quote:
“At the end of the day, any line of code committed... has a human name attached to it and the human is ultimately responsible for the quality.”
— Joe (10:45) - Joe shares iterative cycles: prompt LLM, review solution, adjust, sometimes rewrite or refine as with a junior engineer.
The Art and Science of Prompting (14:44-19:20)
- The team’s velocity hinged on skillful prompting: finding the “maximum supportable request” (not too big, not too small) and reducing ambiguity (“the better you prompt, the better the outcome”).
- Joe experiments with brainstorming using LLMs, not just as a code-writing tool but collaborative problem-solving:
“It’s super common for me to actually brainstorm a problem with the model first before we even start implementing.” (17:18)
Human-in-the-Loop & Accountability (20:12-22:34)
- AI is treated as a colleague: the human must still judge, decide, and understand code. LLMs can surface different solutions but the engineer remains accountable, validating new or unfamiliar approaches.
- Quote:
“The clarity of thought, clarity of what you want to do, is what provides the acceleration... that’s true even more so now.”
— Joe (21:20)
Managing Context Windows & Workflow (22:34-26:01)
- LLMs are limited by their input “context window,” so Joe’s workflow breaks projects into modules, resets context between steps, and captures design intent in durable files.
- This mirrors standard engineering decomposition but is now essential due to the constraints of LLM tooling.
10x+ Developer Velocity & Metrics (26:01-29:32)
- Joe’s team observed a documented ~10x increase in commit velocity post-AI-adoption (see their blog’s commit graph).
- The team works more asynchronously: prompts can be submitted, then engineers do other tasks.
- Quote:
“I prompted the model to write some code [while we’re talking]... I’m literally writing code as we speak.”
— Joe (28:43) - This enables engineers with heavy meeting loads to fit in productive coding asynchronously, even overnight.
Testing, Quality, and Bug Management (29:32-36:27)
- Velocity brings risk: even if LLM code has a lower bug rate, more code = more bugs overall.
- The team invests heavily in automating and accelerating their build and test cycles to avoid “tool down” moments when a bug disrupts everyone.
- 25% of team energy goes into process and tooling to keep up quality at speed.
- Quote:
“At this rate of change, you need to have a way to curtail chaos or else it just explodes–exceeds humans’ ability to reason with.”
— Joe (33:30)
Communication & Collaboration Patterns (36:27-42:15)
- As team size grows, communication complexity scales. The Mantle team is intentionally co-located for high-throughput, high-fidelity interactions:
“Remote is not going to work... we all need to be sitting together just because the speed... is just going to be really hard for everybody not to be on the same page.” (37:40) - They favor direct desk-side discussions over scheduled meetings.
- Meetings: attendance is strictly rationed (only participate if you’re a sponsor, decision-maker, or directly involved) to maximize engineering time.
The Changing Nature of Engineering Work (45:32-49:38)
- With asynchronous LLM prompting, engineers fill cycles with whiteboard sessions, breaks, or starting additional prompts—sometimes running multiple LLM sessions in parallel, though this stretches human memory/context.
- Anticipate huge shifts in work/task management tooling as more developers juggle several LLMs and task threads at once.
- Quote:
“We have our own context windows that overflow and we need to actively manage it.”
— Joe (48:12) - More innovation in this area is expected, as LLMs get more capable and the ways humans interact with them evolve.
Surprises & Takeaways from AI-Driven Development (49:43-52:38)
- The biggest surprise: asynchronous development is possible and highly productive.
- Well-functioning teams benefit from AI development much more than poorly functioning ones—the fundamentals (test harnesses, clear communication, tight feedback cycles) matter even more than before.
- Quote:
“AI-driven development is going to magnify well-functioning teams much more than... not so well-functioning teams.”
— Joe (52:38)
Advice for Teams Embarking on AI-First Engineering (53:00-54:47)
- "Just start building." Learning how AI can benefit your domain requires experimentation and hands-on practice.
- It’s crucial to have a positive, barrier-smashing mindset:
“What do I need to change about my systems, my approaches, my software development, to make [AI] work?”
— Joe (53:51) - Persistence, flexibility, and willingness to rethink ingrained habits are mandatory.
Notable Quotes
- On Human Responsibility:
“At the end of the day, any line of code committed into the repository has a human name attached to it and the human is ultimately responsible for the quality.” (10:45) - On Prompting:
“You build up that intuition of what is that maximally supportable request, and it changes over time.” (16:37) - On Team Fundamentals:
“AI driven development is going to impact well functioning teams much more than it’s going to magnify not so well functioning teams.” (52:38) - On Mindset:
“It pays to lean in... The thing that made Mantle team successful was the fact that we all believed it was possible. We knocked down the barriers.” (53:51) - On Workflow:
“All of a sudden, I don’t actually need to be present—even paying attention—for 80% of the cycle of making a software change.” (50:28)
Highlight Timestamps
- 00:23: Joe’s background & AWS tenure
- 04:51: What is Mantle and why is inference a scheduling problem?
- 08:39: Journey to AI-first coding, human-in-the-loop
- 14:44: Prompting best practices and impact
- 22:34: Context window management for LLMs
- 26:25: Real data: 10x velocity & asynchronous workflow
- 29:32: Bugs, testing, and scaling quality at speed
- 36:59: Communication bottlenecks & team structure
- 45:32: What engineers do during LLM downtime, task management
- 49:43: Most surprising aspects of changing the workflow
- 53:00: Advice to listeners: experiment, adapt, be persistent
Final Takeaway
This episode provides a candid, detailed view into building state-of-the-art, AI-assisted cloud services. The Mantle team’s experience demonstrates that leveraging LLMs for development is not just viable but transformative—when paired with the right engineering foundations, accountability, and agile team practices. The high velocity achieved isn’t about relinquishing control to AI, but about thoughtfully retooling how humans and machines collaborate, keeping humans “in the loop” for quality and responsibility, and relentlessly refining surrounding processes and tools.
Summary by AWS Podcast Summarizer
