
Hosted by Scott Hanselman · EN

Scott talks with Aran Khanna, co-founder and CEO of Archera, about a new category of cloud financial tooling: "Insured Commitments." Instead of locking into 1- or 3-year reserved instance contracts and hoping your usage matches, Archera offers commitments as short as 30 days. They get into the economics of cloud purchasing, how AI workloads are changing capacity planning, and what FinOps looks like in 2026. http://archera.ai

Scott talks with Skyla Loomis, General Manager of IBM Z Software, about the ongoing relevance of mainframes in 2026. They discuss the enduring power of mainframes, how generative AI is transforming COBOL modernization, and why enterprise infrastructure still runs on IBM Z. Skyla shares insights on developer experience, compliance challenges, and the misconceptions about mainframe technology in a cloud-native world.

Greg Hinkle, co-founder of Nimbalyst and former VP of Software Engineering at Salesforce, joins Scott to discuss the future of AI-assisted development. They explore the challenges of managing multiple AI coding agents, finding flow state in an agentic world, and why visual workspaces matter. Greg shares Nimbalyst's opinionated approach to integrating tools like Excalidraw, task management, and session organization directly into the developer workflow. https://nimbalyst.com

Kelly Shortridge, author of "Security Chaos Engineering: Sustaining Resilience in Software and Systems" and CPO at Fastly, joins Scott for an ACM ByteCast joint episode about why security should be designed for failure rather than prevention. From airplane coffee makers causing critical failures to squirrels being the real "advanced persistent threat" to power grids, Kelly makes the case that no system is perfectly secure — and the teams that feel most in control are often the least prepared. The conversation covers metrics theater, the cost-resilience tradeoff, why software has unique advantages for simulation that we're not leveraging, and where LLMs fit (and don't fit) in security workflows.

Tori Westerhoff joins Scott to explore the intersection of AI, human psychology, and personal growth. As people increasingly use LLMs for introspection and decision-making, Tori argues that we're missing the diversity of thought that comes from community, even particularly random encounters with strangers. She reveals her own practice: a daily noon reminder to talk to strangers. "If you sycophant yourself, you're never going to grow," she explains. The conversation delves into how LLMs can create echo chambers of thought, and why the randomness of human connection, even just someone on the same bus, helps us update our mental frames and break out of programmed decision-making paradigms.

In this episode, in association with the ACM ByteCast, Scott talks with Eric Allman, one of the foundational figures of the early internet. Best known for creating Sendmail, the mail transfer agent that powered a large portion of global email infrastructure through the formative years of the network, Allman helped shape how messages move across the internet. Their conversation explores the origins of internet email, the messy realities of building software that must operate at planetary scale, and what lessons today’s engineers can learn from the systems and design decisions that quietly underpin modern computing.

Scott Hanselman sits down with Allen Stewart, Partner Director of Software Engineering at Microsoft, to explore how AI agents with persistent memory are transforming scientific research and software engineering. Allen explains how his team built an AI system that learns from every investigation turning a 12-day autonomous drug discovery run into reusable knowledge that makes future research exponentially faster. Instead of starting from scratch each time, the AI inherits hypotheses, methodologies, and findings from previous work, saving hundreds of millions of tokens and weeks of effort.

In this episode, Scott talks with Don Syme about the emerging world of agentic developer workflows and what it means when coding tools move from autocomplete helpers to collaborators. They explore how modern tools like GitHub Copilot and GitHub Agentic Workflows are evolving into systems that can plan, execute, and iterate on tasks across a codebase, and what that means for software design, type systems, and developer responsibility. https://github.github.com/gh-aw/

This week on the show, Scott talks to Philip Kiley about his new book, Inference Engineering. Inference Engineering is your guide to becoming an expert in inference. It contains everything that Philip has learned in four years of working at Baseten. This book is based on the hundreds of thousands of words of documentation, blogs, and talks he's written on inference; interviews with dozens of experts from our engineering team; and countless conversations with customers and builders around the world. https://www.baseten.co/inference-engineering/

What does it take to design a programming language from scratch when the target isn’t just CPUs, but GPUs, accelerators, and the entire AI stack? In this episode, I sit down with legendary language architect Chris Lattner to talk about Mojo — his ambitious attempt to rethink systems programming for the machine learning era. We trace the arc from LLVM and Clang to Swift and now Mojo, unpacking the lessons Chris has carried forward into this new language. Mojo aims to combine Python’s ergonomics with C-level performance, but the real story is deeper: memory ownership, heterogeneous compute, compile-time metaprogramming, and giving developers precise control over how AI workloads hit silicon. Chris shares the motivation behind Modular, why today’s AI infrastructure demands new abstractions, and how Mojo fits into a rapidly evolving ecosystem of ML frameworks and hardware backends. We also dig into developer experience, safety vs performance tradeoffs, and what it means to build a language that spans research notebooks all the way down to kernel-level execution.