
Hosted by Teresa Torres · EN

Guests Arzu Sandıkçı, Co-founder & CEO, Rhea's Factory Mert Topcu, Co-founder, Rhea's Factory In this episode: Why only 10% of plastic gets recycled—and why mechanical and chemical methods hit a ceiling How enzymatic recycling breaks plastic all the way back to its original monomers, unlike traditional methods that just shorten polymer chains Why enzymes are selective: they can target specific plastic types even in mixed waste streams The discovery of a plastic-eating bacteria in Japan that opened the door to enzymatic recycling How AlphaFold and the Nobel Prize in Chemistry transformed what's possible in enzyme engineering How Rhea's Factory uses protein language models (PLMs) and multi-step AI pipelines to design novel enzymes computationally The evolution from a human-orchestrated pipeline to an agentic AI scientist How guardrails at each pipeline step keep the AI pointed in the right direction without limiting exploration Why wet lab data—even just hundreds of proprietary data points—can be enough to train a powerful domain-specific prediction model Why Mert sometimes wants the model to hallucinate (and how high temperature settings help explore the full enzyme design space) The business constraint: enzymatic recycling must compete economically with cheap, oil-based plastic production What's next: a process agent, a 5,000-ton demo plant in California, and enzymes for new plastic types Resources & Links Rhea's Factory — Enzymatic plastic recycling technology AlphaFold — DeepMind's AI system for protein structure prediction (inspiration for the Nobel Prize in Chemistry) Maven AI Evals Course — The course Teresa took to learn about evals (35% off with Teresa's affiliate link) Chapters 00:00 Meet the Founders 01:50 Why Plastic Circularity 03:19 Mechanical vs True Recycling 04:52 Biology as the New Tool 07:20 Necklace and Pearls Analogy 13:22 Low Energy Reactor Process 17:33 Origin Story and PET Enzyme 22:52 Protein Folding and AlphaFold 28:32 AI Designed Enzymes 34:28 Protein Language Models Stack 37:14 Multi Step Protein Generation 39:00 Building on Foundation Models 40:50 Lab First Success Metrics 43:10 From Human to Agentic Orchestration 43:59 Problem Statements as Inputs 46:18 Guardrails at Every Stage 47:48 Prediction Models and Data Limits 50:03 Industrial Reality and Cost 52:30 Agentic Parallels and Orchestrators 57:45 Impact on Timelines and Diversity 01:03:23 When Hallucination Helps 01:04:09 Scaling Up and Process Agents 01:06:56 Enzyme Blends for Mixed Plastics 01:07:49 Why Clamshells Aren't Recyclable 01:09:34 Closing Thoughts and Thanks

Guests Santi Marchiori, CEO, AITropos Juan Haedo, CTO, AITropos You'll hear how they Spent two years exploring hundreds of startup ideas before finding the specific niche of AI-powered order taking in hospitality Went through three product iterations — hardware for waiters, a waiter app, and finally a customer-facing WhatsApp agent — before landing on the right form factor Identified order item identification accuracy as their single most important KPI Chose a tools-based agent architecture over MCP or pipelines to hit real-time response speed requirements Built a parallelized pipeline that searches for multiple products simultaneously and pre-fetches product context before the agent even calls a tool Use smaller, fast sub-agents to build an "immediate system prompt" that injects relevant data into each turn without extra tool calls Test with thousands of agent-simulated customer conversations run overnight before deploying to new venues Reduced new customer onboarding from three months to a few weeks — and continue to shrink it as they build domain templates Resources & Links AITropos Chapters: 00:00 Meet the Founders 00:59 What Tropos Builds 01:51 AI vs Human Touch 06:17 Restaurant Use Cases 08:16 Why Hospitality 10:47 Finding the Wedge 16:00 Early Prototypes 16:46 Hard Parts of Ordering 18:03 Speed and Channels 21:15 Iteration and Model Jumps 30:50 Customer Order Flow 35:48 Menu Discovery Question 36:07 Menus Inside WhatsApp 36:50 Finding the Chat Entry 37:37 Why Text Ordering Wins 38:30 Under the Hood Pipeline 40:54 Tools Over Workflows 45:05 Tooling and Prompt Composer 49:29 Preloading Context Fast 54:02 Founder Learning Mindset 57:21 Evaluating Order Accuracy 01:00:03 Testing and Human Takeover 01:03:56 Onboarding and Scaling Up 01:06:10 Whats Next and Wrap

Guests Ernesto Garcia, Front-end Product Engineer, Doist Thomas Jost, Backend Software Engineer, Doist Hugo Fauquenoi, Product Manager, Doist In this episode How Doist's 2-3 month AI exploration phase led to Ramble — and why voice-to-task emerged as the top contender The user research insight behind Ramble: people using pen and paper or ChatGPT voice to brainstorm tasks before committing them to Todoist Why Ramble skips transcription entirely and processes raw audio directly with a Gemini live audio model How the model makes tool calls (add task, edit task, delete task) in real time while the user is still speaking — no text output at all Designing for the driving use case: sound effects as audio confirmation cues alongside visual task cards The challenge of teaching an LLM to capture tasks literally without over-interpreting or doing them — and how temperature tuning played a role Date handling complexity: injecting the current date, normalizing to days vs. months, and always outputting dates in English for the natural language parser Building an LLM-judge eval system with 20+ language recordings from 100+ employees across 35 countries to catch prompt regressions Why Doist chose to inject the full project/label list into the system prompt instead of building a RAG pipeline — and why it worked How easy correction beats perfect first-time accuracy in natural language interfaces What's next: multimodal task capture from images and text blobs, Apple Watch support, and automation integrations Resources & Links Todoist Doist Google Vertex AI (Gemini) Chapters: 00:00 Meet the Doist Team 01:40 What Doist Builds 02:27 Ramble Voice to Tasks 04:16 Why Voice Matters 07:42 Brain Dump Insight 09:46 Prototyping With LLMs 11:08 Live Audio Workflow 14:32 Driving Friendly UX 18:47 Tool Only Architecture 26:06 Evals and Multilingual Testing 28:41 Taming Dates and Time 33:28 Fixing Date Confusion 33:43 Defining Task Boundaries 34:34 Capture Versus Do 37:17 Tuning Creativity Levels 39:01 Evals Across Languages 41:23 Feedback and Regressions 44:09 Model Upgrades Over Time 46:33 Projects Labels Context 51:40 Handling Ambiguous Names 54:23 Whats Next Multimodal 58:48 From Capture to Execution 59:46 Closing Thoughts

Guests Vlad Solomakha, CEO & Co-founder, Banani Vova Parkhomchuk, CTO & Co-founder, Banani Vlad Ostapovats, Founding Growth, Banani In this episode Why Banani started as a Figma plugin and what they learned from early organic distribution The canvas-first approach: why Banani is built around a design canvas rather than a chat interface How their agent architecture splits prompts into surgical edits instead of regenerating full screens The "gulf of specification" problem and what Banani is building to help agents and designers speak the same visual language Managing context across canvases with hundreds of screens — per-screen history with shared project context Why Banani doesn't compile running applications — just HTML/CSS mockups — and how that shapes everything How they evaluate design quality without traditional evals: spinning up 10 screens from one prompt to compare models Their approach to building at the edge of what's possible: identifying which model limitations to work around vs. wait out The role of context engineering and specialized agent tools in producing tasteful, high-quality design Resources & Links Banani TL Draw Chapters 00:00 Meet the Founders 01:12 What Bonani Builds 02:18 Why an AI Designer 03:40 Raising the Design Floor 06:23 Why AI Was Finally Ready 10:48 First Prototype Figma Plugin 14:10 Early Growth and Distribution 15:25 Standing Out in a Crowded Market 20:13 Product Tour Canvas First AI 23:40 Autopilot vs Manual Control 27:07 Tech Behind High Quality Design 32:08 Craft Beyond 80 Percent 33:40 Gulf of Specification 36:44 Proactive Agent Interviews 38:40 Canvas First UX Choices 42:54 Agent Architecture Under Hood 48:48 State History Context Tricks 52:32 Tooling Context Engineering 56:04 Navigating Busy Canvases 01:00:13 Betting on Model Progress 01:03:47 Shipping Around Imperfections 01:07:20 Try Banani and Next Steps 01:07:52 Building the Banani MCP 01:09:19 Final Thanks and Wrap

Guests Luke Bates, Product Leader (Agent Studio), Medable Jen Brown, Product Manager, Medable Matt Schoolfield, Product Designer, Medable Fiachra Matthews, Principal Architect, Medable What we cover in this episode: What Medable does: enabling global clinical trials across 100+ languages and accelerating drug-to-market timelines The two agents built on Agent Studio—ETMF (document classification) and CRA (clinical data monitoring)—and the problems they solve Why Medable chose a platform approach to agents instead of one-off builds How Agent Studio works: models, skills, knowledge bases, MCP connectors, versioning, and trigger types Three deployment models: Medable-built products, services-led custom builds, and self-serve platform access RAG approaches at scale: embeddings vs. markdown hierarchies vs. just-in-time MCP retrieval How they built a unified ontology layer to map terminology across 13 different clinical data systems Why they built custom MCPs with an authentication and credentialing wrapper Context window management with sub-agents and automatic tool filtering Evaluation design in a GXP-regulated environment: golden datasets, production monitoring, and the challenge of human feedback as ground truth How they document agent intent → specification → test evidence to satisfy regulatory bodies The "full self-driving" vision for clinical trials and what it would take to get there Resources & Links Medable - Clinical trial platform powering Agent Studio Chapters 00:00 Meet The Medable Team 01:14 Medable Mission And Scope 03:27 Agent Studio Platform Overview 06:29 ETMF Document Automation 08:47 CRA Agent For Monitoring 10:40 Clinical Trial Workflow Primer 14:34 Why Build A Platform 17:51 Learning AI As A Team 21:47 Early Days Of Agent Studio 23:15 How Agents Are Built 25:15 Customer Adoption And UX 30:00 Skills And MCP Standards 31:15 Scaling Context Retrieval 33:07 RAG Patterns And Tradeoffs 34:48 Ontology Data Layer Explained 38:01 Customer Friendly Agent Setup 42:19 MCP Security And Connectors 44:36 Tool Bloat And Subagents 50:44 Evals For Reliable Agents 54:40 Human Feedback Isn’t Truth 57:43 GXP Compliance For Agents 01:03:34 Full Self Driving Trials

Guests Matthias Kleverud - Co-Founder, Momental Charlotte Kleverud - Co-Founder, Momental What we cover in this episode: What "GitHub for product management" means: finding merge conflicts in strategy, not code The product chain: signals → learnings → decisions → principles, and how AI maps it Three trees that model an organization: the product tree (OKRs to epics), the wisdom tree (decisions and their reasoning), and the people/time tree How a document processing agent uses OODA-loop thinking to extract and connect context across documents Why traditional chunking and RAG breaks down at scale and what Momental does instead The origin story: building a team of AI agents in 2024, only to discover agents hit the same alignment problems as humans Starting in 2022 with DaVinci 002 and learning that the market wasn't ready for AI-assisted product thinking How conflicts are detected, auto-resolved, or escalated to humans with merge options Why metadata—who said it, when, and in what context—is critical to preventing hallucinations The self-improving agent: collecting user feedback weekly and rewriting its own prompts Moving from chat-first to UI-first to proactive agents as an AI product design pattern Design partner strategy and what's next for Momental's public launch Resources & Links Momental - GitHub for product management Spotify - Where both founders started their PM careers Claude Code - AI coding tool discussed in the conversation Perk episode on Just Now Possible - Referenced episode about eliminating shadow work Chapters 00:00 Meet The Founders 01:14 GitHub For PMs Explained 03:19 Strategy Merge Conflicts 06:49 Product Chain Model 09:49 Capturing Context Fast 12:17 Context Graph And RAG Limits 16:52 Origin Story Since 2022 20:01 From Agent Team To Foundation 25:26 Three Trees Of Context 28:42 Two Agents Secret Sauce 31:37 How Document Processing Works 34:55 Agents Ask Better Questions 35:41 Human In The Loop Context 36:38 Data Models And Graphs 39:21 Beyond Documents And Vectors 42:25 Specialized Tools Win 44:50 Quarterly Planning At Scale 49:38 Discovery Versus Vibe Coding 51:00 Tree Building And Conflicts 53:49 UI Over Chat Interfaces 56:01 Proactive Agents In Practice 58:00 Quality Evals And Feedback 01:00:22 Launch Plans And Mission 01:03:19 Eliminating Shadow Work 01:03:56 Closing Thanks

Guests Yuri Vela Tulopov -- Co-Founder and CEO, ShowMe Quique Gomez -- Co-Founder and Lead Product Engineering, ShowMe What we cover in this episode: How ShowMe builds AI digital workers that function as inbound sales reps The origin story: spotting the conversion gap at a previous company and realizing AI could fill it Why the first MVP was a voice agent with product videos and a simple RAG knowledge base Adding a realistic avatar via HeyGen and how it changed user engagement through better affordances Decomposing a single sales conversation into multiple specialized sub-agents (greetings, qualifying, pitching) The three agent types: conversation agents, evaluator agents, and creator agents How deterministic workflows manage the lead-to-close journey across days Building toward a smart orchestrator agent that breaks out of rigid workflow paths Ingesting sales transcripts and training materials to teach agents company-specific sales skills Customer-driven evaluation loops that start at 100% review and taper to ~5% over time Creating automated tests from customer feedback to prevent prompt regression Confidence scoring and frustration detection for real-time human handoff decisions Treating the agent as a coworker: onboarding via Slack, weekly reporting, CRM integration Future plans: self-serve PLG motion, smart orchestration, and expanding to customer success Resources & Links ShowMe - AI digital sales reps for inbound teams HeyGen - AI avatar platform used for ShowMe's video calls Chapters 00:00 Meet the Founders: Juri & Kike Introduce Show Me 00:45 What Show Me Builds: AI Sales Reps as Digital Coworkers 02:17 Why Inbound-First Sales Agents (and Not Outbound Spam) 03:51 Origin Story: The Website Conversion Problem That Sparked the Idea 08:11 MVP Launch: Voice + Video Product Demos in Two Weeks 10:45 Bootstrapping the Knowledge Base: Videos, Docs, Scraping & RAG 11:54 Beyond Demos: Multi-Stage Buyer Journeys, Follow-Ups & Orchestration 14:34 Building Trust: Avatars, Video-Call UX, and AI “Affordances” 20:18 Whiteboard Architecture: Agents, Workflows, and the Orchestrator Layer 29:43 Where Agents Run: Creators, Evaluators, and Breaking Deterministic Flows 32:18 Sales Is High-Stakes: Personalization vs. Hallucinations & Revenue Risk 33:35 Conversation Agent Evolution: From Q&A Bot to Guided Sales Discovery 34:10 Why One Agent Becomes Many: Decomposing Stages for Latency & Memory 36:36 Orchestrator + Tooling: Routing Between Greeting, Qualify, Pitch, Next Steps 38:46 Teaching Sales Skills: Generic Prompting vs Company-Specific Playbooks 42:52 Ingesting Real Calls & Onboarding Like a Teammate (Transcripts, Training Docs) 45:05 Real-Time Voice + Avatar Demos: Latency Tricks and Video Clip Libraries 47:33 Creator vs Evaluator Agents: Data Cleaning, Custom Fields, Sentiment & Confidence 49:15 Human Handoff Guardrails: When Confidence Drops or Users Get Frustrated 50:15 Proving Quality in Production: POCs, A/B Rollouts, Dashboards, and CRM Logging 53:21 Evals at Scale: Customer Feedback Loops, Regression Tests, and the 5% Review Set 58:19 What’s Next: Smarter Orchestration, Self-Serve Setup, and More Digital Workers 01:02:07 Closing Thoughts: Customer Insight as the Moat in a Fast-Moving AI World

Guests Mark Barbir – CEO, Earmark Sanden Gocka – Co-Founder, Earmark What we cover in this episode: How Earmark differs from generic AI notetakers by producing finished work, not just summaries The pivot from Apple Vision Pro presentation coaching to a web-based meeting assistant Running multiple agents in parallel during live meetings Template-based agents: Engineering Translator, Make Me Look Smart, Acronym Explainer Personas that simulate absent team members (security architect, legal, accessibility) Why ephemeral mode (no data storage) became a selling point for enterprise Reducing AI costs from $70/meeting to under $1 through prompt caching Why GPT 4.1 still beats newer models for prose quality in their use case The limits of vector search for analysis questions across meetings Building agentic search with multiple retrieval tools (RAG, BM25, metadata queries, bespoke summaries) Designing for product managers as the extreme user to solve for everyone Their vision for an AI chief of staff that goes beyond automating deliverables Resources & Links Earmark — Productivity suite where the work completes itself ProductPlan — Roadmapping tool where both founders previously worked Granola — AI notetaker mentioned for comparison Assembly AI — Speech-to-text service used by Earmark OpenAI API — LLM provider with prompt caching support Cursor — AI code editor with build integration in Earmark V0 by Vercel — AI prototyping tool with build integration in Earmark Chapters 00:00 Introduction to Earmark Founders 00:28 Background and Experience 01:05 What Does Earmark Do? 01:23 AI and Productivity 03:09 Comparing Earmark to Competitors 03:41 Earmark's Unique Features 05:53 Templates and Personas 10:06 Technical Details and Development 17:12 Early Product Versions and Challenges 28:44 Understanding Prompt Caching 29:49 Managing Multiple Tools and Costs 30:59 Optimizing Transcript Summarization 35:11 Challenges with Context and Reasoning Models 38:10 Innovative Search and Retrieval Techniques 44:06 Creating Actionable Artifacts from Meetings 48:30 Ensuring Quality and Managing Hallucinations 58:20 Future Vision for AI Chief of Staff

Guests Jennifer Deal – SVP of Product Development, Healio Casey Utley – Senior UX Designer, Healio Matthew Skepner – VP of Technology, Healio What we cover in this episode: Why physicians need AI at the point of care—and how they actually use it (hint: it's preparation, not bedside) The surprising discovery that physicians wanted help with patient communication and empathy, not just clinical answers Building a working prototype in a weekend with Cursor after starting with Figma mockups How Healio's RAG system combines lexical search, vector search, and semantic search across multiple trusted sources Why "just use PubMed" isn't simple—five different ways to access the same data, each with trade-offs Designing citations that physicians trust: subscripts, hover states, and progressive disclosure Serving contextual ads while the LLM processes queries—a practical monetization approach HIPAA compliance and input guardrails for masking personal health information Eight LLM judges for evals: safety, medical accuracy, faithfulness, relevancy, completeness, reasoning, clarity, and overall quality Why physician feedback trumps LLM-as-judge feedback in high-stakes medical contexts The role of the Healio Innovation Partners in ongoing discovery and validation Resources & Links Healio — Medical news, education, and clinical guidance for healthcare professionals PubMed — Database of biomedical literature Cursor — AI-powered code editor used to build the prototype Chapters 00:00 Introduction to Healio Team 01:00 Overview of Healio's Services 01:57 Introducing Healio AI 03:39 Addressing Physician Needs with AI 05:45 Building Trust in AI Solutions 13:56 Prototyping and Testing Healio AI 18:02 Refining the AI Product 21:48 Technical Architecture and Advertising Integration 25:16 Balancing Speed and Accuracy in AI Responses 26:30 Ensuring Credible and Trustworthy Content 27:41 Challenges in Data Integration and Web Crawling 29:00 Optimizing Search Strategies for Different Data Types 31:09 User Interface and Trust Building 34:31 Human Feedback and Continuous Improvement 35:41 Guardrails and Evaluations for Reliable AI 39:11 Experimenting with LLM as Judges 45:13 Future Directions and User-Centric Design

Guests Daniel Kappler — CPO (Product & Design), Tendos AI Matthias Hilscher — CTO (Engineering), Tendos AI Key Takeaways Start narrow to prove value: Tendos AI began with just radiators for one design partner before expanding to all building products Own the interface: building a web application (vs. integrating into legacy systems) gave them control over UX and the ability to iterate toward full automation Evaluate each agent, not just the chain: per-agent evals make debugging tractable and show exactly where performance changed Use review agents: a separate agent that checks work (like code review) catches errors before they reach humans Let customers pull you: customers asked Tendos to replace their CPQ software—strong signals of product-market fit Topics Covered The tendering chain in construction and why it's ripe for automation How domain expertise (CEO's construction background) helped identify and validate the opportunity Entity extraction from PDFs ranging from 1 page to 1,800+ pages Planning patterns in agentic systems—creating and updating plans based on findings How agents evaluate product fit against customer requirements Building custom tracing and observability tools for complex agent chains The path toward self-learning systems through human feedback loops Links & Resources Tendos AI Chapters 00:00 Introduction to Tendo and Key Roles 01:01 Understanding the Tendering Chain 02:26 Real-World Construction Analogy 03:34 Challenges in the Construction Industry 04:48 AI's Role in Tendo's Product 12:59 Early Prototypes and AI Integration 18:31 Expanding Product Capabilities 28:56 Customer Collaboration and Workflow Automation 33:15 Strategic Partnerships and Technical Groundwork 34:20 Focusing on Specific Customer Segments 36:03 Product Evolution and Current Capabilities 38:17 Technical Workflow and Automation 40:12 Evaluating and Matching Product Requests 47:00 Dynamic Agent Architecture 55:29 Quality Measures and Evaluation 01:02:59 Future Directions and Customer-Centric Development