Lenny's Reads: Building AI Product Sense, Part 2
Host: Lenny Rachitsky | Featured Expert/Author: Dr. Merilee Nika
Date: February 10, 2026
Episode Theme:
How to build “AI product sense”—the essential skill for effective AI product management—through concrete, actionable rituals and frameworks developed by Dr. Merilee Nika.
Overview
In this episode, Lenny Rachitsky narrates Dr. Merilee Nika’s guide to rapidly building “AI product sense”—the evolving core competency for product managers working with AI. The episode dives into real-world rituals and practices for stress-testing AI features, identifying predictable failure modes, defining quality bars, and designing product guardrails. Listeners gain both strategic frameworks and actionable steps to avoid costly mistakes when launching AI-powered products.
Key Discussion Points & Insights
Meta’s Product Sense with AI Interview: The New Bar for PMs
- Context: Meta has updated its PM interview loop for the first time in over five years, adding a “Product Sense with AI” interview.
- What’s Different?
- The focus is not on clever prompts or technical AI trivia.
- Candidates are evaluated on their ability to:
- Manage uncertainty (02:20)
- Detect when models are guessing or hallucinating (02:45)
- Ask clarifying questions instead of making assumptions
- Make product decisions amidst imperfect information
- Big Picture: AI product sense is now the core skill for PMs—knowing model capabilities, limitations, and working constructively within those.
Why Traditional PM Tactics Fail with AI
- AI features work great in demos but break down with real users because of “predictable failure modes.”
- Example: A customer support chatbot that confidently answers ambiguous questions, eroding user trust because it doesn’t ask for clarification (04:00).
Merilee’s Three-Step Weekly Ritual for Building AI Product Sense
Merilee distilled her decade of AI PM work at Google and Meta into three field-tested steps:
- Map Failure Modes & Intended Behavior
- Define Minimum Viable Quality (MVQ)
- Design Product Guardrails Where Behavior Breaks
(10:20)
Step 1: Map Failure Modes & Intended Behavior
What Is a Failure Mode?
Every AI feature has a “failure signature”—the set of ways it predictably breaks when inputs are messy.
Merilee’s Weekly Rituals (15 min total, Wednesdays):
Ritual 1: Ask the Model to Do Something Obviously Wrong (11:15)
- Purpose: Test how the model handles chaos (e.g., messy Slack threads, unstructured feedback).
- Example Prompt:
- Input: Chaotic, ambiguous Slack conversation.
- Output: “I asked the model to extract strategic product decisions... and it confidently hallucinated a roadmap, assigned the wrong owners, and turned offhand comments into commitments.” (12:50, Merilee)
- Key Lesson: Generative models will create structured answers even when there’s no factual basis—dangerous if unchecked.
Ritual 2: Ask the Model to Do Something Ambiguous (26:20)
- Purpose: Test semantic fragility—the model’s guesses when intent is unclear.
- Common Prompts to Try:
- “Rewrite this for your target users”
- “Improve this onboarding flow”
- “Sort these requests by importance”
- What to Watch:
- Does the model make the wrong assumptions about user type, metric, or intent?
- Does it over-summarize or latch onto the wrong detail?
- “Ambiguity is kryptonite for probabilistic systems, because if a model doesn't fully understand the user’s intent, it fills the gaps with its best guess—that is, hallucinations or bad ideas. That's when user trust starts to crack.” (26:40, Merilee)
Ritual 3: Ask a Model to Do Something Unexpectedly Difficult (38:05)
- Purpose: Identify the first point of failure with real but complex tasks.
- Typical Task: “Group these 40 bugs into themes and propose a roadmap.”
- Observed Issues:
- Loses track of early items as input grows
- Inconsistent clustering (themes change/run together)
- Invents priorities or facts not in evidence
- Interpretation: Here is where to design UX/product guardrails—break tasks into smaller chunks, request explicit clarification, etc.
*“Where it [the model] starts to go wrong is exactly where you need to design guardrails, narrow inputs, or split the task into smaller steps.” (40:35, Merilee)
Step 2: Define Minimum Viable Quality (MVQ) (44:25)
- MVQ is the explicit bar for “good enough” in the real world, not the lab.
- MVQ should specify three levels:
- Acceptable Bar: “Good enough” for most users.
- Delight Bar: Feature feels magical.
- Do Not Ship Bar: Below this, trust breaks and feature can’t launch.
Illustrative Example: Speech & Speaker Identification in Smart Speakers
- Delight Test:
- “If 8 or 9 out of 10 attempts work without a retry in realistic conditions, it feels magical. If one in five needs a retry, trust erodes fast.” (46:40, Merilee)
- Example “Do Not Ship” Scenario:
- Regular misidentification or repeated failed attempts.
Tailor MVQ to Strategic Context:
- First to market? Users are more forgiving—ship earlier, label beta, allow undo (49:45).
- Entering a competitive market? Must match/exceed current standards.
- High-risk domain (Finance, Health)? MVQ must be much higher, with human in the loop.
- If brand promises “It just works”? Conservative rollout, stricter fail-safes.
Step 3: Codify and Design Guardrails (56:05)
- Guardrails: Product-level interventions that limit or correct AI behavior to avoid trust-breaking failures.
- Real Example:
- AI feature summarized Slack threads, but started assigning owners to tasks no one had accepted.
- Solution: “We added one simple rule to the system prompt: Only assign an owner if someone explicitly volunteers or is directly asked and confirms. Otherwise, surface themes and ask the user what to do next. That single constraint eliminated the biggest trust issue almost immediately.” (58:10, Merilee)
Notable Quotes
-
On model hallucination:
- “When confronted with mess, they confidently invent structure.” (12:45, Merilee)
-
On ambiguity:
- “Ambiguity is kryptonite for probabilistic systems... that's when user trust starts to crack.” (26:40, Merilee)
-
On delight thresholds:
- “Here’s a good rule of thumb: if 8 or 9 out of 10 attempts work without a retry in realistic conditions, it feels magical. If one in five needs a retry, trust erodes fast.” (46:45, Merilee)
-
On guardrails:
- “A good guardrail determines what the product should do when the model hits its limit so that users don’t get confused, misled, or lose trust.” (56:20, Merilee)
Timestamps for Key Segments
| Timestamp | Segment | |----------------|---------------------------------------------------------| | 00:00 - 04:40 | Theme intro; Meta’s new PM interview focus; why “AI product sense” matters | | 10:20 - 16:40 | Step 1: Map failure modes; Ritual 1 example | | 26:20 - 34:00 | Ritual 2: Testing ambiguity | | 38:05 - 41:40 | Ritual 3: Stress-test with difficult, real tasks | | 44:25 - 53:35 | Step 2: Defining MVQ and delight/do-not-ship thresholds | | 56:05 - 59:00 | Step 3: Codifying guardrails; real-world product fixes |
Memorable Moments
- The “Hallucination Roadmap” Scenario: Demonstrates how convincingly wrong a generative AI can be, and how easy it is for PMs to inherit silent failures.
- “Kitchen Test” for Smart Speakers: Evaluating smart home AI in noisy, uncontrolled environments (kids, dog, dishwasher) for real-world reliability.
- Cost Envelope Example: The true business cost of running AI features at scale (e.g., $0.02 per meeting note, as much as $2/month for power users).
Takeaways
- Building AI product sense is a matter of practical, weekly habits. Rely less on theory, more on ritualized real-world testing of models.
- The difference between “magical” and “broken” AI is simply the result of understanding (and constraining) failure modes and communicating explicitly about quality and expectations.
- Product managers must redefine their approach: From “Is this a good idea?” to “How will this behave in reality, in the hands of messy, impatient users?”
- For every AI feature: Map failures, set concrete quality bars, anticipate scale/business constraints, and always design for user trust—not demo dazzle.
(To access visuals, diagrams, and written guides, see the episode’s written version at Lenny’s Newsletter.)
