The Growth Podcast

How to Use Codex Like an OpenAI PM | Abhi Muchhal, PM OpenAI (ex-Meta and Nubank)

Today’s episodeSix months ago, I told you Codex is the best way to use ChatGPT for PM work.Most of you tried it. Some of you stuck with it and very few of you are running it the way the people who built it actually run it.Today we get that inside look. Abhi Muchhal is an International Growth PM at OpenAI. Before that, Meta, Nubank, and a founder building on the OpenAI API. He is one of the people responsible for ChatGPT’s growth in India, Brazil, and Japan, markets that are now driving a meaningful share of OpenAI’s 900 million weekly active users.He opened his actual setup on camera. The harness. The automations. The prompts that actually work. And the ones that failed before he figured it out.----Brought to you by:Bolt.new - Ship AI-powered products 10x fasterProduct Faculty - Get $550 off their #1 AI PM Certification with code AAKASH550C7Customer.io - Send smarter messages using your product dataAriso - Ship AI agents and features faster, with fewer regressionsJira Product Discovery - Plan with purpose, ship with confidence----If you want access to my AI tool stack - Dovetail, Arize, Linear, Descript, Reforge Build, Relay.app, Magic Patterns, Speechify, Bolt.new and Mobbin - become an annual subscriber ($150), and grab Aakash’s bundle.If you want access to my AI PM customizations - PM OS, Job Search OS, and Prompt Library - become a founding subscriber ($250).----Key Takeaways:1. The harness is what separates Codex users from Codex runners - The connectors, the permissions model, and the skills layer are the three components that make Codex a system rather than a chat tool. Without all three, you are using an expensive autocomplete.2. Generic prompts hit the wrong data - Abhi's team had separate B2C and B2B tables that both matched "tell me about weekly active users." The generic query returned the wrong answer every time. Specificity is the skill, name the exact dashboard and the exact metric, looks simple but saves a lot of time when you scale.3. Three permission levels - Read tasks get full autonomy. Synthesis and drafts get full autonomy. Anything going to another human gets your eyes first. Treating permissions as binary, all control or all autonomy, breaks.4. The person who cares most builds the skill - One OpenAI growth team built a skill that automates their entire experiment review process. It writes the hypothesis, monitors the run, and prepares the review doc.5. Real automations run without you - Abhi runs three automations before he opens a single dashboard: a Slack triage, a 9:30AM self-refreshing growth dashboard pulling from 7-8 sources, and a weekly stakeholder update that writes its own first draft. He reviews, makes edits if needed, and sends.6. Prototype before you document - Build the working prototype first, then write the 10-question companion FAQ. Showing engineers something that runs changes the conversation from whether to build to how to build it.7. India is OpenAI's second largest market and under 10% of working adults are knowledge workers - The ChatGPT use case that drove US growth does not reach the same share of people in the markets driving the most new users. Building for the world means knowing how different the world actually is.8. The WhatsApp computer use loop ran in 68 seconds - Point Codex at the WhatsApp desktop app. It reads what you missed, identifies action items, checks your calendar, and types the draft in the composer. One tap to send. Every PM building for international markets should run this workflow at least once.9. Speaking evals is the key to breaking into a frontier lab - Name a capability you care about. Describe how you would measure it. Say how you would know if the model improved. You do not need 50 evals under your belt. You need to understand why they exist and what a good one measures.10. Building something real is non-negotiable for frontier lab applications - Abhi had a live Chrome extension running on the OpenAI API at the time of his application.----Related contentPodcasts:The Ultimate Guide to ChatGPT CodexHow PMs Ship 100K Lines of Code at OpenAIEvals are the new PRDNewsletters:OpenAI’s Claude Code KillerAI Agents Guide for PMsHow to Land a $300K+ AI PM Job----Where to find Abhi Muchhal:LinkedIn: https://www.linkedin.com/in/abhimuchhal/OpenAI:LinkedIn: https://www.linkedin.com/company/openai/Where to find Aakash:X: https://x.com/aakashguptaLinkedIn: https://www.linkedin.com/in/aagupta/Newsletter: https://www.news.aakashg.com---PS. Please subscribe on YouTube and follow on Apple & Spotify. It helps!If you want to advertise, email productgrowthppp at gmail. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.news.aakashg.com/subscribe

Summary

How PMs Ship 100K Lines of Code at OpenAI with Ryan Lopopolo, Member of Technical Staff

2w ago01:14:45Summary ready

Today’s episodeMost companies are still debating whether PMs should ship code.OpenAI is already debating the best ways for PMs to ship code.They’re living in the future.The builder behind a lot of that harness engineering is Ryan Lopopolo. He wrote the OpenAI post on harness engineering and runs a frontier team where PMs, designers, and engineers all ship using the same system.The wild part for me? His PMs shipped around 100K lines of production code.Did they open the IDE? Hell no! Their coding happened through PRDs, tests, docs, and harness rules. The model did the typing.As someone who spent a decade in PM growth roles, I’ve seen how long it takes to move a feature from PRD in a doc to code in prod. For most companies, that latency is weeks.In Ryan’s world, it can be days, and the PM is inside the loop instead of watching from Jira. So I wanted to get to the bottom of this:* What does the harness look like when PMs can ship like that?* How do engineering teams set PMs up so they don’t ship slop?* What changes in the EPD trio when code is cheap, and validation is the bottleneck?That’s today’s episode, and I come with receipts as Ryan goes deep.----Check out the conversation on Apple, Spotify, and YouTube.Brought to you by:* Product Faculty - Get $550 off their #1 AI PM Certification with code AAKASH550C7* Bolt - Ship AI-powered products 10x faster* Customer.io - Send smarter messages using your product data* Ariso - Ship AI agents and features faster, with fewer regressions* Pendo - The #1 software experience management platform----* If you want access to my AI tool stack - Dovetail, Arize, Linear, Descript, Reforge Build, Relay.app, Magic Patterns, Speechify, Bolt.new and Mobbin - become an annual subscriber ($150), and grab Aakash’s bundle.* If you want access to my AI PM customizations - PM OS, Job Search OS, and Prompt Library - become a founding subscriber ($250).----Key Takeaways:1. Code is a liability, not an asset - Every engineering org was built around the assumption that code is expensive to produce, validate, and deploy. Codex inverts this. Code is now the cheapest part of the stack and the constraint moves to how clearly you describe the problem.2. The new constraint is product decisions per week - With code generation effectively free and parallel, the bottleneck is no longer keystrokes. It is the quality of the brief, the clarity of the architectural boundaries, and the speed of verification.3. A billion tokens a day is the new floor - Ryan's claim is that if you are not running this volume you are negligent. The math comes out to roughly $2K to $3K per engineer per month, which is trivial against the headcount cost of human-only execution.4. A single PR can burn 350 million tokens - One refactor that would have taken Ryan three weeks ran on Codex for 60 hours straight across three days. He gave it two prompts total after the initial spec. The output matched what he would have produced himself.5. The harness is the actual product - Codex CLI is the surface. The harness is everything that gets the agent the right context at the right phase. Pre-work, messy middle, and close. Each phase needs different context, different tools, and different verification.6. agents.md is forcibly injected context - This file lives in the repository root and is always loaded into the agent's context. Use it for the operating model and the non-negotiable rules. Everything else gets pulled in dynamically because context is a hard, scarce resource.7. The painted-door technique works inside the codebase - Ryan's team enforces package boundaries so a designer can paint a fake UI on top of stubbed APIs. Real usage signal, no backend cost. This only works because the architecture refuses to permit a ball of mud.8. The PM's PRD can become a shipped PR in one week - In Ryan's setup, the PM wrote a markdown PRD, the team reviewed it in a Monday meeting, and a working feature shipped to customers by the following week with zero PM-to-engineer back-and-forth.9. The Monday morning roadmap starts with legibility - The first move is making the repository legible to the agent. Write the implicit team decisions down in a documentation tree. Use @-mention Codex to keep that tree updated whenever a Slack thread surfaces a new guardrail.10. One agent beats multi-agent handoffs - The lossy friction of agent-to-agent handoffs costs more than it saves. The right answer is one agent with full addressability over design, backend, and frontend, powered by a model good enough to hold the whole task in context.----Where to find Ryan Lapopolo* X* LinkedIn* OpenAIRelated contentPodcasts:* How to Run Evals in Claude Code with Aparna Dhinakaran* How to Build a Full AI Dev Team in Claude Code with Gabor Mayer* This CPO Uses Claude Code to Run His Entire Work Life with Dave KilleenNewsletters:* PM’s Guide to Claude with Pawel Huryn* How to Become a Builder PM with Mahesh Yadav* How to Build a Team OS in Claude Code with Hannah Stulberg----PS. Please subscribe on YouTube and follow on Apple & Spotify. It helps!If you want to advertise, email productgrowthppp at gmail. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.news.aakashg.com/subscribe

Summary

How to Run Evals in Claude Code with Aparna Dhinakaran, Founder and CPO of Arize

2w ago01:19:32Summary ready

Today’s episodeMany of the smartest AI teams I know are running their evals on Arize. Teams at Uber, Booking.com, Pepsi, and others.It’s become one of the most important skills for PMs. I already had on the CEO of Braintrust, Hamel Husain and Shreya Shankar, and Ankit Shukla.Today I’m adding to this knowledge base on evals with a masterclass on evals in Claude Code.Aparna Dhinakaran is the founder of Arize. She’s also their CPO. And she gives a masterclass in how to run all of your evals through Claude Code.So if you want to do AI evals like the best, like Uber, like Booking.com, check out this episode. For anyone in building in Claude Code, it’s a doozy.If a candidate did this in an interview, Aparna said she would hire them on the spot.----Check out the conversation on Apple, Spotify, and YouTube.Brought to you by:* Superhuman - The fastest email experience ever made* Sign up and get 1-month free of Superhuman Mail with my link: superhuman.com/akash (given by brand - Kartik)* Land PM Job - My 12-week AI PM + Job Search Course, first 10 enrollees get a FREE 30-min 1:1 consultation* Vanta - Automate your compliance. Close deals faster* Product Faculty - Get $550 off their #1 AI PM Certification with code AAKASH550C7* Bolt - Ship AI-powered products 10x faster----If you want access to my AI tool stack - Dovetail, Arize, Linear, Descript, Reforge Build, Relay.app, Magic Patterns, Speechify, and Mobbin - grab Aakash’s bundle.Do you want to become an AI PM? I’ve created a course for you. Starts soon.----Key Takeaways:1. Trace before you eval - A trace is the full step-by-step playback of what your agent did. Without it, you have no evidence base for evals. Every LLM call, every tool call, every intermediate output needs to be visible before you write a single eval.2. A span is your unit of evaluation - A span is one discrete step inside a trace. Evals run at the span level, not the trace level. "Did this specific scoring step get the priority right?" is a more useful question than "was the whole run good?"3. Instrumentation is now a one-command job - Claude Code's instrumentation skills can set up observability for your agent automatically. Arize Phoenix's skill looks at your codebase, identifies the LLM calls and tool calls, and wires them to the tracing layer. No engineering support required.4. The vibe eval is a draft, not a verdict - An LLM can suggest what your evals should test by looking at your traces. That suggestion will not know your bug-first policy, your comp logic, or your definition of "critical." Treat it as v0 and refine against your actual judgment.5. When evals fire, two things could be wrong - The agent produced a bad output. Or the eval is miscalibrated. Reading the flagged span yourself is the only way to know which one needs fixing. Both are normal. Both are good news.6. Evals drift and need regular realignment - Your priorities change. Your bug policy changes. Your product changes. An eval calibrated to last quarter will start misfiring this quarter. Regular alignment to human feedback is maintenance, not a failure.7. The self-improvement loop is already running at the best teams - Fetch all spans where evals fired. Group by failure category. Propose a specific prompt fix. Review and approve. Ship the new version. This loop runs on a schedule and requires a human at the approval step.8. Enterprise PMs: start with one internal agent - Not a customer-facing product. An internal tool that takes four hours off your week. Once you have it, you will naturally want to trace it. That is when observability starts to matter to you personally.9. The context graph is the enterprise unlock - Agents are only as useful as the context they have. Enterprise data lives in silos. The teams breaking through are building unified context layers that give one agent access to CRM, Gong, analytics, GitHub, and Slack.10. Product taste is still the alpha - Code is cheap now. Shipping speed is table stakes. The PMs who pull ahead are the ones with the sharpest judgment about what to build, and the loops that make their agents better every day.----Related contentPodcasts:* AI Evals with Hamel Husain and Shreya Shankar* Evals are the new PRD with Ankur Goyal* AI PM Crash Course with Aman KhanNewsletters:* AI Evals for PMs: Everything You Need to Know to Get Started in 2026* Your Complete AI PM Course & Career Roadmaps* AI PM’s Guide to LLM Judges----PS. Please subscribe on YouTube and follow on Apple & Spotify. It helps! This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.news.aakashg.com/subscribe

Summary

Claude Code for Non-Technical PMs, with Andre Albuquerque

3w ago01:09:38Summary ready

Today’s episodeThe market is looking tough for non-technical PMs.Every single week, my comments look exactly the same: brilliant product managers who have the vision, specs, and roadmap in mind, but have zero coding skills. They want to build, and while thousands of technical resources exist online, they make a flawed assumption: that you already know how to code.So when I invited Andre Albuquerque on my podcast, I had to ask him to share his setup. Andre is the founder of Builders Camp, a product school with 4,000+ students across 30 countries, who runs five businesses with Claude Code and has never been a developer.Live on the episode, he built a fully functional product from scratch to show how easily a non-technical PM can go from 0 to 1. He also walked me through CLAUDE.md architecture, custom multi-agent skills, and the bridge between Lovable and Claude Code (which, by the way, not many people are talking about).If you have been putting off Claude Code because it feels too technical or intimidating to set up, this episode is absolutely for you.----Check out the conversation on Apple, Spotify, and YouTube.Brought to you by:* Customer.io —* Amplitude — The market-leader in product analytics* Bolt — Ship AI-powered products 10x faster* Arize — Ship AI agents and features faster, with fewer regressions* Product Faculty — Get $550 off their #1 AI PM Certification with code AAKASH550C7----* If you want access to my AI tool stack - Dovetail, Arize, Linear, Descript, Reforge Build, Relay.app, Magic Patterns, Speechify, Bolt.new and Mobbin - become an annual subscriber ($150), and grab Aakash’s bundle.* If you want access to my AI PM customizations - PM OS, Job Search OS, and Prompt Library - become a founding subscriber ($250).----Key Takeaways:1. Non-technical PMs are stuck in Jira, Linear, and PowerPoints - Most European PMs are still product owners in disguise, paper-shuffling between strategy and engineering teams. The way out is to actually start building, not to lobby for more autonomy.2. Start with Lovable on a personal project - Build something for your family, your friends, yourself. The codebase does not need to be pretty. The point is the safety to make mistakes without breaking anything that matters.3. The Lovable + Claude Code bridge nobody documents - Connect both tools to the same GitHub repo. Write code in Claude Code with all its depth. QA visually in Lovable with its hosted preview. Publish from Lovable's button. The perfect transition layer.4. Lovable, Cursor, and Vercel are not competitors - Lovable bundles the IDE, the hosting, and the deployment in one product. Vercel exposes the hosting layer so you can run real branches with real preview URLs. Cursor is just an IDE with a generous free tier.5. Cursor has a free debugging agent - When Claude Code breaks, open a Cursor agent and paste the error. The free agent unsticks you instead of leaving you stuck at step zero.6. CLAUDE.md is your team's culture - Loaded automatically every session. The first rule should be "for every task, call the PM agent." When you notice yourself fixing the same issue twice, update CLAUDE.md so it never happens again.7. The PM agent never writes code - The PM orchestrator's only job is to decide which other agent should handle the work. The researcher investigates. The designer proposes. The engineer architects. The implementer writes.8. Do not copy famous people's skills wholesale - Going on LinkedIn and downloading 100 skills from product celebrities creates more confusion than value. Look at how your real team works. Write each role down as an agent.9. Fix the agent, not the feature - When something ships wrong, do not patch the output. Identify which agent in the pipeline failed, update its instructions, and run the pipeline again. The next session inherits the fix.10. The Monday morning move is exactly three steps - Get added as a collaborator on a low-risk repo. Pick the oldest ticket in the backlog. Push a branch and demo by Friday.----Related contentPodcasts:* Claude Code and agents with Gabor Meyer* n8n, Claude Code, and OpenClaw with Mahesh Yadav* Claude Code with Hannah StulbergNewsletters:* How to Build a Full AI Dev Team* How to Become a Builder PM* How to build a Team OS in Claude CodeP.S. Reply with “CLAUDE” and I’ll send you Andre’s actual CLAUDE.md template. He said we could share it.PS 2. Please subscribe on YouTube and follow on Apple & Spotify. It helps!If you want to advertise, email productgrowthppp at gmail. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.news.aakashg.com/subscribe

Summary

PM's Guide to Claude - When to use Chat vs Cowork vs Code, with Pawel Huryn

3w ago01:32:40Summary ready

Today’s episodeWhen do you use Claude chat vs Cowork vs Code? No one has created a resource that helps you get the most out of the Claude ecosystem.Until now. I’ve brought back Pawel Huryn, the guest behind our most popular episode ever, the Complete Course on AI Product Management.Today we’re covering everything you need to know to get the most out of the Claude Ecosystem.Most PMs open Claude chat. Ask something. Get an answer. Close the tab. Tomorrow, same thing. Fresh context. Zero memory.The PM who tracked Anthropic’s 74 releases in 52 days stopped doing this entirely. He built a system where Claude organizes its own knowledge, extracts its own rules from data, promotes hypotheses when evidence confirms them, and demotes them when it does not. The system improves without him telling it what went wrong.I sat down with Pawel Huryn, creator of the Product Compass newsletter. He has defined 60+ PM skills, built a PM skills marketplace that hit 10,000 GitHub stars, and runs his entire content operation across Cowork, Claude Code, and Dispatch.In this episode, he walks through every screen live. Real files. Real agent workflows. Real self-improving knowledge bases.----Check out the conversation on Apple, Spotify, and YouTube.Brought to you by:* Bolt - Ship AI-powered products 10x faster* Amplitude - The market-leader in product analytics* Jira Product Discovery - Plan with purpose, ship with confidence* Product Faculty - Get $550 off their #1 AI PM Certification with code AAKASH550C7* Land PM Job - 12-week experience to master getting a PM job----If you want access to my AI tool stack - Dovetail, Arize, Linear, Descript, Reforge Build, Relay.app, Magic Patterns, Speechify, and Mobbin - grab Aakash’s bundle.I’m accepting applications for my third LandPMJob cohort. Join Me.----Key Takeaways:1. Stop using Claude Chat as your default. Cowork accesses real files, connects to Gmail and Slack via MCP, and runs parallel sub-agents. Chat does none of this.2. Skills are the highest ROI investment. Install marketplace baselines, iterate 5-6 times with specific feedback, and Claude rewrites from first principles until 99% accuracy.3. Progressive disclosure keeps context clean. Agent reads skill names and descriptions first. Loads full instructions only when the task matches. Hundreds of skills, minimal overhead.4. Your CLAUDE.md should route, not store. Project structure and pointers only. Domain knowledge lives in separate files the agent loads on demand.5. Build self-improving knowledge with three types. Rules are confirmed and applied by default. Hypotheses are tracked with evidence. Rejected patterns are kept to avoid retesting.6. The three-line self-improving prompt works for any domain. Review rules before starting. Apply confirmed rules. Update after feedback. Testing, marketing, strategy, whatever.7. Claude Code adds explorer view, hooks, subagents, and local MCP scoping. PMs need it once their system grows past 50 files.8. Every Product Compass infographic was built in Claude Code. HTML generation, component library, iteration through conversation, PNG export. Zero code written by the human.9. Use Agent Browser from Vercel instead of Chrome MCP. Chrome MCP screenshots every 0.5s and burns $100/hr. Agent Browser uses headless mode and is token-efficient.10. Dispatch lets you run multiple tasks from your phone. Start an infographic, check emails, analyze competitors. Each runs as a separate thread. Your system works while you live.----Where to find Pawel Huryn* LinkedIn* Product Compass Newsletter* PM Skills Marketplace on GitHub* [Quadathon - starts May 9th](VERIFY - Quadathon URL)Related contentPodcasts:* n8n Masterclass with Pawel Huryn* Claude Code PM OS with Dave Killeen* Claude Code Team OS with Carl VellottiNewsletters:* The complete Claude Cowork guide* How to use Claude Code like a pro* Build your PM operating system----PS. Please subscribe on YouTube and follow on Apple & Spotify. It helps!If you want to advertise, email productgrowthppp at gmail. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.news.aakashg.com/subscribe

Summary

How to Build a Full AI Dev Team in Claude Code | Guide from Google PM Gabor Meyer

Apr 3002:15:18Summary ready

Check out the conversation on Apple, Spotify, and YouTube.Brought to you by:* Maven - Get a $675 discount off Gabor’s course with my code* Amplitude - The market-leader in product analytics* Testkube - The leading test orchestration platform* Land PM Job - My 12-week AI PM + Job Search Course starts Monday!* Product Faculty - Get $550 off their #1 AI PM Certification with code AAKASH550C7Today’s episodeHere’s the problem with most Claude Cost demos: they stop at the prototype.Nobody shows what happens next. You try to add a second feature. The first one breaks. The styling reverts to default. The code is so tangled that you spend more time debugging than you saved by generating.Gabor Mayer showed me what happens when you stop treating Claude Code like a magic prompt box and start treating it like a team.He is a PM at Google. He has not written production code in 15 years. But over the past several months, he has been building real mobile apps using 21 specialized Claude Code agents. Not prototypes that live in a demo. Apps that are on the App Store.In today’s episode, he walked through the entire workflow live and share all the resources free.If you want access to my AI tool stack - Dovetail, Arize, Linear, Descript, Reforge Build, DeepSky, Relay.app, Magic Patterns, Speechify, and Mobbin - grab Aakash’s bundle.Do you want to become an AI PM? I’ve created a course for you. Starts next week.Newsletter deep diveThank you for having me in your inbox. Here is the complete guide to building a full AI development team in Claude Code:* Why one-prompt vibe coding fails* The 21-agent team architecture* The spec-first workflow * From design to code without touching either* What changes when PMs actually buildSave this. The full 10-step playbook on one page. Everything below is the why and how behind each step. 1. Why one-prompt vibe coding failsEvery PM I know has built something with Bolt, Lovable, or Replit. The prototype looks great. It runs. It impresses people in a Slack message.Then you try to ship it to real users. And you hit a wall.Blocker 1 - Context compression silently destroys your specThis is the failure mode that nobody talks about in tutorials. When you give one agent one massive prompt, the model compresses context. Details get dropped. Not randomly. Strategically. The model decides what is “important” and what is not.In the episode, Gabor defined a complete color palette. Oranges, neutrals, specific accent tones. The agent received everything. The output used none of it. The layout was there. The structure was solid. But every color was a default.The reason is straightforward. When the context window is full, visual styling details are lower priority than functional logic. So the model drops them. Silently. Without warning. Without an error message. You just get generic output and wonder what went wrong.The fix is not better prompts. It is context engineering. Smaller, scoped tasks. Each agent gets only the context it needs for its specific job. The designer agent gets the brand guideline. The CTO agent gets the architecture spec. Neither gets the full 50-page document.Blocker 2 - AI-generated code compiles but is not maintainableA Reddit comment that hit home for Gabor - “Vibe coding is just the rebranding of unmaintainable, low-quality source code.”This is the real prototype-to-production gap. The code works today. You can demo it. You can push it to TestFlight. But the moment you touch it to add a feature, three other features break. No naming conventions. Circular references between modules. Zero comments explaining why anything was built the way it was.The fix is a dedicated code quality agent. Gabor calls his the Spaghetti Agent. It runs after every sprint and checks naming conventions, circular references, comment coverage, and structural debt. When he ran it on his codebase for the first time, it caught issues he never would have found manually.If you are building anything beyond a one-off demo, this agent is not optional. I covered similar quality patterns in my AI testing guide and my AI evals deep dive.Blocker 3 - No dependency mapping means cascading failuresWhen you build without organizing work into sprints, agents try to build features that depend on code that does not exist yet. Front-end components reference API endpoints that have not been created. Database queries call tables that have not been defined.The Atlassian MCP currently cannot create sprints directly in JIRA. That is a real limitation. Gabor uses tags as a workaround. He tags tickets as Sprint 1, Sprint 2, Sprint 3 and maps dependencies between them manually before starting the build. Without this step, the entire multi-agent workflow falls apart.Every PM who has gone from prototype to production with AI agents has hit at least one of these blockers. The ones who shipped figured out the workarounds. The ones who quit assumed the tools were the problem.Here is what the three blockers look like side by side, and what flips the moment you stop one-prompting and start running a team.2. The 21-agent team architectureYou do not need 21 agents to start. Three will get you surprisingly far. But understanding the full architecture shows you where the complexity lives and which roles to add as your projects grow.Here is the full roster: four clusters, 21 roles, and the markdown file pattern that makes them portable across every project you build next.2a. The core agents every PM needsThe System Analyst is the linchpin. It breaks down product requirements into technical specifications. It asks clarifying questions one at a time. It documents decisions in Confluence. It creates tickets in JIRA. Without this agent, every other agent operates on incomplete context.In the episode, the system analyst asked 14 clarifying questions before a single line of documentation was written. Vector DB choice. Usage limit mechanics. Conversation history handling. Search fallback strategy. API provider. Minimum iOS version. Screen count. Naming conventions. Each question one at a time so the answers stay deep.The prompt pattern that makes this work -“Please act like a good system analyst. Ask clarifying questions until you have a complete and comprehensive understanding. Ask questions one at a time. Do not start writing documentation until all questions are answered.”Two critical instructions. “One at a time” prevents the agent from dumping 25 questions at once. “Do not start writing” stops it from jumping ahead before the spec is complete. Different LLMs have different tendencies. Some love to start coding instantly. You need to explicitly constrain them. This is the same principle behind the prompt engineering techniques that work across any AI tool.The Spaghetti Agent handles code maintainability. Naming conventions. Circular references. Comment quality. Structural debt. Born from that Reddit comment. When Gabor ran it on his codebase for the first time, it caught problems he never knew existed.The UX Flow Architect creates clickable prototypes using Figma’s built-in prototyping arrows. This is a small but important detail. The early versions of this agent placed visual drawn arrows between screens instead of using Figma’s actual prototyping connections. The prototype looked like it had navigation. But when you clicked play, nothing happened. It took months of iteration to fix.Each agent has a specific Claude Code agent markdown file that defines its role, its constraints, and its interaction patterns. The setup mirrors how you would build a <a target="_blank" href="https:...

Summary

How to Become a "Builder PM" with n8n, Claude Code, and OpenClaw | Mahesh Yadav (ex-Google, AWS, Meta, Microsoft; Founder LegalGraph AI)

Apr 2001:36:50Summary ready

Today’s episodeLinkedIn just changed the title of its product managers to product builders.What does it even mean to be a “builder PM”?Well, tools only get you so far. Learning Claude Code is helpful, but means nothing if you don’t have an understanding of the underlying first principles.That’s today’s episode.Mahesh Yadav created one of our most popular episodes, with over 35K views on YouTube, and now he’s back. Earlier, he taught you AI agents. Today, he’s touching you how to become a builder PM:If you want access to my AI tool stack - Dovetail, Arize, Linear, Descript, Reforge Build, DeepSky, Relay.app, Magic Patterns, Speechify, and Mobbin - grab Aakash’s bundle.I’m giving a free talk on how to get interviews at the top AI PM companies on Thursday, April 23rd 2026 @ 9:00AM PDT. Grab your seat.----Check out the conversation on Apple, Spotify, and YouTube.Brought to you by:* Maven - Build cohort-based courses that scale* Amplitude - The market leader in product analytics* Jira Product Discovery - Prioritize what matters with confidence* NayaOne - Airgapped cloud-agnostic sandbox to validate AI tools faster* Product Faculty - Get $550 off their #1 AI PM Certification with my link----Key Takeaways:1. Builder PM defined - A builder PM talks to customers, figures out what to build, and ships the first version to 10 customers without talking to any developer. The skill is knowing what to build, not knowing how to code.2. Four agent components - Every agent that works has intelligence (model), tools (actions), memory (session context), and knowledge (your company data). Every agent that disappoints is missing at least one.3. n8n for foundations - n8n is the best learning tool because you visually see every component of the agent architecture as separate nodes. Build your first multi-agent system and evaluation pipeline here.4. Claude Code ate three company types - Context companies, action companies, and evaluation companies all got replaced by one agentic loop inside Claude Code. The three pieces collapsed into one tool.5. Computer control is the real unlock - File system access plus bash commands equals full laptop capability. This is why Claude Code went from coding tool to work operating system.6. Long-horizon jobs changed the game - AI agents went from 3-minute tasks to 3-6 hour sustained jobs in six months. This turns Claude Code from assistant to autonomous worker.7. Continuous learning loops - Build a second agent that watches your corrections to the first agent's work. After five repeated patterns, it proposes a skill update. Your tools get better every day.8. OpenClaw pattern - Delegation through existing channels, full machine sandboxing, model-agnostic. Not a product but a pattern that Google and AWS will copy inside their ecosystems.9. AI PM interviews changed - At L5 and L6, product sense questions are being replaced with live building exercises and system design for AI architectures. Pull out Claude Code during the interview or you are already out.10. Compensation trajectory - From $120K at Microsoft to $1.3M at Google over 13 years, doubling every 18 months through AI-focused switches. Left because big companies kill innovation with six-week approval cycles.----Where to find Mahesh Yadav* LinkedIn* Maven CourseRelated contentPodcasts:* Claude Code Team OS with Carl Vellotti* OpenClaw + Claude Code with Naman Pandey* Claude Code OS with Dave KilleenNewsletters:* The complete context engineering guide* How to use Claude Code like a pro* Practical AI agents for PMs----PS. Please subscribe on YouTube and follow on Apple & Spotify. It helps!If you want to advertise, email productgrowthppp at gmail. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.news.aakashg.com/subscribe

Summary

How to Design like OpenAI and Figma

Apr 1000:53:35Summary ready

Today’s episodeThe design process you learned is already dead.Most teams still follow the same linear pipeline. Low fidelity to high fidelity to handoff. Sketch it. Spec it. Ship it over the wall. That pipeline was built around a constraint that no longer exists. High fidelity used to be expensive. It is not anymore.I brought in two people who represent both sides of the new design infrastructure.Ed Bayes is a member of the design staff at OpenAI. He leads design on Codex, which just crossed 2 million weekly users with usage surging 3X since the start of the year. He spends 70-80% of his time coding. He still calls himself a designer.Gui Seiz is the Director of Product Design for AI at Figma. He leads design on all their AI features, including the Figma MCP server and Figma Make. His designers are now shipping PRs to production.----Check out the conversation on Apple, Spotify, and YouTube.Brought to you by:* Bolt: Ship AI-powered products 10x faster* Amplitude: The market-leader in product analytics* Pendo: The #1 software experience management platform* NayaOne: Airgapped cloud-agnostic sandbox* Product Faculty: Get $550 off their #1 AI PM Certification with my link----If you are trying to understand the new design workflow, this is the one episode to watch.If you want access to my AI tool stack - Dovetail, Arize, Linear, Descript, Reforge Build, DeepSky, Relay.app, Magic Patterns, Speechify, and Mobbin - grab Aakash’s bundle.I’m putting on a free webinar on Behavioral and AI PM interviews. Join me.----Key Takeaways:1. Code vs canvas is a false dichotomy - The best designers use both fluidly. Canvas for exploration, collaboration, and pixel-perfect craftsmanship. Code for interactions, responsive testing, and the last mile of polish. The question is what you are trying to learn, not which tool to commit to.2. High fidelity is no longer expensive - The entire linear design process existed because building something interactive required engineering resources. That constraint is gone. A functional wireframe takes the same time as a paper sketch.3. The Codex-Figma MCP makes handoff lossless - Import screens from a running React app into Figma with exact pixel values. Border radius, padding, shadows, all one to one. It is not a screenshot. It is a responsive, editable design artifact.4. The reverse direction works seamlessly - Make changes in Figma, paste a component link into Codex, and it updates your code automatically. No redline spec, no handoff document.5. Ed spends 70-80% of his time coding and still calls himself a designer - The medium changed but the mandate did not. Designers are still the voice of the user, still upholding craft. The tools expanded, the role stayed.6. Figma designers are shipping PRs to production - Teams that six months ago were AI curious are now banging down the door. Monetization designers who never wrote code are building technically complex prototypes.7. "Prototypes, not PRDs" is the emerging norm - PMs at OpenAI bring working prototypes to design reviews. They ship PRs to stress-test ideas before handing off to engineering.8. You do not need permission to start - Someone from OpenAI's GTM team built an iOS app with zero experience. Download Codex and build something for yourself tonight.9. Curiosity is the defining skill for this era - Not code proficiency, not design talent. The AI is an infinitely patient tutor. Ask questions. Build understanding alongside output.10. Total football is the mental model - Every player can play every position. Roles still have natural spikes. But the tool constraints that enforced rigid boundaries are dissolving.----Where to find Ed Bayes* LinkedIn* OpenAI* XWhere to find Gui Seiz* LinkedIn* Figma* XRelated contentPodcasts:* Xinran Ma - Design with AI* Carl Vellotti - Claude Code PM OS* Codex PM Guide with Carl VellottiNewsletters:* AI prototyping for PMs* The PM guide to Bolt* Codex PM guide----PS. Please subscribe on YouTube and follow on Apple & Spotify. It helps!If you want to advertise, email productgrowthppp at gmail. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.news.aakashg.com/subscribe

Summary

How to build a Team OS in Claude Code with Hannah Stulberg, PM @ DoorDash

Apr 701:10:33Summary ready

Today’s episodeThe way PM teams are trending, one PM is going to support 20 people.Not just engineers. Designers. Analysts. Strategy partners. GTM. Sales. Support.You cannot answer everyone’s questions about everything. You cannot be in every Slack thread. You cannot be the bottleneck for context that already exists somewhere in a Google Doc no one can find.But you can give them a high-context, well-organized repo.Hannah Stulberg is a PM at DoorDash and a former Google PM. She has spent over 1,500 hours in Claude Code.She wrote the viral Claude Code for Everything series. Her setup is not a personal productivity system. She has structured her entire team’s context into a shared repo that everyone queries.Her strategy partner - completely non-technical - puts up pull requests every day. Her engineers query metric definitions without asking the analyst. Her designers pull product context without waiting on a PM.If you are building a team that runs on AI, this is the episode to watch.----Check out the conversation on Apple, Spotify, and YouTube.Brought to you by:* Bolt: Ship AI-powered products 10x faster* Jira Product Discovery: Plan with purpose, ship with confidence* Kameleoon: Leading AI experimentation platform* Amplitude: The market-leader in product analytics* Product Faculty: Get $550 off their #1 AI PM Certification with my link----If you want access to my AI tool stack - Dovetail, Arize, Linear, Descript, Reforge Build, DeepSky, Relay.app, Magic Patterns, Speechify, and Mobbin - grab Aakash’s bundle.I’m putting on a free webinar on Behavioral and AI PM interviews. Join me.----1. Build a Team OS, not a personal OS - A shared repo where every function checks in work. Engineers, designers, and analysts self-serve without asking the PM.2. Root CLAUDE.md is everything - Doc index, team roster with Slack IDs, channel map. Keep under one page or you burn context every session.3. Nested indexes save 97% of context - Every folder gets a navigation CLAUDE.md. A customer query used only 3% of the context window.4. Three token tiers - Always-loaded root (~500 tokens), folder indexes on navigation (200-500), content files on demand (1,000-10,000+).5. Split analytics by product area - Metrics, queries, schemas separated. Progressive loading prevents waste.6. Gate launches on repo updates - Feature not shipped until metrics, queries, schemas, and playbooks are checked in.7. Verified playbooks kill hallucinations - Analyst-audited methodology. Claude follows verified steps instead of inventing its own.8. Plan mode makes 10x docs - Shift+Tab twice. Five phases: load context, ask questions, build plan, push thinking, review agents.9. Split long docs across parallel agents - Each writes to a temp file. Orchestrating agent compiles. Prevents context overflow.10. The flywheel compounds daily - Automate one task, free time, improve the repo. After 1,500 hours still iterating every day.----Where to find Hannah Stulberg* LinkedIn* In the Weeds SubstackRelated contentPodcasts:* My Claude Code PM OS with Dave Killeen* Claude Code + Analytics with Frank Lee* Claude Code as PM OS with Carl VellottiNewsletters:* The ultimate guide to context engineering* Build your PM operating system* How to use Claude Code like a pro----PS. Please subscribe on YouTube and follow on Apple & Spotify. It helps!If you want to advertise, email productgrowthppp at gmail. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.news.aakashg.com/subscribe

Summary

How to Turn Claude Code into an Operating System with Carl Vellotti

Mar 3001:06:48Summary ready

Today’s episodeClaude Code just hit $2.5 billion in annualized revenue in 9 months.It is the fastest B2B software product ramp in history.So why are most people still using it like a chatbot?This is how most people use Claude Code. Type a prompt and get output. The context fills up. It compacts. You lose everything. You start over.The top users flipped it. They built skills that interview through a framework before building anything. They use sub-agents that preserve context. They have operating systems where every file, every person, every project has a home.That shift is what today’s episode is about.I sat down with Carl Vellotti for the third time. His first episode was the beginner course. His second episode was the advanced masterclass. Together they crossed over a million views across platforms.Today is the operating system layer. If you are already an 80 out of 100 on Claude Code, this episode will bring you to a 95 out of 100.This episode covers context management, creating sub-agents to manage your context for you, auto-triggering skills with hooks, trustworthy data analysis with Jupyter notebooks, and building an operating system around it all.If you are living in Claude Code 8 to 10 hours a day and want to stop fighting the tool, this is the one episode to watch.----Check out the conversation on Apple, Spotify, and YouTube.Brought to you by:* Bolt: Ship AI-powered products 10x faster* Amplitude: The market-leader in product analytics* Pendo: The #1 software experience management platform* NayaOne: Airgapped cloud-agnostic sandbox* Product Faculty: Get $550 off their #1 AI PM Certification with my link----If you want access to my AI tool stack - Dovetail, Arize, Linear, Descript, Reforge Build, DeepSky, Relay.app, Magic Patterns, Speechify, and Mobbin - grab Aakash’s bundle.I’m putting on a free webinar on Behavioral and AI PM interviews. Join me.----Key Takeaways:1. Context management is the real skill - A single web search eats 10% of your context. Run /context to see what is consuming it. System prompt and MCPs take 10-16% before you type one message.2. Sub-agents save 20x context - Delegate research to a sub-agent. Same task costs 0.5% instead of 10%. Your main session only gets the summary.3. Replace MCPs with CLIs - MCPs eat context by existing. CLIs have zero overhead. GitHub CLI, Vercel CLI, Google Workspace CLI are all dramatically more efficient.4. Powerful skills need zero code - Anthropic's front-end design plugin is just a good prompt. No APIs or tooling. Just rules that tell Claude "do not look like AI."5. Give Claude self-checking tools - The make slides skill uses Puppeteer to screenshot output, measure overflow, and fix issues before you see them.6. Repeat prompts for better quality - A Google paper showed pasting a prompt twice helps. Tell Claude to double-check against skill instructions after the first pass.7. Use hooks to auto-invoke skills - A user_prompt_submit hook matches your words against skill keywords instantly. Zero context cost.8. Jupyter notebooks solve data trust - Every analysis shows exact code, inputs, and outputs. Traceable and reproducible.9. Build an operating system - Knowledge folder for people context. Projects folder for task isolation. Tools folder for scripts. CLAUDE.md for identity.10. The people folder compounds - Connect meeting transcription. After every meeting, update each person's dossier. Every prompt gets more specific over time.----Related contentPodcasts:* Claude Code Masterclass with Carl Vellotti (Ep 2)* Claude Code PM OS with Dave Killeen* OpenClaw Setup Guide with Naman PandeyNewsletters:* The ultimate guide to context engineering* How to use Claude Code like a pro* Claude Cowork and Code setup guidePS. Please subscribe on YouTube and follow on Apple & Spotify. It helps! This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.news.aakashg.com/subscribe

Summary

All episodes

How to Use Codex Like an OpenAI PM | Abhi Muchhal, PM OpenAI (ex-Meta and Nubank)

How PMs Ship 100K Lines of Code at OpenAI with Ryan Lopopolo, Member of Technical Staff

How to Run Evals in Claude Code with Aparna Dhinakaran, Founder and CPO of Arize

Claude Code for Non-Technical PMs, with Andre Albuquerque

PM's Guide to Claude - When to use Chat vs Cowork vs Code, with Pawel Huryn

How to Build a Full AI Dev Team in Claude Code | Guide from Google PM Gabor Meyer

How to Become a "Builder PM" with n8n, Claude Code, and OpenClaw | Mahesh Yadav (ex-Google, AWS, Meta, Microsoft; Founder LegalGraph AI)

How to Design like OpenAI and Figma

How to build a Team OS in Claude Code with Hannah Stulberg, PM @ DoorDash

How to Turn Claude Code into an Operating System with Carl Vellotti