Summary7 min read

Podcast Summary: The Growth Podcast

Episode Title: How to Become a "Builder PM" with n8n, Claude Code, and OpenClaw | Mahesh Yadav
Host: Akash Gupta
Guest: Mahesh Yadav (Founder, LegalGraph AI; ex-Google, AWS, Meta, Microsoft)
Date: April 20, 2026

Overview

This episode dives deep into what it means to be a "Builder PM" in the AI era. Mahesh Yadav, a veteran product manager from Microsoft, Meta, AWS, and Google, shares practical strategies and technical walkthroughs to empower PMs to harness modern AI tools like n8n, Claude Code, and OpenClaw. The conversation walks through concrete demos, foundational concepts (agents, memory, tools), product workflows, and how PMs’ roles and interview expectations are rapidly evolving due to recent AI advancements.

Key Discussion Points & Insights

1. The “Builder PM” Mentality

PMs’ Moment to Shine: The AI revolution is a once-in-a-generation opportunity for PMs to redefine their roles and make direct impacts.

“This is our time to shine... If you know your skills, you can build it in cloud code and delegate that work to agents.” – Mahesh (00:00)
Builder PM Definition:
- Not just “using” tools, but understanding foundational concepts.
- Capability to go from customer problem > prototype > customer feedback independently—without waiting on engineering.
- Core skill: Translating customer needs directly into AI-empowered solutions.

2. Foundational Concepts: Agents, Memory, Tools, and Guardrails

Every effective AI agent needs:
- Knowledge (world understanding)
- Signals/State (perceiving current state)
- Tools (ability to act)
- Guardrails (safety, laws, constraints)
Analogy: Just like children learn basic facts, context, and how to interact, AI agents need scaffolding to be useful (05:18–12:00).

3. Hands-On Demo: n8n Agents and Memory

n8n as a Starting Point:
- Low-code/no-code way to experiment with agent creation, memory, and tool integration.
- Demo: Connecting an OpenAI model, adding a search tool, and then giving it conversational memory.
- Building blocks: Data loaders, embeddings, and retrieval-augmented generation (RAG) (15:00–21:00).
Real-World Example: Automating contract review and risk analysis, multi-agent orchestration, and even running evaluations (evals) of AI agent outputs against human “ground truth” (17:57–29:05).

“If you build an agent with a tool or intelligence it will be a stupid agent because it doesn't have memory or it doesn't remember anything... Who wants to talk to a person who doesn't remember anything?” – Mahesh (15:49)

4. Why and When to Move Beyond n8n

Strengths: Rapid prototyping, great for getting to first 10 customers (30:05).
Limits: Not suitable for large-scale collaboration, code reviews, versioning, or production-level deployment (30:05–31:10).

5. Claude Code: Transformational Advancements in AI Product Workflows

What Changed in Dec 2025: Introduction of agentic loops, cowork features, combining context, action, and evaluation in a single system (35:34–46:56).
Key Unlocks:
- Long horizon jobs (AI agents can work for hours, not just seconds)
- Full computer and browser control (file system, bash)
- Ability to create “skills,” schedule jobs, and paralyze work with subagents – all accessible to non-coders and power users alike.
- Automatic management of context/retrieval (no more manual RAG tuning)
Demo: Automated review of PRDs, learning from corrections, continuous, feedback-driven agent improvement via “learner.md” artifact (47:00–62:03).

“Anything... if you can do a job, just tell me the job and this agent will do it better than you. 365 days, 24 hours.” – Mahesh (45:46)

6. Expanding PM Impact: Beyond PRD Review

AI-Driven Product Lifecycle
- Competitive analysis agents
- Automated UI prototyping, mock creation
- User analytics and dashboard generation
- Drastically reduces idea-to-prototype-to-customer-signal time from “months to days” (62:10–65:30)

7. OpenClaw: The Next Generation

What is OpenClaw?
- Open-source, agentic AI pattern (not just a product) that anyone can deploy on local or cloud hardware
- Allows for trusted delegation and persistent, multi-channel agent workflows (WhatsApp, Slack, email, etc.)
- Supports open-source and closed models
- Emerged due to developer-driven innovation (not big tech), creates a “new operating system” paradigm for agentic work (65:41–76:33)
Sandboxing and Security:
- For big companies, security means running agentic loops inside controlled environments (sandboxed VMs, internal sandboxes)
- Next big problem: Safely exposing company context and actions to powerful autonomous agents (76:50–79:40)

“OpenClaw is not a technology, it's not a product, it's a pattern... And that, I think, is what I am excited about.” – Mahesh (76:50)

8. A Roadmap for PMs: Becoming a Builder PM

Blueprint for Upskilling:
- 3 weeks: Deep dive into agent basics (models, context, tools, memory)
- 4 weeks: Cloud Code/Cowork—automate your daily PM tasks, implement human-in-the-loop feedback patterns
- 3 weeks: OpenClaw/agentic loop mastery—delegate a full workflow to an agent, safely isolate permissions, experiment with multi-agent systems
- Total: ~10 weeks path to “Builder PM” (79:57–82:20)

9. Redefining AI PM Interviews

New Expectations:
- Real-world problem solving during interviews—case studies, live builds, system design
- Show actual use and mastery of tools like Claude Code, OpenClaw
- Dusty “MBA-style” hypothetical questioning is out; system and agentic thinking is in (82:32–85:51)

“If I give you a job and if you're not pulling out your Claude code or some kind of tool like Lovable, you're already out.” – Mahesh (85:51)

10. Distinguishing Agentic AI from Classical AI

Classic AI: Fits patterns, analyzes data, returns informational outputs
Agentic AI: Understands, acts, remembers, evaluates, and adapts – direct automation of business workflows, not just insight generation (86:02–87:41)

11. Personal Wisdom: Compensation and Career Moves

Compensation Trajectory:
- “AI has worked very well for me.” Comp grew from $120K (2012) to $1.3M+ (2025) with each switch nearly doubling compensation.
- Now, AI PMs at top firms see $1.5M–$2.5M+ total comp (88:01–89:19)
Why Leave Big Tech:
- Despite pay, innovation happens outside big companies—open source projects and startups move faster.
- “Large companies have not produced shit in AI. The tools are distributed but these big companies have no environment to grow something which can be imagined, put in production and put in customer hands... This is the time to go build.” (91:46–95:25)

Notable Quotes & Memorable Moments

On PM Agency in AI Era:

“This is the time to build your own world and I believe in that future and that's why I left and I have zero regrets.”
— Mahesh (00:19 / 95:25)
On Agentic AI Fundamentals:

“For me, a builder PM is somebody who can talk to customers, figure out what needs to be built and build the first version and get to 10 customers without talking to any developer at all.”
— Mahesh (05:13)
On Claude Code Game Changer:

“Now it’s a superpower. I don’t know its limitations. It does anything that I wanted to do. This is an open challenge. If you can do a job, just tell me the job and this agent will do it better than you. 365 days, 24 hours.”
— Mahesh (45:46)
On Upskilling PMs:

“You have unlocked what the human potential was locked inside coding... If you know your skill, you can build it in cloud code and delegate that work to agents.”
— Mahesh (44:00/79:57)
On Innovation Outside Big Tech:

“Large companies have not produced shit in AI... for me, I reached a point in my life that I wanted to just stay unbounded as much as I could.”
— Mahesh (91:46)

Important Timestamps

00:00–03:00: Introduction, defining the “Builder PM”
05:00–15:00: Agentic AI explained; n8n demo and agent construction blocks
17:30–29:00: Demo: Contract review automation, memory, multi-agent setups, agent evals
30:05–35:20: Where n8n falls short, why shift to Claude Code
35:22–47:00: Claude Code’s agentic loop, the December 2025 “inflection point”
47:00–62:00: Demo: Automating PRD review and continuous agent learning
62:03–65:30: Additional PM use cases: competitive analysis, prototyping, data dashboards
65:41–76:33: OpenClaw pattern—open-source, persistent agents, sandbox security
79:57–82:20: Blueprint to becoming a Builder PM (skills, timeline)
82:32–85:51: New model for AI PM interviews
88:01–91:00: Mahesh’s compensation journey and why he left big tech
91:46–95:25: Final reflections: career optimization, innovation, “builders’ moment”

Takeaways for Aspiring “Builder PMs”

Start with basics: Understand the scaffolding of agents and their components – don’t just “use” AI, learn how it works.
Leverage tools: Use n8n for prototyping, Claude Code for production-ready agent workflows, and look ahead to patterns introduced by OpenClaw.
Automate and iterate: Continuously integrate your expertise into agent-driven tools and workflows, collecting feedback and improvement data.
Stay current: PM interviews and real-world impact both require hands-on mastery of modern agentic AI platforms and the ability to deliver working solutions—fast.
Take initiative: The door is open for PMs to build, not just manage. This is the moment to claim that responsibility.

Loading summary

Transcript60 lines

[00:00]
Mahesh Yadav
I would love to send this message to all PMs that this is our time to shine. There is a lot of misconception there that you know, if you start using Claude code or if you configure openclaw you become a builder pm.
[00:13]
Akash
Maher Shihadav, who's been a PM everywhere. Microsoft, Amazon, Meta. Last but not least, Google.
[00:20]
Mahesh Yadav
This is the time to build your own world and I believe in that future and that's why I left and I have zero regrets. If you know your skills, you can build it in cloud code and delegate that work to agents. I don't know its limitations. It does anything that I wanted to do. This is an open challenge. I have given to anybody that if you can do a job, just tell me the job and this agent will do it better than you. 365 days, 24 hours, everything which used to take you almost two to three months first to write the PRD to get to mocks from mocks to a real working prototype, from there to getting customers and seeing the signals, all that is getting squeezed with this Claude code and you become the builder PM that the world needs the ability to sandbox these agents in a controlled way. That's an unsolved problem and that I think is what I am excited about.
[01:12]
Akash
Google isn't going to allow you to just give your company access to an open claw.
[01:17]
Mahesh Yadav
Yeah, no, I 100% agree. I think the idea is before we
[01:23]
Akash
go any further, do me a favor and check that you are subscribed on YouTube and following on Apple and Spotify podcasts. And if you want to get access to amazing AI tools, check out my bundle where if you become an annual subscriber to my newsletter, you get a full year free of the paid plans of Mobin, Arise, Relay App, Dovetail, Linear Magic Patterns, Deep Sky, Reforge, Build, Descript and Speechify. So be sure to check that out@buildle.akashg.com and now into today's episode. PMs are now being asked to push PRs. PMs are being asked to code. This is the rise of the builder pm. But what is a builder pm? Today I have Mahesh Yadav who's been a PM everywhere. Microsoft, Amazon, Meta. He has seen everything. Last but not least, Google. So he has seen all the top AI companies. He has been an AIPM for a long time and now he's training aipms. And today he's going to help you understand. How do I become a builder PM? How do I use N8N? How do I use cloud code? How do I use Open Claw in order to become a more effective and efficient pm even if I'm not building AI? Feature Mahesh Everybody loved our last episode. Thanks for coming back.
[02:36]
Mahesh Yadav
Oh, thank you for having me. And I think this is the time of urgency. So I would love to use your platform to send this message to all PMs that this is our time and we should be ready when this time arrived. I was always preparing for this time and now the time is right for all the PMs to shine. It's just little bit that we need to go learn. And if we learn what we need to learn, this is our moment.
[02:59]
Akash
So what is a builder PM and how does a PM become one?
[03:03]
Mahesh Yadav
That's a very good question. Right? I was always right. Means I had an engineering background. I was not a traditional PM who came from a B school and then went to McKinsey and then became a PM. I had a very gradual move to PM. I was an engineer and I was always building and then I became a PM because I was always building what is customer wanted or working backward from customers rather than just building for the sake of building, which is very popular at Microsoft, if you don't know. So for me a builder PM is like as PMs we are always building. Our job is to build the right thing. And now earlier, if you had the tools, you would have built the whole product. But it was very hard to build anything. Or you need at least three or six months of rigorous coding, testing, deployment, all that was needed to go build things. But in the new age, even like people who are engineers all along are saying that they are not writing code anymore, they are talking to customers and the Claude code does coding for them. In that age, the skill that becomes important is like what to build and what does customer want and that you have. And if you use on the right hand side the tools to do the right prototyping and then build at least the first version of your product and you become the builder PM that the world needs. And I think all of us need to grow into that, whether engineers, designers or PMs because without that we will not be able to diffuse the benefits of AI into economy. For me, a builder PM is somebody who has taken the responsibility to diffuse the benefits of this awesome AI we have today into the economy so that we can all ripe the benefits that large companies and research labs are putting so much money into to build. So for me, nutshell, BuildPM is somebody who can take, talk to customers, figure out what needs to be built and build the first version and get to 10 customers without talking to any developer at all.
[05:14]
Akash
Amazing. Can you show us in action what are the skills and concepts we need to understand in order to get there?
[05:19]
Mahesh Yadav
Yeah, I think there is a lot of misconception there that if you start using Claude code or if you configure OpenClaw you become a builder. VM I think it's the. The hype is right because this is the first time people see that they can manage their calendar, they can delegate work to other party or they can just say things and it just happens. But AI, my, my. I've been in AI for last 10 years and build things all along. I think just knowing the layers or understanding how these things work is the first step or of building these things. And for me I start my journey with and obviously you talked about a lot of tools so I will start my journey in first understanding these concepts. So I will start in an earlier day, I will start with something like TensorFlow or PyTorch and understand what a model is or train a model and then inference a model in the new world in the agentic AI. I will start with something like an 810 and then go and say hey, okay, what is an agent? How it interacts with model? What is a model? What are the limitation of it? What is memory? What is tools? And maybe I can just show you if it's cool. Maybe I just share my screen and show you because it's not very hard to learn these things in N8. N8N has done a good job to be honest. Maybe it's an obsolete tool in building workflows now with cloud code, but I think it's still an amazing tool to learn. So let me just share quickly with you what it takes to build or what are the components involved in building these agents. So just a revision, revision from last time. If you look at last time we talked about this idea that as we grow as humans kid is born, they first need to have the knowledge of the world. So this is how the world works. Great, okay, I understand that and I can build my intelligence on that and I can also update it. And second, I need you to understand what is the current state of the world. If I want to do anything, I want to know. We teach our kids all the time when they are born we will say to them that hey, this is hot, this is cold, all those signals which is current state of the world. Yeah, we know that this is a thing that gives us Power or this is where we cook. But this is the current state of the world. So this is your signals or memory tells you what is the current state of the world. And then if you did a good job, you can ask your kids to get a glass of water for you. And that's the tools where they can use tools like a glass and then open a tap and then move, hold it and get it back to you. So that's the tools piece. And then we learn the guardrails, the laws, what is possible in this country. If you are moving on a road, then you need to look left and right. It's your responsibility, not the responsibility of the driver. Especially on a high, especially on a busy road. So we tell our guardrails and laws. Yeah, if we do all this and if we just replace humans here with a model. So this model is analogous to what you get from OpenAI or anthropic or Google. The model is just the intelligence layer. It just trained to predict the next word and now have some reasoning. But you need all this harness, harness or people call it scaffolding to actually build something that can solve problems or build impact for you. So this harness is called agents. And then we use these frameworks like N8N to build these agents. Right. And then if you look at every agent that we have built so far or what people have built has one of these four things in it or the good ones have all these things in it. So let me show you, like if you want to build these agents, what is the memory, why you need this knowledge piece. What will happen if I don't add the knowledge? So let's spend like maybe 10 minutes and understand the basics first before we get into how can you automate your world with these latest agent on how you become a builder pm. So in my story I would say the first step of becoming a builder PM is just getting into knowing the basics. So if you look at this, maybe so this is my just N8N and what I'm doing here is maybe I start from scratch because what's the point of doing an Akash show without with nets? Let's do it without nets because that's the trapeze acts without nets. So, so on the right hand side you just take an AI agent and you search AI agent, you get an AI agent. So AI agent has a model which is the intelligence layer which I was just talking about. So now I am connecting an OpenAI chat model. You can pick any model you want. I will just save money because expensive models so I'm going to be a little cheap here and pick the JP GPT 4.1 Mini. So you can just pick that model. So now you got model. So this is the agent, this is like a little baby, doesn't know anything about your world and now it has intelligence. So okay, if it has intelligence, we should be ready to go. So I can ask it some questions like, hi, what is, what is, what are neural networks? It can answer those questions. So now what happens is this goes to the agent, the agent calls the model. And one beautiful thing about this is that you can look at like what the message was sent, what the input and output were from this screen here. Okay, it gave me an input. It gave me because it's trained on these things. Okay then, but if I ask it very simple question like, hey, what is the, what President Trump said about ending the conflict in Iran? What do you think it will answer that question or not?
[12:05]
Akash
It's training data probably ended in like 2023, let's say.
[12:08]
Mahesh Yadav
Great answer. Let's see what happens. I asked this question and it says, hey, my knowledge cutoff is June 2024. Because you're cheap. You're picking a cheap model. That means that it doesn't have that knowledge, so it can't answer the question. So maybe we need a tool for latest, greatest things. So let's give it a tool. And the tool can be Taveli, which is a search tool. So this allows it to search the model to do things. So now I go and I say, the way I think it's Avi. Yeah, this is the one. You got it.
[12:47]
Akash
If you've been enjoying this episode today, Mahesh's classic whiteboard style, his awesome labs and demos, then you will love Mahesh's cohort based course. It has amongst the highest reviews on Maven and I challenge you to go check this yourself. His courses are always rated for 4,000, 849. That's because of the 13 years in big tech at Microsoft, Google, Amazon, Meta. He's actually built the things, he's not just talking about it. And his course, everybody I know, my mentees have taken his course, have had a positive opinion of it because it is so interactive. You get the first principles that you're sensing from the podcast today. The coolest thing about his new course building open clause and with Claude code is that you get a Mac Mini. So take advantage. Use the discount code that I have below in the description so you get a discount on his course. Get your Mac Mini better Yet get your company to pay for it. He's going to teach you not just how to become a Builder pm, but he's going to be in the trenches with you. As he said, he'll even give you AI PM interview advice. So it's an all in one course to becoming an AI Builder pm. Check that out in the link just below. And back to Today's Episode.
[13:54]
Sponsor/Advertisement Voice
Today's Episode is brought to you by Amplitude Replays of mobile user engagement are critical to building better products and experiences Experiences. But many session replay tools don't capture the full picture. Some tools take screenshots every second, leading to choppy replays and high storage costs from enormous capture sizes. Others use wireframes. But key moments go missing, creating gaps in your understanding. Neither approach gives you a truly mobile experience. Amplitude does things differently. Their mobile replays capture the full experience every tap, every scroll and every gesture with no lag and no performance set. It's the most accurate way to understand mobile behavior. See the full story with amplitude and
[14:34]
Mahesh Yadav
on query so now I'm adding a tool. I have my account, it will go do the search, it will take the query and where can it take the query? I can hard code the query or I can let the model define the query. So now I'm telling it that hey let the intelligence layer based on what user is asking define what you want to search on Internet. So now we connected a tool and let's go. So now I asked the same question again which is what is said about ending the conflict in Iran and my spellings are, you can see are not that great. But now it goes and now it's going to the search tool, it's doing some searching. Let's see what searching it is doing. Seems like it went to these website, it found information and then it went to lot of website, it captured all the information and now it is able to answer the question which is from President Donaldson stated that conflict could end at any moment based on his decision. He mentioned there is practically nothing left to target and all that. So a lot of excitement. I'm happy about that. So what is President Trump said? Okay then if I ask it a question which is hey what conflict am I talking about? You think that will work?
[15:46]
Akash
Oh, I don't know. It doesn't have memory, right?
[15:50]
Mahesh Yadav
Yeah, you trained well. If I ask it what conflict? I don't see any previous mention of conflict in this conversation. Could you please provide more context or specific which conflict? If you build an agent with a tool or intelligence it will be a Stupid agent because it doesn't have memory or it doesn't remember anything. Who wants to talk to a person who doesn't remember anything? Let's add a memory. So I will just add a simple memory which what it does is it takes a session ID and remembers last five conversations. So now you know what an agent is, which is just a scaffolding. But the real work is happening in intelligence memory or tools. So if I do the same query now and it say hey, what is the president from ending the conflict in Iran? And then it will do the same thing, but this time you see it updates the memory. So it has put all the conversations, all the information in memory. And if I ask my question which is hey, what is the conflict I'm talking about? It goes in, it fetches this from the memory. Now doesn't call the tool because it seems like I'm looking for information. You are referring to the conflict in especially in the military conflict on war situation begin around February 2026. So that's what it takes. But maybe, maybe you have a larger context. Maybe you have contracts in your company and you want to ask questions on those contracts. Because conflict in Iran is not going to make me money. But if I do good contracts it will make a lot of money. So can I ask a question like hey, what are the clauses or what are the payment terms impact of tariffs and war in Iran on payment payments impact payment term impacts of tariff and and of war on our contracts? Okay, so I run a company and we have contracts all over the world. I want to know that, will that work?
[17:57]
Akash
I guess right now we haven't given it like a rag database right to our contracts.
[18:02]
Mahesh Yadav
Yay. I think I got the best student here, right? Who has all the right prompts. So now it goes in and it says hey, I don't know or I will just handle it. The payment impact on tariffs and war on Iran contracts generally can include several key factor, increased cost, payment delays and all that. But it doesn't talk about my contracts. So I have another lab that we cover in our course where you can actually create a knowledge base or database of your knowledge how your world works. So this is important because the world is working perfectly how neural network works or world information. But your company information also need to be executed or needs to be built. So what this does is first one just you can upload contract. So I will just execute this workflow here. I can upload my contracts. So I will just upload a contract from my company and what you will see Is that it will go and create a great. Maybe we put this MSA in. So this is a master service agreement. And what it does is once I go submit, it goes and creates the chunks. Now you can see there is a data loader. You can see insert data. And now you're learning a new concept, how the data is getting into something that machines can understand because these agents or these machines don't understand like you and me understand text. So now you are seeing something called an embedding model. The embedding model allows you to convert text into embeddings. You're also seeing something called a data loader. What is a data loader? Oh, it takes the type of data, which is a binary, which is your input file, converts it into text, and then does simple text splitting. Okay, maybe I can do custom splitting. Oh, so custom splitting you have to specify. But what's happening in simple one, basically it's taking 1000 character with 200 character overlaps. So it divides your whole file into thousand characters and then it goes in and puts that into a database. Or it calls the embedding model, which goes and creates this awesome database which is a rag or retrieval augmented generation based database for you, which we have the same database where we are entering this information. And now if I go here and I ask the same question, which is I execute this workflow and not this one, give me this one. And I go, I select this one and I say execute workflow. Actually not use this trigger. Just come here. And now I ask the same question which I was asking earlier. I can just say, hey, what is the value of contract? Does the payment term change due to war Irina on tariffs? On tariffs based on our MSA contracts? It goes in, it queries this tool. This is where you have put all your information. Now it queries the tool, it extract all the taxes information from your document that you provided. So your knowledge. And it says the document does not specify any provision. So I think we are good whether it happens or not. And now this information comes from your knowledge. Similar to this, we also discuss the next stage of this learning is multi agent systems. So single agent systems are great, but if you want to do real checks or real things, this is a multi agent system that we build. We use the same constructs that we just discussed. And in this one you see that you can send an email. So right now I can just send an email and ask this information the same information by an email. So I have pasted that or I have published that flow and I can Just ask like I ask my lawyers now I can send this request and I say, hey, can you get me the risks in this contract before signing? So if you send me a contract which I believe one day you should send me a contract. But right now let's say you send me a contract which is an NDA before signing I will just send this email. And if I send this email, what happens is this email hits this provider which is my gmail and if you look at it, I will get this review contract key terms for this contract. Okay, if I get that, then what happens is that if I go to my workflow, I will show you in executions that in few seconds you will see that it will get a new running. So it automatically triggers this. I need not to do anything. It's a published workflow and now it's running and when it is done I will get a response which says that hey, you got these risks contract analysis report with all the risks in it. So now you start first understanding all these things which is, hey, what does these agents do and how they work with your things behind the scenes. So these are the connectors, these are your agents. It has multi agent system with playbooks or your database connected and then it can do end to end work like humans do. And the last thing I would love to show in N8N world which I think all of the builders PM should go build is this is great. All this is awesome. This is sending me risk. But what about evals? How are we going to evaluate these? Because these agents are not like us humans which does a lot of self evaluations and you can't. If they do a bad job, you are going to get fired. They are not going to lose their job. Even if they lose their job, it's not good for you. So what is this is then? Then we create this idea of ground truths. So what I have done here is I have taken this contract and what I do is I say hey, here are the terms that you need to look at. Here is the correct value and whether there is a risk or not based on our playbook is also written from a real lawyer. And then you can create a workflow like this where you can run this normally in the daytime. What you do is during the day you can have your people submitting these files. You can find all the risk and the results are getting stored here. And in evening what happens is you evaluate these using the judge or risk categories and modifications and you get a fancy report. So let me just execute this so not the eval Trigger but let's just first create this data. So if I go and execute this workflow, I can just upload a file and on that file what it's going to do is it's going to extract these terms. So now it's going to find the risk in this contract and submit our results according to the same key terms here. So it runs in automation and then it updates. A file like a human will go find the risk, but what it does is it will just get the results first. So now you will see, once it is done, these values will be populated here. So now it's just looking for governing law, justification, agreement terms. And once it is done, which is still executing, once it is done, you will see this file getting updated and you will get the results like a human would have done. First it's extracting these values from the contract, then checking whether risk or not. If risk, it will justify why this is risk and suggest a modification which allows you to take minimum risk. If you sign this contract, this is what a lawyer does for you, by the way. They will look at your contract, compare it with some values, find whether risk or not, and then give you a justification why this is a risk and modification. They never justify the risk by the way. So now you see automatically that you can see the risk. And once you see these risks then you can see the second part of this, which is you can run a quick eval on this by just going and changing it from form submission to eval. And now when I run it, what it does is it will go and run an automatic eval and suggest comparing to my ground truth here, which is my real values and suggest that how much of these risk is correct, how much of justification is what a human or these modification is what a human will accept as is or they want to change. So you can run these evals also. And NITAN has this awesome tool. So now you run this flow and what it does is it will take it one row by one row, run your evaluations and find whether the risk is good and what is the quality of modification suggested by the AI. It will continue doing it, but if you can go to evaluation tab, I can show you some of the previous evaluations. So now you see that it goes through runs row by row and then eventually it's telling you that hey, you have a risk quality is good, you're able to detect 80% of risk correctly, but your modification quality, the suggestions AI is making is only 30% as good as a human would have done. So you have work to do. So now you understand as a PM or as anybody who wants to survive this build wave, that it's not very hard to go with something very small, bring your context, make it multi agent, because the world is multi agent, and then run evaluations to make sure that before you deploy it it's really good quality. So that when it does a job or when it responds to these emails, it does a good job. So I got my analysis which I sent earlier. So now it talks about all the findings. I need to set it up so that you can see it in a more. The formatting could be formatting correctly, but I just build it for you to show it quickly. But you see that the whole flow worked and it does give you risk that this agreement shall be governed or constructed in law of state of Delaware without regard of conflict of law principle. Delaware is neutral jurisdiction under criteria of NDA. So it's talking about these things to you and giving you all the details.
[29:06]
Akash
Yeah.
[29:07]
Mahesh Yadav
Okay. So this is what it takes to get first your footing on ground to get started with your building journey. So that's. People talk about this. When people talk about AI a lot, they talk about, hey, AI is you want to build a PM, just start with OpenClaw. My suggestion would be first understand what the harness looks like. What is the harness made up? Ideally, understand how this model is built. What are neural networks, what are transformers? We talk about that in a way that anybody can understand, but then understand the scaffolding at least, because this is where things will break even. If you're using Claude code or the cloud code constructs like context model, compression, knowledge, memory, first play with them and see what they are. And that's your first step in getting into building with AI.
[30:02]
Akash
When does N8N fall short? When do you move beyond N8N?
[30:05]
Mahesh Yadav
Yeah, so anything is very like. I think it's more like a tool which allows you to get to your first end customers. I think it's a very powerful tool, especially with webhook. When we did the last session, I showed you how can you create your backend in any 10 without any code and connect it to a lovable or V0 front end and then build the whole solution where you can click a button and something happens in an 8N and you can debug everything without writing a single line of code. I think it's very powerful there. But then if you want to iterate put things in production. If you want three people to contribute to your code, if you want to have a test set or want to put a container around it and put it into production. Any 10 doesn't support that. And if things fall short and the worst part of this is that there is no way for people to see the code and get to the code mode. Like we do this in our cohorts and after people want to get to the next stage which is hey, I've done this but now I want to put this in code and share it with my team so that they can see if this is good quality, bad quality. To take it to hundreds of user or thousands of users in the most efficient or latency optimized way, then Anaton has no answer to those questions. So I think it just stops you beyond 10 customers. It's not the right tool I will recommend and that's where I know what you want to get started with.
[31:31]
Sponsor/Advertisement Voice
Today's episode is brought to you by Jira Product Discovery. If you're like most product managers, you're probably in JIRA tracking tickets and managing the backlog. But what about everything that happens before delivery? JIRA Product Discovery helps you move your discovery, prioritization and even roadmapping work out of spreadsheets and into a purpose built tool designed for product teams, capture insights, prioritize what matters and create roadmaps you can easily tailor for any audience. And because it's built to work with Jira, everything stays connected from idea to delivery. Used by product teams at Canva, Deliveroo and even the Economist. Check out why and try it for free today at atlassian.com product-discovery that's a T L-A S S I a n.com product-discovery Jira product discovery Build the right thing Today's episode is brought to you by Nia1. In tech, buying speed is survival. How fast you can get a product in front of customers decides if you will win. If it takes you nine months to buy one piece of tech, you're dead in the water. Right now financial services are under pressure to get AI live, but in a regulated industry, the roadblocks are real. NIA1 changes that. Their air gapped cloud agnostic sandbox lets you find, test and validate new AI tools much faster from months to weeks from stuck to shipped. If you're ready to accelerate AI adoption, check out NYA1@nayaone.com Akash that's N A Y-A O-N E.com A-K- I hope you're enjoying today's episode.
[33:10]
Akash
Are you interested in becoming an AI product manager? Making hundreds of thousands of doll dollars more joining OpenAI and Anthropic then you might want to do a course that I've taken myself. The AIPM certificate ran by OpenAI product leader McDad Jaffer. If you use my code and my link, you get a special discount on this course. It is a course that I highly recommend. We have done a lot of collaborations together on things like AI product strategy, so check out our newsletter articles if you want to see the quality of the type of thinking you'll get. One of my frequent collaborators, Pavel Hearn is the Build Labs leader. So you're going to live build an AI product with Pavel's feedback if you take this AIPM certificate. So be sure to check that out. Be sure to use my code and my link in order to get a special discount. And now back into today's episode. Yes, it's the hottest tool on the market. Claude code. Please Mahesh, show us when should we be using cloud code? How should we be using it?
[34:04]
Mahesh Yadav
Yeah, I think you should spend like good two weeks in any 10 and beyond that you should move to Claude code. Because I think what has happened is with Claude code especially like in last, I would say three to six months. Claude code is there for more than a year now, but in last six months there's just so much possibilities with Claude code and cowork to build things and put things in production for you as well as your team. It's the same tool chain which a person who has no coding experience can use like building with skills, creating their sub agents hooks and then scheduled jobs to the people. And then on the right hand side is people who know how to code who has always been coding for rest of their lives, they also can build on top of what you provided. So this idea of having somebody to allocate or do work with and then this idea of code is combined in Claude code. So I think it started with something that people wanted to do code with but then this one thing they realize is, or we all realize is that if you can build something that codes well, it can do any task well. And that is what Claude code is and that's what I think why it's the hottest tool in the term.
[35:22]
Akash
And you talked at the beginning about needing to harness this moment and Andrej Karpathy talked about something changed in December 2025. What exactly changed?
[35:35]
Mahesh Yadav
Yeah, so I think this is like for me also, right? I mean some contemplating this and I think still not have wrapped my head around it. What has changed is if you look at like last three Years where I I was at Google and we were building AI, but I thought we are not building AI fast enough. So I left Google, I started on my own. I'm building this company where we thought we can go and automate back office and build AI or the benefits AI faster than what I could do at Google with my own agency. Obviously Google is doing great with their agency and if you look at all the companies in last three years, what they did with AI is they took these models that we had and if you take the examples of any company and I can put them what they do in this map. If you look at Gamma, Gamma did one thing. Gamma said okay, I am going to go and connect these models with tools and make it very easy for you to create slides and then I will provide you connectors, connectors to PDFs, connectors to everything so that you can go and publish this in PPTs or Google Slides or just PDFs. They just do it one job. And this was I think now at billion dollar valuation, the second kind of company. So first kind of companies that you saw was these kind of companies which go and made the model and tool working and put a lot of connectors in place. So Gamma is there. The second kind companies said hey, we're going to take this model, harness it on domain specific knowledge. So the Harveys of the world, the Lagoras of the world went and said hey, we're going to take this model and do the context engineering piece here and say hey, we will context to the lawyers and solve their problems and they become one to ten billion dollar companies in two years. And some just went in and provided signals and memory. Well and then there were third kind of companies which said hey, can we provide you frameworks to build this? Which is what you saw with Salesforce, agentforce, Amazon Q they said we'll just help you build with all these things faster. This was happening for last two years. You will see a lot of companies just come out and every company was planning to do something like this. Then what happens is there is something, a breakthrough happens inside Anthropic where they are building just a tool for coding and they thought for doing coding, let's just do one thing which is build the agent loop. They called it agent loop. And the idea is it will build the context, it will take the actions which is all the connectors and tools and it will do the evaluations and based on that it can come back and keep doing that again and again. So now what happens is if you are a context Company Harvey Lagora, if you are an action company, Gamma if you are an eval company or you did evals to make sure that your actions and all this is good, all this is coming part of this product called Claude code. Now people realize that it's not only coding specific, you can actually do work with it. Then they release these plugins for legal, marketing and sales. And what happens is they are able to do a better job at context management, taking actions and evaluations. Why? Because for context there is another unblock in this world, which is what we thought. I thought of that like the best thing that will happen with these models, but I thought maybe it is the browser where things will happen. But where things happened is this idea of computer control. And we all became normal because we thought these are coding tools. So we give them access to two things. One, your file system and second, your bash commands. This is how the whole world works. This is how if you're an engineer, you can control your computer. And if you can control your computer and your browser and you can have access to your file system, you can do all the context management and with bash you can do all the actions. And third, it has the browser control as well. So now you have all the action powers. And on top of it they have evaluations, which is they are saying, hey, I can do the lint checks, do rule based checks, I can go and do UI element checks in HTML and I can do what I was showing you, which is these LLMs as a judge and all this I am going to do. So you don't need a third party provider, you don't need Harvey, you don't need somebody to come in and do it for you. If you take cowork and if you use my tested skills, you can do it yourself. So this is first time that whole intelligence layer that we were talking about, which was Claude, actually became the whole harness provider as well. So this is anthropic, entering into everybody's lunch and saying, hey, we need the space of this also. And that 2 to 3 billion. What you're seeing in Claude code is basically coming right now from these valuations which these people thought that they can go solve for each customers, but now they can't solve because everything is mostly three things. So once you have this, then on top of it you can connect a UI layer or these three things, how they are able to teach the world. And obviously you have done a great job with all your podcasts and teaching the world these skills on top of it, which is that anything that you want to do can be a skills. Anything and skills are powered by these action taking sub agents which can be triggered by something called hooks. And now you can schedule jobs. With these people learning these on top of these computer control which is powered by this context loop. And by the way, there is a bottom layer to this which is These models like Opus 4.6 have grown and able to do long horizon jobs. What is that? They are trained in a way that they can run for three to six hours without breaking. This was not there. If you looked at the matter benchmark the last six months ago, the longest horizon job they can do is three minutes. And this has gone exponentially in last six months. So you put a long horizon job, you give me access to your bash file file system and then the users can create these skills naturally in English language, you have unlocked what the human potential was logged inside these coding or some software, or some software provider will go and do it for you. Now it's just a skill. And if it is a good skill and if you know your skill and if you have your craft, you can put it in a skill which can be paralyzed with sub agents and triggered with events like hooks and then hold. This operating system is available to everybody. And I think that's what happened with Claude code. Claude code people realize that they need not to do the hard piece of which I was showing you with the rag, what the chunk size needs to be. And this rag, we do it in our labs with N8 where I show people that it's so hard to do the right retrieval and check this right retrieval with evals. So you have to do agentic rag, then you have to do graph rag. All that now is a responsibility of cloud code. It goes, it gets the right context. If the context becomes too large, it compacts the context. All that is happening for you, you need not to code for it, you just pay for it. And you pay by like some 20 bucks or 200 bucks, which Harvey by the way was charging you $10,000 to do. So that's what every lawyer got now. That's what every law firm has now. And they are looking and saying maybe we don't need tools. And that's how the benefits of AI are getting diffused into economy with Claude code, then coworkers and maybe we can talk about openclaw next. But that's what changed for me six months ago when I actually took hold of Claude code with the latest greatest model. I tried Claude code, by the way, and I was not very impressed with it for doing things which are beyond coding. And I was like, yeah, it's a good tool. Yeah, But Cursor does the same thing. I have GitHub Copilot, which I got for free because I'm Microsoft, you know. But then three months ago, I went to Claude Code once Opus 4.6 came out, and now it's a superpower. I don't know its limitations. It does anything that I wanted to do. And that's why when I talk about it, people say that I'm scaring them to take my courses. My life is good without my courses, by the way. I'm just telling you that anything. Like when I looked at ChatGPT, I was very excited about it. When I looked at Lovable, I was excited. That was the second moment for me that, yes, front end is a solved problem. And then I looked at cloud code and it seems like everything is a solid problem. It can replace anybody. And this is an open challenge. I have given to anybody that if you can do a job, just tell me the job and this agent will do it better than you. 365 days, 24 hours. And that's what we are playing with now, right? So that's the world you live in. And this is moving each model. These long horizon jobs are shifting to more and more jobs. These context action evals are getting better inside, so you need not to worry about it. And more and more people are sharing their skills, their agents and their scaffolding. And this is the new scaffolding layer, which is a very thin layer in English, rather than figuring out all rag, all of this tool calling all of that putting the guardrails. Obviously guardrail is still your responsibility, but there is like, at least you can rely on these companies to not screw up because that's the only thing they need to go solve next. So that's my take on the question, right, which is, hey, what has changed for all of us? What has changed for all of us is that if you know your skills, you can build it in cloud code and delegate that work to agents.
[46:58]
Akash
Can you show us this in action? What does that look like?
[47:01]
Mahesh Yadav
Oh, you like action so much. So let me show you that also. And again, right? So there is one job I do right, because I started my own company. I have like at this point, 20 people working with us. And once you do that, you figure out that most of your time goes into reviews. So what I build for myself here is that what I was able to do with just. I just thought maybe I just do this Let me share. As you can see, this is my screen, this is my co work. I have bunch of stuff going on but people send me all the time is that they will send me a PRD for review. So now I have given the review job to my agent. How it works, if you send me a prd, I will just go and say hey or I can just start a new task and I will say hey and I will select my folder. So this is cowork, same like claude code. You can provide it a context. So I will select my review context, I will change my model. Opus 4.6 is too expensive. So let's do Sonnet. So I will go to Sonnet because I'm paying them a lot. So they are like let's eat his token with Opus and give him the best results. No, Sonit is good enough. So I go to the review folder and now I can upload a file and say hey, can you just put comments on it? So if the good model or before this, if you were doing something like this, what will happen is you have to go and create. So now you can get. So this is a product 2 pager. We are just building a new product. This is live. And what happens here is that I can go and upload this file and then say hey, can you use our checklist and review this file, Put comments as I would have done. So now if you look at this, this is some sophisticated tool that you need because now it need my checklist which I already provided in these reviews. But first it needs to build the context it need to understand. But now you need not to build the query engine, the query to knowledge mapping. Now it automatically finds out the skill which I have built, it's my PRD review skill. And then it goes to my reviews instructions and then it's reading my checklist. And based on my checklist which I have provided to it, it's going to review this file and then upload or put comments in it like I would have done. As I showed you earlier, I could have put this whole agent inside Slack and somebody could have just asked my review and this would have given a review in five minutes and saved me a bunch of time on basic things which people forget of course because they are busy with their lives.
[49:51]
Akash
So that's the first inside the PRD review checklist MD and see what that's all about.
[49:57]
Mahesh Yadav
Definitely. So let's go to my VS code. So second thing I do is so the same folder VS code. And now you can see this is my checklist so this is my checklist. I've created this with lot of blood, sweat which is mostly prompting to Claude. But I had a checklist before because I was very fanatic on how we should write our two pages. And I'm a big fan of Amazon PRF PRfax. So I like that format because with one page you can get a very gist of what's going on. So now this is our checklist.
[50:31]
Akash
Interrupt you just based on where your face is. Now can we move the camera down a little bit again? Oh, thank you so much.
[50:40]
Mahesh Yadav
Yeah, there is this, this circus we need to do because this mic thing today. Yes please. Thank you for pointing me. Thank you. I have no clue because I get lost in these.
[50:52]
Akash
Thank you for pointing, killing it. So we're just. You're talking about how you built this blood, sweat and tears prompting.
[50:57]
Mahesh Yadav
Great. So, so this is like I'm a big fan of Amazon prfx. I have looked at all the PRD formats and still stuck with that. Like we had one at Meta and we had one at Google, but still I love this whole two pager thing of a prf. And this is a checklist you are looking at with all the things that are like it checks, does the problem has urgency is clear. Is the solution differentiated from ChatGPT copilots or commodity AI wrappers? So I have put my AI specific things here also because this is specific for building and evaluating AI tools. So I build it with a lot of knowledge I have but obviously prompting a lot as well. But I stand behind it. That's the deal. If you do what I have said here, you can be rejected. So as you have seen, so now what it is doing is. Okay, good. I have read the checklist, the doc. Let me unpack the docs targeted. Now let me inspect the documenter to find the right paragraphs to anchor each comments to. So now it's going and it's also going to use another skill that it has which is updating word documents. So it will go and comment inside the document, it will unpack the document, it will find which sections or which points it is having problem. So now it's saying all comment All 7 comments added. Now I need to insert the XML marker into document XMLs to anchor each comment to its paragraphs. And once that is done you will see a file which should look like this. So this is how it will go and say okay, Mahesh has put this which is market sizing is too broad. MOT is missing add section explaining defensible advantage. What are the steps datadog or big four consulting firms from building this AI failure modes are unaddressed. What happens when attribution is wrong? How do you handle misclassification of AI versus human work? So not only like some wishy washy real good comments which I will put in. So that's your first step. Maybe you got it right and maybe you didn't. Maybe. So if you go and build an agent then this should be it. If you build a chat chat then this should be it. But then you will go and I look at this and I say yeah they did the good job but I don't like this whole section wise. So then I will put another comment and I will say hey you know what this thing is? This is more market specific. How can you make it so now I am adding my comments and I'm saying looks for PR fact for right in that format this means that heading problem and solution section without question answering here question and answer format. So it seems like format was picked was not taken and that was not in R. So I added a new comment here. Okay so the beautiful part earlier was that yes it did the job but obviously it might miss few things and I today I have a very different angle. One day I go and comment everything. One day I don't comment anything. So today I want to push AI. The idea is that if AI does our work what are we going to do? We're going to do and push push it even further. So now I'm doing my job which is I'm saying hey you go this comment and then I look at this and I say why this problem is hard. General observability tool don't understand workflow or multi step multi tool adoption is invisible. This is too broad. So I can say hey this is too broad. This is too broad. Make it pointing, make it point to human story as what happens to CXOs when they can't see impact of their AI investments. Great. So now I have put two comments. So this was output file that you got and now you got this whole thing. Then what I build on this is that I said great. Now AI does the review work for me because I have these skills. I have put my instructions in the cloud code and it does the work. But if you have somebody who is working for you or if you work in real environments you wanted to learn from it every day. So for that I build another skill and other sub agent which is what it does is it goes in and I have scheduled them to go and check for my review comments. So what this one does is it goes and runs every two hours or every 30 minutes and checks what are the comments I am adding. It automatically is scheduled. It looks in the same folder where in the reviews I'm putting all my things and it goes. And then what it does is it creates this file called Learner MD and it learns from my patterns. So now if you look, it's updating this file which is a Learner MD which Claude code did not provide it. This is what as a human I added. So it goes and put all my files first every day whenever this job is done. Because I'm a pm, I do a good job of organizing things because I need to evaluate because I understand these concepts. So every time a job is done, it creates a folder here which says who did the job? What was the job about when it happened? And inside that it creates this, that this is the checklist I used. This was the input document, this is the output document, and this is what the user modified document looks like. So it create all the artifacts. So if you want to go back check, you have all the data that you need to debug or see what happened in this story. Then each 30 minutes Claude goes and compares these two files and creates this learner MD file which says, hey, I looked at this job folder and user added these comments. And with that I have these new learnings what I got right, what I missed. And these are the checklists that I want to go and update in the checklist. But it doesn't go and update it right away. I have another skill that sees the patterns here and if it finds it for five days it sends me an email and says hey, I want to upload your checklist. I have seen that you revise this many times. You have updated the same comment five times today or five times in past week. Do you want me to update the checklist? Here is the updated checklist and I quickly review it and then my new checklist comes into life.
[58:10]
Akash
Now I have how does that look?
[58:13]
Mahesh Yadav
Yeah, so that is the PRD review checklist, this checklist. And then you will see versions of this checklist which is getting stored above this whole thing. So there is a master version and then there is a version that is in works. So now if you look at it, what I'm trying to do is okay, because I'm human, right? And I have to push beyond what these folks have already solved for. So what they have solved for is you can create these skills, but they are pretty static. This Is what basically I think all the people are trying to solve for. Let me just fix the camera first. Okay. So if you look at it, what I'm trying to solve is. Let's start here. Okay, then what I'm trying to solve for is that yes, this agent loop can solve the world hunger, but it can't solve my hunger. What I have done is I, as Mahesh call this agent loop. We take the learnings of this, which is the learners md, and then I have my agent which sits on top of it, which is inside machine. So this is Mahesh boxes AI and it goes and checks the work done by this agent loop in those folders and then keep updating my comments or my work every day and creates this learner MD which updates the checklist, which is an input to this loop. Not without my feedback. With my feedback. So now I have created a continuous learning loop. So now every day, Now what happens is every day when you come in and I use this system, my reviewer is going to be better than Akash reviewer or anybody else Reviewer. Yeah, or Claude code reviewer. Because it has learned from my best practices. And everybody who's improving this file, maybe five people comment on this file. My organization has a different soul. We care about different things. Now all this can also be contributed if we build on top of the agent loop, which is bringing our own context. And this is the idea of continuous learners, which I have like first step today. But there is more. Obviously we all can follow. But that's not coming inside the loop. This is outside the loop that I'm trying to build with this learner checklist and feedback. And once you do that, you can adapt these continuously. Now you have a continuous learner like humans. So you and me go to any company like this podcast and last time to this time I'm improvising based on the feedback you gave me and based on the comments all of you gave me. Same way, these agents are also learning now based on what you provide as feedbacks, which is very natural. We are not asking you thumbs up and thumbs down. We're just asking you to do the job that you gave it to us. And if you added to that job the next time, we will do better. So keep doing your jobs, keep these agents along with you and they will learn your world. And every day they will get better and push you to push you to even get better at your job. And that I think is a dream for all people. Like if you have a craft and if somebody is there to learn from your your craft from you and then push you further. I think that's the dream humans always had.
[62:03]
Akash
So we got a preview into PRD comments. But what else should PMs be using cloud code to do?
[62:10]
Mahesh Yadav
I think first is like do your first thing, which is like you started in any 10, then you created your first agent which you were able to do a normal job that you're doing every day. Like for me, PRD reviews was the thing. For you maybe writing PRDs is the thing and you do that first job. But then after that I think the Ginny is out of the box. Let me show you what PMs do and what we help PMs do in our cohort. So what happens is once you understand all these cool things that they exist, then what can you do with these? You can just go. So this is our first lab that you can use which says, hey, creating agents in Claude code. We talk about all the basic things that you need to do. And this one just does competitive analysis. So this is building your own competitive analyzer. So it will go research the web and gives you insight. So this is a lab which we give or creating sub agents. Here you create sub agents which basically what they are doing is we have a lot of sub agents now inside Claude code and they will go and look at different competitors you have and what's the insider news, what's the outsider news and generate a report for you. Oh, that's not enough. Then this one allows you to create mocks. So okay, you can create, once you do your user research, market research, you can create PRDs in cloud code, but you can create mocks and visualization which earlier you were waiting for your design teams to do. Oh, but that's not fun. Maybe, maybe you can clone your mocks from the source and then modify the mocks to build an end to end prototype of your product. So now here we build a whole product from the mock. So we take these screens and we modify them and build an end to end working prototype that you can publish and customers can feel, touch and give you feedback, not in mocks as a real product. And then you can analyze data so you can see who is using, how they are using, where things are failing. So then we give you this lab which allows you to not only just give you data, but also analyze this data and create fancy dashboards which shows you how many contracts have been analyzed, what is the processing time, what is the average rating, compliance rate, bug reports, all. So your app gathers the data and now you are creating these dashboards so everything which used to take you almost two to three months first to write the prd, to get to mocks, from mocks to a real working prototype, from there to getting customers and seeing the signals, all that is getting squeezed with this Claude code and all this you can do with Claude code and we give you labs for that and the labs are public. So for all AKAS audience we will even do one hour free sessions, whatever it takes. Because the mission is to make sure that everybody can build in this new age because there is a lot of spend on these. This technology is very expensive and if we can't build, if we can't diffuse the benefits of this technology to economy, we all are going to fail. So with that at least I will do my part and Akash is doing his part by allowing all of us, this platform to spread whatever we have learned or whatever we have seen out there.
[65:30]
Akash
So Claude code, it's obviously great. It's a session based power tool. What are the limitations of Claude code And when should PMs be thinking about using OpenClaw?
[65:42]
Mahesh Yadav
Oh, another one. So this is an amazing session. We start with like chatbots, then we go to NA10s, then we go to Claude code and now the most beautiful thing called Open Claw. So as we talked about this agent tick loop, right? What happened? This is the another exciting thing that happened in December and generally what happens is November, December, people get time to actually do things and like sit down and not just run through these errands of life. So what happened with that is that Peter, who is a developer from Australia by the way, and all these things needs to come from developers and not from big companies. That's the one pattern you are seeing that the big breakthrough came from ChatGPT which is like an OpenAI1 team. Few people launch something in and then they could see amazing thing happen. Similarly lovable. A team outside us just built something and became overnight sensation. And then you saw OpenClo. So what is OpenCloud? So OpenClo said hey now this agentic loop is open, anybody can build on it. So initially what they said is open Claude because this agentic loop is coming in Agent Agent SDK so I can write the same tool like go work or Claude code, anybody can write that. And the beautiful part is that it can connect to any model. So when Claude build it or anthropic build it, the agentic SDK was open that it doesn't only work with Claude models, it can work with any models. So what he did is he said this loop is great, but people are having problems connecting to different channels. So first layer, what he did is he connected to these channels, which is WhatsApp signal. And Slack Telegram and everything. And at this point like hundreds of these. And then he created this gateway which automatically opens a port and makes sure that these are good. So he did the hard work of taking the formats and making sure that all can be processed. And then what I was passing through my email when I get this example. You were sending this email, but I was passing it to the model. He takes all these inputs or puts this agent SDK and then says, hey, this is the intelligence layer. And now anybody sending me these messages here, I will process it here and then come back and you can connect your tools, you can connect your models same way. And all this is coming as one thing called OpenCloak. By the way, I'm going to open source all of this and give everybody free access to it. So it's not tied to any company, it's out there, anybody can audit it. It's not something. So this is the first version of Open Claude. He said, I'm going to open source claude, which by the way, yesterday the code got leaked. But this is the code getting leaked in a very nice way. Now, if you look at it, because it's developed by a developer and because it has two things that Claude code didn't have at that time. One, it allowed everybody to just go and delegate work to it. So as you were talking about the session, the idea of Claude code was that it was built for developers and developer does iterations. But here there was not iterations. Here what they did is they did delegate the work. So you can delegate the work. I will go do the work and when the work is done I will come back to you. And you need not to bother about it because I'm directly coming to you through your channels and you need not to be in terminal with me and go back and forth. One idea is the delegation idea. Second idea is this idea of a shell or this sandbox. So instead of you giving me permissions on every file, every folder, why can't you just install me on a machine? So install me on a Mac Mini. That's why Mac Mini is out of orders or three weeks delay now, because if I can install this, this is the new operating system. These tools became the UI or the mouse clickable things that you can assign things. And now this whole compute works for you. And the third unlock was you can connect any model and even open source models and you are no more tied to the limits that Claude code. Basically everybody's hitting every day. So now you can tie it to any open source model. So that is Open Claw for you. So openclaw one. So this is the template that you will see everywhere now. So this template is going to be the next operating system. If my predictions have been right and I'm predicting all these trends all along, by the way, I've said agents will be great in 2023. Then I said, hey, you know what, we are going to go to multi agent and orchestration in 2025 and this year I've been saying that this pattern that you have seen here will be copied all over again and again and again and everybody will build on top of these. So now this is the new agent declare or this is the new agent definition. And now you can give the work and the work is getting done by the agent and you get the output back. So you can measure input and output rather than measuring tools, evaluations and all. So that's open Claw for you and you can just see it in action as well because I know you will just say, hey, can you show me in action? So now I know, right? I've done it enough times with you. So what I have done is I have, by the way, I have my Mac Mini here also. But for today what I have done is I have done simple installation because it's hard to just show you my Mac Mini and there is just a lot of things going on that which I can't share a lot about. But there is another way to put Mac Mini or Open Claw beyond Mac Mini is this utm. So on your Mac machine you can create this new VM using this tool. And I have my whole setup by the way, we have labs for these. So you can see that this is my session, you can see my overview, you can see which channels I have connected. I have connected only WhatsApp as a channel. And now you can see my usage of it, which is pretty okay because this is not what I use daily. The daily is in the Mac Mini. And then you can set up cron jobs, which is like scheduled jobs. You can define your agent skill nodes here also. So how, how can you delegate work? Okay, I can just go in here, I can go to my WhatsApp and I can just start chatting with it. I can say, hey, do a deep research on what are, what is agentic loop and long horizon jobs capability and how they can impact software market. Give me full report. By the way, I can Say all this through my command line as well. So it goes in. Now what it is doing is it's going to process this request and give me results. And this is me sending a message on WhatsApp to a friend or to an agent on the other side of the world. And you see that it says, hey, here is the agentic loop. Introduction, landscape, emergence of autonomous software agents, increased demand, companies leveraging. And this is just vanilla. I have not even put my skills and everything that I have put on others. But now you can just put a channel and in 30 minutes we give you labs as well. So for this also we have set up labs where you can go and create your whole. Just give me one second. I think those are. Okay. This is coming down. Open Claw. So here also we have labs for you where you can go and set it up with WhatsApp. But you can also automate all of your world using this lab where you can connect your Gmail which I have connected to my Mac Mini, which I couldn't show because just there is so much personal stuff. But you can connect and make it your personal assistant. You can let it manage your calendar by step, by step, following this guide. Second thing, what we have done on this one is that you can go and create a whole discovery process or whole autonomous developer for you, which goes and scans your GitHub and then fixes the bugs that are P2 or P3 for you and then send a PR request to your development team. So now you're becoming. If they are becoming a threat to you, you are becoming a threat to them. So you can build the whole lab where it scans your GitHub first, the agent goes and does tests for your UI or a new feature, then file the bugs. And then instead of assigning these bugs to developers, it goes and try to fix the bugs as well and then send a pull request for final approval to engineering because they still control the code. But you can build all that in OpenCloud. I'm also building a mini PM in OpenClow which will do all the PM's jobs. But first I thought maybe I should build the dev, because that's what is scarcity for me. Maybe the engineers are building mini pm. So that's Open Claw. If you can take this kind of a structure, you can set and assign work to it through the channels that you are already familiar with. And then you can have this idea of this whole sandboxing or controlling a whole machine to itself and you are just giving it that work and then permissions to Go access it to do your jobs.
[76:33]
Akash
So you've been a PM at all these big tech companies like Google. Let's be real, right? Google isn't going to allow you to just give your company access to an open claw. How should a PM at a big company mitigate security concerns? How should they be using this latest technology?
[76:51]
Mahesh Yadav
Yeah, no, I 100% agree. I think the idea is not like OpenClaw is not a technology, it's not a product, it's a pattern. For me it's a pattern on how these agents can be useful with an agentic loop and they will copy the pattern and offer you in a sandbox way which can be inside their anti gravity or inside their Gmail workspace or on gcp. So in gcp, if you ask somebody today and say hey my Kubernetes cluster is down, I can't debug that today. But with this pattern now this message will be sent to their sandbox VM which will be running OpenClaw or some version of similar pattern and now it will go and simulate first to their Kubernetes cluster, try to make the same deployment and then debug it, first reproduce the whole problem. And that's what people do like as humans. We will first try to reproduce that problem, we will try to make the same cluster do it. But these agents can't do it. Single loops can't do it, but now we can do it because we have the full control on a full machine. And now the agent can create, reproduce your problem, suggest a solution, try the solution and then come back to you. And that pattern on that VM is fully controlled by Google. And as a user all you are seeing is I provided a solution to your problem, but you don't know that. I tested it, but now I can test it. And that I think is what I am excited about. Obviously there are challenges all around the security layer which I think you already talked about in your podcasts earlier, where lot of skills, lot of attacks around those. But I think once you sandbox it, which I think is the next big thing, now what is left? Right? We have the agent loop, we have the whole pattern, then what is left? I think the ability to sandbox these agents in a controlled way, that's an unsolved problem and Google will solve it. And I think OpenAI is solving it. And that's what you saw with Mymanus also by the way, this idea of openclaw is not new like Mymanus, the personal agent company gave you a VM and they run their code and if you look at it, last I looked at it, I can actually log into that VM and start doing web browsing on their VM or do whatever I wanted to that vm. So this was like this idea of enabling a lot of possibilities was always there. Openclaw just made it so popular or so famous because it was open source. So Google will bring it to their companies in their sandboxes and solve end to end problems which humans used to do on their laptops, and then dismantle that based on each query or each problem they solve.
[79:41]
Akash
So can you put it all together, what we've learned today, how do we organize this knowledge, what drawers do we put it in? Where does basic knowledge about ChatGPT and N8N agents, Cloud Code and OpenCloud, how does it all come together to create that builder?
[79:57]
Mahesh Yadav
PM yeah, that's a great question. So you say, Maish, you said a lot of things, how can I just have a plan for it? So I think first two to three weeks just understand the basics. Without that you won't be able to leverage or even understand. It will become overwhelming once you reach the open closed stage. So I spend with my people, like people who join our cohort. I spend like first six weeks with them explaining them what the model says, what the intelligence or knowledge is, how these tools actually are working. So spend that first three weeks, then get to cloud code or coworker and automate your world, which is whatever you do now, agents should be doing and you should be building systems which basically allow agents to continuously learn or follow your patterns. Which was two things in my example today, my checklist and my learner and then this human in loop pattern which was update everything every night but keep me in loop. So that the second thing I would love you to build as a second stage and third thing I would love you to spend another month on is just understand in and out of open cloud and see how can you have one thing in your lifetime in your job that you can just give it to a machine and the machine does the work and give you results somewhere else and the whole machine will be controlled by this agent. Give permissions left and right, make sure that you are not giving permissions to your world, just create a separate world for this and then see if you can delegate work and get it done. And once you have done these three things then just read obviously your newsletter or take any product and see is it a variant of openclaw or Agent Loop or it is something that is starting from scratch, like model knowledge. And then you will be able to see what are the possibilities that exist out there and which possibilities work for your company, for your feature, for your product that you're going to build next. And that's your next three weeks. So this is like three weeks, first four weeks, then three weeks, then two weeks. So you're looking at nine to 10 weeks of a good work through of building with AI or becoming a builder PM.
[82:21]
Akash
So you became a builder PM, now you're trying to interview for the role. How has the PM interview changed with AI? What should PMs expect in the new AI PM interviews?
[82:32]
Mahesh Yadav
Yeah, I think there is one thing that I would love to love to share with everybody because I'm doing a lot of research on this and a lot of people are coming to me every day. My calendar is booked for 15 minutes calls. I do four of those with our cohort members. Just 15 minutes because they have interviews and I don't charge for it if you have done our cohort because I just want to help and I want to stay updated as well on what's happening. So let me tell you like three things that are happening. One is this idea, especially for level 5, level 6 AI roles, that idea of doing normal product sense is gone. You will be given a problem and you will be asked to solve it either in a case study or during your interview. That is becoming a pattern and in that people are trying to see hey, do you understand where we stand, how the world is working or you are stuck in some past like six months ago or a one year ago past. So are you the person who is going to take us to the new world or drag us with the old world making old decisions? So that's what the first thing I'm going to check by giving you some assignment or by giving you case study or giving you at random problem. Second thing people are trying to assess is that hey, have you done some kind of system design work? Because still like there is a lot to even understand these things. So they will ask you questions on hey, design a system for me. Here is a system design where you think the AI should improve with open claw or where within Claude code based agentic loop how will you redesign the system? Which people sometimes PM just come to me and they are like we have this story where my wife was interviewing for an MBA job after doing her mba. She's a very good software engineer and this was like a hybrid job. And the recruiter asked her what are linked list or how to reverse a linked List, which is a very basic question for engineers. But she was expecting an MBA question. So she put down the phone, she's like, I did the whole MBA to get away from Linked list. And here they are again. So some people get offended with these, like, hey, why is a system design question for me in an AI PM interview or a PM interview and becoming normal. Because if you don't understand the design of these systems, you are going to not find the right capabilities that we should be building on. If you. These tools are coming up every day and each design is elevating what's possible. But if you don't understand the design, you can't see what the possibilities are. That's why people are trying to test you there. So these are the two things and beyond that, of course, like great product sense, great taste in the product, paying attention to detail. Those are not going anywhere. But these are the two, two new things that I will add in testing whether you're a builder PM or not. And I do that all the time. Like I, if I give you a job and if you're not pulling out your Claude code or some kind of a tool like lovable, you're already out. Like if you just ask me like, hey, can I do a drawing tool? Or can I create mocks in Figma, those are the things that are like done, done, I'm not interested. So that's the new world.
[85:51]
Akash
One of the distinctions that you make pretty frequently is this distinction between agentic AI versus AI specifically, what is the difference? What do people need to understand?
[86:03]
Mahesh Yadav
Yeah, AI is this idea that you can find patterns, this idea that we all have data. Machine learning helps you find patterns in data. AI helps you use those patterns and make money. Broader AI like AI as a umbrella. And then agentic AI is the thing which allows you to actually take actions, do jobs and finish work. So the idea is that agentic AI need to have these three or four components which we talked about. Can I understand the world that I am living right now? Can I understand what's happening right now in that world? Building my context with these two, once I have built, can I take actions, which is running bash commands or calling tools or MCP servers? When I do that, can I run my own evals and make sure that I have achieved the goal or not? And if I have done all these three, then I am an agentic AI or product. Or if I just send you something like, hey, is it positive, negative emotion? Then I'm doing more like an AI thing, which is I can do One thing specifically and if you send me in this format then I will work else best of luck. That's the whole world of AI or cognitive services we used to talk in Microsoft about. But this is like a world where we are relying a lot on the model or this intelligence and giving it a loosely connected tools knowledge and memory and then just trusting it to solve world hunger or any problem thrown at it. That's the agent AI and that's where most of the excitement and money is today.
[87:41]
Akash
I want to ask you a couple hot personal questions. You spent 13 years in big tech. You started in Microsoft around 2012, you left Google in 2025. Can you share the honest? What can people expect in terms of total compensation trajectory? What was yours over those 13 years?
[88:01]
Mahesh Yadav
Yeah, I think first it's pretty standard, right? You start with 120. I think AI worked very well for me. So I started with 120. Spent all my life at Microsoft to grow it at 360, 400. And that time I felt like I have achieved nirvana. This is like the best it can be. And then when I think it got a 70% bump when I joined Meta and then another 70%. So this is all the AI that was there. So after that I pretty much doubled my salary every year, every two years, 18 months. Every switch I made was a double salary switch. So if you end my last comp it was looking at 1.3, 1.4 million is what you make easy at my level. In my experience if you're working in AI and this is not you applying for jobs, this is them saying hey, we need you, we are doing this new thing. Seems like you are the only one who have done this before. You tell us what you are getting, we will give you 30, 40% on top of whatever you're making and then you can say I need 100% and then generally they don't say no. So and this is not only my story, right? All my friends who were at any stage, right if they were in Meta, they are working in Nvidia today and their total comp is looking at 2, 2.5 million.
[89:19]
Akash
Wow, that's insane. So is that why you bounce around so much? Because I never see people who worked at all four companies.
[89:27]
Mahesh Yadav
I wish. Well it's not that right there was like I loved Meta. So Meta was because I was just board at Microsoft and I wanted to get out of Seattle because for personal reasons I wanted to live in Bay Area and so Meta was a great company and I would never left Meta But Meta had this legal visa problem. So once I switch I needed like my green card and need to switch my roles as like a product manager. And they couldn't do it because they had a USCIS case pending. So they couldn't file like my green card. And I was in that line for 10 years. You might have known this a little that week stay. That's like one life thread that always running. So I gave them like good eight months to a year to resolve that USCIS case, but it was not moving anywhere at that time. I needed to make a call. At the same time I could admit that bedrock at AWS was just building and they needed somebody and they threw a lot of money at me. So I could have waited more, but they didn't let me wait at all. So that was a big deal. And then Google was more like a dream company to be honest. Like my wife always told me that if you are a PM and if you have not worked at Google, you are not a pm. Like that was her way of judging me. Because you know, if you you've done an mba. So you need not to prove to the world when you come from an engineering become background and you become a product manager, you have to prove to the world that you are a legit product manager. You're not a developer, just wore a suit. So that was just that. And when I they needed somebody who has built frameworks and actually build agents in 2024. And that I think what got me in, not my like awesome frameworks or Porter five forces, it was mostly how much I knew about AI. And so was my interview. My interview was very AI driven. Like very what I have done in AI and how I can bring AI to production. So that was my Google. So first that comp. And I think it was never driven by money, otherwise I would have never left. Right. So
[91:36]
Akash
why did you leave 1.3 at Google when you left? And it would be something like 2.5, 2.6 now if you stayed and maybe jumped again. So why did you leave? What are you up to now?
[91:47]
Mahesh Yadav
Yeah, so idea is like I think these companies are going to throw a lot of money at you to keep you and then waste you. So that was my observation, especially at Google. Right. Means I think I'm away from them enough and I'm not going back ever. So the idea is that these are large companies and if you look at like what happened in AI is large companies have not produced shit in AI. Like if you look at OpenAI, it was a small company which created ChatGPT then it was a small company, lovable that created Lovable. It didn't came out of Google. And then if you look at Claude code, it was created by a very small team inside Anthropic when Anthropic was not big and then Open Claw. So what has happened with AI is the tools are distributed but these big companies have no environment to grow something which can be imagined, put in production and put in customer hands. I'm pretty sure I love people at Google. Most of the level 3 thinkers or level 4 thinkers live inside Google. I will go any day to stay even two hours with those people. I love them. But the company will never launch something like OpenCloud. Something like this will be killed. Maybe you are thrown out of the company for trying something like OpenCloud. So that's the kind of environment and that's the kind of guardrails they have put inside their whole ecosystem because it's such a big machine. Same as ChatGPT. Right? Like if you look at OpenAI today they have become so big and it's very hard for you to just come up with new ideas and throw it on Twitter and then take feedback and iterate for six months and then one day say I have created something which is like some I think largest liked repo on GitHub that is not possible. And for me I reached a point in my life that I wanted to just stay unbounded as much as I could. And I was very blessed that I had this course and that was giving me lot of satisfaction of staying with people and not losing like one threat I had like money was like secondary to be honest. Main problem was that you, you get so dependent on these institutions for learning, for staying up to date. I do miss right I will pay them today to be around Google, right. So that is the main thing. But for me that course that I started teaching on maven always helped me stay even up to date than what is there. So as you see I have tried Open Claw, I have built my mini pm. I would have done more here than what I would have done at Google. And given a choice, I will never go to Google again or to be honest any company it just, they just kill every intelligence neuron you have with like you won't believe that, you know for a two page document you have to have a one page of approvals and that takes like six weeks. In six weeks a non builder PM becomes a builder PM and then they can build anything they want ever. So that's the world we live in and these are the companies backward. I know most of the people will watch this podcast with the branding of, you know, these big companies and the dream on how you get into. I just want to assure you that there is more you have today than you will have once you start working at these companies, especially in AI and especially for next two years. This is the time to go build. This is the time to build your own world and I believe in that future and that's why I left and I have zero regrets.
[95:28]
Akash
What a way to end it. Mahesh this is amazing. Your last episode, it was crazy. I think in like the first two weeks or something it hit 8,000 views. But every month since it consistently gets 3 to 4,000 views because your content delivers. It's evergreen and this episode was a perfect demonstration of that. Starting from first principles through to actions. Thank you so so much.
[95:53]
Mahesh Yadav
Oh Akash, thanks a lot.
[95:55]
Akash
I hope you enjoyed that episode. If you could take a moment to double check that you have followed on Apple and Spotify podcasts, subscribed on YouTube, left a rating or review on Apple or Spotify and commented on YouTube, all these things will help the algorithm distribute the show to more and more people. As we distribute the show to more people, we can grow the show, improve the quality of the content in the production to get you better insights to stay ahead in your career. Finally, do check out my bund@bundle.akashg.com to get access to nine AI products for an entire year for free. This includes Dovetail, Mobin, Linear, Reforge, Build, Descript, and many other amazing tools that will help you as an AI product manager or builder succeed. I'll see you in the next episode.