Check out the conversation on Apple, Spotify, and YouTube.
What is a builder PM? (1:53)
Aakash: PMs are being asked to push PRs, PMs are being asked to code. This is the rise of the builder PM. But what is a builder PM? Today I have Mahesh Yadav, who has been a PM at Microsoft, Amazon, Meta, and Google. He has seen all the top AI companies, he has been an AI PM for a long time, and now he is training AI PMs. Today he is going to help you understand how to become a builder PM, how to use n8n, Claude Code, and OpenClaw to become a more effective PM. Mahesh, everybody loved our last episode. Thanks for coming back.
Mahesh: Thank you for having me. I think this is the time of urgency. I would love to use your platform to send this message to all PMs that this is our time and we should be ready. When this time arrived, I was always preparing for this. And now the time is right for all the PMs to shine. It’s just a little bit that we need to go learn. And if we learn what we need to learn, this is our moment.
Aakash: So what is a builder PM and how does a PM become one?
Mahesh: I had an engineering background. I was not a traditional PM who came from a B school and then went to McKinsey and then became a PM. I had a very gradual move to PM. I was an engineer and I was always building. And then I became a PM because I was always building what the customer wanted, working backward from customers rather than just building for the sake of building.
For me, a builder PM is like, as PMs, we are always building. Our job is to build the right thing. Now earlier, if you had the tools, you would have built the whole product, but it was very hard. You needed at least three or six months of rigorous coding, testing, deployment. But in the new age, even people who are engineers are saying they are not writing code anymore. They are talking to customers and Claude Code does coding for them.
In that age, the skill that becomes important is what to build and what does the customer want. And if you use the tools to do the right prototyping and build at least the first version of your product, then you become the builder PM. In a nutshell, a builder PM is somebody who can talk to customers, figure out what needs to be built, and build the first version and get to 10 customers without talking to any developer at all.
The four components of every agent (6:29)
Aakash: Can you show us in action? What are the skills and concepts we need to understand?
Mahesh: There is a lot of misconception that if you start using Claude Code or configure OpenClaw, you become a builder PM. I think just knowing the layers or understanding how these things work is the first step. I will start my journey first in understanding these concepts.
In the new world, the agentic AI, I will start with something like n8n and then say, what is an agent? How does it interact with a model? What is a model? What are the limitations? What is memory? What are tools?
Think of it like how we grow as humans. A kid is born. First way to have knowledge of the world. We teach our kids, this is hot, this is cold, all those signals which is the current state of the world. That is your signals or memory. Then if you did a good job, you can ask your kids to get a glass of water for you. That is the tools where they can use a glass, open a tap, hold it and get it back. Then we learn the guardrails, the laws, what is possible.
If we replace humans here with a model, the model is just the intelligence layer. It is trained to predict the next word and now has some reasoning. But you need all this harness, or scaffolding, to actually build something that can solve problems. This harness is called agents. And every agent has one of four things in it, or the good ones have all four.
Building an agent from scratch in n8n (6:29)
Mahesh: Let me share my screen. So this is my n8n. On the right hand side, you just take an AI agent. The AI agent has a model, which is the intelligence layer. So now I am connecting an OpenAI chat model. I will save money and pick GPT-4.1 mini. So now you got the model. This is like a little baby, does not know anything about your world, but now it has intelligence.
So I can ask it some questions like, what are neural networks? It can answer. But if I ask it what President Trump said about ending the conflict in Iran, it says my knowledge cutoff is June 2024 because you are picking a cheap model. It does not have that knowledge, so it cannot answer.
So maybe we need a tool. I will add Tavily, which is a search tool. This allows the model to search. I am telling it that based on what the user is asking, define what you want to search on the internet. Now I ask the same question and it goes to the search tool, it finds information, and it is able to answer.
Aakash: Its training data probably ended in like 2023. Let us see.
Mahesh: Then if I ask it, what conflict am I talking about? It says, I do not see any previous mention of conflict. So if you build an agent with the tool and intelligence, it will be a stupid agent because it does not have memory. Who wants to talk to a person who does not remember anything?
So I add a simple memory which takes a session ID and remembers the last five conversations. Now I ask the same question, it does the search, updates the memory. And if I ask what conflict I am talking about, it fetches from memory. Now it does not call the tool because it sees I am looking for information it already has.
Adding knowledge – RAG and embeddings (18:30)
Mahesh: But maybe you have contracts in your company and you want to ask questions on those. Can I ask a question like, what are the payment term impacts of tariffs and war on our MSA contracts?
Aakash: I guess right now we have not given it a RAG database.
Mahesh: So I have another lab where you can create a knowledge base of your company information. You can upload contracts. What it does is it creates chunks. You see a data loader which takes the type of data, converts it into text, and then does text splitting. It takes 1,000 characters with 200 character overlaps. It divides your whole file, then calls the embedding model which creates a RAG-based database for you.
Now I ask the same question based on our MSA contracts. It queries the tool, extracts information from the document you provided. It says the document does not specify any provision, so we are good. That information comes from your knowledge.
Multi-agent systems and evaluations (21:54)
Mahesh: Single agent systems are great, but if you want to do real checks, you need multi-agent systems. In this one, you can send an email. I say, can you get me the risks in this contract before signing? The email hits this provider, which is my Gmail. It automatically triggers the workflow. When it is done, I get a response with all the risks in it.
You start first understanding all these things. These are the connectors, these are your agents. It has multi-agent systems with playbooks or your database connected. Then it can do end-to-end work like humans do.
The last thing is evals. How are we going to evaluate these agents? Because if they do a bad job, you are going to get fired. They are not going to lose their job. So we create ground truth. I have taken a contract, here are the terms you need to look at, here is the correct value, and whether there is a risk or not based on our playbook written by a real lawyer.
You can run a workflow where during the day, people submit files, you find the risk, results get stored. In the evening, you evaluate using the judge. It goes through row by row, eventually telling you that you have a risk quality that detects 80 percent of risk correctly. But your modification quality is only 30 percent as good as a human would have done. So you have work to do.
When n8n falls short (30:00)
Aakash: When does n8n fall short? When do you move beyond n8n?
Mahesh: n8n is more like a tool which allows you to get to your first 10 customers. It is very powerful, especially with Webhooks. Last time I showed you how to create your backend in n8n without any code and connect it to a Lovable or v0 frontend and build the whole solution.
But if you want to iterate, put things in production, if you want three people to contribute to your code, if you want test suites or containers, it does not support that. The worst part is there is no way for people to see the code and get to code mode. After people want to get to the next stage, to put this in code and share it with their team so they can see quality, take it to hundreds or thousands of users in the most latency-optimized way, it has no answer. It just stops you beyond 10 customers.
Why Claude Code changed everything (33:53)
Aakash: Please, Mahesh, show us when should we be using Claude Code, how should we be using it?
Mahesh: You should spend about two weeks in n8n and beyond that you should move to Claude Code. In the last six months, there are so many possibilities with Claude Code and Cowork to build things and put things in production.
It is the same tool chain which a person who has no coding experience can use, like building with skills, creating subagents, hooks, schedule jobs. And on the right hand side, people who know how to code can also build on top of what you provided. If you can build something that codes well, it can do any task well. That is what Claude Code is.
The agentic loop and what Claude Code replaced (35:26)
Aakash: You talked at the beginning about needing to harness this moment. Andre Karpathy talked about something changed in December 2025. What exactly changed?
Mahesh: If you look at the last three years, companies did three things with AI. The first kind, like Gamma, connected models with tools and connectors. They just made it easy to create slides and they are at a billion dollar valuation. The second kind, like Harvey and Lagora, took models and harnessed them to domain-specific knowledge. They became one to ten billion dollar companies in two years. The third kind, like Salesforce AgentForce and Amazon Q, provided frameworks to build agents.
Then a breakthrough happens inside Anthropic where they built the agent loop for coding. The idea is it will build the context, take actions, do evaluations, and based on that, come back and keep doing that again and again. If you are a context company, if you are an action company, if you are an eval company, all this is now part of Claude Code.
The real unlock was computer control. We gave them access to your file system and bash commands. This is how the whole world works. If you can control your computer and your browser and have access to your file system, you can do all the context management. With bash, you can do all the actions.
On top of that, models like Opus 4.6 have grown to do long-horizon jobs. Six months ago, the longest horizon job was three minutes. This has gone exponentially in the last six months. You put a long-horizon job, give access to bash and file system, and users can create skills naturally in English language. You have unlocked what human potential was locked inside coding.
PRD review automation in Claude Code (47:38)
Mahesh: Let me show you what I build for myself. People send me PRDs for review all the time. I have given the review job to my agent. This is Cowork, same as Claude Code. You provide it context, I select my review context. I upload a file and say, can you just put comments on it?
It needs my checklist, which I already provided in these reviews. It automatically finds out the skill, my PRD review skill. Then it reads my checklist. I am a big fan of Amazon PR/FAQs. I have looked at all the PRD formats and still love the two-pager format.
This is my checklist with all the things it checks. Does the problem have urgency. Is the solution differentiated from ChatGPT, copilots, or commodity AI wrappers. I have put my AI-specific things here also. It goes and reviews the file, then uploads comments inside the document.
It says market sizing is too broad, mode is missing, add section explaining defensible advantage. AI failure modes are unaddressed. What happens when attribution is wrong? How do you handle misclassification? So not wishy-washy, real good comments.
The continuous learning loop (56:13)
Mahesh: Once AI does the review work, what I built on top is that every two hours or every 30 minutes, another agent checks for my review comments. It looks in the same folder and creates a file called learner.md and it learns from my patterns.
Every time a job is done, it creates a folder which says who did the job, what was it about, when it happened. Inside that, the checklist used, input document, output document, and user-modified document. It creates all the artifacts.
Each 30 minutes, Claude goes and compares the AI output and the user modifications and creates the learner.md file. It says, I looked at this job folder and the user added these comments. Here are the new learnings. What Claude got right, what it missed, and what checklists it wants to update.
But it does not update the checklist right away. I have another skill that sees the patterns and if it finds it for five days, it sends me an email and says, I want to update your checklist. I have seen that you revised the same thing five times in the past week. Do you want me to update? Here is the updated checklist. I quickly review it and then my new checklist comes into life.
Now every day, my reviewer is going to be better. Every day these agents are learning based on what you provide as feedback. Keep doing your jobs, keep these agents along with you and they will learn your world.
Beyond reviews – competitive analysis, mocks, prototypes (1:02:48)
Aakash: We got a preview into PRD comments, but what else should PMs be using Claude Code to do?
Mahesh: Once you understand the basics, here is what you can do. First, creating agents in Claude Code for competitive analysis. Sub-agents that look at different competitors and generate a report. Second, you can create mocks. Once you do user research and market research, you can create PRDs and mocks and visualizations.
But that is not fun. Maybe you can clone your mocks from the source and modify them to build an end-to-end prototype. We take screens, modify them, and build a whole working prototype that you can publish and customers can touch and give you feedback. Not in mocks, as a real product. Then you can analyze data and create dashboards showing contracts analyzed, processing time, average rating, compliance rate, bug reports.
Everything which used to take two to three months, from writing the PRD to mocks to a real working prototype to getting customers and seeing signals, all getting squeezed with Claude Code.
OpenClaw deep dive (1:05:36)
Aakash: Claude Code is a session-based power tool. What are the limitations and when should PMs be thinking about using OpenClaw?
Mahesh: As we talked about the agentic loop, Peter, a developer from Australia, said this agentic loop is open, anybody can build on it. First, he connected to channels like WhatsApp, Signal, Slack, and hundreds more. He created a gateway which automatically opens a port and processes these messages. He takes all inputs, puts them into the agent SDK and the intelligence layer. All this is coming as one thing called OpenClaw, and it is open source.
OpenClaw has two things Claude Code did not have at that time. One, delegation. You can delegate work. It goes and does the work and when the work is done, it comes back to you through your channels. You need not be in a terminal going back and forth. Two, sandboxing. Instead of giving permissions on every file, install it on a machine. Install it on a Mac mini. That is why Mac minis are three weeks delayed now. Third, you can connect any model, even open source models. You are no longer tied to the limits everyone is hitting on Claude Code every day.
Security and enterprise adoption (1:16:45)
Aakash: Google is not going to allow you to give company access to an OpenClaw. How should a PM at a big company mitigate security concerns?
Mahesh: OpenClaw is not a product, it is a pattern. And they will copy the pattern and offer you it in a sandbox way, inside their GCP or Gmail workspace. In GCP, if your Kubernetes cluster is down, this message will be sent to their sandbox VM which will be running something like OpenClaw. It will simulate your cluster, reproduce the problem, suggest a solution, try the solution, and come back to you. That pattern on that VM is fully controlled by Google.
The ability to sandbox these agents in a controlled way, that is an unsolved problem. And Google will solve it, OpenAI is solving it.
The 10-week builder PM roadmap (1:19:40)
Aakash: Can you put it all together? How do we organize this knowledge?
Mahesh: First two to three weeks, just understand the basics. Without that, it will become overwhelming once you reach the OpenClaw stage. Then get to Claude Code or Cowork and automate your world. Whatever you do now, agents should be doing. You should be building systems which allow agents to continuously learn and follow your patterns.
Third thing, spend another month understanding OpenClaw in and out. See how you can give one thing in your job to a machine. Give it permissions, create a separate world for it, and see if you can delegate work and get it done.
Once you have done these three things, read the newsletter, take any product and see, is it a variant of OpenClaw or the agent loop or is it starting from scratch? Then you can see what possibilities exist for your company, your feature, your product. Nine to ten weeks of good work through of building with AI.
How AI PM interviews have changed (1:22:24)
Aakash: How has the PM interview changed with AI? What should PMs expect?
Mahesh: Let me tell you three things that are happening. One, especially for level five and level six AI roles, the idea of doing normal product sense is gone. You will be given a problem and asked to solve it either in a case study or during your interview. People are trying to see, do you understand where we stand, how the world is working, or are you stuck in the past?
Second, people are trying to assess, have you done system design work? They will ask you to design a system, where within the agentic loop would you redesign this system? And beyond that, great product sense, great taste, paying attention to detail, those are not going anywhere. But these are the two new things.
If I give you a job and you are not pulling out Claude Code or some builder tool, you are already out. If you ask to do a drawing in Figma, those days are done.
Agentic AI vs AI (1:25:29)
Aakash: What is the difference between agentic AI and AI specifically?
Mahesh: AI is this idea that you can find patterns. Machine learning helps you find patterns in data. AI helps you use those patterns and make money. Then agentic AI is the thing which allows you to actually take actions, do jobs, and finish work. Can I understand the world? Can I understand what is happening right now? Can I take actions like running bash commands or calling tools or MCP servers? Can I run my own evals? If I have done all three, then I am an agentic AI product.
If I just do sentiment analysis, that is more like an AI thing. But agentic AI is where we are relying on the model, giving it loosely connected tools, knowledge and memory, and trusting it to solve problems. That is where most of the excitement and money is today.
Compensation trajectory and why he left big tech (1:27:54)
Aakash: You spent 13 years in big tech. Can you share the total compensation trajectory?
Mahesh: It is pretty standard. You start with 120. AI worked very well for me. I spent all my life at Microsoft to grow it to 360, 400. Then I got a 70 percent bump when I joined Meta and then another 70 percent. After that, I pretty much doubled my salary every two years. My last comp was looking at 1.3, 1.4 million.
This is not you applying for jobs. This is them saying, we need you, tell us what you are getting, we will give you 30, 40 percent on top. And then you can say I need 100 percent and they do not say no. All my friends who were in Meta are working in Nvidia today and their total comp is looking at 2.5 million.
Aakash: So why did you leave 1.3 at Google?
Mahesh: These companies are going to throw a lot of money at you to keep you and then waste you. Large companies have not produced anything meaningful in AI. OpenAI was a small company that created ChatGPT. Lovable was a small company. Claude Code was a small team inside Anthropic. OpenClaw was one developer.
The company will never launch something like OpenClaw. Something like this will be killed. Maybe you get thrown out for trying something like this. For a two-page document, you have a one page of approvals and that takes six weeks. In six weeks, a non-builder PM becomes a builder PM who can build anything.
This is the time to go build. This is the time to build your own world. I believe in that future and that is why I left and I have zero regrets.
Outro (1:35:24)
Aakash: What a way to end it. Mahesh, this is amazing. Your last episode hit 8,000 views in the first two weeks, and every month since it consistently gets 3,000 to 4,000 views because your content delivers. It is evergreen. This episode was a perfect demonstration of that, starting from first principles through to actions. Thank you so much.
Mahesh: Thanks a lot. It was always a pleasure. Stay in touch.
