
Loading summary
A
ChatGPT apps are a huge opportunity.
B
How would you describe ChatGPT apps?
A
You can have this kind of built in experience where you can interact with an application directly in your conversation.
B
I brought in my friend Colin Matthews. He is one of my go to sources for technical topics on product management.
A
This is a really underrated way to get more distribution.
B
What's the case for other products? Regular people?
A
One you can build is like a spreadsheet or like a to do list app that gets pinned to the top.
B
What is the architecture of a ChatGPT app?
A
So underlying ChatGPT apps is this protocol called MCP or Model Context protocol. I tried podcasting for a bit. I built like probably four or five different SaaS apps this year. This is actually my own prototyping tool. Wow. I'm using Opus 4.5 for pretty much everything.
B
What might be some interesting examples that we can build live?
A
Okay, there's kind of two different ways to invoke apps. The first way is to just type out the name. The second way is because there might
B
have been some news pieces that you read about what the ChatGPT app store is, but nobody has broke it down in terms of what it means for or product builder. So that's what we're going to do today. Before we get into today's episode, if you can do me a quick favor and check if you have a following on Apple and Spotify podcasts and subscribed on YouTube. These are free actions you can take that really help the show grow. And if you become an annual subscriber to my newsletter, did you know that you get access to over $28,000 of premium products? That's right. Mobin, Arise, Relay App, Dovetail, Linear Magic Patterns, Deep Sky, Reforge, Build, and Descript. They are all free for an entire year if you become an annual subscriber to my newsletter. So go take advantage@buildle.akashg.com and now into today's episode. Colin, thanks so much for being back on the pod.
A
Yeah, super excited to be here.
B
So what I wanted to do first was take a look at the announcement video for this. Because this was announced about a month ago at OpenAI's developer conference. Most people probably watch this video, but they've already forgotten about it. So let's refresh your memory. Sam. All right, Colin, so everybody's seen the cool use cases. We saw an Expedia example, a figma example, booking.com, all these different apps that are built into it. How would you describe Chat GPT apps? What are they. And what is the ChatGPT app store?
A
Yeah, so ChatGPT apps are basically a way for companies to bring in their own designs, their own kind of way users should interact with it directly into ChatGPT. So rather than, you know, maybe giving a text summary of something or you recommending something from a web search, you can have this kind of built in experience where you can interact with an application directly in your conversation.
B
Why does this really matter? Because I guess it's kind of hard to discover. That's what I feel like. When I gave the example of the iOS app store, for instance, the Discoverability was there. It was one of the few apps that people actually got pre bundled with their new $1,000 phone. So they're highly likely to open it up at least once or twice. And then when they open it up, it's showcasing different apps. It feels like ChatGPT apps are kind of hidden. I haven't heard about them.
A
Yeah. So I would agree at the moment. And again, you know, we're recording this in 2025, so we expect it to change sometime soon. They are kind of hidden. And so there's a couple of companies that partnered with the init, as you saw there. Some other ones coming soon, like Uber is on that list as well, planning to release an app sometime in the near future. But yeah, they are kind of hidden. And there are definitely plans to bring together a full App Store experience. So very similar to what you have in like iOS or Android, you'd be able to browse apps, find ones that you like, and then download them and use them. Inside of ChatGPT, there's one other mode of discovery that maybe we'll get into a bit more later. But One thing that ChatGPT promises is that if you put in a request that relates to an app that might actually decide to kind of service that app to you. So for example, if I say without installing the app I'm looking for a hotel, I might get the Expedia app kind of surfaced in line, even though I didn't install it or ask for it to begin with.
B
Okay, that would be super cool. So they might be building some sort of tool calling system that automatically figures out, okay, here's a reliable app to help service this particular request. And I guess they've already done some of that. Like, because I've actually seen nowadays when I do search for a hotel that they'll pull in some Expedia search results. Although they don't always pull in from the app, it seems like they pull in from web search. So kind of how they're pulling in from web search, they'll start to pull in apps.
A
Yeah, exactly. Yeah. And again, it gives companies a little bit more control. Right. There's a very large kind of maybe panic around getting your content into ChatGPT. Right. Because people know that it converts. Well, actually there was some, some news I saw today that it's like 26% increase in conversion when a user comes from an AI source because they have higher intent.
B
Right, Yeah, I see that too in my own sites that the LLM traffic is much, much smaller in volume to like SEO, but really high conversion rates.
A
Yeah, exactly. So companies want to be present, but kind of playing this game of whack a mole with like the ChatGPT, web search is hard. And so now you have from an enterprise perspective a deterministic way to show up in the application. Right. Like your app is going to show up, especially if the user has it installed. But even, you know, if they're trying to do something and it's a relevant application, it can show up in their, in their chat and you'll be able to control what that experience is. Like, it'll be branded and you can even pop them back out. So a good example of this is Target. Target recently announced that they're building an app and you can't finish like a checkout with them. Inside of ChatGPT, you can't actually purchase items, but you can build a cart. So you can say like, hey, help me find holiday items or Christmas presents for my siblings. It'll build the cart for you and then you click out and it brings you into Target to complete the purchase. So I think that's another good example of like building this kind of deterministic, catered experience that feels really great inside of ChatGPT rather than relying on like web search to do the job.
B
Makes sense. So in that example at least, I highly grok it. Kind of like the Expedia example as, okay, I'm offering some high ticket service. I might have been getting a lot of traffic from search in many cases. So I need to get into ChatGPT Access, its 900 million weekly active users. So I understand that case. What's the case for other products? Regular people like you and me, like, what sort of ChatGPT apps have you been building?
A
Yeah, so very similar to the App Store. You're probably going to see like the eventual Ubers, right? Which didn't start at the very beginning. And then you'll see like the apps that are like the Flashlight app. Right. Like that, if you remember back to the iOS beginning, it was like that was an app you could download. And so I think we'll follow like similar, similar paradigms here. So for example, like I've been messing around and building some apps and one you can build is like a spreadsheet. So ChatGPT can like use spreadsheets as well as you, you can kind of collaborate back and forth. It's kind of like what you know, you'd expect out of the AI experience in Google Sheets, but it doesn't quite work correctly in Google Sheets. You can build a little spreadsheet app inside of ChatGPT or like a to do list app that gets pinned to the top. So if you want to complete multiple tasks with ChatGPT, maybe you have like three things you want to do. It can check off those tasks for you. So you get like a visual indicator. So those are like the little utility type things that you could build and then you could build like more fully featured experiences. Right. So things like you saw inside the demo that's there with like apps that have maps, navigation, search and like integrations with whatever you want on the back end. And I guess that's the last thing I'll mention here is that like, there's actually no strict limitations in terms of what the apps can or can't do. It's just that they're so early that most companies are releasing like very, very bare bones versions of them to get into the marketplace. But I think we'll see like a lot more complex apps than what exists today.
B
I love that you've been building these, you've even built a platform to build these so you really understand the technical details. What do we need to know about how ChatGPT apps are made and built?
A
Yeah, so underlying ChatGPT apps is this protocol called MCP or model context Protocol. This is invented by Anthropic. It's about a year old. Basically what allows AI agents. So things like ChatGPT and Claude as well as like Gemini, you know, Cursor really like anywhere you would be talking to an AI, it allows those tools to reach out to other things over the Internet, other tools and use them for whatever purpose. So you can think about like web search as an example of this, like a common tool that would be built into a chat application. Any other tool that you might want to think of. Right. So things like booking a stay with Expedia or getting Figma to maybe do some design work for you, those could also be defined as tools that ChatGPT or Claude or any other AI chat could call over this protocol of MCP. So a quick diagram here. This one basically just shows us what it might look like to help book a short term stay. So as the user, you would say something like that, you know, I want to book a stay in New York for this time period and ChatGPT is going to decide first, does this request need an app or would it be beneficial to use an app? And if it does want to use an app, then what are the available tools that we can use in order to help facilitate this request? And so the first thing it's going to do is actually going to go ask for the list of tools that are currently available. And you can see that we have two tools available. We have book a listing and then browse listings. ChatGPT does cache this information, which basically just means that it holds onto it and it doesn't refresh unless you kind of force it to refresh. But there is this kind of underlying need to know what tools are available before we actually go ahead and call the tool.
B
If you're enjoying this episode, Colin literally teaches a course on this. The next cohort starts January 30th. It is on Maven. You can use my code to get a special discount off this course. I highly recommend Collins content and courses. You all seem to love what he's doing and I personally love reading it. I gained so many epiphanies just out of this one podcast recording. So think about what you could get if you were working with Colin extensively over a live cohort course. Check out his course. And now back on today. Today's episode. Today's episode is brought to you by Vanta. As a founder, you're moving fast toward product market fit your next round or your first big enterprise deal. But with AI accelerating how quickly startups build and ship, security expectations are higher earlier than ever. Getting security and compliance right can unlock growth or stall it if you wait too long. With deep integrations and automated workflows built for fast moving teams, Vanta gets you audit ready fast and keeps you secure with continuous monitoring as your models infra and customers evolve. Fast growing startups like LangChain, Rider and Cursor trusted Vanta to build a scalable foundation from the start. So go to vanta.comakash that's V-A-N-T-A.comakash to save $1,000 and join over 10,000ambitious companies already scaling with Vanta.
A
And the Last thing is ChatGPT is going to decide which tool to use for this request. So the browse listing one might make sense to start because we need to know what listings are available so we can show that back to the user. And so we could ask for New York on a specific date and we get back a list of listings from our MCP server. And then ChatGPT would kind of describe that information. So this is the bare bones version of mcp, Right. There's no actual UI or app being involved here, but you could build this if you wanted to and just like have it say, here are the top five short term rentals in New York for that date. And it would literally just write it out as a description. The addition here on top of MCP is this thing that actually OpenAI kind of invented and is now being incorporated into the MCP spec, which is the idea of like widgets or these little interfaces. So in addition to the raw data, the listings, it can also return a position or URL for a widget that we want to return. And so the last thing that ChatGPT is going to say is, okay, now that I know that there's a UI element or a widget that goes with this, let me go get that code and then render the code inside the chat. And so that's how we end up with like that code that shows up, or your app that shows up inside the chat. And then it's still going to respond with something. So it can say like, here are the best options, some small description, really we're going to be interacting with the UI that you see here as like the main interface rather than the text.
B
Makes sense. So how do we build one of these ourselves?
A
Yeah, so there's, let's say, the easy way and the hard way. So as you mentioned, I've been working on a platform to make this a little bit easier. It's called Chippy. And we can go ahead and take a look at maybe an example really quick before we hop into building it. But basically what Chippy does is it spins up everything you would need in order to build a ChatGPT app for you. So it spins up an MCP server for you, and when you prompt it, it's going to basically be specialized at building tools. So not full stack web applications, but literally just what you need to build a ChatGPT app. And then there's some nice like UI UX stuff that I've built into it to help you build the app. So this example here, this is a coffee guide. So I kind of wanted something where Like I can look at a map and see maybe a good place to get coffee. As you can see on the left hand side inside of Chippy, I just asked it to make me a quick location guide and this is what it decided to build. So on the right hand side we can see the component that it built. This is a tool by the way. I'll go through that in one second. But has this little left hand pane, you know, we can kind of click through on this and it'll kind of pull up the right, right side for me and I can get directions. So this is what it decided to build. The other nice thing about using this inside of Chippy is you can kind of get a preview of what it's going to look like inside a chat experience. So I can say where should I get coffee? And there's an LLM working in the background that'll actually call that tool and then throw it into the UI for us. And there we go. So this is actually a full screen UI by default. We can see that what that looks like there. And again we can interact with it in this kind of. You can think of like a simulated chatgpt. Right. So a quick way to test what you've built to see if you like it or not.
B
Okay, makes sense. So just to play that back right, the easier way what you're doing is you're bundling together an MCP server, which if we recall from the diagram, that's like how ChatGPT is going to get connected. It's the universal USB C plug for LLMs to call tools like this. This is a tool and then it's got the right understanding of what needs to be built to create one of these tools. So it simplifies the tool process to basically just prompting with an LLM. That's the easy way. What's the hard way?
A
Yeah, so the hard way would be basically spinning up your own MCP server. That's the first thing, getting that hosted on the Internet somewhere and then understanding kind of like how to write the code to build these tool definitions as well as to build the ui. And there is like kind of something called bundling that has to happen which is when it translates your UI code into something that ChatGPT can actually understand and render. Because the code that you write doesn't just like get downloaded and rendered in the same format inside of ChatGPT has to go through this little process bundling. And so you'd have to also bundle your, your kind of UI code. And then the last thing is just understanding what the options are. So like, you know how to interact with the full screen version of apps versus maybe like the picture in picture or inline, all that kind of like guidelines that ChatGPT provides or OpenAI provides on how apps should be built. All that stuff is kind of built into this agent. But if you want to do it the hard way, you kind of have to learn some of that stuff and then host it, build it, and then eventually connect it to ChatGPT would be the last step.
B
Okay, so you'd probably be using like a cursor claude code sort of format versus here you have more of like a U AI prototyping interface to build that.
A
Yeah, exactly. And you know, you can kind of get the experience of testing the tool that you built without going through all the steps of connecting it to ChatGPT. The reason I built this, actually is because I was working on a completely different app for ChatGPT, and it was such a pain to go through the iterations of like, every time I wanted to make a small UI tweak, I had to rebundle the code, make the change, go back into ChatGPT, update it, and then see if I liked it. Whereas here I can at least visually see it and then kind of play with it in the UI without having to go through that whole process every single time. Cool.
B
Awesome. So what might be some interesting examples that we can build live?
A
Yeah, sure. So obviously we have this one here. Maybe I'll just quickly spin this one up inside of ChatGPT. So in order to do this, there's one last step which is just connecting. So if we go up to test here, we'll get a little URL that's generated for us. I'll copy that and then we'll head over into ChatGPT here and we'll go into our connections in our settings, and you'll see here that, I mean, I have a bunch of enabled apps because I play with these all the time. So you can see I have a few that are built elsewhere and then a few that are built by myself as well. But basically the last step here is click Create Paste in the URL in this MCP URL field, turn off the authentication, unless you really want authentication, and then also give it a name of some kind. So this one we call like Coffee Map. I think I already have one. So I'm going to call this one Coffee Map too. And then finally click this little button. So it's a little bit involved when you're testing, obviously, for installing Apps, it's a lot easier. Like the end kind of consumer experience even. Think about this more like the developer experience. Right. Like I'm a developer, I want to build my own app. Those would be the steps that I go through to test it before I release it to everyone else. Obviously installing like, you know, Canvas app isn't as involved. You just go in, click the button and click Install and that's pretty much it.
B
Mm. Makes sense.
A
Yeah. And then we'll give this one a try. So just to show you, there's kind of two different ways to invoke apps. The first way is to just type out the name. So I say Coffee Map. You'll see that it pops up automatically, that it knows that there's an app and I want to use this app. The second way is to actually tag it manually. So if I go into my apps here, I can click Coffee Map and again it comes up. And the last way, theoretically we can give it a try afterwards, is if I don't, even if I say something alluded alluding to it, like I want to get a coffee, where's a good location. ChatGPT may decide to use my app. And that's a lot of like where the kind of finesse comes in in terms of getting it to be better is you want your app to show up on relevant queries. Yeah. And so you're going to have to play with that and actually go through like an eval process very similar to, you know, other AI tools.
B
It's an eval process. Say a little bit more about that. I guess I was thinking it was almost like another AEO process, like you need to somehow develop the reputation through queries over time. That Chad GPT is feeling like you're a good tool amongst the millions of other tools trying to get called for this query.
A
Yeah, that might become the case. You know, there might even be ads for, for, you know, tools and stuff like that. But for now, really what it is, is like when you type something in, is ChatGPT gonna do a good job of calling your tool? And that's just based on like the very limited set of tools that even exist. Right. I mean, there's less than 20 right now different apps.
B
Okay. So can anyone. And anyone can get access to publish a public tool?
A
Yep. So right now there's no marketplace where you can kind of publish them publicly. You kind of have to be part of the launch partners. So some of these large companies. But in the very near future, then there'll be this kind of public marketplace. Where you can launch your own apps directly. Very similar to what we're doing here.
B
Okay, so right now, if you build one of these, you can't launch it.
A
Yeah, correct. Not to the public. I mean, you can always do what I'm doing here. Right. Which is like give someone a URL that they can. They can play with. Yeah. But yeah, ChatGPT has said, or OpenAI said by the end of the year, so we're getting there. It's December now, so we'll see if that comes through or not. But yeah, I'm end of the year, so we're.
B
We're basically learning how to build for a platform that's about to become available. And so you're kind of just on the bleeding edge of the distribution of this and making a bet that OpenAI will support it.
A
Yeah, exactly. Yeah, you got it.
B
Cool. Cool.
A
So, yeah, here is our little coffee map again. You know, it's a nice little demo application. Doesn't do too much, but yeah, why don't we go ahead and build something new? So we'll flip back over to Chippy here and yeah, any. Any thoughts on what might be interesting? We'll give it a try.
B
I feel like I want to do something in the healthcare space. Healthcare or legal? I feel like those two spaces are just like infinite value for me on Chad. GPT doesn't really have anything to do with product management, but I could imagine, like, let's say like, you're a healthcare product manager. I guess so if you're a healthcare product manager, let's say you're a healthcare product manager at a hospital and you want to be able to give access to your customers, like some information about your hospital system through ChatGPT. How would we think about it? What would be a good unit for an app for that product manager?
A
Yeah, so like, actually like kind of hospital reviews and surgeon reviews are a really big thing. There's actually like SaaS companies that help, you know, hospitals and surgeons manage this because it's related to revenue, obviously. Like, if you have really garbage reviews, you won't have as many customers. So maybe what we'll say is something like build a solution that helps hospitals manage and share their Google reviews, something like this. And I'm actually going to turn on plan mode so that we can kind of see what we get back before we kick it off, just so that, you know, we don't end up building something that's completely unrelated.
B
Cool. So Plan mode is going to give us that thinking reasoning model that gives us the Plan first before it executes.
A
Yeah, exactly. And under the hood here, I'm using Opus 4.5 for pretty much everything. So that's a brand new model that we just came out like last week. The nice thing about it is that has this effort parameter that you can turn down. So you get a really high quality model, but you can kind of reduce the amount of time that it spends on a task. And so for things like this, I'm using a higher quality model. So better thinking, but kind of low effort so that it doesn't spend forever, like spinning its own wheels. It gets back the response pretty quickly.
B
Yeah. Cool.
A
So here's what we have. We proposed three different tools to build. So one is viewing reviews, one is sharing reviews, and then one is review analytics. So we'd be able to, like, see a dashboard of our reviews of some kind of a shareable card that we can share with other people, and then some. Some summary stats and so on. What do you think about that? Does that sound good?
B
Yeah, I love it. Okay, this is what I was trying to figure out. Like, what is the. What is the takeaway for PMs? One thing that's just kind of on my mind, though, is the PM really going to be building it. The PM is mainly probably going to create the spec for this. So this could be like they could create their prototype here.
A
Yeah, exactly. I think it's a little bit hard to really understand. Like, if you're a pm, how would you spec this out without ever, like, using one of these or even like testing how they might work? So, yeah, I think using this as kind of like a prototyping tool is a great use case. And then in the long run, I think that, like, there's an opportunity for solo builders to also build apps and distribute this exact same way as the iOS App Store, right. Where, like, they'll be building their own apps. So that's kind of the way I'm thinking about this platform is prototyping primarily for PMs and then like for solopreneurs or people want to build their own apps that you could do the whole end to end of like hosting your application on here as well.
B
Makes sense. So there's a really big opportunity here, I think, for anybody who wants as a PM to create a side project or something like that, a portfolio project to improve their AIPM credentials. But in terms of actually coding up the production version of your ChatGPT app, you're probably not going to be doing that. You're going to be creating the prototype Here. And then your engineering team is going to take that and they're going to kind of create the real version.
A
Yeah, exactly. And actually kind of funny. So a lot of the things that you would do in a normal AI project, as we mentioned, you have to do those here as well. So things like running evals on the prompts that are triggering your tools to make sure that the right kind of phrases are triggering the right tools. And you might have to tweak the tool descriptions a little bit to try to improve that. You know, this type of request should trigger this tool or even, like have it where someone writes a request and it doesn't trigger your tools at all because it's not relevant. So you need to kind of go through the very similar process of, like, what you might choose to do to improve a regular AI application or AI agent. You do in the same case here.
B
Makes sense.
A
Cool. So it built us our three different tools. Again, I'm gonna just for fun, share a little bit about behind the scenes here. So you can see it actually viewed some examples of code that I've built in the background. So this agent that I built, what it does is it can choose to look at relevant files to kind of get inspiration for what it should be doing to build the thing that we've asked it for. And so I took a look at some kind of technical stuff as well as a list example that kind of covers like some UI UX for lists. And then it decided to build those different tools for us. So here's a little preview. Right. I don't know if I love the ux. I have to kind of see how it looks like in line, like when it actually has some data in it. It's hard to tell when there's no data in here. But anyway, we have views, we have shared nothing in there. And then we have review and. Yeah, so we'll give this a try in a second. I'm wondering though, like, if this has data that's going to be passed in by the model or if it needs better mock data. So I'm gonna go ahead and ask that same question. We'll say, does this rely on ChatGPT passing in the data, or should we have mock data? And the nice thing about this is because the agent has, like, you can think about it being like an expert in ChatGPT apps, it can answer questions like this one, right? Where like, maybe you're unsure about the best pattern for how this should work, you can just ask the agent. It'll kind of read through Read your code, take a look at how everything works, and then make a decision for you. So here it's saying that it already has some built in mock data and if it's using the correct pattern, that it should be passing in data from ChatGPT and then falls back to the mock data. So I'm just going to tell it is that like the mock data is not very good if there's mock data. And so I'm going to say please improve the mock data just to get a little bit more information in here so we can take a look at these components. Cool. And then, yeah, the only other thing I'll mention while we're waiting here, this will just take a second, is you can kind of see that the UX for this building experience is a little bit different. Right. So I can see each individual tool, I can also modify the parameters, but these would be the same parameters that ChatGPT would be using. Right. When it decides to call this tool, it's going to be passing in data for these. So like it would be deciding what the filter is, what the sort is, or any other parameters that are necessary. And so yeah, you can kind of mess around with the prompts directly in here or the parameters in order to get different like UX, different experiences that might be rendered inside of ChatGPT, depending what gets passed in.
B
Okay, I'm keen to see this with the real data because then I want to actually, like you mentioned, like you gave us that teaser that really, really perked up my ears around was the evals for the prompt calling it, because I think that part is really interesting.
A
Cool. So we'll hop over into ChatGPT and then we'll get this hooked up. So first thing we need to do is just go back over to our settings, our apps and connectors and then create a new connector here or new app. We'll paste in the URL again, we'll turn off authentication for now, just keep things simple and we'll call this one Healthcare Reviews. Cool. And connect. So this will just take one second to connect and then we'll give it a try and we'll see if it works. And then after that, as mentioned, we'll go back and look at the kind of the logs. We'll create an eval really quick and see how that performs. So I'm going to spin up a new chat. You don't have to, but I just kind of don't like to see the old ones and I'm going to tag it this time. So I say Healthcare reviews. We'll say, how are my reviews doing for Saint Mercy Healthcare? And we'll see what happens here. I'm actually unsure the. The tools are a little bit janky in that. Like that one didn't. Didn't have any mock data. It didn't look like. But okay, so ChatGPT is actually generating the mock data and then filled it in. We saw for a second there it popped back out for some reason.
B
Yeah, I swear I saw it.
A
Yeah. Yeah. So I think what happened maybe is that like there was some underlying mock data that tried to override it there. Probably have to do a little bit of iteration on this one, but let's see if we can try to call one of the other ones. So let's say like I want to share review and this should call the other tool maybe, hopefully. Right. So the tool that kind of generates. Yeah, there you go. Our reviews here. So you can see our Mercy Hospital. There's a little problem with the underlying data there. You saw for a second and then it disappeared. Clean that up. But, but yeah, so you can see the different tool calls kind of in action. And then one last thing I'll show you is if we go back over into the connector here, we can actually see those different tool calls directly inside the connector. Right. So we have our review analytics tool, we have our share review tool, and we have our view reviews tool. So those three different tools that we've set up.
B
Oh, wow. Okay, cool.
A
So yeah, let's go ahead and look at the other side. So the logs and we'll kind of create a quick eval for this.
B
Yes, this is like that full stack AI product building, which is why I think this is a pretty cool portfolio or side or learning project for an epmod is in two prompts. We spinned up something, spun up something, then we're testing it. Now we're already getting to the evals process. So it's simulating a lot of the things. And some people, you know, they may not have access to building AI features in their current job, but they want to get that job building AI features. This is one way to simulate those learnings.
A
Yeah, exactly. You kind of do the whole end to end, but it's simplified. Right. Like you don't have to worry about as much complexity. Yeah, but yeah, so take a look.
B
Worry about time in between the steps or somebody else is doing something.
A
So. Yeah, for sure. So we head back over to our kind of main view here. We do have an observability tab and this will show us all the various tool calls that we have so far for our tools, right? So you can see we have the hospital reviews manager was called twice, one for view reviews and one for analytics and then our other ones from before with our coffee guide and so on. Right. If we click on this one we can see a little bit of information about what happened. So we can see that the input was the sort was for newest and the filter was for all. So ChatGPT is the one who decided these, right. Just to be clear, like those are parameters for the tool call and then ChatGPT decided based on the user's request that these were the correct relevant kind of parameters. And then I also stored the user prompt here. So what I typed in was I want to share a review and we have that kind of tracked here. And this is what gives us the ability to run an eval, right? We can say like does I want to share a review? Should that result in the tool call of viewreviews or not? Is there a different tool called that would be better? We can also see some of the output data and so on and so forth. But from here there's kind of two options that you have. So one is you can run like some quick annotations. This is really just like if you have a team of experts who you want to quickly label the data with, you can do that.
B
Are you looking to land your next product management job? I am accepting a group of just 30 product managers into a 12 week cohort led by me where every Monday for 90 minutes I help you through your job search. Creating your candidate market fit, updating your LinkedIn, updating your base resume. You're going to get personalized feedback and one on one mentorship sessions with my co teachers, Ankit Vermani, who is an AIPM at Atlassian and was a group product manager at Meta. Prasad Reddy, who is a CPO and has been in product for over 26 years as well as my other live instructor Bart Jaworski who's going to run another 90 minute session per week. So if you want coaching from me to land a PM job, this cohort is a no brainer. It is a premium priced product, it is more expensive than the average product out there, but the return is huge. Most people who join the cohort see a salary raise anywhere from 10 to $100,000 in the first year and so the ROI will be there within a year and we guarantee two plus interviews. So if you don't get two interviews after completing the 12 week program and following all the steps, we will refund the money to you. So it's a no brainer. Check it out@landpm job.com and now back in today's episode, before we dive deeper, let's talk about something every PM faces. Getting alignment on product decisions. You know that feeling when you're trying to explain a user flow to engineering or justify a design choice to leadership and you're just describing it with your hands? That's where Maubin comes in. Maubin is the world's largest library of real world mobile and web app designs from industry leading apps like Airbnb, Uber and Pinterest. Instead of spending hours taking screenshots or hunting for inspiration, you can instantly find exactly how successful products handle onboarding, paywalls, checkout flows, whatever you're facing. Over 1.7 million product builders use Maubin to benchmark against best in class products and show their teams proven solutions. Whether you need to convince stakeholders there's a better way to handle user activation or research how topapps approach feature discovery, Mobin gives you the visual proof to back up your product decisions. Check out Maubin.com Akash that's M O B B I N.com A K-A-S and get 20% off your first year. Today's episode is brought to you by Nia 1. In tech, buying speed is survival. How fast you can get a product in front of customers decides if you will win. If it takes you nine months to buy one piece of tech, you're dead in the water. Right now, financial services are under pressure to get AI live, but in a regulated industry, the roadblocks are real. NIA1 changes that the Air Gapped Cloud Agnostic Sandbox lets you find, test and validate new AI tools much faster, from months to weeks from stuck to shipped. If you're Ready to accelerate AI adoption, check out nayaone@nayaone.com Akash that's N a Y-A O-N-E.com A a K-A-S-H and the
A
second thing is kind of start to build up your evals or your golden set. So let's say for example that this was correct. We want this prompt to trigger this tool. I can add that to my set of evals and I just click this button here and I have three different types of evals. And this comes directly from OpenAI's guidance. So there's a direct, an indirect, and a negative. So direct meaning that the user actually typed in the name of the product. So this would be like Canva, can you do X for me? That'd be a direct request. Indirect would be they typed in something that's irrelevant to the tool. So what we did here, I want to share a review, right? That's not naming the tool or naming the application. And then a negative eval would be where the user types in something completely unrelated, like, I want to go shopping this weekend. And if it called your tool, that'd be a bad thing, right? Because you don't want that to happen. And so in this case, we'd say that this one is an indirect. The user describes the outcome without naming the tool. And we'll go ahead and add that to our evals.
B
Cool.
A
And then last thing we have is the actual eval. So now we have one eval that we can run. We have this one set up as an indirect. And the way that I've built this is there's two ways to run the evals. So the first way is if we open it up here, you can run it on auto. So what this does is it literally sends the same prompt and the same set of tools to an LLM and asks the LLM to decide which tool to call. And it can also decide to call no tools. And so this is a very quick way to test a bunch of different prompts at the same time. You can basically run your whole eval set through auto, and you'll see what will happen. And so, for example, this one failed when we passed it back over to GPT5. And we'll take a look at the reason why. So let's take a look at Cancel out of here. Okay, so when it said I want to share review, we had the expected of View Reviews, because that's what happened inside of ChatGPT. But this is telling us that the correct tool to use probably would have been Share Review, which makes sense, right? So we have these two different ones. One is called View, one is called share inside of ChatGPT. What happened is it called this one, but what we would have wanted is for it to call Share Review. And that's kind of picking up on that issue there for us automatically. And so this is a good example of like, okay, in order to fix this, we have to go back into our tools and probably modify the description of the tools to be more accurate. So ChatGPT has a better idea about when to use this View Reviews tool, because it accidentally used this in the case where the prompt was I want to share a review.
B
Yep. So let's what does that look like maybe we can just look at the full cycle of improving performance?
A
Yeah, absolutely. Yeah. So now that we've run the eval, I'll show you one more thing in here. So that was an auto eval. The auto evals are like a great way to get a quick kind of directional input, but it's not necessarily going to match one to one with what ChatGPT provides. And so if you want to actually manually run your evals, you could basically build an eval set and then go through and just type in what, like type in the prompt to ChatGPT, I want to share a review and just log what happened. Right. So that's what this is here for, is to just literally go through one at a time and log what happens with each one.
B
Nice.
A
Yeah. So let's say we want to make that change. What we do is we go back into our application, our hospital reviews manager, and we can either prompt our way through this or we can just edit it manually. So in this case, it's going to edit manually. I'm going to go into the config, I'm going to find view reviews. That's my tool call. And we can see that this is the description that my LLM or my agent decided to write for this, which is display hospital reviews with filtering, sorting and sharing options. And so what likely happened here is because it has the word sharing in the tool description, when I said I want to share a review, it decided to call this one by accident. And so we'll just get rid of the word sharing just to kind of clean that up. So say filtering and sorting options.
B
So improving the metadata, if we think about it like from like a SEO sort of standpoint, like they type in a keyword, they're using the title and the metadata to match it kind of. ChatGPT is doing the same thing with these MCP tools that has available to it. So we're trying to give it the right metadata here.
A
Yeah, exactly. And these, these descriptions can be pretty verbose. I mean, there are character limits, but you can put in things like examples of how to use the tool. Right. For example, I built a spreadsheet tool before and it supported formulas, but I needed to tell ChatGPT what those formulas were so that I knew how to use those formulas inside the tool.
B
Right.
A
If it was going to be, you know, writing any data to that spreadsheet. So it's not just about necessarily SEO, it's really just like, how should ChatGPT use or behave with that tool?
B
Makes Sense.
A
Yeah. So clean that up a little bit and then I'll probably just add something here. Like, this is intended to fetch existing reviews for the purpose of showing the user. It's probably not the best description, but trying to get more at the idea that like, this is not for sharing, this is for retrieving information.
B
Yep, makes sense. Modifying that metadata to just get it called at the right time and then we're going to keep iterating on that. And that's how we have this end to end cycler on evals, and we can run those manual evals, as you said. So is there another category of evals then, about how effective the reviews were? If we pulled the right reviews, how would we write that category of evals?
A
Yeah. So that's less about did the tool get called based on the prompt, but more around, did the user get the expected result or was the behavior good? So very similarly, if we go back over into our observability, we can take a look at the logs and that's really the best way to get a good idea of what's happening is we can see again what the user requested, what tool got called, and then what some of the data was that basically got filled in. And using these logs, we can kind of get an idea for what happened. Again, you kind of have to have more context on reading through these logs. What did you want to happen? And it's the same thing with any type of eval. Right. You have to have an idea of what ground truth is. What do we want to occur when a user types something in? And so again, you kind of look at this data to get a feel for what did happen. But separately from that, you'll have to decide what you wanted to happen in in whatever case.
B
This is kind of expanding my conception around what a good AI prototype is. You know, I think some people might have the tendency to want to like ship the AI prototype when we did right at the beginning, like, all right, two prompts in. Good to go, let's ship it. You know, we had our initial prompt, we changed it to add in some dummy data. Good to go, let's see it. But it seems like actually going through this evo process along the major categories of evals here the major category was like discoverability and then good result, tweaking it and improving it. This is going to help you really understand the corner edge cases, like some of the things we used to have in a deeper prd, help you understand what's going to move the needle in this feature. Success or not.
A
Yeah. And a lot of these things, honestly, you can't really necessarily predict, like, how is ChatGPT going to interpret the way that you wrote your tool description? Right. And so you could, you could spend a very long time trying to figure out what that best thing is, but really you should just test it and run evals against it and see what works, rather than like thinking about it for a long time. And so I totally agree. Getting into the process of this type of iteration provides a lot more information than thinking about it or writing it in a purity and then handing off the engineering team, because eventually your team's going to go through this iteration anyway. It's just a matter of are you getting through some of that on your own quickly, or you can obviously bring in your counterparts, but you have a mechanism to do it quickly versus the full handoff between teams. Back and forth, back and forth, which can take weeks or months or even longer.
B
So I consider you one of the leading experts on AI prototyping. And since we're talking about it, I wanted to bring up this alternative view that I saw from Itamar Gilad a couple months ago, went pretty viral on LinkedIn where he talked about, well, what are all the other things a PM could be doing? Researching the market, talking to customers, talking to stakeholders, talking to partners, looking at user and business data, identifying opportunities and threats, setting goals, evaluating ideas. Sometimes I wonder, like, are we just endlessly expanding the PM role? What is the right way to think about the prioritization of this work that we've been going over so far versus some of the other work that Itamar has listed here?
A
Yeah, so I think to start just we'll Skip on the ChatGPT app side of things and just address this first. I think it's a skill the same way that, like a PM who knows how to use figma is probably more useful in certain contexts, such as talking to design stakeholders or even spinning something up really quick to show to a customer. To someone like that, you're not dependent on other people to do every single kind of touch point or element for you. And so I wouldn't say that using Figma should be an extra line item in here. Using figma is a skill that supports talking to customers and talking to stakeholders, you know what I mean? So they're not independent things. Yes, these are the responsibilities of a pm, and then they have a way to do that. The same way that talking to customers involves some type of skill around interviewing. And they needed to learn that skill. Talking to Stakeholders involves a lot of skill around stakeholder management and even managing up and stuff like that. I would say using some of these tools is complementary. So I personally wouldn't advocate for AI prototyping to be an extra line item or vibe coding to be an extra line item on here. I think these are tools that we can use to support these ideas or these kind of tasks. And that to me at least, it's obvious that if it's very difficult for you to visually communicate something, that that's a great use case for AI prototyping. Like if I want to kind of explain, hey, this is how I think our AI product should work, you know, I'm building some type of agent that's going to do some task, it could be hard for me to kind of explain that to my stakeholders or to my customers. And so spinning up a quick prototype in whatever prototyping tool you like is an easy way for me to start to have that conversation and improve the fidelity of the information that I'm sharing. I can be like, this is kind of what I was thinking. Does this resonate? And so that's to me, like how I would kind of, I guess have a rebuttal to this is it's not an extra line item. Vibe coding is not something a PM should do for no purpose. It should be related to some reason that they're building that prototype.
B
So AI prototyping enhances some of the activities on this list is the way you should think about it. And you shouldn't necessarily think about not doing the stuff on this list. This stuff is important. But how can AI prototyping help you do some of this stuff better?
A
Yeah, exactly. And this isn't really new. Like PMs have been trying to brainstorm and communicate ideas forever. Right? So like Balsamiq was popular for a long time. It's the same thing, right? It's just like helping someone who's not a designer communicate something visually to get the idea out of their head and kind of like onto some form of paper that people can see. And so, yeah, I would say it's literally for the exact same purpose. And so again, if someone's vibe coding first, like if you're a PM and you don't have these other skills and the only thing you know how to do is vibe coding, I don't think that that will be like a way to be super successful in the long run. Maybe there's like some short term gain because it's popular at the moment, but these skills are super critical and Vibe coding can support some of those, or AI prototyping can support some of those.
B
Amazing. So I want to do some mind mapping together. What are the benefits for creating a ChatGPT app for your product? What would you put those major groups as?
A
Yeah, I think we'll kind of classify this into two categories. I think there's some benefits from like the perspective of learning how to like build agents basically, or build build tools that agents are interacting with. So there's kind of like a career benefit or a skills benefit for an individual person.
B
Right.
A
A pm, a designer, an engineer, that. That's one kind of classification. I think the main one is enterprise focused, to be honest. I think the vast majority of early adopters of this is actually going to be large enterprises, not small companies. And the main thing is getting clicks or views. It's basically growth. Right. So I want people to see my app, see my product. And ChatGPT has hundreds of millions of active users per week. And the intent when a user comes from ChatGPT is higher than the intent when they come in from SEO or another channel. And so every company on earth given like the proper tools to capitalize on that, we'll do so I think. And so I think that's really like the main benefit is what is the right form factor for us to get in front of customers, get in front of users and help them interact with our brand, interact with our company so that we can kind of pull them into our ecosystem.
B
All right, and then the next part of this mind map I want to understand is who should be building a chat GPT app? How would you create the major buckets? Or like, if I'm a pm, how should I understand if I should be?
A
Yeah. So in a typical enterprise setting, like we think about the canva app or, you know, any of the ones that exist today, I would expect it to be like a pod, to be honest. So you'll probably have a designer who needs to understand like, what are the form factors? And it's actually for designer, I think an exciting place because it's very unique. You have these kind of like little micro apps that you can build and each one can do a very small amount of things. You can build more than one if you want to. They can communicate back and forth. So yeah, understanding the form factor is pretty critical. The second to that would be the engineering team. So how do we actually ship this thing? There's a couple of technical complexities. Like authentication is very complicated as compared to regular authentication. So you need to make sure you get that right. And so engineering is going to figure out how do we actually get this into the world, how do we support these different types of tool calls that are coming in. And then lastly would be the pm. And as a pm, the reason I might choose to do this, what is the kind of guiding light it is growth, right? I Decide is a ChatGPT app a good method for us to drive higher conversions from AI, basically AI search or AI chat. And maybe this is a priority that we have, that we want to capture more market from OpenAI from that type of search. And so the PM would prioritize this as something that is relevant. And then we'll also hopefully be involved in this process of building evals, you know, shipping small incremental changes to the application, understanding how users are using that application, and then sharing back with anyone who cares about it internally what's happening with that application. So it's really like it's kind of its own form factor of software. It's not like it belongs to one Persona or group. In my mind it would be a pod. And then they're going to ship this together and each one should have some skill kind of around this new form factor.
B
So if you're a PM deciding whether this is an important opportunity, how do you decide that?
A
I think for now it is like definitely give it a try first of all, so that you have some familiarity with like what the options are. So building, you know, full screen applications, how that differs from building like just a quick inline card. And then second thing I would say is really like pay attention to what other people are doing. So, you know, larger companies like Target and Uber, bringing kind of incoming into the space. When you interact with ChatGPT, is it pulling up Target for you? Is it pulling up Uber for you? Is it pulling up Coursera for you? And if it is, you can see it happening in real time, what the benefit is of having these apps. And then lastly, the thing I would think about is, is this an opportunity to re engage customers off of your product. So for example, if x percent of your customers are using ChatGPT, they might not be logging into your application like Coursera, but they could be using your kind of micro app, right? And still getting benefit from your product and still feeling like they're connected to your product. And so I think there might be a kind of a value out here around you kind of think about like retention, right? You'd have to think about how it affects retention a little bit more, but something around that space of consumers or users Interacting with your brand and your product without necessarily having to go directly into your app experience.
B
Makes sense. All right, I think I can play with that arrow infinitely. So we got a little bit of this package here of ChatGPT app. If you're building it for an existing product here, right on the right. Now I want to go to the other side on the left and talk about what are the good ChatGPT app ideas to build if you are a solopreneur or a side project person?
A
Yeah, I think so. To start with, I would start thinking about, like, unique ways that ChatGPT can interact with your application. So we saw a brief demo of that here. You know, we had a little bit of data issue in the background, but ChatGPT actually can fill in the data for you. Right. So it is the one who's deciding what to call your tool with. And so a good example of this I'll kind of go back to when I referred to earlier is a spreadsheet application where ChatGPT kind of partner with me on it. So, for example, a spreadsheet app that has financial modeling support, some, you know, person drops in some financial data into ChatGPT and says like, hey, can you help me model this? And it pulls up the spreadsheet app, puts in the relevant formulas, generates some nice charts and graphs, all that kind of stuff that's deterministic. So the user can go back and actually change the data in the spreadsheet and say, oh, you got that number wrong. Let me just quickly fix it. That'd be a small example of a utility or an application that is embedded with ChatGPT. It's not just showing you stuff. It's not a search tool, actually has the ability to Collaborate directly with ChatGPT in some form factor like a spreadsheet or a task list or a whiteboard or whatever, right? Like, you can imagine this chatgpt that has memory of you and knows you really well and has access to all the tools that you want to use. Then you don't need to hop into like Miro or Google Sheets. Like, you can do a lot of work very quickly directly with these embedded applications. So I think there's a lot of like, potential for these types of embedded apps to kind of take over smaller use cases of like, where ChatGPT kind of falls short right now. But it'd be useful for ChatGPT to help you with these types of tasks.
B
Okay, so maybe like a domain that ChatGPT is interesting in, like healthcare or legal or productivity or writing, but Maybe a use case within that that's neglected.
A
Yeah, exactly. Yeah, yeah. And you know, you can think about like any, any example you can find out in the world where there is an AI company building a product for this. So a good, other good example of this is Gamma.
B
Right.
A
So Gamma's a very large company building like a presentation tool. An AI presentation tool. Theoretically we could build a ChatGPT app that also makes presentations. Right. And so you could like provide a really great experience in building presentation or software inside of ChatGPT. You maybe won't be as good as Gamma, I expect. Probably not. But you also don't have to be. You just have to be good enough that like it is a good alternative that someone who's already inside ChatGPT goes like, Ah, yeah, this presentation is a good starting point. Right. And so really like the strength of the distribution with these kind of embedded applications is, is what I think will kind of win the day there.
B
Yeah, anything that might benefit from embedded distribution. What else? It seems like there was a lot of like e commerce examples, right?
A
Yeah, I think. Well, I mean there's a ton of work going on right now in terms of like shopping. Right. So I think it'll be pretty common that people like Target or you know, you know, other kind of consumer facing companies want to be in this space if people are searching for products inside of ChatGPT. Right. And so they want you to be able to build a cart, they want you to be able to check out because they want to be in front of you the same way that they're in front of you on Google or any other product, like, you know, so on. It'll be interesting to see if Amazon does this because like Amazon has their own LLM activities going on. But I think Amazon would be an obvious example of like, can you imagine, just hop in ChatGPT, you say like, hey, reorder me the Thursday order. It fills out your cart for you from like whatever you ordered last Thursday or whatever you currently get and then you just buy it. Like there's a lot of good examples in the E commerce space that I think would be consumer friendly.
B
And then we saw Figma and Canva in there. So I guess those are like if you have any sort of media or content creation tool, right?
A
Yeah, these are still early days. So like I think Canva is the best example from a functionality perspective so I'd encourage you to give it a try. But basically the idea here is you can use some like kind of mini version of the Canva app. Directly inside of ChatGPT. So it's more fully featured than just, like, showing you information. You can actually interact with the application, move stuff around. Like I said, a mini version of Canva. And again, you're reliant on ChatGPT to help you do that. So rather than me clicking through everything or all that, I'd use ChatGPT as an agent that understands Canva. And so in some ways, we're moving towards a future where ChatGPT is this operating system more. It's like the universal agent. And then these are all different applications that I can call or use as needed, rather than every company building their own agents. Yep.
B
And all this is built on mcp. So theoretically, if Claude or Gemini win, they could also pull into these or what about that?
A
Yeah, exactly. So this is the. This is probably the best part. Kind of icing on the cake to some degree is that this started as an OpenAI initiative. This kind of like apps inside of Chat, but. But using mcp, which Anthropic is responsible for, they pretty quickly amended the MCP kind of protocol or standards. So now Claude actually is, like today working on the same thing. You can see screenshots of it if you look around on Twitter or LinkedIn of the team sharing how these apps are going to work inside of Claude. And so you're not just building for one distribution channel, you're actually building for any distribution channel that supports MCP, which right now primarily is Claude in ChatGPT, but Lovable actually supports MCP. Cursor supports MCP. Like, there's a lot of tooling that supports MCP already. Gemini does not, interestingly enough, but maybe they will at some point in time. It was just not the focus. But, yeah, I think that there's a potential future here where you can get yourself plugged into multiple different chat applications through one app that you've built on top of mcp.
B
Okay, so it's currently isn't in Gemini. So that is one sort of downside here. But it is in Claude.
A
They're working on it. It's not released yet. Maybe by the time the podcast goes live, it will be. But yeah, there's basically like, you know, the engineering builds of it, they're working on it. It has been approved. Like, it's part of the MCP spec. It just needs to go through, like, the actual development process at this point in time.
B
Okay, is there anything else people need to know or add to this mind map to get a good understanding of ChatGPT app?
A
I think this is Pretty comprehensive. Obviously, as we talk today, it's still early days. I think maybe that's the last thing I'll mention is I don't want to try to hype it up too much. It's a cool form factor. It gives you a great experience in terms of testing and building AI apps without having all the infrastructure yourself. So it's a great way to learn. But obviously there's only less than a dozen companies that are currently partnered with ChatGPT. And so I think a lot of this is going to depend on OpenAI's ability to execute. Can they actually get this marketplace over the line? Do people start to use these apps? What does the discovery experience really feel like and look like? And so, yeah, just I guess with a grain of salt. I'm super excited about this space. Obviously I think there's a ton of potential, but it does depend on kind of a couple of things getting across the finish line. But I'd say we're like 70 to 80% of the way there and I would guess by March we'll know one way or the other if this is the case.
B
So I think this is what you're highlighting, that critical PM skill. Is this an important opportunity? That's what you guys all need to think about for yourself in your unique situation. That is our masterclass on ChatGPT apps 4pm hopefully the best guide on YouTube that you have seen yet. Colin, I want to talk a little bit about you because you're sure you're one of the most interesting men in the PM content, appreciate the heck space you're. You just finished a year as a solopreneur, so you were a PM for a long time. We all know you have very highly ranked on the Maven leaderboard, so that could be one thing you did with all your time, I imagine, and you would be financially fine. But you didn't really stop there. You did some experiments this year. You launched a podcast, you launched a couple SaaS apps. This obviously chippy you've built. What is the pie chart of Collins time and attention and focus these days?
A
Yeah, so as mentioned, one year, almost like maybe a week ago, two weeks ago, which is really exciting because I've actually wanted to work for myself for literally my entire career, so dream come true for me. But yeah, in terms of time and attention, I would say it's like maybe 40ish percent keeping the whole thing running. And I have some help with that, which is great. My wife actually helps me a ton with operations. I have a TA named Paulo who helps me with some of the core stuff. And so I have some support with that. And then the other like 60 ish, maybe 50% of my time is spent on new bets. And when I think about bets now, it's actually changed over time so partially it's like, is this going to work commercially? And then the second part is like, do I actually want to do it? And sometimes I don't know if I want to do it until I try. So podcast is a good example of that. I tried podcasting for a bit. I think I was creating interesting things, but the thing I think about is, am I going to be in the top 5 to 10% of this thing? And if I'm not, I kind of drop it and I think of something else that I want to do. And so for podcasting, I don't think I'm going to be in the top 5 or 10%. Maybe I'll come back to it one day, but for now, it's not really a bet that I have the same amount of conviction about. And so, yeah, I tried podcasting for a bit. I've built probably four or five different SaaS apps this year. This is actually my second AI prototyping tool that I've built, like my own prototyping tool. This one obviously catered to this use case, but I built one earlier in the year as well. A lot of learning. Obviously, knowing how to build these tools takes a little bit of time. So learning how to build agents, learning how to run these systems and, yeah, I don't know, testing other stuff. To be honest, I spent time this year on a reg app for a little while that I built from scratch. What else? I don't know. A bunch of different stuff. I just try different stuff. Obviously I have a substack, so it's been a little bit scattered, to be honest, which I don't like. But it is nice to try different things, fail quickly and then move on, rather than just like doing the one thing forever. And then I think, I know you didn't ask this, but my kind of end goal is to have a software product and continue teaching and kind of like balance those two things. But I'd love to have like a software product that's, that's super cool, that's valuable, that people want to use. So that's, that's what I'm trying to shoot for at this time.
B
Got it. So you're like a Mark Louvian plus course instructor?
A
Sure, Something like that, Yeah. I just, I just mess around with stuff a lot I write about stuff when I find it interesting mostly, and then that's pretty much it. I do have to get better at marketing. This is an aside, but just being transparent, that's like one muscle that in the next year I'm hoping that I improve on because my marketing activities are very haphazard at the moment. And so I need to get more consistent at marketing and just showing up so people know I exist. But.
B
And what's your stack for building these SaaS apps? How do you build the app we just saw today?
A
Yeah, actually this might be really interesting for you all. I built the ux entirely on replit. So like the Gemini 3, the new, the design mode they released recently, but in terms of like actually shipping stuff. So I use VS code and CLAUDE code for the vast majority of like Cogen. I'm obviously using Git for like, version control. I use a database provider called Neon, I use a hosting platform called Render. And then there's lots of different libraries depending on what I'm doing. Right. So like for the REG app that I built, REG is a whole world on its own that's kind of complicated and hard to optimize. And so I was using this vendor called Voyage for the embedding models and different vendors for different things. So you end up with this whole stack of random stuff that you learned about and try to build and then maybe it works, maybe it doesn't. But anyway, my core tools are like Claude Code, VS Code, GitHub and Render for building stuff.
B
Why VS code, not Cursor?
A
Yeah, so I don't use Cursor. I use Claude code predominantly. I use the Codex tool as well sometimes. And I just find like the integration inside of VS code is a little bit nicer for those tools than it is inside of Cursor. Cursor has its own AI, obviously, and it tries to like, use that AI. And I don't want to. I don't want to use cloud code or Codex. And so, yeah, I don't, I don't really use cursor. I find cursor at times kind of like tempts me back because they release new features. So for example, they release embedded websites. So like you can interact with your website, whatever you're building inside of Cursor, and the AI has some context on it can actually debug for you. But more often than not, like, those things don't really move the needle. For me, what really moves the needle is like quality of code gen. That's like 95% of what I need and care about and all the other stuff is just bells and whistles. And so right now for me, cloud code is like the highest quality code gen with the fastest pace and so I just, that's like my, my daily driver.
B
Fascinating stuff, man. We're going to have to have you back again. Maybe you can show us this stack of how people can build stuff. I'm sure there's a million different other episode ideas we could come up with. You guys leave a comment below. Should we have Colin back on for a third episode? By the way, for those who don't know, he used to have our number one episode of all time when he did our top five AR prototyping tools. Since then we've managed to release some episodes better. Thank God for us as a podcasting team, but hopefully we can break the records with this one. Drop a comment below what you liked about this episode. Whether we should have Colin back on. Colin, thank you so much for dropping all this sauce.
A
Yeah, yeah. How did you hear?
B
All right everyone, see you later. I hope you enjoyed that episode. If you could take a moment to double check that you have followed on Apple and Spotify podcasts, subscribed on YouTube, left a rating or review on Apple or Spotify, and commented on YouTube. All these things will help the algorithm distribute the show to more and more people. As we distribute the show to more people, we can grow the show, improve the quality of the content and the production to get you better insights to stay ahead in your career. Finally, do check out my bundle@bundle.akashgi.com to get access to nine AI products for an entire year for free. This includes Dovetail, Mobin, Linear, Reforge, Build, Descript, and many other amazing tools that will help you as an AI product manager or builder, succeed. I'll see you in the next episode.
Episode: How to Build ChatGPT Apps (The Next App Store?) | Live Demo by Colin Matthews
Host: Aakash Gupta
Guest: Colin Matthews
Date: January 22, 2026
This episode dives deep into the emerging world of ChatGPT apps, their potential to create a new kind of app store, and how builders (especially product managers and solo founders) can get started. Aakash Gupta hosts expert builder and AI product leader Colin Matthews for a practical, in-depth tour — including a live demo of building and testing ChatGPT apps using Colin’s Chippy tool. Listeners will understand the “why” and “how” behind ChatGPT app development, discover the difference between easy and hard paths to launch, and get strategic advice for building in this fast-changing ecosystem.
“ChatGPT apps are basically a way for companies to bring in their own designs... directly into ChatGPT.”
— Colin, [03:24]
“You can build a little spreadsheet app inside of ChatGPT... or a to-do list app that gets pinned to the top.”
— Colin, [07:12]
“Every time I wanted to make a small UI tweak, I had to rebundle the code, make the change, go back into ChatGPT... It was such a pain.”
— Colin, [15:42]
“You need to go through the process of tuning your tool description... so that ChatGPT better understands when to call your tool.”
— Aakash, [36:32]
“For me, what moves the needle is the quality of code gen. That’s 95% of what I care about.”
— Colin, [60:48]
“There's a very large kind of maybe panic around getting your content into ChatGPT. Right. Because people know that it converts.”
— Colin, [05:25]
“If you have really garbage reviews, you won’t have as many customers.”
— Colin, [20:27] (on healthcare reviews use case)
“AI prototyping can support some of those [classic PM] responsibilities... It’s not an extra line item.”
— Colin, [42:51]
“We are basically learning how to build for a platform that's about to become available.”
— Aakash, [19:22]
“ChatGPT is this operating system more — it’s like the universal agent... these are all different applications that I can call or use as needed.”
— Colin, [52:12]
| Timestamp | Topic/Quote | |-----------|-------------------------------------------------------------------------------------| | 03:24 | What is a ChatGPT app? Core concept breakdown by Colin | | 04:10–05:53 | Discoverability, partnerships, and deterministic in-chat experiences | | 07:12 | Opportunities for product builders, solo and enterprise | | 08:36–11:24 | Architecture & MCP/Widgets explained | | 12:44–14:40 | Easy way (Chippy tool) vs. hard way to build apps | | 15:42 | Friction of rapid iteration, why Chippy exists | | 17:25–18:42 | Invocation, distribution, and eval process basics | | 21:11 | Using Opus 4.5 model for high-quality prototyping | | 23:51–26:15 | Evals, prompts, and app test cycles | | 34:00–37:39 | Metadata, tweaking tool descriptions, and the iterative process | | 40:58–43:06 | AI prototyping as a PM skill, not new, but an amplifier for core tasks | | 45:17–47:02 | Who should build ChatGPT apps? PM, designer, engineering pod | | 51:18–52:12 | E-commerce, content creation, collaboration — top use case directions | | 53:07–54:16 | MCP as the cross-platform foundation, current limitations (no Gemini yet) | | 54:37 | State of play: “It’s early, but promising” | | 56:19–58:32 | Colin's solopreneur journey, project approach, and focus | | 59:04–60:48 | Tech stack and tool preferences |