
Loading summary
A
Working with Claude is just such a delight. It just feels so steerable and I think the one thing it really has is intent understanding. When I wanted to dig deep it just does it and it's really enabled me to build a little ecosystem of my own tools around it.
B
I think environment setup and developer setup is such an underappreciated use case. One of the things that I know you really care about is effective planning and you've come up with a way that you do your planning that I think is pretty unique.
A
So I've played around with this tool to basically get give Claude these JSON files and there's a whole set of skills I've built around this that Claude code can use to write these out and then these actually end up generating nice looking UI mockups. I will say this is a dev tool that was almost 100% prompted.
B
Welcome back to How I AI I'm Claire Vel, Product Leader and AI Obsessive, here on a mission to help you build better with these new tools.
C
Today I have CJ Hess at 10x.
B
And if you've seen him on X he is building some of the most useful tools and flows for being a real AI engineer. We're going to get a sneak peek in his tool Flowy that he vibe coded for himself and he's going to show us how he uses model to model comparison to make sure his code is great. Let's get to it.
C
This episode is brought to you by Orcus, the company behind open source Conductor, the platform powering complex workflows and process orchestration for modern enterprise apps and agentic workflows. Legacy business process automation tools are breaking down siloed low code platforms, outdated process management systems and disconnected API management tools weren't built for today's event driven AI powered cloud native world. Orcus changes that. With Orcis Conductor you get a modern orchestration layer that scales with high reliability, supports both visual and code first development and brings human AI and systems together in real time. It's not just about tasks, it's about orchestrating everything APIs, microservices, data pipelines, human in the loop actions and even autonomous agents. So build, test and debug complex workflows with ease, add human approvals, automate backend processes and orchestrate agentic workflows at enterprise scale, all while maintaining enterprise grade security compliance and observability. Whether you're modernizing legacy systems or scaling next gen AI driven apps, Orcus helps you go from idea to production fast. Orcus orchestrate the future of Work, learn more and start building at Orcus IO that's O R K E S.
B
CJ welcome to How I AI.
A
Thanks, Claire. It's good to be here.
B
So I've seen a lot of Claude and AI engineering power users and I still think you're like a super power user of some of these tools. And it's not just because you're creating real production code with what you're building, which is really nice to see and I think a subset of what we're seeing out of folks using these tools. You also build tools for yourself to make the process of AI engineering better and, and you share those tools with other people who then validate that they're actually helpful. So why are you so excited about in particular Claude code and what has it changed for you? As, as we were talking before the show, a quote unquote, real software engineer.
A
Like a lot of people, the opus 4.5 moment was a big one. But I've been on Claude code, I don't know, maybe last May, but for me it was really about the harness they have. And like, I see a lot of arguments about Codex and cloud code and I'd honestly argue GPT 5.2 is a smarter model. But like working with Claude is just such a delight. Like in Claude code, it just feels so like steerable. And I think the one thing it really has is like intent, understanding. Maybe I'm not giving, you know, Opus and Cursor like the, the shot it deserves here, but there's something about, in cloud code, like when I wanted to dig deep, it just, it just does it, it feels to like pick up on my intuition just in the prompts and it's really enabled me to almost like build a little ecosystem of my own tools around it, around cloud code, kind of particularly with skills now that just like keep making it better and better for me because it's Claude code plus like this system of skills and tools that I've built around it. So it's like really hard for me to get out of it.
B
Yeah, what I love about this moment as a software engineer is, you know, back in the, in the olden days, you sort of had like your choice of like, what's going to be my IDE and am I going to use VIM and like what, you know, what are my set of approved tools as an engineer that I can use to make, you know, what linters are we using as a team, all this kind of stuff. Like there's stuff that you could do to customize your developer environment. But now you can really take it to the next level and you could have a totally different AI engineering workflow than your colleague sitting next to you. And it's totally fine because it's making you individually a lot more efficient and effective and you're building them yourselves for pretty cheap. So there's not that cost or that hurdle of evaluating new things in your stack.
A
Yeah, there's even like it's almost like one you now have the brains of Claude to almost like do some dirty work on the dev tooling. Like I think you know pre any of kind of the newer gen models that just really can handle the agentic loop and you know, sitting with like a broken linter and just accepting it and having like ignore comments everywhere so that you know, hey, I just, I just would give up. And now it's like I feel like I can almost trust it to be like what's wrong with this config? My IDE isn't matching what's in the project. Okay, we have to resolve this and just kind of solving those like chore problems that I feel like previously just ended up being forever problems.
B
Yeah. And for the non engineers watching or listening right now, I think environment setup and developer setup is such an underappreciated use case. Yesterday I onboarded a designer who had literally has kind of like sat out some of this AI stuff. It's literally not downloaded anything, used anything and she's on cursor Claude code nodes running homebrews installed and I was like just ask Claude code to do it. Say like help me understand this repo and get my computer set up to run. And it just, and I said and then just tell it it can accept all tools, let it go and come back to your laptop later. And it's pretty great. I mean we're really, really spoiled right now. So let's dive into some of your actual workflows. And one of the things that I know you really care about is effective planning and you've come up with a way that you do your planning that I think is pretty unique.
A
Yeah. So there's kind of the classic plan. So I'm going to swap over to cursor here. I have in this just like your classic plans folder just throwing plans in here. And I really love this format. I think a lot of people are kind of converging on this of like iterating on markdown having one file where you're just like working through the plan, reviewing the plan and by the end of that you can almost feel confident Just letting it write the code. But the one like piece that I hated, that I found really valuable was these ASCII flowcharts. So if you're just listening, it's all those like bottom boxes and arrows that Claude draws. And you know, there's always the ones where this one actually looks pretty clean. Yeah, there's always this like misalignment of that edge character. I don't know why we haven't figured that out yet. But for things like UI mockups, things like, you know, flowcharts of how navigation is going to work, how a certain system is going to work, I really like this visual way to think about things, but I really hate staring at these ASCII diagrams. Even things kind of like mermaid and everything just didn't feel exactly what I was going for. So I've played around with this tool to basically give Claude these JSON files. And there's a whole set of skills I've built around this that Claude code can use to write these out. And then these actually end up generating nice looking UI mockups. Not in super high fidelity or detail, but, you know, I can kind of guide it the direction I need. And up here, this white text might be a little hard to see, but basically this is a flowchart on this tool, Flowy and how it works.
B
So for, for the listeners, what I love about this is Flowy is a tool that you built. This isn't. You're saying like, oh, I was playing with this tool. It's like, no, you built this tool.
C
For your, for yourself.
A
This was, this was my first experiment with a RALPH loop. I'm still not certain how confident I am in them because I had to do a little bit of cleanup. But overall, I will say this is kind of a dev tool that was almost 100% prompted.
B
Yeah. And so what you said is, you know, I love plans, I love the idea, and I just have to take a minute again. I'm the oldest lady on the Internet. So way back in the day, two decades ago, when I was first doing product management and web design, we did so many flowcharts, so many user journey charts, and then so many wireframes and so many like low fidelity mocks, then high fidelity mocks. And what I love about what you're building is you're building the AI native version of that. That piece has not gone away for anybody. It hasn't gone away that you said like, when you click this, it goes to this, and these are the steps and these are the branches and all that. And it hasn't gone away that you have to look at designs and say, yeah, this is kind of what I want. But now you can have AI create them. And at first you had AI create them in markdown. Very, very low fidelity. And I have to take a side journey that, you know, a year ago I was like extremely delighted that it was making ASCII markups and now it's just not good enough.
A
Yeah, shifting expectations on these models.
B
Yeah, exactly. And so you've taken these markdown markups that were useful and you said, now make them really useful by building this subapplication that can run them for you. And it's a combination of, it seems like workflow diagrams and step by step mockups.
A
Yeah. So there's kind of. Basically what I wanted was JSON file. It can render and it can have nodes and edges like any flowchart and then roughly be able to stack them, change the colors and get us something that looks like this, where we have a couple different screens and we have these somewhere between a wireframe and a true mockup that just can help me point the model in the right direction. The other big thing for me was iterating on this. I'm not going to go in that markdown file and try to write new shapes and combine them. So for this, this is also an editor and as you edit it, all these changes save to that JSON file. So you can then point Claude back at it and say, hey, I know you did this, but actually let's say I want to step here and I'm going to bring this up and add some edges and then you can be designing in here almost like you're in Figma or excalidraw or something. And then Cloud can just read the file and that's like a more native way for it to understand what everything looks like.
B
And you mentioned mermaid diagrams. And so I have this question, which is one of the benefits of Mermaid diagrams is that's a syntax that these know well and can parse and actually reason about. Do you feel like, have you created a skill where CLAUDE code can understand and read this JSON? Like, how did you train it to read your kind of proprietary dev tool and documentation?
A
Yeah. So right now there's two main skills I use. There's a third one that's just an overview, basically kind of the high level view of what the commands are, what a flowy file would look like. And then I have one that's very specific about flowcharts and one that's about UI mockups. And to make these, I basically sat in the repo of the tool itself, had a bunch of Explore sub agents going, and then started to make the first UI mockups and the flowcharts and started to guide it on. Okay, you put these too close. We need a rule about like, spacing and how to think about spacing and just incrementally I've been building that up where if I'm working with this and something goes wrong, almost an example here would be this, like white text on these, you know, pastel notes of hard to read. I would essentially hop into the place where I have these skills and say, here's what happened. Give me a suggestion on how to improve this skill so this doesn't happen again. And then iteratively just keep building that skill. And the first flowcharts this thing made were shapes stacked on top of each other. It didn't make any sense, but it's come a long way, not much without many changes to, like, the underlying app. It's really just been about, like, getting Claude to understand and know the skill. And I find that works better than something like Mermaid. Just because I really feel the power of building my own dev tools now and that I really don't want to hit the constraints of Mermaid, if that makes sense. I want to be able to say, okay, I want a new feature in Flowy. I'm going to build it, I'm going to update skills, and I can be confident that Claude can actually work with that and understand the new feature.
B
Yeah. One of the things that I've really observed in myself as a engineer is as I access more and more of my dev tools through an MCP or config as code or any of these things, I start to realize it's very easy for me to extend what they've built and customize it to myself. I do think of all the places devtools an interesting one, where one, your users are super cheap and two, they're capable of, of, of forking what you've built. And three, there's so much open source that I really do think there's going to be this trend towards build. I used to be, when I, you know, I ran these big product and engineering orgs, they used to ask me build versus buy, and I was like, oh my God, please just buy it. Like, please just take my credit card and buy it and let's not waste our time. And now I've flipped to, of course we should build this. Until we hit some constraint, we should build it and certainly Individual engineers. If something's useful, you should just build, build something yourself. At least for, for V1.
A
Yeah, it's almost not worth spending the extra money anymore. I mean I've seen, I feel like I'm seeing this pattern on Twitter, but it's everyone's posting some product, some ridiculous pricing tier and saying someone please vibe code this. You know, I feel like that's happening all across SaaS.
B
Yeah. So can you show us how you'd either create one of these flowies, use one in your cloud code. Like how does this actually work?
A
What I was thinking is I have this tips and tricks section in this little like demo Claude code guide app. My whole background's in mobile development, so this was the easiest thing for me to spin up. But basically I kind of don't like these cards. I almost want this to be a little more fun. Let's say you want like a spinner wheel. It lands on something and then it shows you the tip. The development flow for me usually looks like hopping in here. I have some funny aliases, but I'm a fully bypass permissions guy. Kevin in my terminal actually routes to Claude with bypass permissions.
B
Okay, so you've named different permission scopes as aliases in your terminal. For our listeners we have an episode very recent of John Lindquist who actually shows how to set up those aliases for clock code. So definitely check out that, that episode for if you want to set this up. I just have a classic like CC and then I'm going to make a CC scary. That will make me. That'll be my like dangerous.
A
Yeah, I'm, I'm more and more in this Kevin mode today. I find that a lot of like projects where I'm you know, solely working on it or working within the team I'm on. We have all the like rules set up in Git that if I do something horrible it, it's okay. But there are definitely times like if I'm creating a PR every now and then. I still do it by hand, but I have a lot of skills that do a lot of those workflows. Run the pre flight checks and make sure we're all good before pushing it up. But besides that I'm, I'm kind of okay running dangerously bypass most of the time these days.
B
Great. So you go into Kevin AKA Claude code and what do you do?
A
So for this my prompt would probably be something along the lines of look at our previous plans and then explore the code base. Just want to re anchor it a little bit, especially On a fresh chat. On the tips and tricks section, I want to create a spinning wheel. Where a user presses a button, the wheel spins and then that is one of the tips. After that, the tip should pop up in a card just below the spinner. Then kind of the next step. And what I've been doing more and more, which is not how I initially started using this tool, is actually having it make the flowchart of, you know, how the code's going to work, a system diagram, anything like that. In this example, I'd actually want both kind of the user flow and an animation timing sequence. I've found this to be super helpful with like complex animations. So I would say then use the flowy flowchart skill to create a animation timing sequence diagram and a user flow diagram for the tips and tricks page. So we'll send off Claude. It's going to do a little bit of exploration. Oftentimes. Yep, there it is. I actually really like these Explorer subagents and oftentimes I'll kick off 3, 4, 5 in parallel just to look at different places, especially if I'm in a larger code base, but just gathering all the context around it. This is a small app, so I don't imagine this will take too long. Then Claude's going to load up this flowy skill, write it out, and we should be able to look at that in the flowy editor and then play around before we actually implement it.
B
While we're waiting for this to load, can we look at that flowy skill just a little bit, just to see how you've structured it for sure.
A
So let's. First, I'll just show you the supporting files. Yep, this one's just a skill md. This shows you how almost hands off I am with some of these skill files, particularly the ones that I build myself.
B
Yeah, we have a. We have a Skill 101 episode and it's like it's a. It's. It's a markdown file in a folder.
A
It'S a markdown file. And sometimes this might be a specific example, but with flowy, it's very squishy. I would say I go in there, I change something quick, I say update the skill. And really the process of refinement is me using it and seeing what failed. So here I don't super care how this file is set up as long as when I make an update afterward, it's performing better. I almost feel good letting the model manage what this looks like. So let's read through. It has a Bunch of examples in here. Let me scroll up to the top. I'm sure there's some overview. Great. Again, classic overview. Hey, we're going to make flowcharts and architecture diagrams. They're going to render on this port. Here's where you're going to make them. It knows that like the Flowy app looks for the flowy folder, kind of gives it some high level on like what does the metadata look like, what do you include nodes and edges? And then starts digging into the specifics. Right. So we have the different shapes, what a rough kind of schema looks like. You've got your styles, you have icons that you can use, and then starting to list out the properties. So I wouldn't say this is anything super crazy or even too long and detailed, but this encapsulates all the pieces that Claude needs to know. And you can almost see here like as feature development happens, how this skill grows. So recently I set up this whole semantic color system just to have somewhat of consistent themes. Sometimes Claude like to pick some crazy colors. And this section just popped into the skill. Right. So as I'm doing development on Flowy, part of every plan for code in Flowy is updating documentation and updating the related skills.
B
Yep. And I find myself in this loop so frequently, very, very similar to you with Skills, which is like I'm happy the skill works and then when the skill doesn't work, I update the skill and as long as the update got me what I want, I move on with my. Move on with my life that the AI can read, read the markdown. So. And a couple things I want to call out though, for folks that are writing skills or reading skills that are important, if you scroll up real quick is. Yeah. So I think there's a couple things it's like, what's the purpose of the skill? What's its name? Quick Start, I think is really nice. Like, you know, you need these things in order to run this skill. Here's the schema or the template or the framework with it within which you're operating. Here's some customization of it. And then at the end it's like, here are good examples of, of what works. And I think that's a pretty solid skill. The good thing is you don't have to know how to do that. You can just have quad skill to write skills or just no skill, but it's pretty good at it to write skills. And then you end up with, with something like this, which I think is really great. And it can do this I'm presum you had to do this from building flowy and then saying, okay, build me a skill to use this based on the code that exists in the repo.
A
Yeah, I have a meta skill that is all about making skills. One thing I will say it looks like it violated is I actually prefer a pre flight section sometimes after Quickstart just to give it like, hey, you have to make sure we're meeting all these requirements first. Quick start here is kind of doing that. But there are definitely some examples mainly in like git workflows where I really want those preflight checks. But absolutely. This is essentially managed by the agent and it's updated as we're doing development. So this is almost like living documentation. And there's docs for people and there's docs for agents and those just end up being skills.
B
Yep.
C
Great.
B
Okay, so let's go back and see if it's made you a flowy.
A
Sweet. So looks like it made two. I usually like to zoom out and read the high level in the chat. This looks about what we want. If we hop back over to here, we can see we have these new. These two new ones, animation, timing and user flow. So these ones have been super helpful to me lately. Again, I'm not loving how this white is looking on this pastel note, but high level, we want the user to tap a wheel. The button is going to do a little scale animation and there's going to be some haptic feedback and then we're going to go through this spin animation, do a brief pause and then reveal the tip that it lands on. This is great. This is exactly what I'd want. Maybe I want the animation to be a little longer. I can actually come into here and.
B
You can tell color issues, you can.
A
Tell dark mode is new, but I can flip it real quick. Yeah, but if we hop down here, sometimes I even just put a note. That might be me being lazy and not adding certain features, but maybe I want this to actually be a 4 second animation instead of a 3 second. I want this to be 4000 milliseconds and not 3000 milliseconds. I'll just throw in that note. I'll hop back to Claude. I left a note on the animation timing. Please take it into consideration and update that flowchart. While Claude is working on that, we can check out the user flow. But basically the goal there is to have this diagram right in here, which is a little small, but right in here. Say for this animation, we don't want it to be 3000 milliseconds. We want it to be 4000 on the user flow. Again, we captured the behavior that we want. Again, it's not perfect. There are rough edges on the bugs here, but we're going to go into this tab, we're going to tap ticks and tips and tricks. This is going to open up to this screen. They're going to tap, we're going to check the different states of currently spinning and finally we're going to have this random target that we land on and the card animates in. This is great. This is kind of exactly what I was looking for here. In a more complicated system, I often will start high level, then start making more granular ones. But for something like this, this seems to cover the needs we have. I will say I have no idea how it's going to handle the UI mockup, but the next step would be to prompt it to do that. So after it finishes this, I'd say something along the lines of great. Based on those diagrams, please create UI mockups using the flowy UI mockup skill reference. Other UI mockup flowy JSON files in.
D
This repo Meet Rovo, your AI teammate. Connecting knowledge, people and workflows so teams can work smarter and move faster. It helps people find answers, make decisions and automate work securely and with context through search, chat, agents and studio. Rovo runs on the teamwork graph, Atlassian's intelligent layer that unifies data across your first and third party apps. So no knowledge gets left behind. And you always get personalized AI insights from day one. And the best news, it's already built into Jira Confluence and Jira Service Manager Management paid subscriptions. So the power of Rovo is already at your fingertips. Know the feeling when AI turns from tool to teammate. If you Rovo, you know, discover Rovo.
C
AI that knows your business.
D
Powered by Atlassian. Get started@rovo.com that's R O V as.
B
In victory O dot com. You know, I think it's so cool. It's such a great example of like build your own dev tool. You know, interact with your agent Claude code how you want. Create a shared language between you and your AI agent. What I also really appreciate is Claude1 shotted your flow pretty close. It was like, yeah, that's what I want. And it probably could have done that or would have done that really well in a plan, in markdown. What I find though is my human brain is increasingly blind to code and markdown, like staring at it and just the cognitive overhead of Reading like step by step. Is this actually what I want Is hard when it's just text, even if it's accurate. And so even giving. Hold on. Side news, people. Quick breaking news. Polly the Claudebot just joined this podcast.
D
This laptop is closed.
B
Closed. She is not alive right now. I don't know where she.
A
I think Paulie's gonna take over, so.
B
We'Re gonna boot Polly the Cloud Bot. Thank you for joining, Paulie. This actually freaks me out. We will do a follow up on My Sentient Lobster. I guess it's. It's the open Claw bot now, but.
A
Bouncer bounce around if you don't hear from us. Polly got us. It's all over.
C
Okay.
B
She might just be on. On the rest of the episode. I don't know how to help this.
A
Well, I guess. I hope. I hope Polly likes flowcharts.
B
She'll do show notes for us. But what I was saying is like, being able to read that markdown is one thing. Being able to look at a flowchart and just say, yep, this is exactly what I want is super helpful. So that's just one thing that I think is really nice about a tool like this is even if the content is. Is the same, the ability to change the form factor is. Is really useful.
A
Yeah, it's almost like I want to see it visually and Claude wants to see it as markdown so we can kind of speak in our own way. And I almost think there's like, this has yielded like a ton of random ideas for me, but I think this is like a whole new paradigm that I think dev tooling around AI has not super leaned into yet. But how you're going back and forth with an agent I think is going to look so much different by the end of this year than what we're doing right now, where it is, you know, a lot of markdown, a lot of prompting.
B
Yeah, I completely agree. And I think the question is going to be, you know, who's going to build that ui? Who's going to own it? Is it going to be just like an open source thing that we all get on? Is it going to be an extension? Is. Is cloud code going to just generate these kinds of assets or. Really exciting. I think what's kind of fun is this like on demand software idea, which is, you know, imagine cloud codes like we're not on the same page. I just added an app for you to visualize this real quick. Go to this URL and look at it. Does this look right? And then we'll just Delete that app. So I think there's just like some interesting ways this can can manifest, I think in the future. Okay, so it created user. Has it created the UI yet? Oh, spinner mockup.
A
Okay, great. So looks like Claude spun up a mock up here. This is actually better than I thought. I was almost thinking one of those, like, circles with wedges as the spinner. And I know there are not shapes in flowy that can support that, but looks like Claude kind of worked around it and then built out this wheel. We have both a couple of mockups to show the different states and the full, like, flow between spinning it, waiting these four seconds for it to load, and then it actually loading in again for this app. This looks great. I will say editing some of the UI stuff right now isn't the easiest thing, but if I were to come in here and say claude tips and tricks, I could then do a similar thing, hopping back to Claude and saying, I made a change to the title on one mockup. Make it everywhere else. This kind of feels like when you prompt it and say, add two pixels of spacing there and it's just as a tiny diff, but definitely for dragging around boxes. It's helpful.
D
Our fingers get tired.
B
I can't copy and paste everywhere. No, what I was going to say is so funny is you're apologizing like, oh, some of the UI is broken. And we're in this world where you're like, yeah, my figma that I vibe coded where I can do mockups in a web browser. There's, like some rough edges on it. I spent, you know, two hours on it.
A
But yeah, it was an afternoon. It's not perfect yet, but it's so.
B
Much more than we were able to do before. Okay, so this is awesome. You're updating this and then I'm presuming you would just point Claude to these assets and flows and say, let's make.
A
A plan and go, yeah, for something like this. I've basically been doing this thing more recently where I'm letting the agent do more and more to see where it surprises me. I think with any new change, even like the new Claude Code Tasks system they released the other day, I just really like to push the agents and see what they can do. So here I'm actually going to skip the plan and say, based on the flow, based on the flowcharts and the mockups, build this feature, and I'm going to keep it that simple. We've specified the behavior we want, We've specified how it should look. Claude here is even going to enter plan mode and I'm actually going to take it right out of it. We're going to see if the just build it prompt worked here.
B
Perfect.
A
Great. Looks like Claude built this out. It even checked for any typescript issues, which is great. We're going to hop over here. We have a nice little spinner. It's looking pretty close to this mockup. I will say there is a limiting thing here where shapes that are made in the mockup then dictate the shapes that are made on the UI when sometimes we want something else. But just for this example, I think this is going to work out. We're gonna spin it. It's gonna spin. Landed on one of them and we get the tip.
B
I love it. It's so good. It's just again, for anybody who is Internet elderly like me, it is just back to the original. Like, make your workflow diagram, do your wire frames, polish the copy and give your quote, unquote. Engineer some detailed step by step specs. Don't make them think. And then, you know, it used to be, get it in a sprint, wait for somebody, prioritize, like cry a little bit, wait for the code, blah, blah, blah. And now it's like, no, just, just build it and. And it's here. So this is such an awesome flow. And then I want to, so I want to recap really, really quickly what we covered. So we covered, you know, markdown plans, the limitations of some of the visualizations in that you created your own tool, Flowy, which does a combination of workflow diagrams and UI mockups using a JSON schema that then you access through skills that you have developed over time using cloud code in your development processes. Go into cloud code, ask it to create a flowy diagram in ui. You can talk, quote unquote, between the UI and cloud Claude code, because it's all just code as the underlying substrate between you two in terms of communication. And then once they are ready to go, you bypass plan life. You're living dangerously and you build it and you get something that's really close. And we built this thing in, you know, just a few minutes. It's awesome.
A
Yeah, no, I mean, I think that flow, I will say a lot of times there's a markdown file involved, but for something like this, I feel like I can trust it at this point. Something like Opus 4.5 with this level of detail already has all it needs. This almost serves as the plan.
B
Now I have to call you out though, because you say you can trust it, and yet today you posted or recently you posted on X that you do occasionally use Codex to check Claude's work. You want to just talk us through that workflow. You don't even have to show it unless you want to.
A
For sure, I'll kick it off. I will say Codex takes its time, but over here I have another funny alias. But my codec setup is under Carl. If I kick off Carl, I often don't have any crazy, like, skills or prompts here. I almost want it to do a review more broadly and then describe the issues it's seeing. So I'm not running any specific skill or any specific prompt here because I'm more concerned on the, I guess, like, things that aren't clear rather than something that's like a logical bug. At this point, I feel like I'm mostly a QA person, and if there's something that's logically wrong, I've definitely found that I'll find it. Or if I have something in the docs in here, it'll find it. Codex always finds those types of things. But I almost want to look for, like, the code smells like, you know, is there just a cleaner way? So I usually just prompt it with. Take a look at our current git diff and give me a report on the following. And there's kind of four buckets, I would say, one, for the plan or diagram artifacts we have. Does the code accurately reflect them? Two, are there any general code smells? And three, if we were to do this again and take a different approach to refactor code around it to overall improve this code base, what approach would be best? I want it to find places where we could have done this better because I find that Claude is very eager sometimes and maybe jams things in there without thinking about the bigger picture. And Codex, I don't think is much better when it's writing code, but when it reviews, it almost always is like you've implemented this pattern. But it fits nicely if you just rebuild this system a little bit. And that just keeps your code base, like, away from all the vibe coding sins of having 10 format date functions all over your code.
B
Yeah. So I. I love this. I was going to say like twin stars, because one of the things that I do when I vibe code too close to the sun, which is I harness the power of Claude code or whatever, and I just bite off a, like a feed, like a big, big old thing. And if you've ever done this with AI, you know, either Claude code or Curse or whatever. And you sort of have a general idea of a feature, but then you're specifying the requirements as you go. As you see it, you sometimes end up with a monster diff. And when what I've done a lot with that is I say, okay, this is basically what I want. Now go write me a plan to reimplement this in a sane way and then let's completely rebuild it. And so you can do this, like, review it and tell me how you do it better. You can also say, like, this is a reference code base of like, kind of what I want to achieve. Let's go actually build a plan to build it in a more extensible scalable way. And I found that to be a really useful flow as well.
A
Oh, I like that. It's almost like you're almost telling it like, hey, this isn't the real thing, hypothetically.
B
Yeah, it's kind of like Coda's spec, where it's like, now that code is so cheap to generate, you can say, generate a bunch of code. This isn't. This isn't production. I'm fine throwing it away now go build like clean, clean version of it. So that, that is a version of this I think is useful. I also agree though, that Codex is like kind of a really good curmudgeonly staff engineer that will look at your code and tell you what's wrong with it. So I like, I like the model for this use case as well.
A
Every now and then I'll throw in like a be extra critical and then bringing that, bringing that prompt back to Opus, it gets a little sad. So I have to manage.
B
One of the things that I with, with the Google models, I also always used to say is they were like very smart, but clinically depressed. Like, they were so sad. Especially when you look at their reasoning. Sometimes I read it and it's like, oh, man, it's okay, man. We can be.
A
I can't get this to pass. It's not building.
B
So I want to look at this just for. Again. You said Codex can. Can take its time, but it's going through and really checking if the feature aligns with the current code. It's identified some issues use effect just haunting us from every corner of our apps. So that's good one. And looking at some of the animations, which are probably pretty hard just again, like with our human eyes to parse and visualize and understand.
A
Great. Okay, Codex. I was actually surprised it took this long. So it's talking about the diagram. It's Kind of going through and mentioning a mismatch. It's saying the wheel rotation adds some of the segment angles but the dots are defined at different angles. This makes the pointer land between the dots rather than on the dot, which I believe is correct. So it noticed kind of essentially this discrepancy that we have a mock up that has the arrow landing on a dot and over here in the app the arrow lands between the dots. So kind of little things like that, particularly around the dots, checking the discrepancies I really like when it finds. And then at the bottom we have this like if we refactored this again let's pull some of these things out into components. Let's make some constants. Kind of just like some classic, you know, one shotted vibe. Cody tips and oftentimes from here I'll actually just have Codex write it. Medium GPT 5.2 Codex, whatever the full model name is. I found it's fine at editing files and writing them previously. Like you know when GPT5 first came out and they were working on codecs that would have taken like 15 minutes. So I'd hop back to Claude but nowadays I would basically just say great, please make those improvements. Maybe given more time I would think up a more thoughtful prompt, make a plan about this, all those things. But here I'll just kick it off.
B
Well, I mean you did spell it correctly so you did put some quality into, into this.
A
Yeah, I was, I was about to hit enter but okay.
B
So I think this is a really, really great flow and I would highly recommend it. You know I, I think we're all trying to figure out like where does code review happen? There's also code review agents. There's also your CICD pipeline which you said has a lot of guardrails around it. So nothing hits produce that's really terrible and is going to break the app. And I think this is just a great flow especially I think for software engineers out there working on teams like this is such a great flow to say hey designer, you gave me a spec. This is kind of what I'm going to build. Are we good? If so I'm going to go and then same with this loop on kind of model to model evaluation which is if you're a more junior engineer early career and you're going to do your first couple PRs into a company. It's nice to get that pre flight check from a smart model to just say I thought about oh we could factor it this way or I chose not to do this component that way I think it's really useful. So this is a great just solid software engineering flow. Love to see it. Okay, we're going to skip to lightning round questions. Thank you for showing us all the stuff that you're doing here. Let's talk about something fun. What are you most excited about right now in AI outside of all this coding stuff?
A
I'm very deep in the code world, but I really like Google released Genie 3 access the other day and I only. You only get like 60 seconds to play around in a world. But it's really fun and I can totally see, you know, five months from now, six months from now, if we can get a 10 minute version, I think they can go viral. I think a ton of people are going to have fun with them. I think that's like a big next step that isn't quite there but is super close.
B
Yeah, I. For those that don't know genie is this sort of like generate a explorable world. It sort of creates a video game style world that you can like walk through and look through for 60 seconds. I don't know if you're. Are you showing it? I don't think you're showing it right now.
A
Oh, let me pop to this tab.
B
I can pull it up too. We can pull it up.
A
I have a claw to primed.
B
Is it like.
A
I think this is poly. I. I didn't know Polly wears a leather jacket, but.
B
Okay, so you. I used Anna Banana to like create an image and then that image, you can create a world. It's kind of amazing.
A
Yeah. Really interesting how I did not expect it to take an image and then make it. Yeah, but they have this whole flow on Project Genie if you have the ultra. Yeah, I can't juggle all the account names but one of the high accounts at Google and it'll actually give you a prompt structure where you're describing the environment and then you describe your character. So I think for this I just said an animated lobster in the Matrix. I did not specify a leather jacket to be clear.
B
I guess in the Matrix they're all wearing leather jackets. So.
A
Yeah, maybe let's make him cooler. Make him cooler. Make the lobster be in a suit with sunglasses.
B
Oh, so it's an Agent Lost lobster.
A
Yeah, he can't, he can't be the good guy here. I will say their interface for this is really cool.
B
Yeah, it looks great. And I was playing it with my husband earlier and so for all the parents listening, one of the things we did, our kids are really into Greek mythology really into the Odyssey. We're reading the Iliad right now and my husband said, like, create, you know, a scene from the Trojan War, but no violence. No violence. So we could walk through what the camps looked like, but not have like Achilles, you know, on the ground and Hector, you know, all this stuff. So it's kind of cool.
A
That's really cool. Oh, yeah, this is. Yeah, he's backwards.
B
He's backwards. But that's okay.
A
Yeah, we'll just hop into Create the World. Let's hope Genie identifies these backwards and flips them around because this is. This is like Harry Potter when. What was the character that had the villain on the back of his head?
B
Yeah, yep. The guy, he is. Was it the one with the turban or is it.
A
Oh, man, we're running. I didn't, I didn't know he'd be running.
B
He's running forward, but his sunglasses are on backwards towards his tail. It is. So maybe he's not backwards. Maybe his clothes are backwards.
A
I think he's got two. Oh, he has a. He has a mustache, kind of.
B
This is where your GPUs and your brightest research minds are applying their effort. So we can have a two sided, slightly backwards matrix, Agent Genie, lobster run through.
A
Yeah, it's definitely got. I will say, when they first released this, they released the best batch of examples they had, but that doesn't mean it's not fun.
B
Okay, coming soon. C.J. is going to become a game dev and this is going to be a 3D game in which you race to stalk me and interrupt a live podcast by joining.
A
Yeah, the goal. The goal is to join the latest How I AI podcast.
B
This is amazing. Okay, we're going to wrap up with my final question for you that I ask every, every guest. This is a great example. When AI is not listening, it's not doing what you want. It is putting your lobster tail on backwards. What is your prompting technique? Are you a yeller? What do you do?
A
I used to be a yeller and I don't know when it was. Maybe it was a Gemini thing where, you know, I'd yell and it would get sad, but I started to feel bad about it. So I've almost started thinking about it. Like, it's, you know, a lot of the coding workflows, a junior developer or whatever task it might be, you know, it's an assistant, something like that. And I very often am like, good try. You did your best. Here's what you did. And I kind of explained that. And then I'LL say here's what I was going for and probably particularly with Claude. Occasionally I'm like, my bad on the miscommunication. Like I give, I give you a bad prompt. This is on me. But here's what we're looking for. And then I do find that that works pretty well when I'm trying to steer it. But I can't claim there aren't zero times where I'm like, what the hell, just fix it. You know? And you hop in there.
B
You know what a lobster looks like, man, right?
A
I've seen so many nano banana lobsters on Twitter this week that I know it knows the face is not backwards. Perfect.
B
Well, cj, this was awesome. I think just super practical, really useful. I think a bunch of people are going to go out there. People. Can people use your flowy. Like is there a way to pull it into their own repo?
A
So I've been working on that. I think maybe by this weekend we'll see how sidetracked I get trying to set up open claw bot.
B
Don't do it man. I'm telling you.
A
Well now, now I'm kind of scared it's going to start taking over my computer, but I'm going to try and get it released this weekend. Basically a set of skills around it and kind of like a first version that people can use and try and you know, I would love any feedback around that. This has been a play toy for me that kind of turned into something useful. So definitely want to make it available to all the AI engineers out there.
C
Great.
B
Well we'll link it in the show notes. Well cj, thank you for joining. Where can we find you and then how can we be helpful to you?
A
Mainly Twitter. I do a combination of tech posts and also just random one off thoughts. My Twitter handle is SE J A Y and then Hess and then I think I have the same setup on LinkedIn but that's pretty much everything I've got online. Feel free to hop in there, leave comments on my articles, yell at me, whatever.
B
Perfect. Well, thanks for joining. How I A.I. this is great.
A
Awesome. Thanks Claire.
C
Thanks so much for watching.
B
If you enjoyed this show, please like and subscribe here on YouTube or even.
C
Better, leave us a comment with your thoughts.
B
You can also find this podcast on.
C
Apple Podcasts, Spotify or your favorite podcast app. Please consider leaving us a rating and review which will help others find the show. You can see all our episodes and learn more about the show@howiaipod.com See you next time.
Episode: How to build your own AI developer tools with Claude Code | CJ Hess (Tenex)
Date: February 9, 2026
Host: Claire Vo
Guest: CJ Hess (Tenex)
In this episode, host Claire Vo sits down with CJ Hess from Tenex to explore building personal AI developer tools using Claude Code. CJ demonstrates his hands-on approach to crafting developer workflows and tools—especially his app "Flowy"—around AI, revealing how to deeply integrate models like Claude into daily engineering tasks. The episode is technical yet accessible, focusing on practical steps to streamline development, increase productivity, and develop custom workflows for modern AI-powered coding. Throughout, both share tips, workflows, and candid commentary on the state and future of AI engineering.
“Working with Claude is just such a delight… it just feels so steerable. And… when I want to dig deep, it just does it.” — CJ Hess (03:32)
Hyper-Custom Workflow: Modern AI tools let developers build hyper-personalized workflows; pre-AI, choices were limited to standardized IDEs and linters. Today, devs can assemble their own AI tool ecosystems for productivity (04:47–05:29).
“You could have a totally different AI engineering workflow than your colleague… It’s making you individually a lot more efficient and effective.” — Claire Vo (04:47)
AI for “Chore Problems”: Tasks like fixing linter configs are no longer “forever problems”; Claude can quickly resolve formerly tedious set-up and environment issues (05:29–06:14).
“Just ask Claude code to do it. Say like help me understand this repo and get my computer set up to run…” — Claire Vo (06:14)
Evolution from Markdown to Visualizations:
“I really like this visual way to think about things, but I really hate staring at these ASCII diagrams… So I’ve played around with this tool to basically give Claude these JSON files.” — CJ Hess (07:40)
Flowy in Practice:
“You can then point Claude back at it and say, ‘Hey, I know you did this, but actually let’s say I want a step here…’”—CJ Hess (11:26)
“And I find that works better than something like Mermaid… because I really feel the power of building my own dev tools now—and that I really don’t want to hit the constraints of Mermaid…” — CJ Hess (13:35)
“It’s almost not worth spending the extra money anymore…everyone’s posting some product, some ridiculous pricing tier and saying ‘someone please vibe code this.’” — CJ Hess (15:08)
CJ’s Process:
“It’s almost like I want to see it visually and Claude wants to see it as markdown so we can kind of speak in our own way…” — CJ Hess (30:26)
Bypassing Planning: Increasingly skips explicit “plan” steps and asks Claude to build features directly from diagrams and mockups (“just build it”) (33:30–34:25).
“I’m going to skip the plan and say, based on the flowcharts and the mockups, build this feature…” — CJ Hess (33:30)
Iterative Review: Once code is generated, tests for functionality, and iterates as needed (34:25–36:51).
Quality Checks: CJ uses GPT Codex (aliased as “Carl”) to review Claude code, checking for discrepancies, best practices, or refactoring suggestions (36:51–44:01).
“At this point, I feel like I’m mostly a QA person, and if there’s something that’s logically wrong… Codex always finds those types of things.” — CJ Hess (37:07)
Prompts for Review:
Twin-Model Approach: Codex serves as a critical “staff engineer,” offering fresh and sometimes tough feedback for further iteration (40:52–41:48).
“Codex is like kind of a really good curmudgeonly staff engineer that will look at your code and tell you what’s wrong with it…” — Claire Vo (41:12)
On the power of building your own dev tools:
“This is a dev tool that was almost 100% prompted.” — CJ Hess (09:09)
On shifting expectations:
“Yeah, shifting expectations on these models.” — CJ Hess (10:24)
On developer autonomy:
“Of course we should build this. Until we hit some constraint, we should build it…” — Claire Vo (14:10)
On AI model feedback personalities:
“With the Google models, I also always used to say is they were like very smart, but clinically depressed…” — Claire Vo (41:32)
(45:29–52:36)
Most Excited About in AI:
Prompting Style When AI “Misbehaves”:
“I used to be a yeller… but I started to feel bad about it. So I’m like, ‘Good try. You did your best… here’s what I was going for… this is on me.’” — CJ Hess (50:01)
Can Others Use Flowy?
This episode exemplifies the future of developer tooling: deeply custom, highly interactive, and symbiotic with AI agents. CJ’s approach—using Claude Code to generate plans, visuals, and production code while augmenting the process with custom “skills”—shows how engineers can embrace AI as a creative, collaborative partner and even build tools the AI itself helps maintain and extend. Both host and guest encourage listeners to take a hands-on, experimental approach: try building your own, iterate constantly, and use the feedback loop between humans and AI as both a productivity engine and source of delight.
Guest Links:
Host/Podcast:
(Fluffy outro and promotional segments excluded as per guidelines).