Summary8 min read

Podcast Summary: Behind the Craft

Episode: Full Tutorial: Use AI Agents for Coding AND Product Management | Eno Reyes (Factory)

Host: Peter Yang
Guest: Eno Reyes, Co-founder of Factory
Date: February 15, 2026

Episode Overview

In this episode, Peter Yang welcomes Eno Reyes, co-founder of Factory, creators of a leading AI coding agent called Droid (also referred to as Joy in the conversation). Eno offers a deep dive and live demo on how advanced engineers leverage AI agents for software development and product management. The discussion centers on Factory’s unique enterprise-first approach, best practices for working with AI coding tools, the evolving nature of product management roles, competing in a rapidly growing AI agent space, and how tools like Droid democratize codebase access and collaboration for all roles—not just developers.

Key Discussion Points & Insights

1. Why Factory Focused on Enterprises from Day One

Timestamps: 01:24–02:52

Factory differentiated itself by addressing the needs of large companies (10k+ employees) and their engineering VPs, focusing on security, enterprise controls, ROI analytics, and integrations.
Reyes notes the distinction: “We set out on a long journey to build these AI systems… There’s sort of a lot of layers that make us feel maybe like the full enterprise solution to software development agents.” (01:36)
Factory’s Droid agent offers full-stack integrations beyond just the terminal or IDE, doing cross-company codebase analysis and surfacing blockers for agent-driven development.

2. Live Demo: Building a Speed Reading Web App with Droid

Timestamps: 02:55–07:11, 08:01–11:59

Reyes shares an example project: a fast reading (speed read) web app.
The workflow:
- Use the terminal extension for Droid, set “high autonomy” (agent can act with little user intervention).
- Paste in a conversational transcript as the build spec—Droid interprets and generates the prototype, autonomously planning, building, testing, and QA’ing changes, including opening browsers, taking screenshots, and running tests.
- Factory’s autonomy modes let users control agent permissions flexibly (allow all/some commands).
- Reyes shows how Droid supports multiple models (Opus, GPT-5.2, etc.) for planning vs. execution.
Quote: “Where Droid shines the most is on things like basically long running tasks … we’ve done a lot around things like compaction or compression and prompt caching to make the experience feel really nice.” (04:54)

3. Spec vs. Plan: How Droid Structures Work

Timestamps: 08:01–10:45

Droid has a unique “spec mode” (different from ordinary planning mode).
- Spec: What should be built.
- Plan: How to build it.
- Droid queries the user for clarifications (e.g., input formats, features), then proposes and saves a spec that can be edited in VS Code.
Reyes highlights the importance of separating “what” from “how,” enabling better collaboration and reusable specs.

4. Models, Costs, and Validation Best Practices

Timestamps: 10:45–12:08, 12:08–14:28

Droid enables model selection mid-session (e.g., plan with Opus 4.5, execute with GPT-5.2), optimizing for both capability and cost.
Extensive focus on validation:
- Automatic linting, type checking, QA (including visual validation with screenshots and console message reporting).
- Quote: “Agents are fundamentally bottlenecked by the ability to validate their own work … Our view is that the agent is the one that needs to validate its work to move to the next step. The quality of the output is way higher.” (13:23)
Workflow smoothness: Higher autonomy can be set or changed on the fly for convenience or control.

5. Design System Integration and Grounding

Timestamps: 15:02–16:29

Droid is adept at reading and grounding its UI in a codebase’s established design system—automatically applying branding, styles, and components without explicit “design system” skills.
Quote: “What a lot of people underestimate is that building stuff that’s in your design system… doing it well is actually fairly difficult. Droid can do that quite well.” (15:42)

6. How (and When) to Use Skills, Hooks, Subagents, and MCP

Timestamps: 16:29–18:44

Reyes demystifies Factory’s extensibility: skills, hooks, sub-agents, and MCP (multimodal command processing) are all supported and power shared workflows.
Usage insights: Most enterprise users rely on a handful of customizable skills and integrations, and Factory enables org-wide management of these at scale.
Quote: “From enterprises… a couple of people focus in on making skills, MCPs, tools for their whole organization or big teams… It’s just easy to get everyone in your 10,000 person company outfitted with a skill that meaningfully changes dev productivity.” (18:11)

7. Product Management Skill: A Deep Dive

Timestamps: 19:00–22:34

Reyes demos his Product Management skill (used for reviewing PRDs, specs, design docs, feature prioritization).
- Gathers live content from Notion (native integration), including product principles, core value prop, “11-star experience” framework (inspired by Airbnb), templates, etc.
- Ensures generated product docs reflect Factory’s philosophy (“Our five-star experience two years ago is now our baseline.”).
Quote: “This is probably one of my favorite skills that I have. … The structure of it ends up looking a lot more like the types of things that if you’ve been in the room at Factory for a year, you would say, instead of just what Opus 4.5 is randomly opining on.” (21:12, see also start of transcript)

8. Evolving Roles: Product Engineer vs. “Regular PM”

Timestamps: 22:34–25:05

Factory’s org model only hires “product engineers”—no traditional PMs. All roles are expected to drive most workflows with AI agents.
- Even sales/AE roles are power users of Droid (using skills for analysis, customer work, CRM, etc.).
- Quote: “What it means to be really any role has changed a lot… it’s quite clear for us.” (23:44)
AI agents as general productivity overlays—less about code, more about being able to articulate tasks and review or plan in natural language.

9. The Terminal as an Overlay, Not a Destination

Timestamps: 25:05–26:32

Reyes explains why the agent UX moves beyond the old “all-in-one” IDE-centric mindset.
- Agents should act as overlays, not full-screen destinations: always-on, able to operate across desktop apps, files, workflows.

10. Competing in a Crowded AI Agent Space

Timestamps: 27:07–29:48

Factory (40-person team) competes successfully against much bigger, well-funded teams (Cloud Code, Cursor, etc.).
- Key: Focus on hard enterprise problems (hierarchical controls, customizable security, “air-gapped” operation in highly secure contexts).
- Success on benchmarks stems from real enterprise datasets and solving tough problems, not public leaderboard hill climbing.
- Quote: “There’s just so much to be explored in AI for software development… just opening Twitter and reading a couple workflows, you realize the variance… is so high.” (27:46)

11. AI Agents for Legacy Code, Democratizing Access

Timestamps: 29:48–31:43

Major value: agents shine at detail-oriented, unpleasant engineering work (e.g., legacy code refactoring) and onboarding by reading, contextualizing, and surfacing insights from gnarly codebases.
Extends utility to QA, ops, data, and other adjacent roles—“democratizing access to what used to be a very complex and hard to understand topic.”

12. Rethinking Codebase Access and Enterprise Education

Timestamps: 31:43–33:37

Reyes advocates for broad codebase read access: “A bunch of companies are going to pay a huge cost to design decisions like ‘we’re not a monorepo’ or ‘limiting codebase access to only these personas’… it won’t age well into the AI era.” (31:57)
Educating enterprises about activation and advanced user journeys is as crucial as product features themselves. Power users become internal evangelists, driving broad adoption.

13. Droid for All: Getting Started

Timestamps: 34:06–34:40

Droid is free to use (with credits for new users): “Just go to Factory AI. … One line and you’re in.” (34:16)

14. Final Thoughts & Where to Find Eno

Timestamps: 34:40–35:16

Host praises Factory’s focus and speed: “It all comes down to focus. I’m super impressed with your progress.”
Find Eno on Twitter: @enoreyes

Notable Quotes & Memorable Moments

“Agents are fundamentally bottlenecked by the ability to validate their own work.” (13:23, Eno Reyes)
“At the beginning [Droid] does this grounding step… looking at our CSS, different pages, and it’s using that to ground its UI.” (16:22, Eno Reyes)
“The hardest software problems is like… refactoring these gnarly legacy codebases, right? … Droids just do that stuff pretty well.” (29:56, Reyes)
“What it means to be any role has changed a lot … you definitely need a willingness to drive your workflows with AI if you want to work in any role at Factory.” (23:44, Reyes)
“I think enterprises should just give everyone access to the codebase. … This stuff is not going to age well [without it] into the AI era.” (31:43, Yang, Reyes)
“You can just get it digested [by the agent] for pretty cheap.” (31:40, Reyes)

Episode Timeline (Selected Timestamps)

01:24 – Factory’s enterprise focus
02:55 – Kicking off live build demo
03:47 – Demoing Droid autonomy modes and interface
08:01 – Spec vs. plan; editing and validation flow in Droid
11:04 – Model selection for plan/execution and cost management
13:23 – Validation as core agent differentiator
16:00 – Design system auto-integration; agent “grounding” step
19:00 – Deep dive: Product management skill
22:34 – Only hiring “product engineers”; the modern PM role
27:30 – Competing as a lean team
29:56 – Solving “the hard stuff” (legacy code)
31:43 – Rethinking access and onboarding in the AI era
34:16 – Getting started with Droid
34:57 – Where to find Eno

Conclusion

This episode offers a tactical and philosophical master class in deploying AI agents (like Droid) for both engineering and product management at scale. Eno Reyes provides actionable insights for enterprise teams and solopreneurs alike—emphasizing validation, adaptive workflows, custom skill-building, the blurring of traditional tech and product roles, and why the next generation of work will be driven by universally accessible agents rather than exclusive “power tools.” If you want to see what the near future of product leadership and software development looks like with AI at its core, this episode is a must-listen.

Find out more:

Factory AI: factory.ai
Eno Reyes on Twitter: @enoreyes

Loading summary

Transcript91 lines

[00:00]
Reyes
This is probably one of my favorite skills that I have. And what this does is it's basically when I'm doing Things like reviewing PRDs product specs, working on design docs, discussing feature prioritization, that PRD has the R language, it has R ideas, R principles, and the structure of it ends up looking a lot more like the types of things that if you've been in the room at Factory for a year, you would say instead of just what like Opus 4.5 is randomly opining on.
[00:25]
Podcast Host
This is amazing, dude. Maybe you can share this with me privately or something. You only hire product engineers. Are you going to hire like a regular PM at some point?
[00:32]
Reyes
I think what regular PM means has totally changed. Like we have an AE who's the number three Droid user at the company. He's in sales. We're about to publish some interesting work about how Droid basically has passed the threshold of what we call like self improving.
[00:48]
Podcast Host
All right, welcome everyone. My guest today is co founder of Factory, a popular AI coding agent called Joy that works with any terminal or id. Today will show us how great engineers actually work with AI agents, but giving us a live demo. And we'll also talk about the crazy competitive AI coding space and what actually has product market fit in that space. So welcome, sir.
[01:09]
Reyes
Hey, thanks so much for having me. I'm really excited to be here and there's nothing more I love chatting about than software development agents.
[01:16]
Podcast Host
All right, cool.
[01:17]
Reyes
Yeah.
[01:17]
Podcast Host
So maybe before we get into the demo, can you tell a little about Factory, Android and kind of what makes it different from other AI coding tools?
[01:24]
Reyes
Yeah, totally. And you know, Factory, we've been around for actually a surprising length. We've been, we're about two and a half years old and when we first started, you know, the world was not really comfortable with the concept of letting an agent just, you know, call it YOLO mode or whatever on your computer. And so we said we really need to orient around building products that will work in enterprise environments and build products that made, you know, VPs of engineering of a 10,000 person organization feel comfortable. And so we set out on a long journey to build these AI systems. We call it Droid, the sort of core agent. Our product doesn't stop at the terminal or the IDE or the web or desktop. Of course we have those surfaces. But we also provide tooling that helps you analyze your entire company's code bases to determine what's stopping agents from being successful. We give you ROI analytics, we give you enterprise controls. So. So there's sort of a lot of layers that make us feel maybe like the full enterprise solution to software development agents.
[02:24]
Podcast Host
Got it. Yeah. It's smart that you guys focus on enterprise from day one because I question the product market on the consumer side. So smart.
[02:32]
Reyes
Yeah, yeah, totally. There's, I think, a lot of optionality that people have. Like there's a million different coding agents and some of them are hackable, some of them are not. I think the people who care about quality have found their way to us. But if you're like cost optimizing or something else, you know, you may just go with a subsidized plan or an open tool.
[02:53]
Podcast Host
Awesome. Let's build something live. What do you think we should build?
[02:56]
Reyes
No, I love this idea. And I was thinking that maybe we would start sharing my screen. I was thinking maybe we could like granola and actually record some of our back and forth on maybe a prototype of some form, a web app. What do you think? Is there a specific direction you wanted to take this?
[03:12]
Podcast Host
Yeah, we can just build a simple web app and maybe you can use some best practices of using Droid. You can show us how things work.
[03:19]
Reyes
Yeah, totally. Then maybe what I'd suggest is, and I'm going to pop this open, hide this, share my screen, and show this granola transcript that I have right here. What we can do is we can build out an app that showcases a simple, fast reading application. I don't know if you've seen this viral thing on Twitter where, you know, you have like a book and you upload a bunch of documents and then it lets you read really quickly, like speed read, basically. Sound like interesting?
[03:46]
Podcast Host
Yeah, that sounds good. Yeah.
[03:48]
Reyes
Cool. So typically what I recommend to folks is when you want to use Droid, you open up either the terminal or IDE extension. Here I'm just going to use the terminal and I'm using Ghosty. I love how quick it is. And basically, you know, we have a very simple interface, not very simple, but a fairly simple interface that lets you type in. If you've ever used a terminal based agent, you'll have all the bells and whistles. We support things like skills, mcp, you know, hooks, et cetera. You can select your model. And I think one of the cooler parts about Factory is that, you know, we support nearly every frontier model as well as different levels of what we call autonomy, which is basically how much do you want to give the agent the ability to operate in its environment? Do you want to approve every action it takes? Do you want only read only commands Reversible or everything. I'm going to turn it on high autonomy here and I'm just going to paste this transcript. This is like the transcript from Granola that we just had where I suggested we do this. This is something we do all the time. Let's build a prototype for this in this directory, please.
[04:54]
Podcast Host
I love it, man.
[04:54]
Reyes
And I'm just going to paste it. And a couple things that I think are interesting about Droid, you're going to see it plan, read list directories make this really simple for you to see what is it actually doing at a high level as it works. But I think when you actually go under the hood, I think that where Droid shines the most is on things like basically long running tasks, when you want it to run for not just a minute or 10 minutes, but really like an hour. It's too hard to show in a quick podcast, but we've done a lot around things like compaction or compression and prompt caching to make the experience feel really nice.
[05:30]
Podcast Host
And dude, I just want to mention one thing, like the fact that I can just like, I think I use tab or something to pick like allow all commands versus allow some commands. Like that's much better UI than like, you know, like I love cloud code, but like the default experience in cloud code where it asks you for permission for everything. It sucks, man. Like, like I don't, I don't have, I don't like sitting around trying to grant it permissions, you know.
[05:50]
Reyes
Totally. I think that there's actually like a real like security and risk thing of if you give people two options, like I have to approve everything manually or dangerously run YOLO mode. Here you can see Droid is actually opening the browser for me autonomously and it's jumped in and it's basically testing out. You can see it's taking screenshots and qaing the work that it just did. So it's going to determine did I adequately test what the user's doing? I don't know. You can see here is that using
[06:20]
Podcast Host
Playwright or is just like some native thing that you built.
[06:24]
Reyes
This is using Chrome dev tools, but by the time that this podcast airs, we've actually made this native. So the Droid for everybody will be able to, you know, browse, interact and see. It's basically confirmed that it's done and it gave me a little alert and I can iterate. So I think most people who've used a tool like this are familiar with this workflow, but I think that once you actually jump In a lot of the nice quality of life things like the ability to create skills, manage your skills in one place, an MCP registry that contains most of the major tools that you'll use like Linear, Notion, et cetera. Like one click away really just make for a much nicer experience when you're developing. So if you want a strong multi model harness, I think Droid is basically the like leading option there.
[07:12]
Sponsor Voice
This episode is brought to you by Linear. When engineers use tools like Cursor, clock, code and codecs, a lot of work happens invisibly. Someone can go from a bug report in Slack to a shipped fix without creating any record of what happened outside
[07:26]
Podcast Host
of the code editor.
[07:27]
Sponsor Voice
And that's fine for speed, but but it makes coordination harder. As you scale, Linear integrates with the very best agent coding tools directly like Cursor and Codex. That way anyone can see what an agent is working on and who assigned them to the task. You get the speed of agents without losing visibility across the team. Product teams at OpenAI, Ramp and Block are all using linear to collaborate with AI agents. And I use Linear myself to run my creator business. So check it out. And at Linear App Agents, that's Linear App Agents. Now back to our episode.
[08:02]
Podcast Host
It's typical best practice to write a little plan or spec first before you do this thing. But in this case, I guess our spec is just a granola conversation.
[08:12]
Reyes
Exactly. But what I recommend is we actually have something called spec mode. And maybe the nuance here is basically and you can do this by just hitting Shift tab. The nuance here is that when you're in spec mode, I'm going to say let's make this a more fully fleshed out product. What you're going to see is that in spec mode, a lot of agents call this planning mode, where you get a plan for what to do. Our view is that a plan is a little different from a spec. Like a spec is like what should be built and a plan is how you build it. We think the agent should figure out how to. You shouldn't be in plan mode, you should be in spec mode where you define. Basically here it's asking me questions about what input sources should be able to use. I'm going to say all of the above. What reading enhancement features would you like? Maybe chunk mode and local storage. Any additional features? I could type my own answer here and say let's definitely have a party mode button. So I've answered all of its questions and it's going to propose a specification and when it proposes the Spec to me, I have a bunch of different options. Like I can choose to edit it, I can open this up, you'll see that this is saved as an actual document. And so if I choose to manually edit, it'll actually open VS code for me. So that I can jump in here and look through the spec, read through it, edit it, and after I've edited, I'm just going to delete party mode. Let's go ahead. You'll see that Droid will pull that spec in, reread the changes I've made, and kick off a plan to get to go further.
[09:49]
Podcast Host
Got it, Got it. And this is after it's already built the initial version, Right? Or it's like.
[09:54]
Reyes
So it's kind of better. Yeah, exactly. So we basically just specked out like a whole new plan of how to. Of how to work.
[10:01]
Podcast Host
Got it. Yeah. This is awesome.
[10:03]
Reyes
So as it's iterating, you're going to see it's changing stuff. So obviously it's not going to work. React's hot reload is obviously awesome because it's going to keep hot reloading, but the moment that it completes its work, you can see it's asking for permission as it operates. I'm actually going to shift it to high autonomy so it stops asking me permission and I'm going to just let it cook.
[10:25]
Podcast Host
Oh, so you can actually shift it like have while it's actually working?
[10:28]
Reyes
Yeah, I can shift in and out of spec mode, I can shift the autonomy levels. I can actually change the model mid session. So if I want to start and plan in for example, opus, but then execute with GPT 5.2 too, these are all settings that you can turn on. Or if you just want to switch mid session, you can.
[10:46]
Podcast Host
Got it. Yeah. I do think being able to pick the model is important. I guess I kind of get comp access to a lot of this stuff, so I don't think about cost. But if you're running an enterprise, the cost really matters. Right? Because OPUS is really pretty expensive. You don't want to run it up for everything.
[11:04]
Reyes
Yep, totally. And I think that there's also a lot of things that people are discovering now, which is for example, GPT 5.2. Codex is extremely diligent. It's very good at validating its own work and it will run for a long period of time, but it doesn't have the same like sort of high level planning intelligence that you know, fairly subjectively. Although we have some evals to back this up. Opus 4.5 has and so there's a great way to sort of get the best of both worlds in agnostic model agnostic harnesses, because you can actually say, look, OPUS will plan and GPT 5.2 will execute. And that combo actually outperforms either alone. So a lot of what we try to do is actually make decisions like these way easier for you by setting sensible defaults, giving you a really solid experience. And of course the cost thing matters a lot for people. So being able to switch to a cheaper model or a more expensive model tends to be like a pretty pleasant experience.
[12:00]
Podcast Host
So do you have any like high level tips too? Like how would a real engineer use this versus like, you know, like a vibe coder? Right.
[12:08]
Reyes
Like, you know, totally. I think that probably one of the things that's most optimized for real engineering scenarios is Droid has a lot of both, like system injections, prompting as well as harness level modifications to really heavily encourage validation of its work. We use this word validation a lot, but our view is that agents are fundamentally bottlenecked by the ability to validate their own work. Like Chrome Dev Tools is a great example of sort of QA and validating that the change it made actually visibly makes sense. Code has tons of these validators. You have linters, unit tests, type checkers. I don't know if you can see that it's continuously building, running dev, you know, linting, type checking. In this flow right here, the Droid is working. We think that we basically have done this probably to a higher degree than most, which is a big benefit for the actual product experience that people have. So here it's going to open this up. You can see it taking control. We've added some of the things that we mentioned, the ability to add content, full screen, et cetera.
[13:18]
Podcast Host
Yeah. So I don't have to remember to do all this testing manually. You just do it for me. Each time I ask you to do something like build something new.
[13:24]
Reyes
Exactly. The Droid will actually take screenshots of your product. It will QA it for you. It'll click through, it'll list console messages like are there any errors that popped up in the console? This is a lot of stuff that we think, you know, as somebody who is in product or somebody who is in data science, or even just someone who's not a front end or full stack engineer. If you're building prototypes or you're building, you know, straight up end to end real work as a production engineer, obviously you can know these things and everyone knows it's good to do them. But when your agent is the one that sort of says, no, I actually need to validate my work to move to the next step. The quality of the output is way higher. So I think a lot of people sort of, when they say, like, Droid, like, subjectively feels really good, what they're actually pointing towards is this idea that we validate the work very rigorously. And it doesn't really come at that much of a cost of spend or tokens because it's sort of the measure twice, cut once thing. A lot of agents are measuring once, cutting once, measuring again, cutting again. And for us, it's like, just validate the work iteratively and you'll get a much higher result.
[14:28]
Podcast Host
Got it, dude. Let's check out the app, man. So what does this thing actually do?
[14:32]
Reyes
Yeah, yeah. So this is a speed reading app that basically lets you go through. And I think the idea is that it helps you maintain comprehension as it works. I've noticed that it's doing two word chunks. So what I actually want to do is I want to see if I can change it to one word chunk. And so the idea is you can sort of read this as it goes. You know, it.
[14:59]
Podcast Host
Got it, got it, got it. So it reads much faster than having, like a huge paragraph.
[15:03]
Reyes
Yeah, exactly. So it's just sort of like a play app that you'd have. But I think that the thing that's sort of fun about this when you full screen is. I don't know if you can tell, but this is actually like already stylized. We have like a public website where we've got a lot of content. One thing that I like is that Droid is really good at picking up your code base's existing styling. So, like, this is our brand colors. These are our sort of like similar components to our actual design system. You know, the modules have our borders, the font is ours. And I think that what a lot of people underestimate is that building stuff that's in your design system, doing it well is actually fairly difficult. And so if you want to have like vibe coded things that just 0 to 1 a random code base, that's fine. Droid is fairly good at that. But when you have an existing code base, like our factory public web here, and you want to make modifications to it, you want to build a new app, you want to keep consistency of your design system, Droid can do that quite well.
[16:00]
Podcast Host
And I didn't have to, like, you didn't build like a skill or something like design system skill. It just does it by reading the code?
[16:06]
Reyes
No. Yeah. Like if you look back, there's no skill being invoked. It totally could, though. If you wanted to have a skill for your design system, you could. But. But I think that that's actually what's cool about Droid, is that at the beginning it does this grounding step, right, where it's actually reading through, it's looking at different layouts, it's looking at our css, it's looking at different pages, and it's using that to sort of ground its ui.
[16:30]
Podcast Host
All right, dude, well, let me ask you this. I'm going to throw you a curveball. So there is like all kinds of crazy terms, right? There's like skills, there's hooks, there's sub agents. This is like for someone who's new, it's just super confusing, man. When you actually use all the other stuff. Or can you just go back and forth with AI and just build something?
[16:48]
Reyes
Yeah, totally.
[16:49]
Podcast Host
Stuff, yeah.
[16:50]
Reyes
I think that this is such a hotly a hot contested debate. We have full support of all of them. Right? So sub agents, skills, MCP hooks, slash commands and like a global config that lets you manage all this stuff. What we've seen is that clearly Skills and MCP have by far the highest usage. And I think that this answer changes based on who you are. If you're a solo developer, I think that there's a lot of opportunity for you to like sort of build your own custom workflow with these things. My personal opinion is that we get a lot of mileage by just having a couple of skills that matter for things like data engineering, for things like building repeatable components and integrations. And I have a skill and a lot of the people on our team have a skill for like writing and like language that matches their voice when they want to use it to generate content. In terms of mcp, there are a ton of them and obviously we have a registry for things like Linear Notion, Axiom, Datadog, Sentry, et cetera. My view is that Skills might be just a better way to manage integrations, context. And so if you can get a skill for a given capability that might be better than MCP and hooks I think are really good if you are the type of person that loves to make their tool super customers. But from enterprises, what we've seen is that enterprises will have like a couple of people focus in on making skills mcps tools for their whole organization or for big, you know, teams in their org. And because Factory is the only offering that lets you actually from an enterprise Perspective, manage who has what customizations from the user team and enterprise level. I think that a lot of power users end up getting converted over to Factory. And because it's just easy to get everyone in your 10,000 person company outfitted with a skill that meaningfully changes their dev productivity on a daily basis.
[18:45]
Podcast Host
Oh, okay, so there's like a permission system or something.
[18:48]
Reyes
Yeah, permissions. And also just shared access to a ton of different skills tools, MCP at the enterprise level.
[18:55]
Podcast Host
Can you, and you can tell me now on this, but can you actually show us a skill? Like can you show me your writing skill or whatever skill you want to show?
[19:01]
Reyes
Yeah, totally. Like, I have a couple here that are live on my prod, like changelog, codecanvas, product management, writing factory blog posts. So if I were to go and
[19:15]
Podcast Host
actually let's look at the product management one because there's a bunch of PMs.
[19:19]
Reyes
Yeah, of course, yeah. Can you open my product management skill file in VS code? I could probably do that myself, but I use Droid for everything, so it's much easier to just say to Droid like open that file, please. And so there we go. This is, I actually think that a lot. Like this is probably one of my favorite skills that I have. And what this does is it's basically when I'm doing Things like reviewing PRDs, product specs, working on design docs, discussing feature prioritization. I'll zoom in so it's easier to read. And what I've done is we have a bunch of source of truth documents. So we have our product principles, we have a core value prop, what we call the 11 star experience, which is taken from Airbnb. This is an awesome framework for thinking about. Basically, Brian Chesky was like, five star Airbnb experience. They roll out the red carpet. It's great. You get the Airbnb, they give you the keys, they give you a bunch of cool things to do. That's the five star experience. What's six star, what's eight? What's 11? And 11 is Elon Musk personally takes you on the rocket ship yacht and you go to Mars. And. And so what this framework does is it lets you say, where are we today? And what is the baseline expectation of an amazing experience in your product? That is the bar. Now, what comes after that? What comes when you break that bar? And what's cool about Factory is in the last two and a half years we have slowly moved like our original 11 star experience, or at least the seven star that we had two years ago is now our five star experience. So like it's just the baseline expectation of what wasn't even possible in the. Like, maybe at some point in the future this will work. Is now what we expect the average user to have in our product. So it's a really cool framework. Yeah. So you know, anyway, tons of docs product positioning, how we build prioritization frameworks, templates. And what you do is you basically pull all these notion docs together. Factory has a native notion integration, so you don't need the mcp, you just integrate it for your whole company and it handles permissions. So it'll pull all that data and then it'll use that for things like PRD reviews, guiding the language and has
[21:27]
Podcast Host
a couple of examples, but bro, which one? I guess it calls different notes because this is probably a lot of notion docs. Right. So does it call different notion docs based on what you want to do, like build a prd?
[21:40]
Reyes
Yeah, exactly. So basically what this is, is you can think of it as like a map almost of our most important documents and these are like shared sources of truth. And I would, if our company was purely people in GitHub, I would probably put these in markdown in GitHub. But we have folks that use notion. Like our AES are ops. Our, you know, most of all of our product team is actually their product engineers. So they're all engineers. But. But we pull all this stuff together and then what happens is based on what you're working on. So if I say like I would like to write a PRD about this new thing, that PRD has the R language. It has our ideas, our principles, and the structure of it ends up looking a lot more like the types of things that if you've been in the room at Factory for a year, you would say instead of just what like Opus 4.5 is randomly opining on.
[22:29]
Podcast Host
This is amazing. Dude. Dude, you got. Maybe you can share this with me privately or something. I can copy this thing so I can make.
[22:35]
Reyes
Oh yeah, for sure. I'd be happy to. We can maybe attach it to the vid and like share it with anyone who's listening.
[22:41]
Podcast Host
Yeah, that would be amazing. I've always wanted to build a product management skill and you mentioned one thing that's a little bit innocuous, but I think has a big impact. You mentioned that you only. You only hire product engineers. So are you going to hire like a regular PM at some point or like you want people with both engineering and pm?
[22:56]
Reyes
Yeah, well, I think it's funny because like I think what regular PM means has totally changed. So. So my view here is that what we and it's compounded by the fact that what we build is a software development agent. So even our AES are like, we have an AE who's the number three Droid user at the company. So he's in sales, right? Like, but he is still the number three user of Droid. He does everything from Droid. He does customer research. He puts together skills for analyzing customer usage data to determine how he can help provide better experiences for his customers. He uses it to track his deal flow. He has Salesforce connectors, so everything in his life is operated by Droid. Our view is that I think that a lot of people underestimate this aspect of software development agents and I think it's because of maybe like the terminal dominant ui. But you know, our view at factory is that software development agents are basically the next generation of general AI systems. And so it's no secret software development agents are advancing basically everybody's capabilities, not just software engineers. But ask anyone at Cursor, Anthropic or OpenAI, they'll all admit that most people at the company are using their software development agent for productivity gains. And so it's quite clear that for us what it means to be really any role has changed a lot. If you have no experience as a software engineer, you can still be in product at factory. That's totally fine. But I think you definitely need to have a willingness to drive most of your workflows to with AI if you want to work in any role at factory.
[24:35]
Podcast Host
Dude. I think and your AE probably doesn't know how to read like co sync text and stuff like that, right?
[24:40]
Reyes
He's a bit of an outlier, so he definitely does. However, Most of our AES are definitely not. They're not in VS code, they're not trying to live and operate their life via the terminal, but they still use
[24:50]
Podcast Host
Droid because I think it's just like a higher level of abstraction. It's almost like engineering. You need to understand some of the technical stuff. But you're basically trying to talk and plan this stuff out in English, right? It's not like you gotta know about for loops and while you know all this kind of crap anymore, you know.
[25:06]
Reyes
Yeah, 100% yeah. I think that it's funny because we were just talking about this internally. I think people think of the terminal as sort of this destination or place because they're used to thinking of the IDE as this destination and What I mean by that is like the IDE contains this encompassing view of all the information that's helpful when you're coding, but that also builds walls around the IDE as a concept. Right. It may be easier for a software developer to operate inside those walls, but it really sucks you in. And you open the IDE full screen, it's got all these crazy screens, debuggers, it's got 50 buttons. And this complexity is definitely intentional because it's a power tool, but it changes how you interact with it. And our view is that the terminal, or a native app for agents, is not necessarily a destination in and of itself. It's not your full screen. It's more of an overlay. It's this thing that lives on top of the rest of your computer. And sometimes, you know, something that you keep open all the time, something that has access to the file system, the apps, the desktop, you know, I think that this is a better indicator of where the future is going. Like, these software development agents are just general computer use agents. And so most people who work on computers could benefit from having a little overlay in the upper left hand corner of their computer that they can talk to and basically ask to do nearly any task for them. And it should just work.
[26:33]
Podcast Host
Yeah, like, yeah, I mean, like all white collar work is done on computers and like, you know, code is how computers work. So it kind of make. Make it make sense.
[26:41]
Reyes
Yeah, yeah. Like software is sort of the physics of AI agents. And so it definitely behooves them to be good at manipulating their own physics, their own world. And I think that's also why software development agents have moved way faster than other fields because they're also made of software. So the self learning bootstrapping is very clear. We're about to publish some interesting work about how Droid basically passed the threshold of what we call self improving.
[27:07]
Podcast Host
Wow. So let's talk about something which is you're a pretty small team, right? How many people are in the company?
[27:15]
Reyes
We're 40.
[27:16]
Podcast Host
Okay. And you're competing against cloud code and cursing are these super well funded companies. And dude, I'm super impressed that you guys are number one on terminal bench. This might be a much smaller team. So how do you do it, man? Any secrets?
[27:30]
Reyes
No, totally. I mean, I think that there is a funny thing. Of all the resources in the world, as we all know, cannot necessarily purchase a product experience that's fully crafted for your icp. I think that there's two angles here that are important. The first is that the Cursor team, the anthropic team, the OpenAI team. I mean, incredible. We know we work with them all the time, they're all awesome. Every time I've met all these folks, they're total class. So one thing is you have to hold two things in your head. There is a huge, well funded, very smart group of people also building in this space. But at the same time I think that there's just so much to be explored in AI for software development that effectively just opening Twitter, reading a couple people's workflows, you'll quickly realize the variance in what a good AI software development agent or a good workflow is. So high that there's just so much to build. And so for us there's two things that really matter. The first is just a relentless focus on customer and icp. So there are features in Droid that make no sense for a solo developer. Things like the Enterprise, hierarchical controls, some of how OTEL works. A lot of like you can actually run Droid in the most air gapped environment, like a, you could run it in a submarine if you wanted to, as long as you had a gpu. I think that level of control flexibility, customization doesn't really sell well to a individual developer. However, I think this is what has made us more capable in general is that because we've built all these things, we have gotten access to customers that have incredibly difficult and very sophisticated software problems to solve. So we get to basically hill climb not only on public benchmarks and in fact we actually don't really hill climb on public benchmarks. Like our performance on terminal bench is not because of terminal bench, it's because of a separate data set built of more realistic enterprise customer data. And so that I think has been a huge boon for us is basically being able to work on much harder software problems. And if you solve those, a lot of the they're not necessarily simpler, but maybe more straightforward problems like full stack development, you know, zero to one, etc. Sort of come naturally.
[29:49]
Podcast Host
Yeah, and like the hardest software problems is like what? Like it's like refactoring and like you know, these narny legacy code bases, right? It's like all the shit that engineers
[29:56]
Reyes
don't want to do is that exactly like, like I think that there is just so much crap involved in software and no one, no one wants to be the guy or the, or the girl. Like refactoring a Cobalt legacy code base that's like 15 years old. Everyone who's touched it is either gone or you know, not working on it anymore. And Droids, droids just do that stuff pretty well.
[30:19]
Podcast Host
Yeah, dude. Because like dude, I think the best part about you know, Droid and some of these other AI agents is like it's very detail oriented in just reading and understanding the code base. Because if you onboard a new human developer, it's going to take them a long time man, to figure out what the hell is going on with the code base. Especially if it's like a mess.
[30:36]
Reyes
Yeah. And I think that that's one of the coolest things we've seen. So there's a customer that we deployed. We went zero to 10,000 people in a couple months basically. And one of the ways that we did this was just by enabling not just software engineers, but really everybody who wanted access and saying to them look like if you are someone who is anywhere near the software process, open this tool up and just start asking it some questions that you've been wondering about the world that you operate in. Like there's so many people who, because it's very costly time wise to like learn coding or learn big aspects of software but their work is so consequential to the delivery of software ops products. QA DevOps like, you know, data science, they sort of know how some of the software engineering stuff works or maybe they know pretty well they just haven't invested time in learning. Droids just make this so much easier. So it does really feel like democratizing access to what used to be a very complex and hard to understand topic. You can now just get it digested for pretty cheap.
[31:43]
Podcast Host
Yeah, I think enterprises should just give everyone like access to the code base. Like maybe not, maybe not write access but at least like read access to just figure out what the hell's going on because then you can ask a bunch of questions to join these other agents instead of bothering other people.
[31:57]
Reyes
Yeah, I mean a bunch of companies are going to pay a huge cost to design decisions. Like we're not a monorepo or we're limiting code based access to only these Persona like this stuff is going to not scale well. It won't age well into the AI era. So it's a lot to think about if you're an engineering leader and probably
[32:15]
Podcast Host
like a lot of what you because you're focused on enterprise, which I totally think is the right thing to do. But like probably a lot of what you're doing is just like educating because like you know, if you're trying to sell joy to anthropic or something but like that Makes sense. They don't understand what's going on. But like a lot of these enterprises, like you know, Accenture or whatever, they don't know any of the shit. Right. You have to train them. You have to train them how this stuff actually works.
[32:37]
Reyes
Yeah, and that's a big part of also when you're building a product for Enterprise, I think you have to be thinking about not just does the product work and is the user journey very clear, but also how does a user become a power user. Like what is, what is the activation and then what is the basically the secondary activation that happens? Post, post. I'm using the tool and now I'm really using the tool. And for us we've seen there's like a usage based activation of like I'm sending messages pretty frequently, so I clearly like the product. And then there's a customization activation which is I've uncovered what skills and hooks and MCP and tools. Those power users in the Enterprise very quickly become evangelists. They're sharing it with everybody. They're so excited to use Droid. They start bringing more and more of their work into Droid. And I think that that's the most fun is seeing enterprises that most would say are like, quote unquote legacy doing cooler stuff than what you see on Twitter. Which is fun.
[33:38]
Podcast Host
Yeah, it's kind of like the people are using cloud code to run their life. Except for an enterprise, right?
[33:44]
Reyes
Yeah, exactly. And the Enterprise, at the very least it's actually better suited for this sort of stuff. It's still a huge pain to connect your Gmail to Claude code or to Droid, but you can actually pretty easily connect Outlook and Excel and all this other stuff to Droid. So it's. Yeah, it's a lot easier to operate your like work OS from, from Droid.
[34:07]
Podcast Host
Awesome, dude. So people are excited about Droid and you know, building a product management skill and like so where can people. So joid is free to use, right? Like where do people go?
[34:17]
Reyes
Yeah, just go to Factory AI. We've got the CLI link right there. But if you sign up we can give you up to a bunch of free usage to get started. Some really exciting things depending on when this airs of free usage that I think a lot of people are going to be excited about and then we have a bunch of plans for all sorts of options. So really easy to get started. Just one, one line and you're in.
[34:40]
Podcast Host
All right, dude. Well, I mean I have thoughts about a bunch of vibe coding companies out there, but I think you guys, like, it all comes down to focus, man. And I'm like super impressed by the progress that you've made. So yeah, I definitely highly encourage everyone to give Jord a try. And also you have some really great talks out there. So like, should people find you on Twitter or where can people find you?
[34:58]
Reyes
Yeah, you can just. It's no Reyes E N O R E Y E S and you can see all my Twitter escapades.
[35:08]
Podcast Host
We got a speed read. You got to do the speed read thing with the Twitter APIs. You can just read all the. You can read all the rage bait tweets.
[35:17]
Reyes
Exactly. That's a great call. Just a daily dose of very fast rage.
[35:22]
Podcast Host
Yeah, and they just become a very demented person. But anyway, yeah, cool, man. All right, dude. Stay in touch, man.
[35:28]
Reyes
Yeah, thanks so much. Bye.
[35:31]
Podcast Host
I did.