Spec-driven development: The AI engineering workflow at Notion | Ryan Nystrom - How I AI

Summary8 min read

How I AI – Episode Summary
“Spec-driven development: The AI engineering workflow at Notion”
Host: Claire Vo | Guest: Ryan Nystrom (Notion, Engineering Manager)
Date: May 11, 2026

Episode Overview

In this episode, Claire Vo welcomes Ryan Nystrom, engineering manager at Notion, for a practical, inside look at how Notion leverages AI to radically speed up and improve core engineering workflows. The conversation covers how AI-driven agents automate everyday developer tasks, enable fully prepped meetings without manual effort, facilitate background code generation, and support spec-driven autonomous development at scale. The tone is conversational, hands-on, and full of energy, focusing on real, copyable workflows.

Key Discussion Points & Insights

1. The Impact of AI on Engineering Work

Personal Workflow Disruption
- Ryan describes how AI has transformed his habits after a decade of consistent engineering routines, creating “so much like joy and freshness and like newness” in his work.
- “I’ve changed IDEs, terminals, tools like whatever, like 10 plus times… The last year I've changed IDEs... But it's... so energizing for me.” (03:11, Ryan)
Cultural Shift
- Both Ryan and Claire note a universal feeling among engineers: AI makes work faster, more playful, and more satisfying.
- “Every How I guest has come on and said I'm having more fun, I'm working faster and everything is different.” (04:08, Claire)

2. Automating Team Operations with Notion AI

Automated Standup Preparation
- Ryan’s team uses a Notion AI custom agent (“Hot Potato”) to automatically compile detailed daily standup notes.
- The agent scans Slack, closed tasks, pull requests, and the previous day’s meeting notes, then assembles a contextual “pre-read” for discussion.
- KEY BENEFIT: No prep required—agendas and all relevant metrics arrive just before the meeting, letting engineers code until the last second.
- “I can basically work up until like the minute of our meeting without having done a bunch of like prep.” (08:28, Ryan)
Meeting Quality & Engagement
- The new workflow eliminates “glassy-eyed” participation; instead, meetings focus on real problems, recent findings, and decisions.
- Claire notes: “There's a really big difference between a good stand up and a bad stand up... If you can have high bandwidth, high quality meetings with high frequency without the overhead, I think you can get better, more detailed work done collaboratively.” (09:00, Claire)
- The automation democratizes participation, surfacing contributions from quieter team members.
- “It sort of democratizes that sharing so that... brilliant and super talented [engineers]... this kind of raises everything up to the top.” (11:30, Ryan)
Reducing Burnout, Boosting Joy
- Removing meeting-prep drudgery allows managers and engineers more time for coding and creative work.
- “You're spending half of your time just compiling information, synthesizing it, writing reports, like doing all this stuff... it's so draining. And now I feel like I'm in a sweet spot...” (14:27, Ryan)

Notable Quote

“I get a lot of joy out of working with people... I love solving hard problems together and then I like building stuff.”
(15:02, Ryan Nystrom)

3. Behind the Scenes: Building the Custom AI Agent (“Hot Potato”)

Agent Operation & Design
- Scheduled to run daily, the agent applies subagents (fan-out/map-reduce style) to pull metrics, gather updates, and compile feedback.
- Ryan gives it contextual permissions—read-only on most databases, write on the meetings database.
- Fun, brief, slightly “quirky” Slack summaries to keep engagement high.
- Even setup was AI-driven: “I literally gave it a screenshot of the Honeycomb query... can you just update your instructions?” (19:48, Ryan)
Iterative, Natural-Language Automation
- Adjustments are easy: “Just change the natural language, redo the order, change the trigger, give it more access to data and then it’s ready to go.” (20:41, Claire)
- It’s not about saving hours, but reclaiming and protecting mental focus.

Notable Quote

“The tedium that this removes for me is... like 20 minutes a day... but it’s like protecting my brain from like having to context shift... It’s soul sucking.”
(21:52, Ryan Nystrom)

4. Accelerating Code with Background Agents

From Spec to PR in Minutes
- Internal Notion workflow lets engineers @-mention the Codex agent in Notion tasks, triggering a VM (“Boxy”) that reads the prompt, builds the code, and files a PR/end-to-end preview—often within 10-20 minutes.
- Real example: Adding a “copy link” feature after a friend’s text suggestion—built and reviewed by the agent in about 20 minutes, including automated screenshots of UI verification and multi-step type fixes.

Notable Quotes

“I just described the task… and then this new thing we built, I can actually mention codecs from within our comments... it replies with a pull request link and a preview URL… built the entire thing.”
(25:07, Ryan Nystrom)

Changing Code Review Dynamics
- Honest, direct feedback—no more pretending expertise.
- “I literally don't know what I'm doing here. You need to explain this to me. Especially doing all this CI stuff... you gotta, you gotta like explain it like I'm a five year old.” (29:59, Ryan)
- Claire: “Imagine sending that to somebody. Oh yeah, win teammates. Be like, no, look it up.” (29:24, Claire)

5. Spec-Driven Development at Scale

Spec as Source of Truth
- Notion now stores detailed markdown specs in the repo—starting as spoken notes transcribed via Whisper, then codified and refined by Codex.
- The agent generates comprehensive, testable specs. Then, pointed at the spec file, Codex can "one shot" entire implementations, including automated verification.
- “I then opened up Codex again, pointed it at this spec file and I said, build it. And basically one-shotted this because the entire spec file is so comprehensive...” (34:37, Ryan)
Living Docs, Continuous Evolution
- Specs are versioned, serve as both change log and reference, and can be consumed by other teams (e.g., for documentation/marketing).
- Updates are made at the spec-level—“update the specs, don't update the code.” (39:45, Claire)

Notable Quote

“Our job as engineers [is] evolving into systems thinkers and architects... not even just necessarily writing like the spec and thinking about the behaviors, but most importantly, is like the verification loop.”
(36:56, Ryan Nystrom)

6. Tools & Prompts: Codex vs. Claude, Prompting Strategies

Codex Endurance
- Ryan prefers Codex over Claude for its ability to persist across long-running tasks, handling larger context windows without "losing the plot."
- “Anytime cloud code like filled up its context window, it would just kind of lose the plot... Codex can grind for hours...” (40:09, Ryan)
Prompting Philosophy
- Don’t be shy—admit when lost and ask for direct explanations or push back for deeper justifications.
- “You’re wrong, you need to defend your argument... I just need the cited hard argument against it... and a lot of times... I don’t know what I’m doing...” (45:35, Ryan)

Memorable Moments & Quotes (with Timestamps)

“I literally don't know what I'm doing here. You gotta explain it. Like I'm a five year old.” (00:00 & 29:59, Ryan Nystrom)
“No more waiting for the meeting, no more waiting for review.” (00:27, Claire Vo)
“Give me the whole triangle.” (01:07, Ryan Nystrom & 23:45, 23:47, Claire Vo)
“We have this agent called ‘Hot Potato’... We’re going to make the potato like a rocket ship.” (16:32, Ryan)
“I used the agent to like set itself up because I was like, here's the query. I literally gave it a screenshot of the Honeycomb query... can you just like update your instructions?” (19:48, Ryan)
“I appreciate about this versus old era of more deterministic workflow style builders is the updates are so easy to make.” (20:41, Claire)
“It just feels like a more relaxed way to work. Yeah, we're happy.” (23:13, Claire)
“Spec is the source of truth. The spec as the change log, I think is a really interesting model.” (36:15, Claire)
“This is the era of the hard skill… how do you write code? How do you write automations?” (15:44, Claire)
“If you are not… spending time on making that [CI pipeline] fast… you're not going to get the benefits of AI that you could.” (44:26, Claire)

Timestamps for Major Segments

00:00 – Ryan’s workflow with AI prompts and using Codex end-to-end
03:11 – How AI changed his daily work, new energy and tools
04:36 – Context on running a fast, AI-driven team at Notion
08:28 – Automated meeting prep and standups with Notion AI custom agent (“Hot Potato”)
10:24 – How automated meetings democratize input and improve quality
13:23 – Burnout prevention through reduced prep; more coding time
16:32 – Deep dive: How the Notion agent is constructed and permissions managed
20:41 – Iterative, language-based automation workflows
25:07 – Live example: @-mentioning Codex to build and review a feature
29:59 – Code review as candid dialog; prompting for explanations like a five-year-old
34:37 – Spec-driven development: From yakking into Whisper to “one-shot” agent implementation
36:15 – Living specs as source of truth and asset across the business
40:09 – Why Codex is best for long-context, durable agent workflows
44:26 – DevEx and CI speed as the multiplier for AI agent effectiveness
45:35 – Prompting strategy: challenge the AI, demand justifications

Notable Takeaways

AI agents are now essential workflow multipliers, not just tools; they democratize meetings, reduce tedium, and allow hands-on technical leverage by managers.
Spec-driven, agent-executed development is moving coding towards system architecture and verification, with specs and prompts becoming the core artifacts.
Human involvement shifts from low-impact documentation and prep to decision-making, system design, and high-leverage feedback loops.
The right DevEx and CI speed are prerequisites for reaping outsized AI gains.
Real-world language and admitting uncertainty are strengths in prompt engineering.

For Listeners Who Missed It

This episode is a playbook for modern engineers and leaders eager to harness AI for maximized productivity, engagement, and strategic leverage. Ryan and Claire make the workflows tangible and prove that the future of software is already being built—one automated potato at a time.

Loading summary

Transcript74 lines

[00:00]
Ryan Nystrom
One line that I've been putting in my prompts lately is, I literally don't know what I'm doing here. You gotta explain it. Like I'm a five year old. I didn't start with writing code. I didn't start with anything. I just started with an empty markdown document. I actually just opened up Whisper and just started yapping about how this feature should work. I gave the yap session to Codex and was like, write a spec. I then opened up Codex again, pointed it at this spec file and I said, build it. And basically one shotted this.
[00:28]
Claire Vo
I've been in software engineering for 20 plus years. We were writing these documents and we were sitting in meetings with other engineers debating the merits of one implementation versus another. And now no more waiting for the meeting, no more waiting for review.
[00:42]
Ryan Nystrom
I'm not a CI expert, but I kind of know what I want. And so other folks were kind of like, can you just bring some of your like puppy dog energy to like CI and just see what we can do?
[00:53]
Claire Vo
Your AI, your agent is never going to complain when you ask it to do this five minutes before the meeting starts.
[00:59]
Ryan Nystrom
It is more relaxing and it's more fun and I feel like I'm getting more done. It's weird to have this like, win, win, win.
[01:05]
Claire Vo
They do the triangle and they're like, pick two. And you're like, no, I want to pick all three.
[01:08]
Ryan Nystrom
Give me the whole triangle.
[01:09]
Claire Vo
Give me the whole triangle. Welcome back to How I AI. I'm Claire Vo, product leader and AI obsessive, here on a mission to help you build better with these new tools. Today we have Ryan Nystrom from Notion and he's going to show us as an engineering manager how you can never prep for a stand up again. We're also going to see how you can get a background agent to write code for a fix your friend texted you, and how spec driven development really works in a code base at scale. Let's get to it. This episode is brought to you by Work os. AI has already changed how we work. Tools are helping teams write better code, analyze customer data, and even handle support tickets automatically. But there's a catch. These tools only work well when they have deep access to company systems. Your copilot needs to see your entire code base. Your chatbot needs to search across internal docs. And for enterprise buyers, that raises serious security concerns. That's why these apps face intense IT scrutiny from day one to pass. They need secure authentication, access controls, audit logs, the whole suite of enterprise features. Building all that from Scratch. It's a massive lift. That's where WorkOS comes in. WorkOS gives you drop in APIs for enterprise features so your app can become enterprise ready and scale upmarket faster. Think of it like Stripe for enterprise features. OpenAI perplexity and cursor are already using work OS to move faster and meet enterprise demands. Join them and hundreds of other industry leaders@workos.com start building today. Ryan, welcome to How I AI. I am really excited because you're going to show us, I think start medium advanced mode on some AI coding stuff. And so before we jump in, how has AI just changed how you live your life at work?
[03:11]
Ryan Nystrom
I mean it's, it has completely upended the way I work. Like I, I've been doing this, I did a lot of like mobile iOS work in my past and I work the same way every day for like 12 plus years. And then the last year I've changed IDEs, terminals, tools like whatever, like 10 plus times. Um, so it's like really weird and scary to be like changing this stuff so much. But it's also I feel that like I've been doing this for a while and I'm feeling like so much like joy and freshness and like newness in this that like I wake up every day super excited to like tinker and build things and I'm also like working faster and harder than I feel like I ever have. But like in a, in a good way. I know people are like kind of freaked out about like ah, everything's changing, the pace is up. But like it's, it's really energizing for me.
[04:08]
Claire Vo
Well, you are not alone. I think every, How I guest has come on and said I'm having more fun, I'm working faster and everything is different. And what I love about what you're going to show us is it's not just the set of tools which I think has changed or how we write code has changed. How you like run a team has changed. So. And I think for the better, you know. So I would love to see how you're using Notion AI to actually run teams differently.
[04:37]
Ryan Nystrom
Yeah, for, so for context I, I manage a team of like six, seven people. I'm like an engineering manager or tech, technical engineering tech lead manager or whatever we call it. I manage people and I write. Code is like my role which I love and I've run a bunch of different projects here at Notion. The, the one I'm going to show today we've, we've nicknamed Afterburner like kind of a quick backstory is I've been kind of vocal about our like devex CI for a while. I've just, I worked at places where it's really slow. I've worked at places where it's really fast. And I came here and I was like, we're kind of like in between, but I feel like we're like slower than we need to be. And eventually this like caught up to me and somebody was like, could you just come fix it and like come work on it? And I'm not like a infra expert, I'm not a CI expert, but I, I kind of know what I want. And I think also most importantly, the, the group that I manage and the, the org that I work in were a little notorious for being like really fast and like very, very AI pilled. And so other folks were kind of like, can you just bring some of your like puppy dog energy to like CI and just like see what we can do? And, and that's what we've done. So we had this really aggressive goal to cut our CI into like a quarter of what it is. We're on the path to doing that. But so what I want to show you is a little bit about how we run projects in Notion and how I'm using AI to kind of streamline those projects. So what we're looking at here is like our project hub called Afterburner. And so in here I've got all this documentation, I've got databases, we'll look at some meetings. I have like a automation that looks for any sort of like little wins if we knock off seconds from different jobs or whatever. And we keep it, we keep track of them in here. But what I want to show you is basically how we run our meetings. Our small group, we'd run a standup every single day and doing standups where everyone just like is kind of like dead eyed and going around being like, I did this, I shipped this change or you know, no updates for me, thanks is like painful and in my opinion like a huge waste of time. I want to like get to the meat. So we have this kind of like automated meeting template that shows up every single day that we run our meetings, which is basically every day and it starts blank. And then what I have set up behind the scenes is a custom agent. So we chipped this notion AI custom agent stuff a little bit ago and this runs right after the meeting template gets generated and it looks through all of our like slack conversation in the last 24 hours. Any tasks in Notion that we close any pull requests that we've like merged just like all sorts of contexts. Oh, and it looks at like yesterday's meeting transcript as well and then it compiles it basically like a pre read and this is one from a week or so ago. And it pulls metrics. It can show us like what our latest CI time is. It shows some of the things we've decided shows like progress on different projects or different things that we're trying to make faster bugs, feedback, open questions, anything that's of concern. And like I can basically work up until like the minute of our meeting without having done a bunch of like prep. And then we all get on a video call and we look at this screen and we're like, okay, here's what we need to talk about and we'll like hit each bullet. We have like meeting notes that we run down here within notion and like all the context is basically like captured. All of the like agenda is like set up for me. So we spend the entire time talking about like problems, decisions, wins, findings, like what are we going to work on next? And it's less the like, oh, I did this thing.
[09:01]
Claire Vo
Yeah. What I want to call out for folks that are maybe listening and not watching is this is a very detailed meeting, like kind of like pre read slash status update, which if you're a, you know, TLM or an EM running good meetings unfortunately is part of, part of the job. And there's a really big difference between a good stand up and a bad stand up. And I think the, the ones you described, these like wrote like I wrote this PR today. I'm gonna start on this and then basically like a notes document that's very high level because some human is putting it together. It stinks. And you start to like do the other thing that I found. And maybe I'm curious if this has impacted how you work or just made it easier, which is like you start to have those meetings less because the updates aren't super rich. People don't feel like it's a good use of their time and I think you lose something by reducing the frequency. But if you can have high bandwidth, high quality meetings with high frequency without the overhead, I think you can get better, more detailed work done collaboratively. Especially if you're running like a remote team or not everybody's in the same room. So I'm curious, do you feel like just being able to get to this level of detail in your standup has improved how you can actually build this product and do this work?
[10:25]
Ryan Nystrom
Basically I've been in way too many meetings where I can tell everybody's eyes are glazed over, nobody's paying attention. And I like, I have run enough projects at this point where like, to me that's like pretty dangerous because like, everybody in that room has ideas, they have insight. But like, because we've made it like so non productive and like not engaging that, like, we're not, we're just like, we need. The point of the meeting is to like exchange ideas and information. And if we're like not doing that, like, literally what's the point? The whole, like, this could have been an email. Like, that's what the meme is from. Um, and so I, I found that this is basically a good conversation starter. Um, and a, A key thing to me that, that tells me that this is working is when I'm like, I'm running these meetings and kind of going through like bullet by bullet and then I see like, oh, somebody like fixed our mock server environment in our, our jest tests and we're seeing a test improvement by like up to 13%. Like, oh, I miss that. You know, like, that's, that's super cool. Like, tell, like, let's talk about it. And then from that maybe like, oh, there's additional headroom that we can make on this. And so all of a sudden, like, let's drill into it. And so just like, I also think it sort of democratizes that, like sharing so that you've got some people that I, I could talk for an entire 30 meeting without shutting up. And then other engineers are like, you know, kept keeping to themselves, but like brilliant and super talented and I think it's great. This kind of like raises everything up to the top and makes it very visible.
[12:06]
Claire Vo
I was, I was thinking last night, truly, this is a little bit of a sidebar, but I was thinking last night how people don't know this. It's very, very confusing. As Clairvaux the podcast hopes. I am such an introvert. I'm a crazy introvert. I will avoid humans with every fiber of my being. And I find like, AI as a proxy for me to get over my own anxiety. Communicating with the outside world. It's like so insane, but I really feel it whenever I like, message my open claw and I'm like, hey, Polly, can you email so and so, blah, blah, blah, on my behalf? My anxiety is gone. But when I sit in front of my Gmail and start typing an email, my introvert starts showing and I'm like, I can't. I can't do. So I think this is a real point which is like you gotta pull information out of everybody.
[12:54]
Ryan Nystrom
Yeah.
[12:54]
Claire Vo
And there's like spikes and valleys of who is gonna come and show off their work, how detailed they're gonna be. It does not mean the work is not good. It does not mean they are not talented. They just have different skills in different environments. And so I love this call out of like we're gonna pull equal amounts of information out of everybody because the playing field is, is equal. I also want to go back to something you said which makes me think about burnout, which you said. I can like work up until this meeting starts.
[13:24]
Ryan Nystrom
Yeah.
[13:24]
Claire Vo
And I, I know so many people who feel like they're in meeting after meeting after meeting. One, because when they're not in meetings, they're preparing for meetings. So you're getting rid of that.
[13:35]
Ryan Nystrom
Yeah.
[13:36]
Claire Vo
And then two, I think managers being able to for example code up until the standup is such a burnout like protection mechanism, which is you would much rather your managers hands on the code, hands on doing work, filling a creative impulse than prepping for meetings. Sure, they're great at running the meetings, but let's do that in like this just in time mechanism where they're supported through these automations. And so I just imagine I used to feel so stressed as a leader and as a manager being like if I'm not in a meeting, I'm prepping for a meeting and then I'm in a meeting and if I can carve back more of my time to just do real things, that feels so much, so much, so much better. So I'm guessing you're having more fun just through the balance of the kind of work you're trying to do.
[14:27]
Ryan Nystrom
I'm. Yeah, I'm having so much more fun working this way. I have done the, like run a big engineering group where. Yes. You're spending half of your time just compiling information, synthesizing it, writing reports, like doing all this stuff. And I, I hated it. Like, it's so draining. And now I feel like I'm in a sweet spot where I can support and work with a team of like very talented individuals but also not have to. Yeah. Be like doing like paperwork the entire time. Like.
[15:02]
Claire Vo
Yeah.
[15:03]
Ryan Nystrom
I don't want to do like the tedium. I get a lot of joy out of working with people. It's like why I like managing. I like talking with them, having fun. I love like solving hard problems together and then I like building stuff. And yeah, I think that we're maybe at an inflection point where like maybe this is controversial or not, but like if you're like a line manager, like right code, you know, get in there and like stay, stay close. Maybe don't do the P0 Hero projects but like help your team fix bugs, like make optimizations, like whatever, you know, I. It's just, it's so easy now.
[15:45]
Claire Vo
I mean, I'm going to pull that thread all the way up which is like directors of engineering, VPs of engineering. Like CTOs. Yeah, CPOs it. Right, right. Some code. Now is the time. And I say this all the time on this podcast. This is the era of the hard skill. This is not how do I get better at my. My soft skills and managing stakeholders. This is literally like how do you write code? How do you write automations? How do you learn these new tools? How do you understand what models do do what for your own skills? That I think is super important. Okay, we could go on forever on this topic. I do want to show folks how you built this.
[16:20]
Ryan Nystrom
Yes.
[16:21]
Claire Vo
Because we haven't seen the. Actually haven't seen. We've had a couple of notion folks on the podcast. We haven't seen notion AI in action. And I just want to see kind of your thought process on how you build something like this out.
[16:32]
Ryan Nystrom
So I'm going to flip over to. This is our custom agent. Don't ask me why. We got this like potato theme for like the entire project. I think it was kind of something about like CI is just this like cobbled together, like mess. And so we're going to like make the potato like a rocket ship. I don't know. I don't even know if that makes any sense. But like now we're having fun with it and we have reactions and like agents and it has spun off into its own thing, which is fun. But this is our hot potato agent. So you can see I have this set up to run at 9am every single day. It's also set up for chat. It's set up if the agents mentioned, but we never really use any of that. And probably the important part is the actual instructions. So in this instruction page, giving context on like what the purpose of this agent is, I am telling it to run. Yeah, look back at 24 hours. So basically telling it your job is to run every single day and I only want you to look back for the last like 24 hours of activity. I'm explicitly telling it to use subagents, which is kind of a sleeper feature in notion AI. Like this exists, but we Don't. We don't really push it to use it very often yet because it's one, it's very expensive and two, it can be kind of finicky sometimes. But I helped build it, so I know how this works. And then I ask you to kind of like, fan out and do like a map reduce where I'm saying, go use the Honeycomb MCP to figure out what the latest metric is. Look in our project channel and like, find updates, feedback, questions. I tell it where the task databases and how to look for tasks within this project and then how to find yesterday's meeting. And then I give it a template in the instructions. Or I'm like, this is your format. I care about CI, speed, decisions, progress, changes, bugs, questions, risks, a little bit of guidance on writing. And then when it's done, I have it post to Slack and I like, emphasize this. Like, I want it to be brief and fun. And sometimes it's really corny and then sometimes it's like, really good and it'll be very quirky. And just post, post this, like, link in our Slack channel and it's like, hey, here's your. Your pre read some little quibble about, like, whatever, you know, like, hey, you guys are not making enough progress. And then that's it. And then it's, we have our meeting note, it's updated. What I like the most about this one and show you some of our like, internal settings. So this is like, I give it access to all of these things and I'm like, you can only view all of this stuff because I don't want it going and like modifying our task database, our project database. Like, everybody at Notion uses those. But this meetings database in particular, I'm like, you know, you can edit content because this is the one you're gonna like, write and update the page. It can read from our Slack channels, respond to our project one. And then this was new to me, actually, when I set this agent up is our mcp. So we've had MCP and we have this other thing called workers, which is like kind of like writing code. I haven't used them very much within custom agents, but in this one in particular, I'm like, I know exactly where this metric is. It's in honeycomb. And so I like, just configured the MCP in Notion, by the way. I like, I like used the agent to like set itself up because I was like, here's the query. I literally gave it a screenshot of the Honeycomb query. And I was like, I don't know how this works. Can you just like update your instructions?
[20:09]
Claire Vo
I love that you screenshot it. You didn't even copy and paste it. You're like please OCR this screenshot.
[20:14]
Ryan Nystrom
Exactly. Too lazy. I'm like here it is, just take it, figure it out. And it, it kind of, it got it mostly like most of the way there I had to fiddle with it a little bit.
[20:25]
Claire Vo
But yeah, what, what I appreciate about this and again for anybody trying to just brainstorm workflows where AI can actually have a huge impact on your productivity at work or life, I just feel like write down what you would do if you had time. If you had time every morning at 9 you would sit down and you with your eyeballs would go through Slack, you would go through Honeycomb, you would ask people what's going on. You would look at GitHub and then you would compile it and then you would try to be very fun in Slack. Like it's, it's just a description of what you would do and it doesn't have to be that complicated. You can iterate so quickly on it. What I appreciate about this versus old era of more deterministic workflow style builders is the updates are so easy to make. Just change the natural language, redo the order, change the trigger, give it more access to data and then it's, it's ready, ready to go.
[21:25]
Ryan Nystrom
Yeah. And you know I think the other thing I, I've gotten hung up on when trying to think about like these automations and I, I've seen others do is that like you get this like you start giga braining it and you're like well how am I gonna save like five hours of work a day? And I, what you just said made me realize too that like the, the tedium that this was removes for me is not like world changing but it's like 20 minutes a day and that's like 20 minutes I can spend doing other stuff and that like it's not even just about like saving that 20 minutes but it's like protecting my brain from like having to context shift about all this stuff and like ingest it and instead. Yeah, it's just, I know that the information will be there when I'm ready to like read it and I'm ready to like shift gears to this project rather than, yeah, I, I hate doing the like read this update, copy all this information, like put it into like an update board. Like it's soul sucking.
[22:26]
Claire Vo
I mean you and I have been doing this for a while. That's like that was, I feel like 70% of my job at some point, 70% of my job was just like, what's going on? How do I massage it into a format appropriate for the audience at hand? And it's always the same information. It's just like, what's the executive version of it and what's the team version of it and what's the full team version of it? And like, like my shoulders drop out of my ears when I realize we don't have, like we just don't have to do it anymore. And the other thing that I think people maybe underappreciate about AI and this like just in time delivery is your AI. Your agent is never going to complain when you ask it to do this five minutes before the meeting starts.
[23:11]
Ryan Nystrom
I know, it's so great.
[23:13]
Claire Vo
It's so great. It's just like when you have it, drop it, get it done and out of your brain, I think is just again, I go back to like burnout and enjoying your work and reducing toil and it just feels like a more relaxed way to work. Yeah, we're happy.
[23:33]
Ryan Nystrom
It's so funny because it is more relaxing and it's more fun and I feel like I'm getting more done. It's weird to have like this like, win, win, win.
[23:41]
Claire Vo
Yeah, you know, they, they do the triangle and they're like, pick two. And you're like, no, I want to pick all three.
[23:46]
Ryan Nystrom
Yeah. And I'm like, give me the whole triangle.
[23:48]
Claire Vo
Give me the whole triangle that stamp it for the YouTube thumbnail. Give me the whole triangle. All right. This episode is brought to you by Orchis, the company behind open source Conductor, which powers complex workflows and process orchestration for modern enterprise apps. In agentic workflows, legacy business process automation tools are breaking down siloed low code platforms, outdated process management systems and disconnected API management tools weren't built for today's AI powered world. Orcus changes that. With Orcus Conductor you get a modern orchestration layer that scales with high reliability and brings humans, AI and systems together in real time. It's not just about tasks, it's about orchestrating everything. APIs, microservices, data pipelines, human in the loop actions and even autonomous agents. So build, test and debug complex workflows with ease, all while maintaining enterprise grade security compliance and observability. Orcus Orchestrate the future of work, learn more and start building at Orcas IO. Let's talk about. So we talked about how meetings happen. Love this. You do Write code, though.
[24:57]
Ryan Nystrom
I do write code.
[24:58]
Claire Vo
Sometimes with your fingers and sometimes with. You're my favorite harness of the moment. So let's go to how you get coded.
[25:08]
Ryan Nystrom
Yeah, I want to show the little bit of a new workflow that we have going on in Notion. I honestly don't think this is necessarily a big feature that we're going to ship or we might ship some version of this feature. So this is all basically internal only at this point. But prior to this, the way one. I love Codex. I've been a codec stand for, like, I don't know, six, seven months now. And we started. We started building this, like, Codex integration into Notion. Prior to this, it was like, I mean, obviously you're using the CLI to like, write your prompt. And then they created the Codex app, which is what. Which is nice, but I'm still like, writing my prompt in this thing. So I started actually writing prompts in Notion pages where I can be a little bit more like, freeform and structured. And, you know, I'm in like, the cli. I don't have to worry about, like, hitting enter and then like, oh, shit, like, I sent my prompt. I can actually, like, write a document in Notion. But then of course, I'm like cop or like highlighting all of the text, copying it, going into my terminal, hitting paste and like letting it go. Like, fine. There's like MCPS and other stuff that I could be using, but I'm. I'm a very simple person. It was like too much. Too much stuff. So we built this thing that we're kind of like calling it. I think we're calling it both Software Factory. But I like its internal project name is Boxy because it's like all these little VMs that we install codecs and clog code on. It's our little boxes where now we can actually invoke them from tasks within Notion. And so literally happened this morning. A friend of mine who's a Notion fan texted me. He's like, hey, I like the tab block that you built, which is this thing. He's like, but I really wish I could, like, you know, I can like, copy. I'll show you now. If I click on the dot, dot, dot thing next to a block, there's like copy link to block. He was like, man, I really wish I could just like, copy link to a tab and then like, send it to somebody. And I was like, oh, yeah, that. That sounds really easy. So I opened up this task and I just took some notes and I dropped in this screenshot showing him where it could live. You know, this is like on the the tab block, if I right click on it, we get this little flyover menu. And I just described the task. I was like, yeah, let's put a copy link button here. I also noticed that hovering over the delete button didn't change to red. So I was like, yeah, we should fix that. And I was like, I know when edge cases, like you're gonna have to like if you land on a URL where we're like linking to this block and this tab, like it's gotta like switch to the tab on like fresh refreshes. So I like kind of outlined all the cases. But you can see Here, this is 1, 2, 3 paragraphs, 4 like 4 sentences and a screenshot. I was like, not, not a lot. And then this new thing that we built, I can actually mention codecs from within our comments and this triggers our boxy. This is all our internal dev tooling stuff. And then it got to work. And I was looking at the timestamps earlier and I think 1040, 1051 started the implementation and then another 10 minutes later it replies with a pull request link and a preview URL because we do the preview environment stuff. And it built the entire thing. And if I switch actually over to here's the pull request, it built the entire thing. It was like, here's how I tested it. And this was actually the coolest part to me is it actually uploaded screenshots of it doing its own UI verification and there was like a CI failure in it. So I was like, hey. I replied down here like, oh yeah, this part of this code change. I was like, I don't know what is going on here. This doesn't make sense. There's some type check things. And it was like, cool, here's why we did this change. And I fixed your types. And then there was a merge conflict.
[29:25]
Claire Vo
Let's also talk about just old world, new world, the emotions around code review. Like, I love that you're like, I don't get this. Not I'm going to try to get it. Or I've done my best to investigate and I'm pretty sure this is wrong. Like literally just I don't get it, I don't get it. Imagine like sending that to somebody. Oh yeah, win teammates. Be like, no, look it up. It makes it so that's one of my. One of my code review mechanisms as well is I'm just like, I don't think this is right.
[29:59]
Ryan Nystrom
I. Yeah, one, one, one line that I've been putting in my prompts lately is I'm like, I literally don't know what I'm doing here. You need to explain this to me. Especially doing all this CI stuff. I'm like, I'm in over my head. Like, you gotta. You gotta like, explain it. Like, I'm a five year old and sometimes it's like literally, like caveman style, like here things. And I'm like, I needed this.
[30:23]
Claire Vo
Well, what I appreciate about this and we'll talk about it in a little bit later, I think, is we're not getting any of these Claude code warm fuzzies from Codex. I love it, but when it talks to me, I'm like, I feel real dumb when you're talking to me.
[30:36]
Ryan Nystrom
Codex, can you like, okay, okay, dumb, dumb. Here's what's going on.
[30:43]
Claire Vo
So I think that sort of like, I don't understand what I'm doing. Please explain it to me. Is. I love it, but I think it is a codec specific experience. Claude code shows up and is like, hey, buddy, guess what I made for you. Okay, so stepping back before we go too deep on the personalities of all these coding models you have built just at mention. And I think people miss this, but it's important for particularly our engineering team and engineering leader listeners to pay attention to, which is everybody I know that's smart has some background agent hooked up to a virtual machine that they kick off so that you're not spinning up your local environment and doing all this stuff on your machine. And it just. I think the velocity. We'll talk about this. But the velocity plus your DevX, plus your CI, I think is a huge piece of AI adoption in engineering. So if you are not. If you don't have a like, VM strategy and background agent strategy in your large engineering org. Time. Time to get one. Time to do it. And we'll just ask Codex to build it. Be like, I don't know what I'm talking about, but I think. I think this is. Claire said this is a good idea. Please build. Make no mistake. Okay, last. Last one. We have a last use case. You were gonna. You were gonna walk us through?
[32:06]
Ryan Nystrom
Yeah. So we recently rebuilt our entire agent harness. The Quick TLDR is just like everybody else. We reached like this point of tool and instruction fatigue where you had this bloated system prompt. So we're like, okay, we need to dumb this down. And we borrowed the kind of like, skills and progressive disclosure concept from coding agents and brought that to notion AI. And when we were doing this big rewrite. We were asking ourselves, like, you know, it had only been like six months since we shipped the last rewrite and this time we were like, what would we do differently? Like, seeing the state of the art with coding agents and someone had this really great idea to like, let's not start with code. Like, let's just start with specs. And what we've ended up building is we have this like in our. Checked into our code base, you see this, we have this, we're looking at, this is a notion repo, but we have this agent specs subfolder. And within this subfolder we have all of these markdown documents. This is one that I worked on. We have this thing in our AI called ask mode, where we basically ban all the mutating tools so it can only just read and answer questions. And so when I was building this, I didn't start with writing code. I didn't start with anything. I just started with an empty markdown document. And I mean, I, I actually just opened up like Whisper and just started yapping about how this feature should work. And then at the end of it, I gave that, I gave the yap session to Codex and was like, here's our other like, spec library. Learn the format, take my information, write a spec. And then it spiked. The first version, I did a couple revisions on it and it ended up with this markdown document. Now the markdown document is like, it's nice. But what we did with it next, in my opinion is I kind of think that this is like the future of software engineering, where I then opened up Codex again, pointed it at this spec file and I said, build it. And it basically one shotted this because the entire spec file is so comprehensive with code pointers, with down at the bottom, we have verification. And it was like, here is how you're going to verify all of this stuff works. And we've even built our own CLI tools so that you can run notion AI from the cli. And it could, once it's done, seeing that all the tests pass, it can actually just spin up notion AI itself. Send it queries, send it questions, enable ask mode, disable ask mode and then see the transcripts and like see what actually happens. And I think the first shot of this took a couple hours, but I came back to whatever couple thousand lines, did some code review, played with it myself, and I was like, it's right, it's like done. And you know, since then we've made I can. The other beauty of this is like, this is in version control. So I can go to the past changes of this spec file and I can see how the spec has evolved and I could go look through all of the code changes which also have their own history. But this is now the sort of source of truth for how this part of notion AI works. And it's just in plain English that can then be verified and implemented by agents.
[35:34]
Claire Vo
Well, I think the other thing that people don't appreciate is, you know, taking this outside of the engineering flow is this plain English can be ingested by other parts of the business that need this information. So let's say you need to then release this feature via some sort of marketing. This is actually like a pretty good asset that explains how it works that can be translated into another thing in a way that like code itself is still a little intractable. And so this idea of spec driven development. But what I like about what you said I don't want people to miss is the way you make these updates is you update the spec and go, go look, make the update, change the thing.
[36:15]
Ryan Nystrom
Exactly.
[36:16]
Claire Vo
And so like the spec is the source of truth. The spec as the change log, I think is a really interesting model. And for people that aren't watching, it's very detailed, it's very technical. So it's not like there is not code in the spec, there's just not all the code in the spec. And I think that's a really kind of good hybrid model for, for experienced engineers to start to bridge into. What would it look like to have an agent do more of your coding work while you still do architecture work, while you still do design work, while you still make sure that the thing is going to scale.
[36:56]
Ryan Nystrom
Yeah, exactly. I view our job as engineers evolving into systems thinkers and architects and not even just necessarily writing like the spec and thinking about the behaviors, but most importantly, is like the verification loop, like, is it a. Like, how should it verify correctness of this feature or this change? And honestly, it's like if it can't, or if like the verification's a little hazy, it's like that's the first thing you actually should be going and doing. It's like, do you have a tool to let the agent run itself? That's like one of the first things we did with this project was like, we, we should actually build a clique so that I can tell Codex, like, send this prompt and like see what happens. And now that we have that, then we can take these specs and actually just like go deeper and deeper and deeper. And so it's like we're still doing engineering but I'm not doing the like plumbing work of like wiring up this ask mode feature.
[37:54]
Claire Vo
Well. And what I think is really funny is me, I've been in software engineering for 20 plus years. Like we were writing these documents anyways. We were writing technical documents, inspect documents anyway, and we were sitting in meetings with other engineers debating the merits of one implementation versus another and then we still had to go write the code.
[38:16]
Ryan Nystrom
Yep.
[38:16]
Claire Vo
It is like really, it hasn't added work to, to go into this model. It's maybe like shifted the emphasis of where the human attention goes. But these were documents that at least I was in every org that I've ever been in writing before. And the other thing that I think has changed so much is those meet those docs then waited for review and they waited for a meeting. And now no more waiting for the meeting, no more waiting for review. Ship it, have a verification loop, debate it on the merits of it being live and working versus the theoretical merits of it sitting in a document waiting for everybody's calendar to open up for, you know, a live argument.
[38:59]
Ryan Nystrom
Yep. Yep. Couldn't agree more.
[39:01]
Claire Vo
Let's. Let's do it. This is what my I, I just. You and I could talk all day about this. Just to recap for everybody because I know I got to get you out of here. Three use cases. One, never prep for a meeting again. Hook it up not only to your slack but to your meeting notes, your GitHub, your telemetry and build the best stand up meeting so no one has to stand there glassy eyed being boring, giving updates. Second use case background agents at mention from wherever you work notion being a great place to do that. Kicking off virtual machines, getting PRs done, just saying yes. When your friend texts you, can you ship this feature? And then the last one, putting all your specs in your repo, using them as a source of truth for a more autonomous coding agent like Codex. Let it cook for a couple hours, review the code and then when you update, update the specs, don't update the code. Did I get it right?
[39:53]
Ryan Nystrom
Nailed it.
[39:54]
Claire Vo
H. Okay, let's do a couple lightning round questions and I'll get out get you out of here, you. And I love Codex. Why? Why you love Codex? I'll tell you why I love Codex, but you go first.
[40:03]
Ryan Nystrom
Okay, I. Well, so I first I first fell in love. Oh my God, did I just say that?
[40:09]
Claire Vo
You did.
[40:10]
Ryan Nystrom
I first I first fell in love with Codex because when I was like evaluating both like Claude code and Codex, I found anytime cloud code like filled up its context window, it would just kind of like lose the plot really quickly. And Codex, I don't, I don't know exactly what it's doing if it's the model, if it's the compaction, if it's like both but it can, it can grind for like hours and with, with the way that I work both the systems and things that I work on and just like I like to be able to fire off a bunch of them like at the same time and then like go to a meeting or go do something else or like spend my time kind of like round robin managing like all of these agents. Like I, I, I don't necessarily want to. I'm not the person that is like sitting with the browser open and like some agent next to it and like iterate look at the browser. Iterate look at the browser. I'm like the more, the closer I can get to one shotting solutions the better because that frees me up to do other stuff. And so I found that like Codex was, was pretty good about that. I also just feel it's like pretty simple. Like it's, there's not a lot of bells and whistles. It's not like too fancy necessarily. I'm happy with the addition of like MCP and skills and some like other stuff coming out. I also really love GPT 5. 4. I think it's a great model. So I'm like all, all of those things together. It's just, it really like matches my working style and type of work a lot.
[41:47]
Claire Vo
Yeah, I'll, I'll tell you why I love Codex and I sent this to somebody. I said work trees everywhere. Ports 3, 3000 through 3009. Spoken for. Like we're just, we're just going across, across the board. I do think it's good at long running tasks. I like its concept of projects because I run a lot of different projects. I think it's just like a very helpful mental model. And then it's great at code review honestly.
[42:13]
Ryan Nystrom
Oh yeah, super good.
[42:14]
Claire Vo
It's just like a really good code reviewer. It's a really good security reviewer. Tireless, uncomplaining with the most tedious of things. And so I, I find it's well one of my daily drivers. I just, I really, really like it. It's good. Okay, second question. Because you and I also agree on this. Give people the reason. This isn't even a question. This is a demand for you to share my Point of view, which is why bang on developer experience and CI speed right now.
[42:45]
Ryan Nystrom
A couple, a couple things to me, CI was like super important prior to agents because in my opinion, I mean it's not even an opinion, it's like fact that like the faster your CI completes, the faster you get signal, the faster you're like the more comfortable your engineers will feel with like making changes and doing things because they know like I can make a change and get it pushed into dev or into production really quick because I know my CI is fast. If it's slower, then you're building up these like monster changes and you're going to be even slower about like judiciously like reviewing every little like tiny thing. When I'm like a learn through like doing sort of person and if I can close the iteration loop like even tighter, then I'm going to be putting out a change, people are going to be using it. I will take that feedback, I will make another change and I don't have to worry about, you know, it taking another day before like the deploy train is ready to go. I just want to like, I want to crank really, really fast. And that was all pre agent and now that we're like in the agent land, it's, it's that but like on steroids because agents don't get tired. They can work on a vm. They can work like while I'm sleeping and you know, if I'm take, if I've got a CI loop that takes an hour to run, that your agent's just going to sit there and spend for like an hour waiting for results to like do something. If it takes three minutes to run like, holy crap, how much more stuff are you, you as a human and then especially as your like little swarm of agents going to be able to get done like so, so much more. And so I super important.
[44:27]
Claire Vo
I agree. And we just had Steve from Stripe on and they're doing like 30, 1300 agent PRs a week. You cannot do. Yeah, you like cannot do that if your CI is slow. It's just you might as well be throwing all those PRs in the trash. And so I do think there is just a true mathematical limit on your capacity to ship code to production that is a reflection of how fast your CI pipeline is. And so if you are again engineering leaders, if you are not spending time on making that fast and you are tolerating something slow there, you're not going to get the benefits of AI that you could. And honestly like good for AI, good for Humans. No engineer wants to sit waiting for their stuff to hit production. It's miserable on the engineer side. Lots of downsides. So allocate time to your pipeline, please.
[45:19]
Ryan Nystrom
Yep, Yep.
[45:20]
Claire Vo
Ryan and Claire say so. All right, last question. When AI is not listening, it's writing bad specs. It's not being funny in Slack. What is your prompting strategy? Do you yell?
[45:36]
Ryan Nystrom
Yeah, I can tell I can be a little bit of a diva sometimes. If it, like, really goes off the rails, though, the. The other. The other prompting strategy that has, like, saved my ass working on the CI stuff lately is because, like, even the best models, I feel like, can sometimes be a little sycophantic. I'll be like. I will just, like, be like, you're wrong. Like, you need to defend your argument. Because I, like, I want it to defend it in the way that I, like, I want, but I'm like, I just need. I just need to see the evidence that if I push counter to what it has done, that it can, like, back up with, like, good, pointed reasons rather than just be like, are, you know, are you sure this, like, change looks okay? It's like, oh, boy. It's like, totally fine. I'm like, no, no, no. I need, like, the cited hard argument against it because, like, and a lot of times, like, with the CI stuff, I'm like, I don't know what I'm doing. I know generally what I'm doing, but, like, the specifics are a lot more nuanced, and I, like, I need to get this right. And that's been super helpful. And then when it goes off the rails, I'm like, I could be such a diva to it.
[46:46]
Claire Vo
Escape all caps. No, that's what I do. Interrupt and. No, steer the conversation. Little arrow button has never gotten so much work. Ryan, this was so much fun. Where can we find you and how can we be helpful?
[47:03]
Ryan Nystrom
You can find me on X. I'm out there shilling for notion right now a lot, but that's. That's basically where I spend my time. Yeah, my DMs are open, so if you have notion problems, I like to try and fix them. I. You. I'll send it to Boxy and we'll get it done.
[47:20]
Claire Vo
Perfect. Love it. Well, thanks for joining the podcast.
[47:23]
Ryan Nystrom
Thanks for having me, Claire.
[47:25]
Claire Vo
Thanks so much for watching. If you enjoyed this show, please like and subscribe here on YouTube or even better, leave us a comment with your thoughts. You can also find this podcast on Apple Podcasts, Spotify, or your favorite podcast app. Please consider leaving us a rating and review, which will help others find the show. You can see all our episodes and learn more about the show@howiaipod.com See you next time.