
A common challenge in software development is creating and maintaining robust development environments. The rise of AI agents has amplified this complexity by adding new demands around permission controls, environment isolation,
Loading summary
A
A common challenge in software development is creating and maintaining robust development environments. The rise of AI agents has amplified this complexity by adding new demands around permission controls, environment isolation and resource management. ONA is a platform for AI native software development and engineering agents. The platform combines autonomous agents with secure standardized environments which with a focus on giving enterprises control security and productivity so they can scale AI native engineering without scaling risk. Chris Vaikol has more than two decades of experience spanning software engineering and human computer interaction. He is currently the Chief Technology Officer at ona, formerly gitpod, where he leads the engineering team behind the company's cloud native development platform. Chris joins the podcast with Kevin Ball to talk about ona, the impact of coding with parallel agents, the future of IDEs, choosing agent friendly languages, code review as a new bottleneck in the software development lifecycle, and much more. Kevin Ball, or K. Ball, is the Vice President of Engineering at MENTO and an independent coach for engineers and engineering leaders. He co founded and served as CTO for two companies, founded the San Diego JavaScript Meetup, and organizes the AI in Action discussion group Latent Space. Check out the show notes to follow K. Ball on Twitter or LinkedIn or visit his website K Ball LLC.
B
Chris, welcome to the show.
C
Kevin, thank you for having me.
B
Yeah, excited to dig in. Let's maybe start with you a little bit. So can you give us the TLDR on your background, how you got to where you are today, and a little bit about ona.
C
Yeah, so my name is Chris, I'm the CTO and co founder of ona, and how we got here is I've been writing software basically since I can read. That's a very long time and I've been doing that professionally for more than 25 years at this point and throughout my entire career have been in essentially dev tools and tooling space a lot in automotive large enterprises. At some point did a PhD in human computer interaction on tooling for digital fabrication as it was. And ONA really at this point is the culmination of all these years and trying to solve a problem that I've seen show up repeatedly. And with AI we got a whole set of tools to help solve things that we couldn't even have dreamed of just a few years ago.
B
So let's talk a little bit about that problem set. So what is the core problem driving ona? What are you solving for, folks?
C
Very fundamentally it is reducing the time between having an idea and making it reality. I mean that's the job of any good tool. But where we started, where we came from is essentially owner environments. And it's the idea that if you want to write software, certainly in a professional context, you'll be spending a lot of time setting up a dev environment. So instead of writing code, you'll be faffing with PIP and node and NVM and what have you trying to set up your dev environment. And that's the first thing that we tackled and something that we spend a lot of time solving, how to make that work well and how to make that work well also for large organizations. And then building on top of that, all the primitives that we built turns out are extremely useful, if not necessary, to run agents. So ONA really is the mission control for software engineering agents.
B
Yeah, I remember running into you guys when you were purely doing the cloud environment before agents were a thing, and already it was like, oh, yeah, that's brilliant, right? Go to my GitHub repo and not just fork the code, but get a dev environment that works right there. Let's spell out a little bit of the implications there for agents, because I know I try to run a lot of different agent things, but then I'm managing environments and copying NVARCs around and doing all this sort of thing. So what is it that you do for someone who's running an agentic environment?
C
So this plays out on different levels. So first there is instantiating a dev environment for a particular project. You know, typically means one getting a set of compute resources. If it's your laptop, it's your laptop, and that only has so much RAM and so much disk to go with and so much network bandwidth. But then it's also the potentially different tools. So if you're working on one in the same project, chances are you have machines set up for that. But then it may be different configuration, you know, different keys or something that you need to run multiple instances of that. It's definitely different working copies, potentially work trees is what we see a lot of folks do locally. All of that you now need to manage manually. And, you know, at this point, there are heaps of tools trying to do that for you on your laptop, but very fundamentally, you're still bound to your laptop. So there's a resource limitation and there's a variance limitation that one single machine can handle. The other is running it on your machine really, on one single machine really limits the autonomy you can grant that agent. There's a reason Claude code calls it dangerously skip permissions. They put that dangerously prefix there. And do I want an agent to believe it needs to Reset the state and delete the root of my file system? Absolutely not. So the thing that we do is we talk about this as owner environments, agents and guardrails and environments really are the development environments that I just spoke about that we can instantiate many, many, many of that humans and agents can interact with alike. And then there is owneragent that lives in these environments, does autonomous work for you, and then there's guardrails, that gives you control over what owner agent can do. That's in a nutshell, that makes a lot of sense.
B
I feel like I'm constantly walking some of these balancing acts locally, where I want an agent to be able to, for example, autonomously run tests, but those have to touch a database, and now I have to give it network access, and who the heck knows what it's doing with that and all these different pieces. So, okay, how does this work? Like, if I want to run an agent with onod, what type of configuration do I need to do? How do I get that set up?
C
The cool thing is that agents are very, very useful for getting stuff set up. So the fundamental primitive is essentially a compute box, that it's essentially dev container in a vm. If we strip it down all the way, that has code checked out already. So in practice, the way this would look like is you come to ona, you select the repository you want to work with and put in a prompt of what it is you want to do. And we obviously give you a good set of defaults. One of them is how you can configure your dev environment and then it's going to go and do that. So really the baseline is what you're used to with most AI tools, where you can put something in a prompt box and it's going to go and do that thing. The difference here is that you can do five of them in parallel, and the fans of your laptop don't spin up. And if one of them goes astray, you just delete it and create a.
B
New one that makes sense. Okay, so let's look through a little bit of those implications then. So it's now trivial for me to spin up agents, which is useful, I find personally on my laptop, which I'm still on, I start like capping myself at like, I don't know, three or four is about how many I can keep track of. So if it's now essentially free to spin up additional agents, they each get their own environment, they're not messing with each other, they're not messing with each other's Branches, how do you deal with that level of stuff going on?
C
Yeah, that's the thing I as with my HCI background, the thing I find really, really interesting is how do we as humans come into this? And if we take a step back and really look at software engineering as an industry and as a trade, the way we've operated for the longest time, certainly since I can remember, is this deep mono focused work where you focus on one thing at a time. And we value this sort of deep flow state. I mean, heck, we even tie our identities to this, you know, as like having that environment as perfectly tuned to your own liking and keyboard shortcuts and your IDE is your home, you know, your sanctuary almost. That's how we've been operating for a very long time. And so it feels a bit, and for me also felt a bit odd that okay, now I'm asked to give this up. Like agents get more and more autonomous and the only way we can turn this autonomy into productivity is by doing multiple things at the same time, which is the opposite of this deep focus work. And as a result the interfaces that we interact with need to change. The IDE is built for a world where every line is artisanally handcrafted, say for autocomplete and code generation. And now we have these machines that can write code for us, many in parallel, exactly as you say. So now we need interfaces that help us find joy and flow in this parallelism. That's a new class of interfaces. And we all haven't figured this out yet. Like we're still trying to understand what this is and there are many ideas being brought forth and we obviously have our own take and spin on that. What I will say is the second key trait that we observe in software engineers is that we all have a mind that leans towards addiction and we're gamblers at heart. We're like, okay, this next change is going to fix my test. This next change is going to make it work. Just one more change and all of a sudden it's 2 o' clock at again. That's how we work.
B
I'm in this picture and I don't like it.
C
Exactly. What agents do, it's a bit like playing a one armed bandit. It's like a slot machine in a way. And what agents do is they've made it incredibly cheap to pull that lever and you can play five of them at once. And so not only do you, you can actually find flow in this parallelism with the right interfaces. It's even addictive.
B
Oh, it's deeply addictive. Like, the ease of building your own tooling has just gone through the roof. Right. So it's like, oh, yeah, why not spend another five minutes building myself another tool?
C
I know, right? Like, there's also a lot of interesting implications for startups there because I would argue that so many startups were founded from, I have this problem. I'm sure others have this problem. Let's see if I can go and sell it. But now you don't need to offset the cost of producing the tool anymore because it becomes so cheap to produce it. Chances are you're not going to bother generalizing it and try and find someone to sell it to because it already solves your problem.
B
So let's maybe go one step deeper. So you highlighted there's a set of interface changes that need to be made. Are there also sort of mindset shifts that need to happen?
C
I think so. What we observe also in our own team is that there is a sort of mindset gradient that aligns with seniority. So the more junior someone is, this is my, you know, like N equals less than 50. So take that with a grain of salt. But the more junior someone is, the more they identify and love the game of writing the code itself. It's less about the overall problem, the business problem they're trying to solve. It's more about, okay, can I write this code? Can I figure out a really elegant way to express that? Or something like that. And the more senior folks are, again, small sample size, the more they value solving the problem. And the more you value solving the problem, the less the way you solve it is relevant. Like, obviously you don't want to impose a lot of debt on you and get called out in the middle of night because your code's not great. But in reality, it doesn't have to be you writing the code. And so the mind shift that needs to happen is the identification now is, as a software engineer, it's more about solving the problem, less about writing the code.
B
And what do you think are the user interfaces coming to that, back to the HCI that are going to elevate that problem domain over the code domain? Because if you think about our generations of tooling, they are focused around code. You have the ide, you have the code review tools, you have all these different pieces that are down at that low level of granularity.
C
Absolutely. My hot take is code. And the languages that we have today will be with us for a very long time. So every once in a while you hear folks going like, yeah, there'll be AI first programming languages and that humans don't necessarily understand. Maybe that is. So if you look at the history of programming languages, the trend indicates the other way. Like, we've come from very close to the machine languages and we've subsequently added more abstraction and moved away from the machine level. Struggle to see why we would now take a hard turn. But there's so, so, so much code out there in programming languages that date many tens of years back. So I think we're going to have that with us. And with that, we're also going to have those tools with us. You know, like those tools themselves aren't going to go away. Some predict the end of the ide. I don't actually believe that to be true. I think we will have IDEs going forward and the way they look like will need to change. But fundamentally, interfaces towards tools will exist. And so an important consequence of that is that whatever system we have that lets us interact with agents needs to take that into account. Many times. It's just so much quicker to go and change that hex code to the color I want than to try and tell an agent how to do it. So why should I be limited to a prompt box when there's so much richer interface and system and ecology out there that would let me do that?
B
Yeah. I think to your point, it comes down to precision. What level of precision are you wanting to engage in? Like, the equivalent of a scalpel is editing the code directly. Right here I can say exactly what I want it to be, whereas when I'm interacting with an agent, it feels like I'm operating at a much higher level of abstraction, which sometimes is fine, but means the details get quite fuzzy.
C
Absolutely. I think there's a great analogy. It's a bit like trying to trim your hedges. You're not going to use a scalpel to do that, you're going to use a hedge trimmer to do that. That's the agent. But if you're trying to sort out those two, three pesky bits that you didn't quite catch, trying to use a hedge trimmer is a pain. The scalpel might just be the right tool for the job. So there is a different degree of, let's say, engagement with the problem depending on what it is you're looking to solve.
B
So let's come back a little bit to putting these things in the cloud and how that engages it. So I think one of the things I've seen in the IDE version of this is when you have a Hammer. Everything looks like a nail. Like, I've seen people who, once they get into the oh, I'm prompting an agent change that would have taken them 10 seconds before because they understood the code. They're like, I'm going to tell this agent to do it and keep going. And I feel like that becomes even more the case the further you get away from this is my code. It's off in the cloud somewhere. My agent's handling it. So how do you facilitate that sort of layers of tool use? The layer from. I've got my hedge trimmer, but sometimes I need a scalpel in an environment that is sort of set up for ephemerality, just have an agent go and do it.
C
I think this comes down to separating the interaction and also automation that surrounds this from where compute lives. So just because my tools now and my source code live in this VM somewhere on AWS or gcp, doesn't mean the way I engage with that code and those tools needs to be different. So as an engineer, when I connect to one of these environments, it doesn't feel much different. If I use a desktop ide, say Cursa Vs code jetbrains what have you and connect to that environment, it feels very much like it's local, except the bandwidth is. Is much, much more. Like I have much more bandwidth available. So the way you interact with it really doesn't differ much, except that now you can have an agent do work even when your laptop is closed, so you're no longer bound to the life cycle of that one machine that right now lives on some shoddy Starbucks WI fi that might be disconnected at any moment. So really what it does is it lifts you off the limitations of a laptop, but it doesn't very fundamentally change how you engage with the code that makes sense.
B
Well, and it reminds me of like I was trying to figure out, I'm still running everything on my laptop. To be fair, I'm not on ONA yet, though, who knows after this conversation? But I was trying to figure out, can I set up tailscale on my laptop and my phone so I can nudge my agents along while I'm out on my run or what have you in ona, like, what would I need to do to, say, drive an agent from my run?
C
Literally, you would go to ona.com, you would log in and you would talk to your agent right there and then. And that works very well from the phone. We put a lot of effort into the mobile experience. And the story I like to tell about this is I have a 4 months old son and spend a lot of evenings with him falling asleep on my left arm and then I'm sat there and I'm frankly too scared to put him down, so he'll just be sitting there. But it means I can't use a laptop in that time either. But I can use my phone. And so I've spent plenty a night with my son on one arm and my phone in the other hand, frankly being quite productive. And a lot of ideas that come into depth at night otherwise would have been maybe a note in some tool. Now they're a prototype and the next morning when I log in again, I actually have working code instead of some half stumbled words.
B
In reflection, agents are great for turning half stumbled words into working code. Okay, so let's go back a little bit to implications then. Right, so you can now generate all this code as you're holding your son asleep, or I can generate it on my run. The cost of software development is now substantially lower in a lot of ways. What happens?
C
Hmm. There are interesting implications of all of this. So one, I'm not yet sure that the cost of software, if you look at the entire sdlc, is actually lower. I think the cost of software production, of change production is lower, but that has downstream effects that I don't think we fully understand yet. The one that's immediately obvious is that we're now turning code review into a hotspot because we're producing so, so so much code. And reviewing is still very hard and is still a very manual, human attention intensive activity. I mean, the cynical take on this would be we were all promised, we now get to do the creative stuff and in reality we've been reduced to line workers who now, you know, just review code, which isn't the most enjoyable thing to do. So I think what's fundamentally happening is the economics and the way we scale software production changes. But I don't think we fully understand the entire SDLC and its effects yet.
B
I've definitely been feeling that hotspot on the code review side. It definitely feels like that is one of the bottlenecks. What else do you think changes in terms of the life cycle from have an idea to this thing is actually production ready?
C
I think this depends a whole lot on the context in which it happens. So as a weekend warrior, as someone who's building tools for myself, the cost of producing something really has gone down so dramatically that sometimes instead of trying to search for prior art, I'll literally just prompt an agent and get it done. It's cheaper to build something tailor made that needs to work once or twice for myself than to go and try and adapt something out there in a business, especially in a large organization, especially in a regulated industry. I'm not sure that's true. Again, we've brought the cost of production down. I'm not yet sure we've brought the cost of deployment operations down.
B
So that's fascinating because it continues a shift that we've been seeing in the industry for a while, which is the cost of getting started has dropped dramatically. This started with things like SaaS, right? Or hosted environments. I no longer have to outlay $5 million to set up my own server environment. I can get started with 20 bucks a month on Amazon or what have you. Continuing that so starting becomes almost free. Maybe I have my ChatGPT subscription, $20 now my agent can build my software, I can deploy it on Amazon for another $20 and I'm off to the races. But to your point, scaling, dealing with security, dealing with privacy, all of those are just as expensive, if not more so.
C
Yeah, it's interesting to see how that, like if we're introducing new breaking points or step functions in the sort of cost utility function over scale, it does feel like that slope is getting flatter, like it's easier to get started and it's easier to scale. I think the effects though of that acceleration just aren't equally distributed, if that makes sense. Like there's a factor here, it's a function of organizational size and impact of compliance and regulations and standards you need to follow. That said, a lot of our customers are in regulated industries, finance, pharma, massive organizations, Fortune 500 companies. And what we see is that they see massive benefits from agents. And it's largely fueled by the sort of emerging complexity that exists in these systems. One use case that we see a lot is folks using agents to understand their own systems. Simply because there's so much complexity and so few people who can hold it all in their head. Having an agent who can crawl through that is immensely useful. So even if bringing something into actual production downstream might still be quite involved in that kind of environment, we're seeing a lot of use cases that are necessarily the production of code, where agents are immensely valuable and still bring the cost of the overall process down.
B
Yeah, that makes sense. So that maybe actually goes into a domain that might be interesting to explore, which is the different ways people interact with AI agents. Right. Obviously this is a shift the entire industry is going through, though there are still Trailing adopters. Not everybody is into the agentic world, but like, what are the different ways that you have found people are wanting to engage in these? You just mentioned one, which is explain to me my code base. Explain to me this system. What else shows up?
C
Yeah, first, I think the point you made there is really quite astute. One observation that we repeatedly have is that there's almost like two worlds. There is, and it's not quite binary, but it's definitely a spectrum. There is the top 1%, the folks listening to this podcast who are really hooked in, who know their stuff, west coast, east coast, that kind of thing. And then there's the rest of the world, many of who frankly still believe that Taptap Autocomplete is the pinnacle of AI and software engineering today. And don't get me wrong, Tap tap Autocomplete, amazing killer feature. And we've moved a step further than that at this point. So I think the old adage of the future is already here. It's just not equally distributed. Is more true now than it's been before. I think, like there are so many organizations, I think we as an industry, we generally underestimate how many folks are on the laggard side, which is fantastic. It means there's so much potential still, 100%.
B
There's tremendous opportunity purely in scaling out what already exists.
C
In terms of how we've seen people interact with agents, we see it roughly along three broad categories. One is sort of inquiry style work, one we just touched on, which is, hey, tell me how this code base works. There's a special version of this which is essentially onboard, where AI becomes the onboarding body for a new engineer who joins a team or an organization. There's also checks for compliance, for example, hey, how compliant are we with this and that coding guideline? So generally, inquiry. We ourselves use it a lot for design work where we write design docs together with owner agents and essentially we have slash commands, global slash commands. So we have our own for a design doc. And the great thing is that it is very rooted and grounded in the actual code base. While you still have an interactive conversation, it's almost like an interactive rubber duck that also can read your code base that helps you write good design docs. So that's one class is the inquiry side. The second one is essentially your classic code change group, which first and foremost encompasses all the toil, updating libraries to mitigate CVEs, migrating to a new version of this and that, adjusting to a new standard. You want to drive Whatever it is, like generally sort of lift and shift type functionality. So that's something that we see a lot and that's really a lot of the work that's happening. And then of course, there's new feature work. There's also eventually getting things off the ground. And then the third class is really trying things out. Like, the cost of experimentation has gone down so much, especially if you can scale out your agent, dare I say agentic resources horizontally, which you can with ONA Agent, of course. Like, you can have so many environments, so many agents running in there. So the cost of doing these explorations has gone down and you can explore multiple paths at the same time. We prototype a lot more using ONA and ONA Agent than we do using figma. We still use figma, of course, but so many of the ideas that go in there really have been created through ONI Agent before. So those are the three classes we see a lot inquiry, actual production of change and prototyping ideation.
B
So let's dive into each one of those a little bit. So for inquiry, I think this is potentially a really nice entree for folks who are on that trailing edge, because you don't have to trust the agent yet to write code. You can say, hey, like, explain this to me, and then you can go and actually confirm the thing that was explained, or find this for me, or those sorts of things. Do you have any best practices you all have come to internally? For example, if you're doing that design doc, how are you prompting the thing? What are you loading as context? How do you approach this thing?
C
So the thing that we load is one, a template for the design doc that we want out of it. It's a very simple kind of template. Then very explicit instructions to engage in a conversational style. So we literally ask it. Ask me three rounds of questions, three questions each, and with additional instructions. Whenever you're not sure, ask. So we heavily try and counteract this excessive confidence that a lot of models have. And with that, it really becomes literally a conversation. What's also working really well is using something like Whisper Flow or Super Whisper to have those conversations. So no one's typing that anymore. It's literally just talking to a microphone. And so the way the flow looks like is you use that slash command that contains the template and the instructions. Also, hey, go and look at the APIs, go and look at the database schema. We have a set of engineering principles where we, for example, value consistency and integrity and pragmatism, change over excessive abstraction. And those are part of the prom too. So you load all that, you run whisper flow, you put it that way, brain barf into a microphone for five minutes straight. And you put all that, and then the thing is going to turn on it. It's going to investigate the API and schema and what have you and comes back with a set of questions. You do that a bunch of rounds and the outcome is a 80, 90% version of a design doc.
B
There's a few different pieces of that that I think are worth highlighting. Right. So you're not starting from scratch. You actually have a tremendous amount of things that you've put in there. I'm guessing you've actually engineered and iterated on those as well, on the template, on the guidelines that you give it, on how many, you know, the ways in which you guide it towards this.
C
Surprisingly little, to be honest. Like, the template is the same template we've been using when AI was a research field with narrow applications. And the prompt itself, like the three rounds of three questions is entirely arbitrary. We frankly haven't iterated much on. Okay, use different numbers or any of that. And that's just because it's been working well enough.
B
Yeah, fair enough. So from that design doc, then, do you just hand that to another agent and say, build it? Like, what does that look like?
C
Another agent or the same agent? Like, you could literally go, great, you have the design doc. Amazing. Please put that into notion. So we have full MCP support. So it's going to put into notion straight away. Or if it's in a linear issue and link there, then you go, hey, here's a linear issue. Go do your thing. That more often than not, is a really, really helpful starting point. Fundamentally, what we find is trying to get 100% there with the agent. There's diminishing returns, and it's the hedge trimming versus scalpel, or the metaphor that we commonly use. It's the difference between highway and city driving. Making miles, highway driving. I don't need to be at the steering wheel. It's fine if my car does it driving, but if I'm in a narrow city where there are a lot of things happening and where I really need to make sure I get to the right number or the right address, probably I want to be at the steering wheel. I mean, there are also fantastic organizations that have solved that.
B
You know, obviously I was in UEMO the other day. It's absolutely science fiction. It's amazing.
C
It's incredible here, like, you know, fundamental models, one mental model that we Found very helpful is actually a metric that Waymo optimized for in the early days. I don't know if they still do that is time between disengagement. It's the time between the car disengaging and the human having to take over the steering wheel. Seconds between disengagements is essentially lane assist minutes. Ours is the backseat of a Waymo and in software engineering we are going through the exact same transition. Lane assist is your tap, tap, autocomplete, copilot, cursor. And then the backseat of a Waymo is where we're heading with agents.
B
Yeah, I think that's a really nice metaphor. The question then becomes, when should a human get involved? What level of engagement do you want? And then additionally, if your agent has gone off for a three hour journey, how do you boot your brain up on what are all the changes that it made?
C
Yep. And this is where the interfaces come in. Right. Like this is where we need to build systems that make that really, really easy also to do that context switch. The way we do that in Owner is that for one we have feedforward where very early in the process we, similar to cloud code, essentially produce a set of todos and they guide the agent and the human to the human. It's really, really helpful because it gives you trust that the agent's gonna do the right thing. And then throughout the interface, we've been very careful in how we display these and how we use them to tell you, okay, this needs my attention now or in half an hour. So there's that feed forward. There's also the feedback, what has it done, the summary at the end, but also the ability to jump into a full IDE right there in the same context that gives you a diff on the changes that have happened. So we've designed and we continue to iterate on the conversation itself in a bid to make it easier to sort of re enter and switch back and reconstitute that state in your brain. At the same time. What we find incredibly effective is literally reviewing code changes since almost like the last time you looked. And for that, quite frankly, git is very effective. Like we don't have to reinvent the wheel. Literally looking at git diffs and VS code right next to the conversation embedded in the same browser tab works really, really well.
B
Well, having the conversation there is really helpful. And the history and like understanding it, do you link between the two? So it's like, oh, here's what it was thinking when it made this diff and Here, like, can you see that sequence?
C
Yeah, you can go from the conversation, you can go right into file. So if, for example, it edited a file, you can click on that and it's going to open it up in the IDE and it's going to bring up that, that section there and then and so you get the full context of how it looks like. Now, we have experimented with historical context, but it turns out it's actually confusing. It's more confusing than is helpful because if you want to see the evolution, it's easier to download the entire picture for a small enough change, right? Like if we're talking 20,000 lines, I mean, if you're making 20,000 line changes, then, you know, good luck.
B
But this does highlight, right? As you increase the time between disengagements, most likely that's because you're making larger change sets. So how do you navigate that? Right. Suddenly, are you getting benefits or are you deferring the cost to. Now I have a 20,000 line PR to review, which, by the way, I've reviewed far too many of those since we got into this agentic world.
C
Emerged one today, but I don't think that there's a one size fits all answer. For example, the PR emerge today, essentially it introduced three new database entities, including their, their API. Our API is very consistent. It's essentially crud on entities. And so that kind of code is very reviewable, especially when structured into good commits. So, you know, a 20k change might be okay. Now every once in a while, fundamentally, I think what you're doing is exactly what you point out is you're pushing the cost downstream because it's now the poor SAP or the poor folks who have to review that change that have to bear the burden. What this points to, I think, is agents are tools. At the end of the day, they're very close to magic, but they are tools. And much like any other tool, you need to learn how to wield it. I mean, a very long time ago, writing implements were very close to magic, and you need to learn how to use it. And the same thing is true for agents. And part of what it takes to learn how to use an agent, I think, is understanding what size of problem it's capable of handling and how you decompose the thing you want to achieve to fit that size. And that's a skill. It's a learned skill. It's not something you wake up one day and you have it. It's something that requires experimentation. And here we come back to the benefit of Running agents in a system like ona, because the cost of that experimentation is just much lower. You can try different sizes of this decomposition in parallel and observe which one works and which one doesn't. And the one that didn't work, you just throw away, no harm done.
B
Yeah, I love that. And I think that is a really key thing here, which is this is changing the way we do software development. And there are skills to learn associated with that. It's not a drop into your IDE and suddenly you're 30% or 100% or whatever faster you have to shift the way you're thinking about developing software.
C
Absolutely. And we can see that in our own statistics. For one, we see a very strong correlation between PR throughput, so PRs merged and owner contributions. So we closely track how much of our own code is produced through owner and we see a very direct correlation. And the effect that we see is that there are the more senior the folks, the more likely they are to be able to break this down and to adapt to this way of working. That's sort of what we see right now. It goes to your point. It's a tool that you need to learn how to use and the skills that are needed. You know. Precisely. The other thing that you need to learn is good specification. A key challenge that we see also with our customers is under specification, agents are fantastic, but they're not magic. They can't read your mind for better or worse. And so you need to learn how to prompt them. Well, I mean, the old thing is still true. It's less true now. I think we all remember early days, like a year ago, the arcane kind of prompting and prompting libraries and whatnot. And like all the tips and tricks and all caps you must blah, blah, blah and all that fun stuff. That's obviously much less true today. But still there is an acquired skill, the prompting. Well, you know, and to decompose in a problem.
B
Well, and I think some of what you've already described of your process of start with a design doc, and sure, the agent can help with that, but that gives you an artifact where you can look at decomposition, you can look at approach, you can look at all these different things and kind of have a feedback loop before you set this thing loose for an hour and a half to generate 20,000 lines of who knows what.
C
Absolutely. The other thing that we find is also with change sets of any size, but specifically, the larger they get, the more sort of deterministic control mechanisms you have that help the agent understand if it's doing the right thing, the more likely you'll get a good outcome.
B
This is a really good point and I think also connects to your time between disengagements. Right? So the more you have a feedback loop, the more you are able to validate the outcome of the thing, the better it can do things. And if it's a human in the loop for all of your validation, you can't let it go very far before you validate. But if you can deterministically validate a whole swath of things, suddenly it can spin on its own. So what are the guardrails that you put in place for that?
C
So we have a bunch of mechanisms where, for example, we hook into our CI system and we have prompts and it's literally just a prompt where the agent behind the scenes will do a sleep360 or something. It's going to sleep a while, wait until the CI system has done a bunch and then go and look at the logs. It's a bit like how a human would operate. Like, I don't necessarily get a ping when my CI system is done, but I'll check back every once in a while. And we found that the same very simplistic mechanisms also work for agents. The other thing is really good engineering practice now, I think really pays off. So having a well set up Linter using a language that favors standardization. And I'll make no, it's not a secret. We use Go. And we use Go for good reason. It's a very opinionated language. Not everyone agrees with the opinion, but everyone likes that there is one and this turns out to be very, very helpful. Like consistency turns out to be very, very helpful. The mental model that I've come to apply here is agents, in a way, are like a jet engine that you can strap onto your plane and either your airframe is rigid enough to withstand the acceleration and velocity and then you're going to go very far, very fast, or you're going to come undone in midair.
B
I kind of love that. I have a similar thing where I say these things just speed up everything. So if you have sloppy practices, they're going to speed up the slop. If you're accumulating tech debt, they will speed up the pace at which you do that. And if you're doing good practices, it'll speed that up. I'd actually love to dig into the language choice piece a bit because we too are doing a lot of development in Go, and there's differing opinions on the team about it, but I have Definitely observed that it seems to be a language in which the LLMs stay on the rails much better than many other languages. So I'm curious, like what have you seen in terms of the characteristics of a language beyond just being very opinionated, which GO is, But what are the characteristics that lead to good agentic coding?
C
Start with the simple things. Using white space and indentation for structure is just generally a bad idea. Sorry, Python. So having something that is more structured and that helps NLM infer structure is helpful. Like obviously there are AST based toolings that help with that, but for simple text edit based modification we find that these more C esque languages work very well. Also being idiomatic, like having only two ways to shoot yourself in the foot is generally more successful than having five different ways of doing that. So we come back to the consistency piece and you know, consistency can be enforced through the language, but it can also be enforced through your practice. Well, configured ESLint goes a long way in terms of languages. The other thing I say, well, we've observed many a time is there's no surprise here. There's a very clear correlation between public training data and the quality of the code changes. So if you're writing COBOL or FORTRAN at a bank, I'm afraid the frontier models aren't going to do that great a job at editing your code. If you're writing Java, if you're writing, you know, any of the big languages, you're going to have a much, much better time.
B
Yeah, I think that makes sense. I have a hypothesis I want to bounce off you as well, which is I think the fact that go's dependencies, the way that it imports things, the fact it doesn't have a full featured object inheritance model also is helpful because in my experience LLMs are very linear thinkers and being able to linearly look from here to where is the code that defined it and not have to climb an inheritance hierarchy or anything like that seems to be related to reliability.
C
I agree with that. I think it comes down to what kind of tools you give the agent. And one of the things that really surprised me when cloud code first came out was that there weren't any specialized tools. It's all graph and filer read. It goes an awful long way. And if you look at the generation of systems that came before that that invested so heavily in rag and semantic indexing and got very involved in trying to build up a higher value model and then call it code comes along goes like, nah, you don't really need all that the flip side of that, I think, is where you do need it is exactly for those nonlinear languages and probably for code bases that are large enough where this sort of reasonably simplistic way of navigating the code base doesn't work anymore.
B
I do wonder if this coming back to we talked about how this is changing the cost curve, and we talked about it changing the cost curve for different types of deployment or businesses, but I also wonder if it changes your cost curve for code bases. Does this push us more towards smaller code bases or towards, you know, separated code bases versus monorepos? Like, where do you see that side of the industry playing out?
C
So we already see that, and we'll see more of that where code bases get specifically adapted to work well with agents. You know, things like AgentsMD and having rules in place. Also, I think we'll see values like consistency become more and more important because not only are LLMs linear thinkers, they're also very good at generalizing within the things that they've seen. And so they see the same thing. They're more likely to produce something that looks like that. I haven't thought enough about where this is going to, like, what is going to push us a mono versus multi repo. I personally have always had a very strong preference for monorepo because we dabbled a bit in the heydays of microservices. We dabbled a bit in that, and boy, did we take many left turns that I wish we hadn't. I do feel like there are a lot of large organizations that famously run on monorepos and do that very, very well. If you have infrastructure that can handle it, if you have a system that lets you instantiate and clone that monorepo and work with it, well, something like ona, then this is a really good way of going about things because you have it all in one place and you resolve an entire layer of lookup and navigation 100%.
B
I had a problem split across two repos. I was trying to get an agent to solve it, and I was like, how do I tell it? Oh, now I need tooling to say, okay, these are the repos you got to pull, and you got to do it this way. Monorepo resolves that up to some level of scale. But to your point, then the agent has to be able to explore everything.
C
Or at least navigate it. It has to be able to understand where it needs to look. And every day, almost, I'm actually amazed by how well the combination of find and grep actually works.
B
Fair Enough on that subject of adapting code bases. So you mentioned agentsmd, you mentioned linting rules, consistency. What other things do you think are useful adaptations for a code base to make it agent ready?
C
Well, configured tools. So coming back to the deterministic validation that agents need to do, having standardized development environments so that everyone uses the same set of tools and hence everyone's agent uses the same kind of validation, I think greatly helps in overall lifting code quality. If everyone uses a different version of, you know, whatever it is, node or Lint or Java, what have you, you're going to have so much more variance. Fundamentally, it's a game of reducing variance. Generally, I think that the overall theme is driving standardization across your team or teams. This is my own take, but I've seen several cycles of this. Like as an industry we go through every five to seven years, we go through cycles on the spectrum from engineers are the kingmakers, let them do what they want, they know best to very rigid, centralized. We're now going to do it all way and this is my way and this is how we're going to do it. And you all have to comply. And we iterate, we pendulate between those. Right now we're pendulating towards standardization. We've had this phase of let everyone do what they need to do and all of a sudden you end up with thousands of repos. Everyone with, it's like that guy over there uses Erlang for some reason. It's like when we're at Java shop. So right now we're pendulating back and I think this comes at the right time. I would argue that agents are on a salary into that. And so in terms of what do you need to do to your code base? To set it up for agents is to lean into that is to lean into the standardization and to find tools that help you drive the standardization across your organization, that help you also lift this beyond the repo. You know, if you're coming from this world where you have a thousand different repositories, each of which does its own thing, what you need now is a layer that helps you standardize across them. And it's not config files in the repo.
B
Well, one of the through lines I'm seeing through this conversation is we need to lift the level at which we're thinking about these things, right? You are not designing this line of code, this function anymore. You're designing the rules that the agents are using to modify all of this. You're not designing your one repo anymore because each One has to be different. You're designing like, what is the system that these autonomous things are going to go off and doom. So there's a whole layer there of those decisions and there's a whole layer of tooling that we need to deal with that. And so I kind of want to come back to, right, you're building this agentic building block. You built the IDE for cloud, for the last generation of ide. What is the development environment? When you lift up one level of abstraction?
C
I think there's a version of this as it stands today, and there's a version of where we're going. The version as it stands today is being able to engage with the software that you're interacting with at the right level and sort of being able to go down levels of abstraction as needed and also degrees of concentration as needed, if that makes sense. Like an analogy here is, even today, if you're writing embedded software, maybe you're writing that in Rust, but parts are still in C and parts may still be assembly or the Linux kernel for that matter. You know, same thing. So I think the same thing is true here, where a good part of your specification now is English. I don't think English is a programming language, but you're going to use English to describe a lot of your problems. That's one level of abstraction. And then you go down into sort of cursory glands of code, and then you go down into very deep engagement with code. And so this interaction that you have with your software, I think needs to sort of live on these different levels. That's where we're at today. There's also really the idea that owner is built around, or at least the interaction with ONA is built around where we're going. I think, as we like right now, essentially all agents focus on the production of change, the interaction with code direct, if we're honest. And then of course there are some who focus on adjacencies, such as code review. We all at this point reasonably understand that this is a hotspot, something we need to look into. Hot take. I don't believe putting a bunch of comments on a pull request is the end. All of that we'll see. And we already are seeing agents permeating more and more of the sdlc. Also on the right of the commit, like post deployment, we see agents come in help with operational work, and we see agents come in on the PM side of things and on the planning side. And I think we'll see that more and more. Obviously all these things are like there are some large players who try and play the entire sdlc, but in reality it's still reasonably disjoint. And I think what we're going to see over time is some form of consolidation, whatever that time is. I mean, I'm not going to get too concrete here, but we will inevitably see that. And with that, the level of abstraction for the entire SDLC will lift. Okay, I'm going to make this more concrete. I'll say five to 10 years.
B
So you don't think programmers are going to be out of a job in the next six months?
C
Oh, no. Gavin's paradox is so real. Like we're making it cheaper to produce software, so we're going to produce more of it. Whenever we've made it cheaper to do anything, we just did more of it. And the same thing is true here.
B
Yeah, absolutely. I think there's something. I'm really excited to see what that 5 to 10 year looks like, because to your point, all of the layer of tooling that we're looking at right now is about producing code pretty much and evaluating that produced code and doing that sort of thing. But we are still as humans having to make most of the decisions behind that code. And as more code is created, there's more decisions to be made. And so how do we design an environment that elevates those decisions and makes it easier to make those decisions that are key and necessary while abstracting away the lower levels? Unless you have to pull out your scalpel.
C
Absolutely. I think the, you know, what language do you want to use to specify the problem you need to solve? And there are obviously entire professions whose job it is to translate between these different languages. Translating between. Here's the business problem that we need to solve, and here are the hypothesis of how we're going to solve them. And eventually that translates into code many, many hops later. And so the question is that we'll collectively need to solve is what is the right language? What is the right form of expressing intent and expressing the problem we need to solve? One way we see that show up already today is that the more classic ways of limiting what you can do on a machine in a particular environment find their limits. So concretely, we see all the different agents having some kind of deny list or allow list mechanism where you can specify in this or that syntax what the agent's allowed to do or not. We have the same thing you can specify. The agent is not allowed to run AWS because you don't want it to drop your production database. Not that any agent would ever do.
B
That, but have definitely talked to people who've dropped production databases from their agents. Gemini, I hear, is particularly prone to getting rid of databases or was.
C
Let me reset that state. Yeah, you shouldn't have to drop that database. You're absolutely right. So we too have these deny lists now. They work today, they're okay, but they're not good enough. We'll need to find like fundamentally the thing I want to avoid is it dropping my production database, not it running aws. The aws cli. Right. Like the level of abstraction for how policies look like, especially in a world where we have actors that don't care about getting fired is going to be a really, really interesting challenge. I think this is one of the first places where we're actually going to go and see this show up this right level of abstraction.
B
Well, you highlight something important there too. When you say actors that don't care about getting fired. Fundamentally there still need to be humans responsible for the outcomes of this.
C
I guess it's a bit like self driving cars, no?
B
If I commit some code that an agent wrote and it brings down production, who's responsible?
C
Probably you, but who says it's you committing it? You know, if, if you look at the economics of it, I would argue it's reasonably inevitable that will have agents who commit code and agents who review code. There's a regulatory work that tries to stem that tide where we're mandating. If you generate the code, humans have to review it or if you write the code by hand, AI can review it. Fun stuff like that. But give it time. The economic reality of it all will mean that sooner or later we'll see code that no human has ever seen. And then who do you hold responsible?
B
That is a great question.
C
Self driving cars might just have the answer. And as a company that builds an agent, I don't necessarily want us to go their way. But the way this works for self driving cars is the manufacturer of the car. I'm sorry Tesla, but if you're building a self driving car and it's killing someone or it gets into an accident, you need to prove that your tech or there's some level of scrutiny that you're now subject to. And I think we'll see similar things happening in agentic software engineers.
B
That's fascinating. So essentially we get to a place where ONA or whatever other agentic provider you're working with has responsibility, liability, something along those lines for what is ending up in production somewhere else.
C
We might end up there personally Again, I obviously hope we don't, but. But we might. I think a key difference is that, at least for now, example, what is metaphor falls down is automotive is and has been subject to so much more regulation than software production. We do also see a trend that again, personally I don't particularly like, but specifically in Europe we see an increasing trend towards regulating the production of software and equating it to the production of physical goods. With all the regulatory downsides that come with that. I would not be completely surprised if we saw that elsewhere as well.
B
Yeah, well. And it does come back a little bit to this question of what level of abstraction are we operating at? Where is there a human involved such that responsibility can be applied? Is the human making the decision about what the specification is? But then the agent creator is responsible for creating that with fidelity. If it meets the specification, then the agent maker's off the hook and it's the person who wrote the spec that's responsible. It's a fascinating problem.
C
Absolutely. I like what you raise here, where there's a shared responsibility model. And it's not the first time we've landed on that idea. Maybe this is what we end up with. The model that we're engaging in today, where it's essentially one human that's responsible end to end, also has very clear limitations because maybe you're responsible for it, but how can you make good on that responsibility when agents are producing 20k lines of code?
B
Yeah, well, and that's. I think one of the things that people are running into in terms of adoption.
C
Right.
B
Is we need to shift our mindset of. I'm reminded of the shift for how we manage hosting and servers and things like that. Right. Used to have your server that you carefully curated, your pet, it had a name, all these things and we moved to. Servers are commodities, they're cattle. You get some compute, you get some memory, you rent a GPU from here, all those different things, they're managed in some sort of infrastructure as code set up. You don't think about the special details of 1. Are we moving to code as cattle?
C
I think we are. Yeah. I like the analogy. Code as cattle also sounds good. There's an interesting element of that also where going full circle. As software engineers, many of us, we've tied our identity to our ability to craft code instead of solving problems like, you know, we come back to that. But like the. There's a pride in the craftsmanship for good reason. For good reason. But I wonder at what point and maybe that we're already past that point where that is a luxury that, at least for a business, is no longer economically viable.
B
I mean, to switch metaphors to a different area. I don't think I own any handcrafted furniture. There are still people who make it, there are still people who buy it, but the vast majority out there is manufactured.
C
I'll provide the counterpoint. Every piece of furniture in this room I built. But the.
B
Well, and I still write software sometimes, right?
C
Exactly.
B
For me.
C
And my house has plenty of IKEA furniture. You know, there is the reality of if I need a bed for my son, am I going to go and build this? Well, maybe, but in reality I'm going to go to IKEA and buy it. I think the same is true for software. Like, there's a big difference between writing software on the weekend because you enjoy the process of writing. But as a business, are you going to equip your office by having your employees build furniture? Absolutely not. And the same trend I think we'll.
B
See in software and that is a massive mental shift for software engineering.
C
It's interesting because this same shift I think we see in adjacent roles too. Fundamentally what's happening is a consolidation of skills onto ever fewer people. We're noticing that with that increase of level of abstraction, we can now skip some of this really complicated lossy process that is communication and consolidate things in singular brains and with that become more effective. So, you know, the traditional product manager role now consolidating into, you know, design engineer, product engineer, whatever they're called, member of technical staff. Like, we see that consolidation happening elsewhere and it's only natural as the level of abstraction rises. And so as engineers, I think the thing that we really need to re identify with is solving problems, not the hammer with which we solve it.
B
I love that. Well, we're getting close to the end of our time. Is there anything we haven't talked about today that you think would be important for us to cover?
C
I think we've covered it all. The one thing I'll add is at this point, the models that we use today are the worst models we'll ever use. And if those were all the models we'd ever have, which clearly they're not, you know, Sonnet 4.5 just dropped. Is it a light year change over Sonnet 4? No. Is it a good change? Probably. You know, first vive checks looks promising. Like it does more, it does things better. And the same was true for GPT5. And we're going to see an improvement in models. But even if that stopped today, the way we write software has fundamentally changed, and there is simply no going back. And I probably don't have to tell this to the audience of this podcast, but as engineers, really the best thing anyone can do right now is to embrace this reality that we find ourselves in, and fighting it is inevitable. It's like trying to fight the Internet. It's not a fad. It's not going away. SA.
Software Engineering Daily | October 23, 2025
Host: Kevin Ball (K. Ball)
Guest: Chris Weichel, CTO & Co-Founder of ONA (formerly Gitpod)
This episode explores how autonomous AI agents are transforming the way engineers develop, manage, and ship code—particularly in enterprise and cloud-native environments. Chris Weichel shares insights from building ONA, a platform for AI-native software development, and discusses the evolving developer experience, changes to IDEs, the advantages and risks of agent-driven workflows, industry cultural shifts, code review bottlenecks, and the future of software engineering as AI accelerates both productivity and complexity.
“Reducing the time between having an idea and making it reality—that’s the job of any good tool.”
—Chris Weichel [02:54]
By moving dev environments off individual laptops and into ephemeral, cloud-instantiated VMs, ONA enables resource isolation, parallel agent execution, and fast, safe experimentation ([04:18], [06:25]).
ONA introduces three pillars:
Parallelism as a new workflow:
“It’s a bit like playing a one-armed bandit... agents made it incredibly cheap to pull that lever, and you can play five of them at once.”
—Chris [09:52]
Old Paradigm: Deep, singular focus; the IDE as a craftsman’s sanctuary.
New Paradigm: Find joy and flow in orchestration and oversight of multiple agent and human-driven changes ([07:50], [10:23]).
Interfaces must evolve:
“The only way we can turn this autonomy into productivity is by doing multiple things at the same time, which is the opposite of this deep focus work. The interfaces... need to change.”
—Chris [07:50]
“We were all promised we’d get to do the creative stuff and in reality we’ve been reduced to line workers who now just review code.”
—Chris [18:15]
Chris outlines three main usage patterns ([23:58]):
“Trying to get 100% there with the agent—there’s diminishing returns... It’s the hedge trimming versus scalpel.”
—Chris [29:03]
“Agents are like a jet engine you can strap to your plane... your airframe has to be rigid enough, or you come undone in midair.”
—Chris [39:18]
“The overall theme is driving standardization across your team or teams... I would argue that agents are on a salary into that.”
—Chris [45:13]
“As engineers, I think the thing we really need to re-identify with is solving problems, not the hammer with which we solve it.”
—Chris [59:58]
“Fighting it is inevitable. It’s like trying to fight the Internet. It’s not a fad. It’s not going away.”
—Chris [60:06]
On the Flow of Parallelism and Addiction:
“The ease of building your own tooling has just gone through the roof. Right. So it's like, oh, yeah, why not spend another five minutes building myself another tool?”
—K. Ball [10:13]
Metaphor for the Productivity Shift:
“Highway driving — I don't need to be at the steering wheel; city driving — probably I want to be at the steering wheel. ... Lane assist is your tap-tap-autocomplete ... and the backseat of a Waymo is where we're heading with agents.”
—Chris [29:03], [30:04]
On Code Review as Bottleneck:
“Code review becomes a hotspot because we’re producing so, so much code. And reviewing is still very hard and still a very manual, human attention-intensive activity.”
—Chris [18:15]
On Increasing Specialization and Consolidation:
“There is simply no going back. As engineers, really the best thing anyone can do right now is to embrace this reality that we find ourselves in.”
—Chris [60:06]
Chris Weichel and K. Ball paint a compelling picture of a near future where agentic, cloud-native tools and practices are the norm, human roles are elevated toward intent and orchestration, and software craftsmanship shifts from code-level pride to problem-solving at scale. The era of AI-enhanced productivity is not just coming—it has arrived, and engineers are urged to adapt, experiment, and embrace their new toolkit.