![[AIEWF Preview] Containing Agent Chaos — Solomon Hykes — Latent Space: The AI Engineer Podcast cover](https://substackcdn.com/feed/podcast/1084089/post/186632797/dcdef1f295d7b1f29b3701afd5b2428a.jpg)
Loading summary
A
Hey everyone. Welcome to a latent space learning pod. This is Alessio, partner and CTO at Dazable. And I'm joined by McCoy Swix, founder of Small AI.
B
Hello. Hello. And today I'm so happy to have Solomon Hikes join us. Solomon, you're most famously the creator of Docker.
C
Hi, thanks for having me.
B
You started Dagger six years ago and I think originally it was pitched as some sort of infrastructure provisioning thing. I'm sorry, I'm probably totally mangling this in front of you. How do you introduce Dagger today?
C
So Dagger is I think six. Yeah, six years, I guess sounds right, yeah. It's a workflow engine. It's an automation tool for software teams that want to deliver software faster, more efficiently. And it takes all these workflows that are usually semi automated with artisanal scripts, you know, your builds, your tests, your kind of end to end pipelines and it turns them into robust modular workflows that you can drive with codes. And it all runs in containers. So it's highly portable, highly isolated. You can run them locally or in CI, which saves a lot of time. And we're an open source platform, We've got a very active and engaged open source community, mostly made of platform engineers, you know, those systems engineers that actually design the factory and run it and enable the developers on the team to be more productive. So that's our core community.
B
Yeah, in some ways. Yeah, sorry, go ahead.
A
I was going to say just to make that clear. These are both pre development, so spinning up an environment for people and then also between you're done writing the code and getting ready for production, are those busy, like the two entry points?
C
We started mostly post development, so anything that happens after you've saved and you're ready to see what happens next, you know, build tests, you want to take that live. So there's that delivery loop, right. We've been focusing on making that delivery loop more efficient because it's really terrible in most places and it just, there's a lot of inefficiencies that could be cleaned up in a lot of ways. It's a cobbler, you know, the cobbler has no shoes situation. Like those platform engineers, they spend all their energy and their significant experience helping developers get the best possible tooling and the best possible experience, but they themselves for their own tooling, it's sort of like, okay, we gotta cobble this together with Bash scripts and YAML. So we've been focusing on that post development, although recently we're getting pulled into the dev Loop in part because of this crazy change that's sweeping the whole market. Right, with agents.
B
Yeah. So I think let's kind of run right into that. Obviously, there's a lot more of context on Dagger and Docker that you've done prior, but we are an AI podcast, so why not? Let's just go right into it. A few months ago, you messaged me and you were like, I think this is the biggest idea I've had since Docker itself. And I was like, what? Like, Docker is very big. What is your context going into the AI builders? In some ways, I've always said, basically, like, to us, dev tools people, it is just more of the same. Everything that you wanted, you just need 100x more. So how did you approach this?
C
Yeah, we didn't think of ourselves as an AI company. We got pulled into it by our community, our users. Because although Dagger is primarily used to build CICD pipelines, historically, we've never thought of ourselves as a CICD company. We like to build platforms from first principles, and then we encourage our community to go and apply it. So what we have is an engine for automating workflows, making them reliable, portable, and giving them very, very clean environments to run in. And the environment is key. And of course, we use containers because that's what we know. And the container tech still today is underutilized, misunderstood. It can do so much more. So that our community pulls us into this AI space because of agents, because these platform engineers start messing agent loops. They want to insert LLMs into their workflows. And now it's becoming more popular to run agents in the context of your CI, to automate more parts of your delivery. And then they started showing us that everybody wants to use these coding agents now, you know, so if you're a developer, increasingly, your job is not going to be to actually develop, but to manage and enable these coding agents. And we're at the very beginning of this, you got this one agent in your IDE helping you, but now you want more than one, Right? You want a team of agents sort of doing the work for you. And that. That transition from team of one to a team of multiple coders, that's basically what our community community deals with these platform engineers. So what we're witnessing is developers becoming platform engineers. They're. They have to learn how to enable others to be productive. These others, of course, are AIs, and to do that, they have to give them environments to work in. And you can see the problem when you see someone livestreaming their, their vibe coding.
B
Right?
C
Everyone's vibe coding, but you can kind of make one set of changes at a time. You're kind of, everyone's sort of messing with this dev environment that, that really isn't cleanly isolated. So what we're doing is we're taking this technology that we invented for CI CD and bringing it into the coding agents environment and giving your agent basically a perfectly isolated, reusable and portable environment so that you're not completely locked into this one app connected to this one model running on this one cloud infra provider. Right. You want the environment where the agent does its work to be decoupled to be its own thing that you can manage and look at and then move to another platform if you want. Right. That's sort of what happened with Docker in the previous wave when everyone was adopting cloud technology. Right. Everyone was building these big platforms and they had everything, but they were highly fragmented. They tried to kind of cobble everything together like a monolith. So you didn't have this portable environment that you can carry around with you. You were trapped in one big platform. And the same thing's happening now. So we want to use our experience from the past to enable this new generation of developers to really unlock the potential of these coding agents. That's what I'm excited about.
B
Ye, I think like the scope of what people do for coding agents today is maybe kind of like single vm, let's call it. Especially the coding agents from the big labs, they don't even have Internet access. They have like pre installed libraries. And like, it's very, very limited and it could be so much better if there were standards for it. I think obviously the standard needs to be open source, but there's a question of like the design constraints that you're coding for. I think that. So for example, gitpod is another demo and market participant. I've seen where they're like, yeah, we really need to isolate this kind of almost like a mini vpc. You need the networking sorted out and you need storage, everything. Right? Like just all the fundamental units of compute. So I guess what is the container here? Like what is that concept here?
C
The unit of IoT still actual containers as the base layer. That's the, that's my, the first insight here. Like that you don't have to reinvent the core technology that already exists, but you got to rethink the tooling. Because the tooling as it stands, I mean honestly it's, it's a little frustrating for me. Because we, we busted our ass on Docker and we built this whole ecosystem and we invented the Docker file and Docker Compose and the format and the cli and then, you know, we kind of messed up on the follow through and at some point Docker basically stopped innovating and I left and the ecosystem continued around containers. But it didn't actually pick up where we left off. Everything towards applying container tech to development kind of stopped. And you know, there's still a lot of interesting experiments, but it never had the unity of purpose that this initial movement had. And so you have fragmentation, right? No standard really emerged beyond Docker File, Docker Compose, and that's it. All the effort went into infrastructure, you know, Kubernetes, scalable storage, scalable networking, you know, that kept moving real a lot. And if you go to Kubecon or any of those events, like you see all the infra people applying containers to solving that problem. But for development for dev environments, it really hasn't moved that much. Use Containers would be the, my first recommendation. But then yeah, I just, I just think you gotta, you know, design a solution from first principles. It's just a difficult design problem and I'm very excited because there's an opportunity to go through this design process again from first principles. So yeah, maybe, I mean, I'm aware of gitpod, of course. You know, there's dev containers as a standard in IDEs and there's a million other products out there, but none of them have convinced the majority of humans to develop in them with their tool. It's very fragmented. In fact, most people don't develop in a container still. Right. They just develop on their laptop. That's just a failure of tooling. So yeah, I think it's, it's anyone's game how to apply the container container technology to the perfect developer experience. But the criteria are it has to be well isolated. Like you should be able to have a bunch of agents working in parallel and they don't mess each other's work up. It has to be portable. It should not be locked to a model or a cloud provider should also not be locked to an ide. Like that's crazy. If you think a whole team is going to standardize on the same IDE forever, you're deluded. It should be fully observable. You should be able to see everything that happens in that environment end to end. Everything from what the model's doing and thinking and saying all the way to what are the tools actually running and what, what's the state of the environment. You should be able to see everything in one place. And you need a strong multiplayer element. You need agents and humans to both be able to interact with that environment so that you can say, you can tell an agent, do this. And when the agent says, I did it, you know, you can go and verify, okay, did you do it? Give me the keyboard for a second. You know, you need all those things. And I, I, right now I'm not seeing us heading in that direction. I'm seeing us heading in the direction of highly integrated, very vertical, end to end monoliths, you know, and I don't want to name names, but it's just, it's, it's definitely a market trend. Like if you, if you're selling an IDE right now and people are asking for more customization for the coding agents environment, you're going to add some proprietary way to customize the environment. You're going to add your own observability solution. And when people ask for a way to have the agent work in the background, you're gonna add your own hosting solution to run the agent in the background. That's, that's a monolith. That's what fragmentation looks like. And that's exactly what we were up against in the early days of cloud that led to the creation of Docker. Right. So we're gonna need some sort of a standard, not on everything, just on the environment. Right. Just that little piece that connects all the other pieces. It's not the most powerful piece, but it's sort of, it's the linchpin that connects everything else. You know, the environment in which the Agent does work, that should be independent.
A
What do you think are like the biggest limitation today? So I use Docker to develop and I force, I force the coding agent in the IDE to run commands in the Docker environment. I would say, if I had to give feedback on it, it's like one, the agent cannot make changes while inside the container and propagate them back to like the dockerfile and Docker Compose. So it kind of like spends all the cyc and then all that work is kind of lost. And there's no like AI native interface. I'm basically just doing Dogger Compose exec and running the command in the container instead of like having a more native way to do it. How do you think about that changing with like the new dagger approach?
C
I mean, yeah, you just gotta rethink. There's just a design process. You gotta, I mean, dockerfile was something we designed as a stopgap prototype thinking, oh, we'll clean this up later in 2013. You know, it's been more than 10 years. Compose was a clone of a clone that we acquired into the team and like stitched on top. And then as soon as we launched it, you have to understand there was so much excitement and demand, people didn't want it to move. You know, you would build platforms on top and then you'd be like, don't change the syntax. And so that stuff has been frozen in time. And I commend Docker for maintaining it and keeping it alive and just like doing the hard work and maintaining it. But it's not agent native, it never will be. So there's like, there's, you just gotta go through a good engineering and design process of understanding how people, how people develop with agents and understand what's the best UX that they need and try to design that on top of technology and components that is, that are as standard as possible. You know, you don't want to reinvent the wheel where it's not necessary. But we've got a bunch of standards to work with. We've got the container tech, I mean it's there, it's universal, we've got Git, we've got the LLM, you know, the OpenAI API spec. Right. And its derivatives and now we've got mcp. So that's pretty good, that's a pretty good set of standards. We can work with that. But yeah, it's got to be a new UX in my opinion. And I mean not to plug Dagger, but obviously Dagger is our vehicle for going through this design process. So, you know, if you want to see my particular opinions and how things should be designed and how you want to balance, for example, simplicity versus modularity, then you know, in my case, look at Dagger. But yeah, it's normal that it's normal that you can't just tape existing tools as is on new workflows and hope it to be, hope it'll be perfect. That's totally normal.
B
Something I bring up in my work history a lot. And I think, you know, it's also worked in workflow orchestration and temporal. Yeah, and like this migration of like, let's say like a custom language, like a Dockerfile or like an AWS step function type thing into like a more programming language like a Typescript or go that, that is the exact same journey I took.
C
Yeah, I mean it's, it's, it's really, it's really hard to find the right balance. I think of it as Lego because it's a hard problem to solve. These workflows and these environments, because no two workflows are the same, no two dev environments are the same. It's like factory design. Every great product has its own factory that's unique. No one goes and buys a factory at the factory store. You design it and you build it alongside your products. And that's what these things are. It's like a factory. And it's really hard to provide tooling for that space because if you make it too complicated and too customizable for no reason, then you're wasting people's time. It's just, I buy this, I'm just gonna do everything myself from scratch. But if you try to streamline and simplify too much, then you're restricting choice and then it becomes useless. You know, like, well, this doesn't fit in my factory because my factory uses this system and I can't reconfigure it. So it's a really hard area of design in engineering. And I mean, to me, the goat in this space is lego, you know, the actual lego. And it's, it's used as an analogy so much that it loses its meaning. But if you really think about what LEGO really is, it's really hard to. It was in reality, very difficult to design and engineer the LEGO brick because it had to be just right. And it's designed as one component, but it's. It's designed with, like, a much larger system in mind. Right? So there's sort of this two layers of design. That's what's so hard. And that's why LEGO is genius and has stood the test of time. That's what everyone in this space should be trying to do. Build, like a better LEGO system that is worth adopting. It's expensive to adopt a new standard in your stack. Like it's one more. One more of everything to worry about, right? So it's got to justify the cost by actually saving, you know, it's got to save you something. Effort, money, whatever. That's what LEGO does. When I play, I like to play with LEGO because, you know, I can assemble things quickly, etc. So, yeah, that's, that's the challenge in the case of temporal. I mean, I think these systems are very well positioned also for running agentic systems, right? Because an agent always has some sort of loop that's triggered by events. It's asynchronous. Gotta run that somewhere.
B
Yeah, I would say Temporal's focus more sort of runtime applications and then other systems are more focused on CI cd. Honestly, you could use one for the other.
C
But yeah, but that's the. They're converging because like your ICD will soon be nothing more than runtime infrastructure for your workflows. And all those workflow workflows will become agentic. Right? It's going to be either workflows running LLMs, LLMs, running workflows all the way down. And CI is just, it's also events, a job dispatcher and compute. So I think these things will converge with coding agents being the domain of application where everything meets. Right? Because when you're running a coding agent, you're running an agent. So it's a runtime problem, but it's a very specific area of application. Like it's not real time. You don't have to worry about voice and video and things like that. You worry a lot about artifacts that are being produced and are they repeatable? Can I trace how they were created? Is this binary created by an agent, that a model that went rogue? You know, things like that.
B
I think one element that I see a lot for these kinds of, like we've talked to Bolt and like there's all this universe of, let's call it ephemeral apps that people are making or vibe coded apps, single use apps even. Just because it's so easy to create an app now that you're like, I just, it doesn't matter. So the speed and the setup and the tear down is pretty significant. And then obviously the resource usage and cost, these are all things that I'm hearing the founders in these companies all trying to solve for and all finding the current set of tools lacking because we just don't have the, we cannot subdivide things that small or that cheaply. We can't start things up that fast. But the users always want this. That's why they're doing the shortcuts. That's why they're not using containers, because they're like, I don't know how to do this in containers.
C
Yeah, and there's a lot of duct tape. I mean if you want to use containers yourself, you got to duct tape a bunch of tools together. And it's not just containers, it's also like the file system isolation. You know, everyone's playing with these, with git work trees to try and get multiple instances of the agent working in parallel. That's the same problem, right? It's, it's not just containers, it's. It's it's something container based, containers for isolated execution work, trees for isolated files and it all kind of, you got to connect it together somehow. I think, I do think what I'm seeing, I'm seeing a lot of vendors of course, think about that and try to find the right solution for their customers. I think one mistake everyone's making again, I'm making a historical parallel with the rise of paas. Like everyone had only you can do this. Hopefully I'm not the only one left. But, but I think it's becoming, it's going to be very apparent very quickly that when someone who sells a commercial cloud product, a cloud centric product in the AI space, like it's hosting, they're hosting your stuff, right? You log in and they have your data, they have your traces, they run the model or they proxy to model. Whatever, everything solution they come up with will be very infra centric, right? They're going to think of another hosted feature, another hosted service. So when they think environments, they think, how can I run these VMS on my infrastructure as fast and cheaply as possible. My scale is going to be X. I have this many customers, the pricing is X and that's excellent. But developers also want to run stuff themselves locally and they don't really want that to be an afterthought. But it's it. Right now it is an afterthought and it's the same mistake cicd vendors made. Like cicd has no good local story. There is some open source projects that tries to simulate say GitHub actions locally and it's like it must be a nightmare to maintain this project because compatibility is just so hard. But my point is just local execution. It's not everything, but it's a good test. Whatever solution you're imagining, does it support local execution? Will developers be able to run it locally and enjoy it? If the answer is no, you're not, you're, you're solving part of the problem, but you're not fully solving the problem of standardizing dev environments for coding agents. It's not, not going to stand the test of time because it cannot be ubiquitous. It'll be a great commercial solution, you'll make lots of money with it, but it's not going to be ubiquitous.
B
So I'm going to ask like a really hard question maybe, but what does it take for one of the big clouds or big labs to adopt Dagger? Yeah. So for example, right, like I've had this exact same call, same problems with the Microsoft Team and they're pushing dev container and obviously dev container is not enough, but whatever, you know, they want to build around it, they want to extend it and I'm like, okay, but like everyone's working on their thing. Like, when do we get some consolidation in this space?
C
Yeah, well, who knows? I mean, honestly, everyone should just give it their best shot and design the best solution. Yeah, honestly, I think with open source to some degree, well, it depends on the area, the domain. But I think for this problem, scale doesn't really help as much. Like if you're going for like foundational models, to take the extreme example, sure, let the best design win. But Also I know who one, you know, I know the 10 names of who's, you know, I'm saying like who's going to win? It's going to be one of those 10, like startups don't really stand a chance. This is very different because it's not about scale, it's about developer experience and it's about designing the interfaces in a way that actually help people be more productive. So there's a lot of leverage for small teams to just design the best possible solution and then you have to convince others to adopt it and build on it. So community and ecosystem are extremely important. If you're large like Microsoft, you have an advantage on that. Obviously there's a Microsoft ecosystem, it's massive, but you still need to have a solution. Right? So like we're talking to Microsoft, we're talking to a whole bunch of people, but there's a lot of. In this particular problem that we're targeting, you know, standardizing the dev environments for agents, you don't really need permission as a startup. You just go ahead and do it and build momentum. You know, there's less gatekeeping, I would say.
B
Yeah, awesome. I think like the other thing that does come to mind sometimes is that there are different layers in infrastructure. I think you're very focused on, I don't know how to call virtual hardware. Is that like a term that resonates, like virtualization? No, like that. When I look at, let's say what I can do with Dagger, it doesn't for example, have auth or billing. Right, the higher level still infrastructure.
C
Yeah, yeah, yeah.
B
But there are recombinations of things that are more sort of business facing than just the hard metal facing.
C
Yeah, I think that's what you're describing. As a consequence of the LEGO approach, we're focusing on a platform problem. How do you create a modular system that can run anywhere, you can connect to anything and Allows you to compose the ideal environment, the ideal workflow, and it can be ubiquitous because it can be integrated into pretty much any existing system. In order to do that, you can't go ahead and design a complete end to end solution. Right. That's the price to pay. So Dagger will always be a component of a bigger platform. It will never be the complete end to end platform because in order to do that, you have to force your customer to adopt your authentication and your UI and your storage and your networking, et cetera, et cetera. And we do, we're doing the opposite. We're telling these platform teams, hey, we're not, we're, we will adapt to the, to the stack that you have. And so for example, even I started, we started this conversation with cicd because that's our traditional use case. Dagger makes your CICD better, but it does not actually replace your legacy CI platform. It runs on top of it and it allows you to simplify it and turn it into like basically dumb runner infrastructure. But it's still important infrastructure. So we don't come in and seek to replace what you have. We try to integrate with it and make the overall system better. And so as a result it's just a different shape of a product. Right. And we definitely don't solve auth. You know, you gotta pick a balance to help solve auth, but whatever you do will integrate with it.
A
Yeah. Yeah.
B
Cool. I think that that's planning for an intro to like where Dagger's at and you know, the poll that you guys are seeing. I'm looking forward to your talk. To be honest. Like, I think, I think I haven't seen like a good like Solomon hikes, like AI talk.
C
Well, I don't know anyone has. Yeah. So hopefully I'll see one next week.
B
Yeah, I know, I know. You know, let's, let's, you know, feel free to lean on me to prep.
C
And all that, but thank you.
B
I'm excited and I think like when I saw the workshop submission come in, you guys had a really good like, let's, let's build a code, code, code agents and like here's infrared things. That really tied it together for me. That was like, okay, this is where what Dagger is going. This is how like, like ultimately like I need the LLMs to start generating their own infrastructure.
C
Right?
B
Like generative infrastructure almost. I feel like that's like prone to a lot of spend. Yeah, it's going to happen.
C
Controlling the environment is, is a good like, okay, how much control, how much freedom really do I want to give this agent, you know, I mean, we're.
B
All past the stage where like these things are just running wild on the Internet, so why not? But like, you know, to give them my AWS account and just go nuts, like. Yeah, that's tricky.
C
But, you know, just so you know, depending on how things go, you know, there's like a 50% chance that for my keynote there will be some like brand new original stuff to show and talk about that is not yet out, like on top of what we're showing in the workshop and everything. So it depends if it's ready in time. But we may have like even more, even more fresh stuff to talk about.
B
All right, well, we'll let you get back to it. Thanks so much for your time.
C
Thank you. Yeah, thanks, guys. I'll see you next week. Sam.
Date: June 3, 2025
Guest: Solomon Hykes (Founder of Docker, Dagger)
This episode of Latent Space dives deep with Solomon Hykes—creator of Docker and founder of Dagger—on the crucial topic of infrastructure and workflow design for AI coding agents. The discussion spotlights how the transition to AI-driven software development is straining and reshaping conventional developer environments, and how Dagger aims to provide standardization and modularity that could “contain the chaos” of running teams of coding agents. The hosts, Alessio (CTO at Dazable) and Swyx (founder of Small AI), guide a conversation exploring the state and the urgent needs of the AI engineering toolchain, especially for AI agents, and Solomon's vision for how to design reliable, composable environments that empower both human and machine developers.
[00:20–02:37]
“It takes all these workflows that are usually semi automated with artisanal scripts, you know, your builds, your tests, your kind of end to end pipelines and it turns them into robust modular workflows that you can drive with code. And it all runs in containers.” — Solomon, [00:32]
[02:37–06:23]
“If you're a developer, increasingly, your job is not going to be to actually develop, but to manage and enable these coding agents.” — Solomon, [04:10]
[06:23–11:23]
“It has to be well isolated. Like you should be able to have a bunch of agents working in parallel and they don't mess each other's work up. It has to be portable... observable... you need a strong multiplayer element. Agents and humans [should] both be able to interact with that environment.” — Solomon, [10:00]
[11:23–14:02]
“Dockerfile was something we designed as a stopgap prototype thinking, oh, we'll clean this up later in 2013... it's not agent native, it never will be.” — Solomon, [11:59]
[14:02–16:38]
“I think of it as LEGO because... no two dev environments are the same. It's like factory design. Every great product has its own factory that's unique.” — Solomon, [14:20]
[16:38–17:37]
“Your CI/CD will soon be nothing more than runtime infrastructure for your workflows. And all those workflows will become agentic.” — Solomon, [16:45]
[17:37–20:59]
“Local execution... it's a good test. Whatever solution you're imagining, does it support local execution? Will developers be able to run it locally and enjoy it? If the answer is no, you're not... fully solving the problem.” — Solomon, [19:55]
[20:59–22:55]
“You don't really need permission as a startup. You just go ahead and do it and build momentum. You know, there's less gatekeeping, I would say.” — Solomon, [22:43]
[22:55–25:13]
[25:13–26:51]
"It takes all these workflows that are usually semi automated with artisanal scripts... and it turns them into robust modular workflows that you can drive with code. And it all runs in containers."
"If you're a developer, increasingly, your job is not going to be to actually develop, but to manage and enable these coding agents."
“It has to be well isolated. Like you should be able to have a bunch of agents working in parallel and they don't mess each other's work up. It has to be portable... observable... you need a strong multiplayer element. Agents and humans [should] both be able to interact with that environment."
"Dockerfile was something we designed as a stopgap prototype thinking, oh, we'll clean this up later in 2013... it's not agent native, it never will be."
“I think of it as LEGO because... no two dev environments are the same. It's like factory design. Every great product has its own factory that's unique.”
"Your CI/CD will soon be nothing more than runtime infrastructure for your workflows. And all those workflows will become agentic."
“Local execution... it's a good test. Whatever solution you're imagining, does it support local execution?... If the answer is no, you're not, you're, you're solving part of the problem, but you're not fully solving the problem of standardizing dev environments for coding agents."
"You don't really need permission as a startup. You just go ahead and do it and build momentum. You know, there's less gatekeeping, I would say."
This episode is an essential listen for anyone navigating the intersection of AI agents, developer tooling, and infrastructure. Solomon Hykes provides a historical perspective on how developer environments have stagnated post-Docker, lays out a vision for how infrastructure must evolve to empower agentic workflows, and candidly assesses the state of today's tools ("frozen in time," "not agent native"). The conversation repeatedly returns to the need for a modular, open, portable “LEGO brick” standard for environments—one that will let both humans and AI agents collaborate productively, and that lowers friction for both cloud-native and local-first development. The discussion is frank, energetic, and prophetic, offering direct advice to both the AI builder community and the cloud vendors shaping the next wave of software.
For developers, platform engineers, and AI toolmakers, the core call-to-action is clear:
Rethink how you manage agent environments. Don’t accept today’s fragmentation and lock-in. Embrace open standards, prioritize local and portable execution, and help build the “LEGO” for the age of AI agents.