
Modern software development is more complex than ever. Teams work across different operating systems, chip architectures, and cloud environments, each with its own dependency quirks and version mismatches. Ensuring that code runs reproducibly across th...
Loading summary
A
Modern software development is more complex than ever. Teams work across different operating systems, chip architectures, and cloud environments, each with its own dependency quirks and version mismatches. Ensuring that code runs reproducibly across these environments has become a major challenge that's made even harder by growing concerns around software supply chain security. Nix is a powerful open source package manager that builds software in controlled, declarative environments where dependencies are explicitly defined and reproduced. Its functional approach has made it a gold standard for reproducible builds, but it can also be difficult to learn and adopt. Flox is a company that builds on top of Nix with increased supply chain security and abstractions that streamline the developer experience. Michael Stanke is the VP of Engineering at Vlox and formerly worked at companies including Caterpillar, Puppet and CircleCI. He joins the podcast with Kevin Ball to talk about Vlox building on top of NICs, how reproducibility underpins software security, the concept of secure by construction, how deterministic environments are reshaping both human and AI driven development, and much more. Kevin Ball, or K. Ball, is the Vice President of Engineering at Mento and an independent coach for engineers and engineering leaders. He co founded and served as CTO for two companies, founded the San Diego JavaScript Meetup, and organizes the AI in Action discussion group through Latent Space. Check out the show notes to follow K. Ball on Twitter or LinkedIn or visit his website K Ball LLC.
B
Michael, welcome to the show.
C
All right, thanks for having me.
B
Yeah, I'm excited to get to dig in, so let's maybe start a little bit with you. So can you introduce yourself, your background and how you got to Flocks?
C
Sure, yeah. My name is Michael Stonke. I'm currently the VP of Engineering at Flox and I've been involved in packaging and automation for most of my career. And so kind of what Flox does was a culmination of a lot of previous experiences. I had worked at Big Enterprise at Caterpillar, the construction company, running data centers and system administration, doing a lot of automation there. Eventually I left that company to go work with some friends who founded a.
B
Company called Puppet and loved that back in the day.
C
Yeah, it was pretty early on there. I really enjoyed it. Let's not have this bespoke automation, let's have a framework for this and really leverage out at scale when you're working with thousands of servers per administrator, things like that. Really enjoyed that. Did a lot of packaging, built a lot of things and ended up doing a lot of porting and packaging I guess. And packaging has been my passion for software the entire time. I love putting bits into nice, orderly things that other people can consume. I founded a package repository called ePel, which was the extra packages for Enterprise Linux on top of Red Hat. Enterprise Linux. In 2005, it was me and and six other people that basically got together and decided we were all doing this within our own companies. Why are we not sharing this? It's not differentiating work, it's not competitive work. So let's just make it open source and go do it. And that was super cool. So that was kind of what got me into Puppet. They wanted me to repackage a bunch of stuff and they're like, do you know how to build Debian packages? I'm like, no, but I'm pretty sure I can figure it out. And I did. And then, do you know how to build AX packages? And I was like, actually, yes, because I Caterpillar, but things like that. And so I did a lot of packaging and CI system building to validate all that. You can't run a cloud CI system. They barely existed at the time. And if they did, they certainly didn't have aix, hpux, Cisco switches, Juniper switches, things like that. So we had to build all of that ourselves. And eventually CircleCI was kind of watching what we were doing, and they called and said, hey, do you want to run platform engineering at CircleCI? And I kind of answered yes, eventually. And we had the right conversations and that was awesome. One of the things I really wanted to do when I went to CircleCI was have a SaaS experience where instead of waiting for somebody to adopt the latest version and upgrade and maybe they have two change windows a year where they make a change on their version of Puppet or whatever. It was just. We can just ship. It's awesome. And we were shipping hundreds and hundreds of times a week. And that was super cool. And then eventually it was okay. I've learned a lot at this company. Let's go someplace earlier and go kind of figure out, how do you build a business from the ground up? And that's what I'm doing with Flox. And I guess that's how I got here.
B
So let's talk a little bit about what Flox does to sort of set the scene, and then we can dive deep into some of these pieces.
C
Yeah. So Flox is kind of an STLC product overall. And people say, what does that mean? And I'll explain that for a moment. We have a couple of foundational principles we really want to have and that's reproducibility and secure software supply chain. But that starts with developers. You don't bolt that on at the end and do a scan and say, hey, what's in this thing? That's what the SBOM is. Or that's what the software supply chain is all about. We want to secure it by construction. And so that starts with an awesome developer experience because you want developers to work a certain way and have a consistent way of working. What we really optimize for is cross platform reproducibility. And so with flux you create this thing that we call a flux environment, which is somewhat analogous to a Docker container, but not identical in a lot of ways. But I would say mentally, if that's kind of where you want to map it for a little bit, it'll work for a little bit. But you make an environment that you want to go develop in. And so maybe you put your tools in there, you put your language ecosystem tools, maybe you have some other stuff. And then that's reproducibility, it's reproducible. And so anything that we put in that environment, we lock it for Linux and Mac on x86 and ARM. And so if you're on an M1 Mac developing all day, but your colleagues On a Linux x86 laptop, you can use the exact same environment and it will materialize with the exact same version. And to me that's really powerful because it means that, you know, it's not, oh, when I brew install this, I get this version and when I app install it on this Ubuntu box I get this other version. And a lot of times it's fine, but in a lot of cases it's not.
B
I mean, you just described one of my pain points at work right now I've got the guy who's got his home built box with his custom distro and all these other things and then here I'm working on a MacBook Pro because I also have to do exec stuff. Right, right.
C
That's exactly the kind of problem that like it was one of the first problems that we really wanted to set out was like, well, why don't we just have a cross platform package manager? And so we leveraged this thing called Nix underneath. And if you're familiar with Nix, it's this giant open source project. It's one of the most active open source projects in the world, but it's also pretty complicated. And when I first saw Nix, I was like, oh wow. A bunch of Haskell people got together Decided packaging just wasn't complicated enough. And so eventually I kind of learned more about NIX and the power of a functional programming language to deliver packaging, but also deliver a whole bunch of other things. And as we learned about it, as the Nix community kind of adopted, it was, well, this is really cool. But there was a pattern that we kept seeing that any company that tried to adopt Nix, there was kind of this single point of failure. It's like, well, the next person left the company or the Nix person got promoted into a different department and now we don't know what to do. And it was like, okay, well, so maybe this technology is a little too complicated or a little too academic for the average business to go adopt and use. And there are some that are very successful with NIX in the raw form, not most. And so we decided that we were going to build some units of work that were more approachable from an enterprise. And that's what Flocks really is. It's. So it's units of work on top of NIX that are reproducible, that can be shared for developers, that can have optimized CI and then have a runtime where you have a complete, secure by construction bill of materials all the way through.
B
Okay, so I want to break down each of those pieces a little bit. I am not personally super familiar with Nix and I suspect a lot of the audience is not. So can we, since you're building on top of that and making it more accessible, let's start with like, what does nix actually enable for folks in particular? You mentioned pieces of it, but let's like spell it out.
C
Yeah, I guess we'll talk about it more from principles probably than actual implementation. Just because one, I'm not super familiar with all the implementation, I can use it very powerfully. I don't contribute to making significant changes within Nix itself. There's this thing called NIX Store that is the most important thing for all the packaging system. It's like in Nix on a Nix running computer, and what you do is you put all the software there and all the software goes in there with linkage against everything else. So it's a rolling release in that every day there's new stuff merged more often than every day. And so there's this point, and that's one of the things that I would say businesses have trouble with is a rolling release, because that's like Gentoo or Tumbleweed or, you know, things like that. That's not a common thing for RHEL or Ubuntu. And so there's rolling release going on. But you put all this software in slash nix and then what you do is you basically you can think of that as like a package warehouse and then you build a view on top of that when you want to use certain software and enable it, whether you put that into your path or you put that into other environment variables. And so imagine you have this giant database of software and it's like, okay, right now I need coreutils and I need Python and I'm going to go grab those out of this data warehouse and those are what I'm going to put in path. Even though there might be a five different versions of Python in that data warehouse, There might be 17 different versions of Core Utils or whatever, I'm going to select the ones I want and that's what's going to be enabled in my environment right now. And that's the way that nix works. You can have that loaded into a shell, you can have it loaded into a developer shell, things like that, and when you exit that, it's gone. So you haven't polluted your environment, you haven't, you know, overridden the system version of Python or the system versions of Core Utils or whatever. These are completely side by side. And I think that's glorious, giving you.
B
A system wide form of a virtual environment similar to how like somebody in Python might be familiar with. Okay, interesting. Yeah.
C
And the other thing is like with the way that Nix and those flocks work is you kind of, you have a superset of what that language package manager is. So In a package JSON, if you're writing JavaScript, you can specify any library that's on npmjs.org, and it just works. Or you can, you know, a GitHub repository that's got a package JSON. The problem is eventually some of those have native bindings, like do they build against ImageMagick, do they build against libxml or whatever. And when they do, the only way you find that out is you try to install it and it tries to build and then it fails. And then you have to go be like, read the error message. And you're like, oh, this needs lib xml and you go install it.
B
Yep, yep.
C
Whereas with a tool like Flox, you can say, actually we want all of these NPM libraries and lib XML and a compiler and make. And that's part of the environment to start with. Therefore I know it works on your machine because I've already included everything that you would need to have that build Succeed. I think that's an advantage.
B
So, yeah, yeah, yeah. It's reminding me of what was it Vagrant was trying to do this back in the day as well. Of just like having essentially the dev environment be part of what's shipped in the code repo. Of just like, this is all the pieces you need. Just in that case, it was do Vagrant up. I don't know what the equivalent nix command is, but everything will happen.
C
Yeah, yeah. In a lot of cases, you have a nix definitions file, maybe like a default nix or a flake nix. And then you just. You spawn a shell that has all those properties, basically a shell that has those libraries available, those tools available, things like that.
B
So conceptually, this sounds beautifully simple. And yet you said the core problem people run into is, oh, the next guy left. Which to me is like, okay, I've played that as well. Like, oh, we have one sysadmin who understands how everything is played out, whatever, and nobody else understands. So what makes the sort of configuration complex? What are the things that you are then having to package over with flocks?
C
Yeah. Nix fundamentally starts with a build. The very first thing you define is, how do you build this project? And you're like, I don't have a project. What does that mean? Like, when you first start. Yeah, and you start with a build file, basically, and you have to say, well, here's my dependencies or here's what I need. And that's pretty foreign to the average person that is writing software. In an enterprise, usually the first thing they're doing is why a Python installed? And then I go and write some Python files, and then I do a Python run or run my Python unit test or whatever. And then I think about delivering that in a payload of some kind, whether that's a container or whether that's, you know, we SCP a big zip file out to something and run it or whatever. Like, so starting with the build, to most people feels like you're starting in the middle of the SDLC and not starting at the beginning. And so that was one of the very first things we had to do, was like, okay, we actually have to modify the way that Nix has these opinions so that when people want to start working, they start working instead of being like, why do I have to start at step four to go back to step one. So that was one of the very first things we did, where you just do a flux init and it kind of sets up the scaffolding for all those things. And then you can do a search Like a flox search a flocks install a flocks upgrade kind of more like after homebrew semantics versus these nix things, which are all like nix shell P and then you have like a flake command with like this weird URI you have to attach to it. Like it's a very odd syntax is probably what I would say. And even like some of the experimental features that you have to enable under the hood to make things work. It's just, you can tell that a UX interface designer was not very involved in the creation.
B
It's reminding me of. Often I'm somebody who tends to straddle back end and front end, but often when you're interacting with one side or the other, you sort of realize that people have not done the work of translating their mental model to the mental model of the person who wants to connect with this thing.
C
Right.
B
And so, okay, what I'm hearing is nix was built with a very, in some ways low level mental model and did not necessarily do the work of bridging that to your average developer who wants to just get started on their box.
C
Yes, I would say it was a very academic, focused project in how does Linux distro assembly happen? How do you have side by side versions of software? How do you functionally do a lot of things where you don't have side effects within the system? And that's a really cool principle, but most people aren't trying to do that daily at work. And so we were saying, well, how can we leverage these principles without really making you think about all of those powers? But be like, in the end, you know what you do get a complete bill of materials with every dependency that you've ever installed, because we have a complete bookkeeping of what happened there. And so there's benefits there if you can work through it. And so we're trying to give you the benefits with opinionated workflows rather than please assemble all these parts yourself and figure it out.
B
Absolutely. I'm excited to move into that. One more question about Nix before we do. So I think one of the big innovations that Homebrew which you mentioned had was it was, I believe, possibly the first package manager that was all in user space. Didn't require you to have root access in your box to put things in place.
C
Didn't.
B
Didn't give you nearly as many levers to F up your box if you managed to do this the wrong way. So is nix operating also in user space or you need to have system privileges or how does that all work?
C
You basically need to have system privileges to install it. The same with Homebrew. Because if you're writing into user local or whatever, usually you have to be able to create that. But I guess it's now opt Homebrew in the modern M1 or silicon world. But like after that you are running in user space and it's actually you're basically only operating in this nix store area that everything is by default read only in there. Because these things shouldn't change, they're immutable. Like because it's functional, you don't want side effects. And so the side effects need, if you do have side effects need to happen elsewhere. They might happen in your environment, they might happen in your work directory where you are making a configuration change. But in the next store it's read only. So your security vector, like your attack surface, it's changed. It changes the shape and size of it.
B
Yeah, well, and reduces the volume of foot guns that you end up with.
C
Right. And there are ways that you can override that with settings and all that. But by default I would say you actually end up in a quite secure multi user package management system that is still user space. Run.
B
Nice. Okay, so going into flux a little bit and you mentioned some of the pieces here around having some opinionated kind of init and different like standard workflows, but what is the mental model that you are creating? Kind of that molding from the low level nix model to what a developer actually expects.
C
Basically we want you to create an environment that you work within. And so if you have a project, I'm going to go build my next N tier app. It's got a database, it's got a caching layer like Redis, it's got next JS on the front end or whatever. I can do flux init, I can go search for. Okay, what postgres versions are available. I'm going to install Postgres, you know, 16 or whatever. I'm going to go grab Redis. I'm going to grab a redis library for node. I'm going to install node or I can install bun or you know, whatever your favorite JS runtime is these days. There's too many of them. And then I'm going to go just run my next JS thing. And so it might be that I do flocks search for the different packages flocks install. Once those installed on the back end, there's a lock file that's written with every dependency that comes in. Like when you grab node, what does it depend on? What does it link against all the way down to libc, which is fascinating because it's a deep recursive tree. But it also means that there is literally nothing in that closure of software that is not needed. There's nothing in there that's not needed. And so that's how you can have the thinnest footprint in terms of what you actually require, and also the lowest attack surface. And so that's one of the things that we're super interested in from just a security and compliance point of view. So you build that environment, you install the software you want, and then you might run a NEXT JS init, or however you start a NEXT JS project. I don't even remember off the top of my head. I don't write a lot of next, but. And you can run it with your normal NPM tools if you want to, and just do it on top of that, because a lot of those have really good locking on top of it. And then you could say, okay, we've done that now I've built my application, I want to deploy it. Well, from there you have a couple different options. And one of the reasons we built these options is because not everybody's going to adopt a whole new SDLC overnight. And so you have to kind of meet people where they are, like, where do you want the advantages of this? And where are you saying you're comfortable with how you're currently operating? So it might be you have a great operating environment or a great developer environment. People on Mac, on Linux, they have compatible software, exact same versions. Beautiful. And then you go to CI and maybe you press a container from that, you just export it and you say, hey, I want to take what you have and put it in a container. And that's how I'm going to go run the rest of my work. Workload. Okay, great, go do that. Or you could say, actually, I want to package the thing that I just built as its own package, because I want to run these packages out on my system. Maybe you want to run bare metal because you want access to an AI model or something like that that's giant. You don't want to ship around a container every day. So you say, I want to package this up. And we have two packaging methods. We have one where if, you know, nix, you can write a nix definition file and get it completely pure with everything optimized to the nth degree, complete reproducibility, all of that. A lot of people think that's really difficult. And so we built an entire thing within flux that if you can tell us what commands you run on the command line to build your project, whether that's NPM build or whatever, you type those things and you copy the output into a specific environment variable, we will make a package for that. And we just end up building a package for you on the back end. You publish that up to our Flux Hub system, which is like the central clearinghouse of all these different environments and packages. And it's got RBAC on it and all those kinds of things, but you could have that deployed and then you can run that environment. And coming up at kubecon, we'll show you that you can run these environments natively through kubernetes without even needing a container. And one of the really cool things there is I don't have to ship these layers back and forth. I don't have to ship a bunch of content. The only thing that happens is the content I need ends up where it needs to run, because that's where we materialize the environment. What we ship around are environment definitions, not the payload. And so we're shipping small amounts of metadata bytes at a time versus these layers going back and forth on a Docker image. And it's not that Docker's done a bad thing, but in your normal workflow you have a 10 gig image. The very first thing you do after you build it is you ship it up to a registry just to pull it back down where you want to run it. So Now I've done 20 gigs of round trip.
B
Well, and this is where that functional mindset really gains you a lot, because you know that it's reproducible, you know that it's going to be idempotent, kind of do it. So a couple questions I want to dive into on a couple of those pieces. So first off, starting in just the dev environment before we get to deployment, you talked about, okay, I have a lock file, it goes all the way down to my versions of libc. How does that interact with the multi platform aspect of it of like I'm on a Mac versus this person's on Linux? Like the build chains are very different.
C
In some cases they absolutely are. And so we do calculations like our catalog, which is where you go get the packages. So we have built basically an inference engine on top of NIX packages and that catalog that does a lot of the calculations for that. And I would say we know which builds are guaranteed to build, which ones are cached locally on nixos.or cache.nixos.org like we have all of that inventory. Then we do a resolve request where it's like, okay, you're looking for this version of Go and this version of Node. Okay, can we resolve that to have that be at the same time? And if we can't, because they're not contemporaneous with each other, like say the version of Go is from two years ago on the version of nodes from today. Well, on a rolling release you have to pick a point where you're going to land. Okay, so how do you solve that? Well, we solve that by actually, do you want to break that into two different pointers? Because we can do that.
B
That's interesting.
C
Okay, and so we can break that into, okay, I'm going to have one closure that is that version of Go from two years ago and I'm going to have a second closure that is this modern version of Node today. And then there's a third closure that encompasses both of those together and that's how they move in a group. And so we have package groups and things like that. But the way that we do that cross platform is as we calculate that resolution of okay, you want this older version of Go, we go and look and say, is this available on Mac? Like if you're on Linux x86, is this available on Mac? Is this available on Mac ARM? Is this available on Mac X86? Is this available on Linux ARM and Linux X86? And we'll start to make calculate like we do calculations based on how many of the constraints can we satisfy. And at some point you may drop below a threshold and we may say constraints are too tight, we're unable to resolve this. And like we'll tell you that. Or what you can say is actually I don't care about Mac x86 anymore because everybody in my shop is on Apple Silicon. Okay, take that one out. And maybe that relaxes the constraints enough that now we have a resolve that can happen appropriately.
B
Got it.
C
But there is a lot of logic in that I will say that's where the bread and butter of a lot of stuff is going on.
B
So I kind of want to think about it, right? So like as you start to dive down that and you can say, okay, we can resolve at the same level of Go. And Go is depending on, for example, I don't actually know what it depends on underneath, but it's like a set of C related libraries.
C
And yeah, it's very few, very minimal.
B
Yeah, Go is a great one then because it's going to be easy. And actually I love using Go because It is so easy to package and ship anywhere. Yep. What's an example that actually would have a lot of different dependencies should we do?
C
I mean Ruby or Python are both.
B
Pretty entangled, so let's use Python because probably slightly more people familiar with the challenges there. So I have this version of Python I know that that exists on both my Mac Apple Silicon and the intel x86 box that I'm shipping to or whatever it is that depends on n different system libraries. And it's version this on this side for that version of Python, but it's version something else on the intel side. Like I guess the question is like do you ship a lock file for environment when you say you lock all the way down to the lowest level of system dependencies, like how is that closed over when you've got a bunch of different environments?
C
Well, we're also providing those dependencies. So we're not necessarily dependent on are you running Tacoma or are you running some other version of macOS? We're dependent on. We included the LIBC that you needed all the way down and so you're getting that from the NIST catalog because.
B
You control the whole thing. It's not. I'm looking to see what you've got.
C
And got it, got it in nearly every case. There are some cases on Mac specifically where you want to link into like the Mac frameworks, like MFC or whatever they call the Apple Darwin frameworks or what. I don't know what they actually are named anymore because they have changed names like four times. But like if you're going to write something that involves xcode, for example, we're not going to ship you Xcode because we can't. That's not a license thing that we're allowed to do. And so we're going to link out to a System X code if we can find it. But those are, I would say, a small minority of cases that that's really going on. And so in most cases, and there are some libraries that are just not available ever on Mac or never available on Linux, you know, there are some utilities. Again, most of the time we can figure that out and we'll actually only lock it on the systems that it's valid for. But sometimes there's an odd edge case or something. But most of the time we're giving you the thing you want to type on the command line and all of the libraries that are linked to it underneath and all of the things that are important for that to be able to properly. And that means you have the exact same version of those libraries on all your platforms.
B
Okay, awesome. So now looking a little bit more at the deployment side of things, one of the things that is potentially interesting here is looking at, let's use the postgres example that you've got, right? So you're saying locally, I say, I want this version of postgres Nix. You have your packaged version, you've locked it, you send it to me, whatever. And then I say, okay, I want to deploy, but I actually want to use AWS's Cloud Postgres or whoever it is. Like, what is the like, decomposition for deployment look like?
C
So you have a few different options there, which every time I say there's a lot of options, I'm like, do we need to be more opinionated on something? But the. We might. But what you would do is maybe take the packages of the things you want and make a runtime environment that is smaller than your development environment. So by default, every environment in flux has two modes, developer environment and runtime, where it's basically, do you enable the compilers, do you enable the libraries and all that stuff, or do you not? And for instance, if you run, there's a package called Almonds that is written in Python. If you run Almonds in development mode, you will also have the Python that runs Almonds available in Path if you run it in runtime mode. We're not going to put Python in Path because you probably don't want to grab that Python if you're just going to type Python in the command line. So that's the difference of runtime mode versus developer mode. But you might make a separate environment that only has the web content and the caching layer, but doesn't have that postgres because you know that you're just going to have a connection string that goes out to rds. And so that's something you could do is I'm just going to have a smaller environment that uses the exact same packages, so I still know it's reproducible one by one because I have the exact same packages in these environments. I've just made a smaller environment that I'm going to deploy at runtime. That's an option. Another option would just be change the connection string and you still have postgres sitting there and you're just not using it.
B
All right, so I think I'm understanding the picture now. Let's move a little bit into kind of modern day, modern day software development. There's a lot of changes going on and I noticed that the Phrases like agentic coding or agentic development environments showed up in Flux's messaging. And we as engineers are all trying to figure out how do we best use these tools. So how do you think about what makes for a good agentic development environment and what does that actually connect to with regards to Flux?
C
I guess there's like the marketing forward answer and maybe the engineering cynical answer. And I kind of have a little bit of belief in both. Ultimately, agents are built to model humans, and so if humans do better with consistency, I would think that agents would as well. And so from a lot of those perspectives, it's how much of my entire workload can I keep at a deterministic level? Because I'm now running these probabilistic workloads on top of things. And that is nuts from anybody that came from computer science. All we've been seeking is determinism for years and years and years and years, and now we're like, what if determinism isn't the answer? My brain exploded and I am now trying to reconcile that. And so me coming from a background of determinism and idempotency being fundamental principles of what we've built over the years, I'm like, okay, can I keep 80% of my stack deterministic? Because maybe how many variables do I need to have in play at once is really the question. And so if I have a consistent environment that I'm working in, both in development and at runtime, hopefully those agents are more successful because either they're not having to account for variants as much in different parts of the environment, or they're spending all their context window on the stuff that I'm asking them to do versus the. Oh, this didn't work. Let me run this unit test again. Oh, let me go grab this. Oh, you didn't install ImageMagick when you were trying to compile this node thing? Like, no, let's have all that ready and burn my tokens on the things that I'm actually trying to accomplish. And I think that's more how I feel about the agentic coding right now. But ask me again in four months and I bet I'll have a different answer.
B
So, yeah, I think the determinism and non determinism is fascinating and maybe we can dig in a little bit more there because I think there's a few different angles. I think about it, like, why have agentic workloads taken off in coding and not other places? I think it's because humans are also non deterministic. But We've been trying to drive non deterministic human coders towards a deterministic output as long as the industry's been around. So we've got all these tools for like, how do you harness our random chaos into something useful also?
C
Yeah, well, I think most of computer programming has been like mapping a human mind onto a machine and like training your brain to think like the machine. And you get to these agents and these agents have taken a step more toward they think a little bit more like a person, or at least they behave a little bit more like a thinking person. And so now you may not have to think the exact same way you did the entire time when you were working with these traditional classic computer problems. But we're also now seeing software being developed more with agents in mind. And how do you make this agentic and all these adjectives. And in some cases it's like, well, actually this is more human friendly as well. And so you get this weird balance of are we meeting in the middle? In some cases. But I still want to have the fewest number of variables in play always. And that's better for the machine, that's better for the human. So however, I can do that totally well.
B
And I think a thing that I've definitely seen is like, the better we follow software engineering best practices, the better the agents do with it also. And I can definitely see that. So thinking about putting flux in play here, if someone is really embracing the agentic workflow, are they running flocks? Like, is that setting up the environment that the agent's acting in? Are you actually exposing flux itself and the core CLI pieces to the agents and they're setting up the environment? Like how are you actually using or seeing people use flux in these environments?
C
There are two patterns that have emerged right now and I think, I don't know, I'm trying to figure out if I have a preference for one of them. I don't know that I do yet. But one is you kind of create the environment that you want. Like if you know you're going to build an application node and redis and postgres or whatever, you, you kind of get all that stuff installed and then you launch, you know, cursor or whatever within that environment, like maybe cursor dot and you open up that environment as it's been activated and now all those tools are available to the agent and you just say, hey, I want to go build this. And it's like, you know, it just, it finds python on the path or it finds node on the path and it just goes and uses it. It doesn't even think about it after that. And you're, it's like it doesn't know that you're in this great reproducible environment with excellent software supply, tape tracking. It's just like, I don't know, I typed which Python, it showed up, I'm good and that's totally fine. There's another way where you start with nothing and you launch, you know, your IDE or your agent or whatever and you flip in our MCP server and you just say go, build a, you know, a Go project. And it's like, okay, I'm going to go search the Flux catalog for versions of Go available. Which one's current? Cool. I'm going to go get libraries and we have, you know, hints and cursor rules files or you know, cloud mds or whatever it is that you need to do so that they know how to work with these Flux environments and use Flux commands for everything and they're not falling back into system programming like to be like, oh, let me go brew install this thing. It's like, no, no, we, we've given you rules that we want you to do it all with Flux. And the issue with that is sometimes context windows get resized and sometimes that like it works. I would say like 95% of the time. But there are times where you like, why did it go do that? And that's the problem with the things.
B
I feel like that is the story of everything with these agents. Like, it works most of the time, but sometimes you're like, what?
C
Well, my favorite thing is I can ask it why. That's actually one of the cooler things you can do with it with an AI agent. But why did you go do that to be like, oh, because of this and this. You're like, huh, I care more about this set of trade offs. It's like, oh, well, if that's what you care about, we would go do it this other way. I'm like, okay, let's go down that path. And I am absolutely correct.
B
So yeah, so somebody, at some point, they said, you're absolutely right and they called that a yar. And so now every time I imagine Claude in this pirate voice of yar.
C
Yes.
B
So one of the things that we touched on pre show that I want to kind of bring in here was thinking about engineering efficiency. And I think one of the big things with the world that you've been in and all of this like package management and stuff like this done right, it helps so much with efficiency. Like I've had to dive into this much more as my career progressed. The first job I had there was a build engineer. He just made everything work. I didn't have to worry about it at all. And it was incredible. I just like work on my thing and he's got all the build engineering, like all of that. And over time you become that person or you become the one who has to manage the infra or do all those different things. But I think, you know, doing package management well really supports efficiency. Is that changing now? As we sort of move into this agentic future, Are there other things that we need to be thinking about?
C
I mean, I'm sure there are other things you need to be thinking about. I'm not going to speak in absolutes generally, but overall if you're really good at something, I feel like the AI stuff is amplifying it. If you're really bad, the AI stuff is amplifying it. And so it's, it's kind of what are your practices already? Because you're getting more of them. And so with a packaging perspective, if you go back to like the state of device reports from like 2016, 2017, 2018, these were, these were things I was semi involved in 2018 and on. I was very involved. But like one of the things that leads to success the most is reducing the number of variables. So we're back to this consistency play, we're back to this repeatability, reproducibility play. And more successful teams have fewer ways of doing something. They have fewer ways, you know, they don't run on 17 different OSS, they run on one, they don't run on five different programming languages. They have two, maybe they have a static type and a dynamic type language that they standardize on or you know, things like that. And so the fewer variables you have in play, the more likely you are to be success. That actually correlates with success in terms of, you know, who's in the elite tier pretty highly. And I think that that's the same thing that would happen with AI. So it's like if you can teach your agents that you have one way of doing something now it goes and does. Like, you know, we have one way that we connect over to postgres so that we're guaranteed to have failover on. You know, when RDS has a hiccup or whatever. Okay, great. You have this one way that you always use because you've written your own client library or you're in port one and overroded or, you know, whatever you do, it never has to think about that really again. And so now it's on to the next set of problems. And I think that's the exact same way a human would be. If you have a really good set of libraries, they're gonna be like, well, I'm just gonna include the one that the platform team gave me because I don't have to think about it anymore, and they move on to the next part of the business logic. So I think what you're seeing with a lot of the engineering performance is if you already had good practices, AI takes advantage of it. It's basically everything you're already doing. What if it just happens way faster? What happens happens, and, you know, if you're kind of fumbling and you're not sure what you're doing, that just happens faster. Whereas if you know what outcomes you're looking for and how to measure them, that also can happen faster. I think that's my. My quick summary on that. Does that work?
B
Yeah, no, I think that makes a lot of sense. And it feeds or it. It is very similar to a thing that I've noticed. So, yeah, it just amplifies whatever's going on.
C
Right. I mean, I will say that a lot of the agentic development stuff kind of makes you think much more about specification and like, what are you really trying to build? And then validation. Did I actually get the thing that I was asking for? And I think validation has always been the hardest problem in software engineering, in my opinion. You know, QA criminally underpaid for the entirety of its existence. And it's the hardest problem of knowing, is this thing actually correct? Like, sure, it compiles, I can run it, but is it. Solving the user need is a really hard thing to know. And then specifying, I'm trying to build this, but what is this and how well defined is it okay if I use this connection string versus this type? Is it okay if I have a model that, you know, pulls in 17 different dependencies or only one or, you know, whatever's going on? So what I've found is that people that have a little bit more product management background or product engineering background are a lot more successful with those things than the people that don't at this point? And so that's been a fascinating turn as well.
B
Has it changed how you at Flox are operating? A little.
C
We're starting to evolve on that. You know, I think the tools are moving faster than the humans are at this point, which is a fun thing to Kind of consume. But we are definitely trying to use, you know, agents to speed up certain types of development work. And there are definitely the skepticism. I have found that your skepticism correlates directly with your seniority. Like the more senior you are, the more skeptical you generally are of AI and the more junior you are, the less you are. Which is fascinating to me because more senior people generally write less code. They're actually spending more time in architecture and all that. But they're the ones that are more skeptical of it, which is, I don't know. It's an interesting set of trade offs that I've observed. I will say that I've written more code in the last year than I probably have in the last five combined because of being able to use agents. And I find that the best I, I don't have too much problem specifying and you know, I spend more time on my workflow for how do I get specifications in a way that I think is consumable and iterable versus just the single super prompt? Like, come on, we're not. That's not going to work. At least not currently. Context windows are too small for that. But the way that we operate, I would say we have several areas where agents are doing a lot of development. We have several where they're doing some of the like, fast part of the code review, maybe the initial parts of the code review. We have some that are kind of looking at tests and seeing are these flaky, are they not, you know, things like that. But not every developer is spending all day just, you know, writing prompts or anything like that. We're definitely not there, but we're trying to keep our eye out for, you know, when is that happening or is it actually giving us an advantage in any spot or is it actually cost us twice as much now because it writes bad code that we have to go fix. And in some cases it does. And so you have to kind of really. We're still learning our way through it. I guess that's probably the simple way to say it.
B
Well, and it's interesting because like your primary audience that you're selling to is developers who are also going through all of these exact challenges as they're going through. So has this changed at all the way that you're thinking about Flox's like product roadmap and what you're trying to do to serve people?
C
Certainly that's a clear answer of yes, in that, you know, one of our major partnerships announcements a month and a half ago was with Nvidia and it was so that we can redistribute CUDA within Flux and so that you can have a native library of this, because generally they didn't used to allow that redistribution properly. You know, it was kind of a non free piece of software and things like that. And so we have a distribution, right, so that when you want to have, you know, a fully functional like Pytorch environment to go do your model training or whatever, you can have Cuda in there. That was one of the first things we jumped on. We're like, oh my gosh, we can go partner with Nvidia, we can go get CUDA available, we can make these, you know, workbench environments for CUDA or TensorFlow or, you know, whatever your favorite ML tools are. And that was a really exciting partnership and I don't think we would have done that had agentic coding not been as important as it is. And then you start looking at, well, what about MCP servers? Well, okay, everybody's asking MCP server was like the idea of it was made public in November. By March, I have people asking me every day, why do you not have an MCP server? I've never seen something like that move that fast. And so we have an MCP server and we didn't have it in March, I'll say that. But we do have an MCP server. And you know, and we're still adding things to it all the time because the specification gets updated or we, you know, you go use a different mcp well, like the way that worked and then, you know, you go borrow from that and you kind of implement that. And we're actually using a lot of the agents to help us write the MCP server because, you know what knows a lot about how AI works, AI. So you can, you can start though.
B
It's shockingly bad at writing good Python, which you would expect. Like they use Python for these things. You expect to be good at it.
C
Well, the thing is like with the way it trains though, you know, it takes the entire compendium of knowledge out there and you have to ask yourself, well, how much Python's out there? A lot. How much of it is good? So, but it trained on all of it because it can't tell the density.
B
Of quality is a little lower in Python.
C
Yeah, that's actually one of the reasons I like working with GO with agents more than anything is because there's not idiomatic go doesn't look that different whether it's generated or whether I wrote it. Like the variable Names might not be the same, but outside of that, the structures and the flow through the code is usually quite similar and I find that to be a little bit beautiful.
B
I have also and actually have talked to multiple people on this show who have found that Go is possibly the best language with regards to a gentic generation. And it is, there's like multiple factors to it and we don't need to dive into those. But yeah, it's great to hear you say that as well. It feels like I'm getting lots of nudges for my to be high there.
C
Yeah, yeah. I mean. And what's been fascinating though is watching it improve over time. Like when we first started working Flux is written in Rust again, we're pretty security minded. We want to make sure we don't have buffer overflows and all that. So when I first started working with AI and Rust it was terrible. Like it was just awful. And now it is pretty decent. It's pretty good a lot of times and sometimes it's quite good. Even our core Rust team is, you know, I turned in a patch that was written by cursor and they were like, oh, we had a helper method for this and a helper method for this. Can you, you know, can you flip that? But my, my code review was actually pretty simplistic and there were a few things they were more educating me about how the code worked less than less that the AI did it wrong. It was more. Let's just have a back and forth about the behavior here and you know, the stuff got merged and I think if I'd have tried that six months ago or a year ago, I don't think that would happen at all. And so like these models are moving quickly and improving. So even if it's a language, you know, if you don't want to write go all day, you're still probably going to be okay.
B
Yeah. So let's maybe you talked some about how it's already kind of influencing you and the push to MCP is, has been fascinating and if we want maybe actually that might be worth diving a little bit into. I feel like writing a good MCP server is actually not trivial because you don't want to just dump everything into the context to blow out your context windows, you say, so what have you all learned through the process of this, like race to expose everything in mcp?
C
I think we started with, okay, how do you find software? Like if you're trying to build an environment, can we give you basically interfaces into our catalog to be like okay, I'm looking for Python, I'm looking for Python. What versions of Python can I browse from this catalog? And you know, we might have something Back to like 2, you know, 2.7, all the way up to whatever came out last week. And you know, so there's maybe several dozen versions and it can go select and say okay, I want this version which is, you know, usually fairly recent but might not be the latest because it doesn't the latest one even existed. But our MCP server is going to be okay. To find software, here's what you need to know. Like it'll kind of have a built in prompt of finding software. You run these commands, you look at this way, here's the output parsing and like, you know, throw the JSON flag at it so you can parse it easier, you know, like things like that. And then it might be well for running software, here's the instruction set. And so you don't have to load the finding software and the running software instruction set at the same time. Like I would rather you drop one of those out of the context window so that you're minimizing the overall fill up. But overall we've added tools, we've added, we've added several things. But I would say our primary engineer on this is just playing with other people's MCP servers and be like what do I like about this one? And like the Supabase MCP for MCP server for example, people seem to really like. And so it's like, well let's go play with that one and see why they like it. Like what's good about it? Or the postgres one a lot of people seem to like. And then you know, you go to others and they're like well this one has an MCP server, but I don't think it's really doing much for me. And so we don't, you know, we play with that for a couple minutes and we throw it away. One of the other things that we're trying to do with Flocks with MCP is okay, if you want to run an MCP server which some people are writing them and some people are running them. How do you restrict it? Because security has a new model with this. Like before you had transport layer and storage layer, you know, encryption, security, ssl, that kind of stuff. That's classic. We know that. Then you have identity, security of API keys, authentication, non human identities, which we still haven't gotten right in any way. But we're going to move past it and Then you have like morals and ethics where it's like, okay, are you allowed to use this tool? Are you allowed to use blackmail? Are you allowed, like, how do you get the way the things you want? We have no idea how to handle that security. And it flocks. We don't yet either. I'll just be honest. I have no idea how to tell an agent, don't do blackmail, other than say, don't do blackmail. But if it decides it wants to do it anyway, it might just do it. I don't know. So one thing we can do is be like, okay, in this MCP server, in the default build, it has a network sniffer built in. Well, in our build, you know, we could be, we'll actually ship you a build without this network sniffer and therefore you have one less security vector that you have to go worry about or something like that. So you can say, here's the tools that are available to the MCP server. And like, we're working with somebody that is writing other McPayer servers and distributing them. And they really like that. They can have different options of I want to run this with fewer sets of tools underneath. And in fact it won't even be on path, so you can't just go grab it. And that's what they like. You know, there's like, you've taken the toy away from the baby so they can't hurt themselves with it.
B
I mean, I think that speaks to one of the really interesting things about what you guys do in terms of like being able to really constrict what any particular thing has access to. And this I've talked with folks who are saying pretty soon you're going to only ever want to run Cursor or any of these other things inside a docker container because you don't want it to be able to go and access. Right. Especially if you're using an MCP server. You don't know, could that be hijacked, all these different potential things. So is that a use case you're seeing people start to use nix for. In terms of how do I lock down the environment that this particular thing I'm running is going to be running in?
C
You certainly can force it depends. Like with nix, there's. There's a thing called a pure activation which basically dumps the environment before you get into this one. Where like, so if you don't include core utils, you don't get LS in this and that kind of thing. Like, yeah, and so you have to be really explicit, like do you even have a shell? You know, things like that. It will grab a shell if you don't have one. But. But it's, it's stuff at that level where it's like. And so from there there is no editor, you can't go call vi or there is no curl, you can't go out to the network or whatever. And you can even activate those environments. Environments in a way that says network's turned off. Like, we have sandboxing modes, you know, things like that. So there are ways to do it. But some of them are great approximations for security. Like, they're like, they're there, but they're not totally bulletproof and there's others that are excellent. And so it just depends on what you're doing. Exactly. But then also, you know, you can start to be like, well, can I take advantage of C groups within the Linux. Within the Linux kernel? You know, the way that Docker does. Can I do that with a flux environment? And the answer is yes. Yes. With Kubernetes, like, we sit on top of container D and we have a shim that we put in there. And like, so you can totally do that. Like, we give you the same isolation basically that a container would, because we're running via container D without a container, so. Which is kind of fun.
B
Yeah.
C
And so like, we have a lot of those same safety affordances that people have kind of come to trust and learn with Kubernetes. And, you know, Those are all APIs that people are familiar with and used to, and extension points and all that. So we were like, well, I don't need to reinvent all of that. If there's an ecosystem that people have agreed upon, why don't we just extend the ecosystem a little bit? Put our, our tool sets in there and if people like it, great. Or if they want to do hybrid things where, you know, they have a pod that has five containers and one flux environment. Cool, that's fine. Go do it.
B
That's fascinating. So with the Kubernetes, like, are you thinking. Because first I was thinking about this as like, okay, this is useful in terms of managing a single environment, but now we're starting to talk about things like orchestration and how you navigate, like deploying complex sets of interacting services and things like that. Is that a space that that flux is targeting?
C
Kind of. So the flux environment can do a lot and we're not constrained by some of the same dogma that I would say, like the containers in the CNCF have kind of built Around Kubernetes. Like everybody says, if you're running more than one process in a container, you're doing it wrong. That's not a thing that we have. In a flux environment, you want to run multiple processes. Rock on, man. Like, go do it. Sometimes applications need more than one thing running to be successful. And you know, and that's why, that's basically why a pod was built, because now they needed unit to go talk about all these different running processes. Well, we can just give you a flox environment that can run multiple processes if that's what you need to do. Sometimes you just want your redis and your postgres to both start instead of having them be separate containers that you have to figure out, manage. Like it's not, you know, I'm not gonna say one's totally right or wrong, but like, we got, we almost got religious about this and I just don't, I just don't think there's a lot of value there. And so we're challenging some of the orthodoxy of running things within Kubernetes a little bit. But, you know, it's pretty cool. And we've, we've worked with some, you know, prominent people in the cncf and they're like, this is pretty cool. So it's not like, you know, no one's saying this is the worst idea ever. There's like, we haven't looked at it from this point of view for a while. And so but when you get to orchestration, then it's like, well, do you want one copy of this environment? Do you want it in, you know, like a daemon set that's running across all of your nodes and workers? Or do you want, you know, do you want this thing to fail over? Do you want a minimum number of pods that are deploying this environment? And so right now we're giving orchestration primarily to Kubernetes because that's what most people are running. And like, why I don't need to necessarily reinvent something. It's not, I'm not going to succeed if I do. You know, CMESO see Nomad, you know, a few others that are like, they're successful in pockets, but certainly don't have the market share that Kubernetes does. And so for us, it was a little more that if we're going to get into the runtime, let's, let's at least play in that space. Now you can also just run our environments on metal. Like they have a service manager and you can just run them without an orchestration suite at all if you want to. In a lot of cases that is very simple, which means you can use, you know, Strace for debugging and I don't have to go launch a debug pod or anything like that. And that's kind of beautiful in a lot of cases. So one of the things that I really like about it is we can get kind of as simple as you need or as complex as you need. Like you want to get into Kubernetes, need a service mesh and all that. Rock on. Or if you say, I don't need all that complexity right now, I just need to run a web server and a database, we can do that really easily within the environment.
B
No, I think that's really important, being able to kind of scale up and down in terms of complexity and in configuration. So I think that's a thing we didn't talk about as much. But I'd like to dive into just for a minute here, which is, as we talked about kind of in the beginning, Nix exposes a ton of knobs, has a ton of power, a little bit sharp around the edges, hard to work with Flox, you're doing sort of these opinionated stacks on top of that that allow people to kind of do what they want to do very quickly. Now if you start a project in flux and you get to a point where you say, hey, I actually need to turn some knobs, I need to go down there, I'm willing to do the learning curve, I'm willing to do that. Like, how easy is it to incrementally extend and take advantage of all those underlying pieces in Nix? Do they play nicely with your stacks? Are you able to sort of swap things out? Or is it like all or nothing?
C
It's definitely not all or nothing. We have a few kind of, we usually call them exit ramps or like, you know, extension points or whatever, that you can go back into pure nix if you want to. Like for instance, if you want to define a build definition with pure nix, because you really understand the primitives for the build system. We don't expose most of those to you, there are hundreds and they're awesome in a lot of cases. But if you know what you're doing, I would recommend doing that in a lot of cases. Like if you really know what you're doing. If you don't, we'll give you the fast, easy way forward and it'll be mostly correct. But that's one. The other is if you want to use nix based ecosystem tools that are exposed as flakes. And the really simple, not quite correct definition of a flake in Nix is if you've ever used Ubuntu and they have a ppa, which is like a personal package archive, which is just this random person's little app repository for their tooling, that's kind of what a flake is. It's the definition of this one piece of software or maybe a few pieces of software that is in a separate package repository that is not exactly correct, but mentally model it is close enough. And so you can go out and get a flake. And now that is not that resolver did not happen with our catalog. So we can't say, hey, this is going to be guaranteed to work at the same time as these other pieces of software you have and all that. So you've gone through a little bit of an escape hatch to do that. In a lot of cases it's going to work fine. Especially if your own tool doesn't isn't linking back to other things in the environment, like if it's standalone or whatever, you'll be fine. So we offer that some of the other things we do expose in flux would be very familiar to a Nix user, where they would might have like a shell nix. And we have our flocks manifest, which is written in toml and you can have like, it has all your package definitions and so all those imperative commands I talked about, search, install, upgrade, they're editing that TOML file, but you can go in there and edit it yourself and say, actually when you load and activate this environment, here are the environment variables that I want to be active. Or here are the shell aliases that everybody should use. You know, we have project up and it just starts everything that we want it to and you know, pulls in a fixture for the database or, you know, whatever during development. You can define those aliases and have that all happen. Or you can have like an MOTD style thing. When you activate the environment, it says, hey, your application is now serving out on Port 8000. You know, things like that. And those are all things that basically we translated and passed through from Nix into the way that we operate. And some of it we've built on a little more like Nix kind of assumes you're running bash. We have made it so that if you're a FISH user or a ZSH user, like you're all. You're still good. Those are things we've added. Yeah, cool.
B
Well, looking forward, what's next? What's coming down the pike for flocks over the next. I mean, do we dare scan out six months? Or in the current world, do we only go out a few weeks?
C
I mean, I feel safe for about a few weeks, but I'll pontificate all you want. You know, we're really trying to get into the runtime and production side of things right now. Like, we've, we spent, I would say, the better part of two years on this developer experience, and I think we've, we've got it quite nice. Like, there was a day I realized that if I didn't work at Flocks anymore, I would still use it for my personal development. I just love the way that it operates and works. And that was a pretty cool day.
B
That's a good milestone.
C
Yeah, it was a great milestone. You know, and then we get some prospects and customers and they're like, I've, I've used Nix enough to know that I want to use Flux. I'm like, that's a great question. Quote, I'm putting that on the website, you know, things like that. But there are others where as we're looking forward, we're looking at the runtime a lot more and so production environments and, you know, Kubernetes is. This is the first thing it's launching at Kubecon North America here in 2025. We're really excited about that. Like, we're really excited about that and we want to see what the adoption looks like. And it may be that that steers us into investing in that area and making it, you know, have more bells and whistles and more feature sets. We're also really looking at agentic runtimes and trying to make sure that we give them the control points that it needs. Everybody's kind of relooking at the way that, the way they develop software right now because of all these agents. And if that's what you're looking at, it's a great time to look at Flocks and be like, well, if I'm changing the way I'm doing this anyway, is Flox a good time to enter here? And so can I give you control points? Because the code coming out of these agents, I think we all kind of need to assume it's hostile. We don't know that it is, but we don't know that it isn't. And since we don't know that it's not hostile, I'm going to assume it is. Which means that if I'm not going to have control at the development side, I need to put the control on the Operator side. And that's. So that's the SRE, the DevOps person, you know, whoever's responsible for production could be the developers just, you know, wearing a different hat at that time. And so we need to be able to put a lot more control points on the runtime control plane. And so that's where I think Flox is going to be spending a lot of time and being like, okay, where are the safety nets, where are the gates, where are the checkpoints, where's the monitors? Because we don't know what's really happening here.
B
That's fascinating. Yeah. I think we are trying to figure out how do we build stability again, determinism on top of this non deterministic substrate. Right, Yeah, I agree. I think having more abilities to sort of hook into what's actually going on out there because the amount of code being generated. There are probably teams that are still reading it all, but they're rare.
C
I do think several teams are reading it and I think that might actually be doing them a disservice. Like if you're actually reading it all and trying to review it all, it's. Did you review the compiler when it, when it wrote assembly or machine language? Like, you probably didn't. And so did you trust it? And I don't think we're at that trust level yet. But I'm also figuring out, like, how will we know if we get there? And so there's definitely, you know, some.
B
Knobs and what tools do we need to get right? Yeah, absolutely. Awesome. So we're coming to the end of our time. Is there anything we haven't talked about yet today or that's come up and we sort of move beyond it that you think would be worth covering before.
C
We wrap One of the key benefits of reproducibility, which, you know, I will hit on that over and over again. I'm a release engineer by trade. That's what I love is like never doing the same thing twice, never doing the same rebuild. But a lot of people in their traditional workflow, they write something on their laptop, they run some tests, then they submit it to the Bless CI system, you know, Circle CI, GitHub, Actions, whatever, and it runs CI there. Maybe they get different failures there because they're running on a Mac locally, on Linux there or whatever. One of the things that we're really looking into and looking forward to is if you've defined all of your inputs, which is, you know, what software is available and what, you know, your software built material, your supply chain and you've run tests, and we know that that artifact is the exact same artifact because it's mathematically provable. Why do I need to run tests again on the BLESS system? If I've run tests locally and I have the artifact locally, and I can say, well, the inputs are the same, we've hashed them all, the output's the same, we've hashed it all, I don't need to run tests again because we already have a receipt that says you ran them all and they worked. So now I can skip that entire part of the CI system, and maybe that's only some said set of tests, because maybe developers only run unit tests locally. They don't run all the integration tests or, you know, whatever.
B
I'm wondering. Yeah. As you start saying this, can you map the dependency. The graph as well? So you're saying, which tests do I need to run based on not only what has changed, but also what tests I've run locally and all of that? Right.
C
We are not doing that today, for what it's worth. But I think that's the thing that I'm starting to get pretty excited about as I get back into this determinism. I'm like, can I cut. I can maybe cut parts of my CI bill, but I can also just cut time off the wall clock, which is actually what developers are looking for. They don't want to have to go up and get a coffee every time they submit something to CI to see if it works. They want to sometimes, but not every time.
B
No, we want to go and get a coffee while Cursor works away on it, and then we've already drank our coffee, we're done. We want to ship it.
C
You got a good point there. But, yeah. So there's just a lot of things you can do with determinism, like when you know the artifact you're dealing with is the same on this system, on that system, and we've already calculated it all. If you can start to put your tests inside that artifact and stuff like that, now you have proof that it's all working the same. Same. And I. I think that's really exciting. So I guess that's the one thing we need to talk about was kind of that reproducibility end of. Like, what do you really get for reproducibility?
B
Yeah, awesome.
Podcast: Software Engineering Daily
Date: January 8, 2026
Guests:
This episode dives deep into software reproducibility, the challenges of dependency management across heterogeneous environments, and how Flox leverages Nix to create secure, deterministic software systems. Michael Stahnke shares his experiences in packaging, automation, and CI infrastructure, and discusses how Flox builds abstractions on Nix to make reproducible builds accessible, secure, and developer-friendly. The conversation also explores the rising importance of “agentic” (AI-augmented) software workflows and the implication of these technologies for the future of engineering teams.
[02:02–04:16]
[04:16–06:08]
[06:09–12:22]
[12:22–15:07]
flox init, flox search, flox install, etc.) resembling tools like Homebrew.[15:07–18:33, 23:38–24:49]
[18:33–23:11]
[23:38–24:49]
[24:49–30:19]
[36:01–44:02]
[44:02–46:58]
[46:58–50:03]
[50:03–55:01]
This summary captures the core discussions, memorable quotes, and the evolution of both tooling and engineering culture covered in the episode. For those interested in reproducible software, secure supply chains, cross-platform devops, or the emerging intersection of AI and engineering practice, this conversation is essential listening.