
Deploying and managing cloud workloads is a complex task that requires developers to handle infrastructure, scaling, CI/CD pipelines, and database hosting. Configuring and maintaining Kubernetes, ensuring smooth deployments,
Loading summary
A
Deploying and managing cloud workloads is a complex task that requires developers to handle infrastructure. Scaling CI CD pipelines and database hosting, configuring and maintaining kubernetes. Ensuring smooth deployments and integrating various services efficiently is a common challenge. Will Stewart is the co founder and CEO of Northlake, which is a platform focused on streamlining application deployment and management. In this episode he joins the show to talk about the contemporary challenges and solutions around workload deployment. This episode is hosted by Shawn Falconer. Check the show notes for more information on Shawn's work and where to find him.
B
Will, welcome to Sean.
A
Hi Sean, great to be here.
B
Yeah, thanks for doing this. I'm looking forward to it. I was digging into your background a little bit and it seems like outside of some time that you spent in university, most of your career has been sort of like founding companies. I'm kind of curious what has been that interest in being a founder and how did you go from where you grew up to essentially now founding and running a major infrastructure company?
A
That's absolutely true. Most of my professional life I've been a founder. Whilst at school I was always trying to work on side projects, deploying game servers, trying to build a game server hosting platform whilst at university and then we realized that myself, my co founder, that potentially that was an opportunity for us to do full time, we want to take on that opportunity and during my university years we learned how to program and iterating on things were being open sourced like Rancher and Mesos and we were able to really start pushing the boundaries with container deployment whilst at university as a side project and then we started to think do we want to get full time jobs or could we build a company, could we build a platform? And we were having a look, could we join an accelerator program? And there was an accelerator in Europe called the Family and that really put us on a journey of building a startup and meeting the right people so transparently. For me I was in a small city in the north of England called Lincoln. There isn't much of a technology hub, especially in technology. It's the home for the Magna Carta and the birthplace of the tank in World War I. But it doesn't have sort of a core software engine engineering history. So being able to join this accelerator and learn what building a startup was all about was great for myself. And my co founder Fred went to university in Switzerland and just always been interested in side projects and building. And then during university I studied IT management for business which gave me optionality in starting with a Little bit of business, a little bit of computer science. And during my third year of university I joined as a placement student at a digital agency in the UK called Clock. That was a great opportunity for me to see how a real business is run and how you deal with customers, how you build multiple projects concurrently, how you hire engineers and have engineers working on projects. So I was able to see how our business was run. And then during that I also saw how tough infrastructure was, how if you've got lots of different customers, lots of different requirements, lots of staging environments and trying to get releases to production, I was seeing things break and thought, well, the project myself and my co founder were building in our spare time for deploying game servers could also be used in parallel to deploy any software we were thinking about Docker files and ultimately a microservice is very similar to a game server and that's pretty much what we've had our heads in ever since. So pretty much that's how we got into it almost by accident.
B
What were some of those challenges you ran into working in those ID departments that you saw with people trying to do essentially run distributed systems and where you were able to identify the opportunity with some of the stuff that you were doing in gaming.
A
So also just stems from when we tried to deploy our own backend for this game server hosting platform. I always call that game servers are a gateway drug to cloud infrastructure and kubernetes because I could spin up a mesos cluster and deploy containers fairly simply. But it was challenging to get the control plane running. And then there was no CI cd, there was no developer experience, there was no way to push to version control and have build and deploy straight to prod. That was not a thing. And Heroku had that. Heroku had this great experience where you can build and deploy and at the time it wasn't necessary for complex applications. It was more simplistic. And in the role as a placement software engineer, you're tasked with spinning up staging environments, but there was a queue for staging or when you have a new customer, you had to spin up brand new infrastructure. It would take weeks to spin up. And it was like, well, this could be done in minutes or hours. Maybe that's where sort of our expertise of trying to build this developer experience here could be quite helpful. And interestingly enough, the company I was on at my placement with actually leverages our platform now in production. So I've managed to come full circle now, learning the trade and then now trying to solve some of those problems for them directly. But if we step back a little bit, a lot of these platforms like Heroku and Cloud Foundry have solved this developer experience and in today's world a lot of those learnings have been forgotten and teams are now having to build internal platforms from scratch. And this is what I was seeing firsthand. And we set out to try and solve that problem where you could get a great developer experience inside your own cloud account, leveraging things like kubernetes and containers and then building foundational technology on top so that teams could self serve and deploy complex workloads to production.
B
You mentioned there that it would take potentially weeks to spin up these new environments. Why weeks? Is that an organizational problem, developer experience problem or technology problem? Maybe like a combination of the three.
A
So I think there's a mindset where people think in infrastructure are not workloads. I've got a provision, EC2, I've got to have all my terraform, it's got to go to this team, it's got to be approved, it's got to be tested, it's got to be manually provisioned. Maybe there was some bare metal, maybe there was some public cloud. There's then different workflows for different customers, different team uses a different cloud provider. If you think again about this, if you more think about what we're trying to do here, it's deploy a microservice and a database and a cron job. Most infrastructure stacks boil down to three types of workloads. And how do you provide a common primitive so that teams can say I need postgres, I need a Java Spring boot application, how do I deploy that consistently for all environments? And that's how we've been thinking about it at Northlank is this high level abstraction of infrastructure in kubernetes. And now we're six years in and trying to find this right balance of control and ease of use is I think the core problem that internal platform teams and DevOps teams are trying to struggle with. How do we give the self service developer experience whilst also allowing deployment of complex workloads? And I think that fine balance is where a lot of teams are struggling today, whether that be on EKs, GKE, AKs or even ECs in Cloud Run. There are a lot of cloud infrastructure, there are so many tools now in the last 10 years but ironically it's if not more complex than it was 10 years ago to ship a container. And I think there's a core issue here of teams are focusing too much on building the factory to deliver a product than building the products themselves. And I think we need to change how teams are thinking about their infrastructure that why are we investing so much on building a factory to essentially host our core product when building Custom infrastructure is non differentiated and we should be focusing more on building business logic.
B
How do you think that, you know, what happened that led to us, you know, kind of being in this position? Like in some ways it kind of reminds me of in the data engineering world, the modern data stack, they make for these like pretty pictures, you got all these logos and you know, different complex pipelines. But the reality is like no one wants to have to stitch together 15 different products to just like move data from like you know, an S3 bucket down to you know, whatever snowflake or something like that. So how have we kind of gotten to this place where shipping a container we just made it progressively more complicated.
A
So I think there's a combination of reasons. One is deploying infrastructure is inherently complex. You've got to think about performance, security, disaster recovery. You want to give more controls to developers, but you don't want to give them too much controls. It's like a continuous ever moving whack, a mole of requirements. And how teams have solved that is by stitching tools together. Because ultimately that's where the market was going with open source technology. Kubernetes winning the sort of enterprise war. Kubernetes is now the default way enterprise deploy software. And it's very easy to start a kubernetes cluster, it's very easy to run a helm install, but it's very hard to then integrate all of these very easy things to do into one consistent unified platform that tens of thousands of engineers consume. So as soon as a team needs the A to Z of requirements of service mesh, disaster recovery, logging metrics, you then start having a hodgepodge of 15, 20 different open source tools that then start to break. They don't glue well together. And I think having a more high level approach where maybe some of these core technologies should be automated at a higher level so that every team isn't building their own data services, every team isn't building their own CI CD pipelines, every team isn't building their own what teams now calling internal developer portals or platforms for example, we can't even agree on what IDP means. Some teams think it's a portal, some think it's a platform. And I think that sometimes teams are focusing too much on the tooling. Ultimately the company has an objective. Build business logic, sell to customer. We have this phrase, customers don't pay you to write YAML. And I think that's more true than ever with stretch budgets, stress teams, more requirements to deliver more with less. And a lot of people are spending a lot of time building the exact replication of building an internal platform across teams. Every single organization with more than 50 software engineers is building an internal platform.
B
And what about you mentioned Heroku? There's other sort of IPaaS systems that have come about past later than Heroku. And Heroku, it was groundbreaking when it came out, it was such a transformation for what you could do. It's had its own stumbling blocks along the way in terms of not necessarily keeping up with everything that's happened on the infrastructure side, but there's also been other newer sort of takes on that idea. But a lot of people who start with those platforms run into like the graduation problem where they reach a certain scale and essentially they need to then move the workloads or what they're doing on those platforms over to doing this using the services available in like AWS or GCP or whatever and manage that themselves. So how do you kind of avoid running into that graduation problem?
A
So fundamentally Heroku was awesome. It still to some degree is awesome. Still a very large acquisition obviously maybe stifled some growth there, but they're still a very successful business. A billion dollars in revenue, that is a wonderful business. But of course there are restrictions. As soon as an organization tries to scale, they bring this in house, as you just said. And the graduation problem is something that we've been thinking deeply at North Lancan. It really stems at vpc. Any company at any serious scale wants to run the workloads in their own vpc, whether that be public or private cloud. And I think there's at the other end of the spectrum, Cloud Foundry and pivotal Cloud Foundry sort of found huge success. And then Kubernetes came along and sort of spoiled the story. And at one end you have Heroku and the other end you have Cloud Foundry. Northlank is trying to sort of solve this problem in the middle, which is teams need a self service developer platform that if you squint Cloud Foundry and Heroku look fairly similar. Engineering team self service deploys a container to production with CI CD built in logs metrics. It's a fairly simple concept but very hard to implement well. And I think Kubernetes in Northlang's case has allowed us to have this. We leverage Kubernetes as an operating system which means that we can run our platform in any cloud, whether it be AWS, GCP, Azure, Oracle on prem, OpenShift on prem, Rancher. And we're trying to solve the consistency problem where you can take this runtime and run it anywhere. And that's where we work with enterprises to run on their own hardware so that they don't have the graduation problem because they can have more flexibility. Kubernetes has provided us the ability to have more complex stateful workloads, high availability data services, preview environments and production release flows. All of this core technology now works consistently across all clouds and, and that means that we found at least that this negates the issue of the graduation problem because teams feel it's running. In my infrastructure I have more control, I have control over data residency, I can meet compliance requirements and I can fine tune my infrastructure exactly how I want. But I also have a great self service developer experience and finding that balance is what we're focused on.
B
Can you walk me through that process? Like what's it like as someone using Northlake to actually be building on it? If I have to essentially host this myself, you know, how do I get away from running into the problems? Essentially like I have to host this myself and now I need to manage it.
A
Exactly. So this easy journey. When I was deploying mesos as a teenager, as you do, there was a product T2IQ built Mesosphere and the installer process was very manual and that's always stuck with me. How do you install a control plane in any cloud in under 30 minutes? And with Northlank a user would be able to sign up, they can then do a cross account link to their AWS GCP Azure and then we've separated control plane and runtime. So northlank is able to spin up an EKS cluster, GKE cluster fully managed inside your cloud account. And then Northlank's control plane then allows you to connect your GitHub, connect your BitBucket, GitLab, define your workloads. You could have 100 microservices, we have teams deploying 1000 microservices on Northlank into their EKs clusters. So you define the workloads, you define the environments, and then a manifest is generated and applied to your EKS cluster. So within 30 minutes you could have an EKS cluster created inside your AWS account and your workloads deployed to production without ever having to know what a Kubernetes cluster is or write a line of YAML.
B
So the code deployment essentially is going to be based on sort of standard like I commit to GitHub and then it's going to essentially take over from there based on something like a GitHub action or however I'm doing CI CD.
A
Exactly. So you connect your version control, we list the webhooks you push to your spring boot application, your net, your Java, your JavaScript node, JS, any application that supports docker or buildpacks you push Northlank does the CI cd, does the deploy, does the release logs, metrics, disaster recovery, and we can even do things like per PR back end previews. So if a team spins up a new pull request, we can then make a replication of production and run those on spot instances to reduce costs. But also give every developer their own preview environment. Which is one thing we hear time and time again across many engineering teams is they're still stuck with two static dev environments and they want every engineer to have a replication of production for preview and it spins down during the night to save on costs. But that's generally the experience you push to get a preview environment, merge to staging, have UAT and some acceptance testing when it passes that you merge to maintain deploys to production.
B
Could this help me with local development as well, where I can essentially have some sort of environment that I'm running my code against that is closer to mimicking actual production?
A
So we've seen a number of teams leverage northlang in different ways. We haven't necessarily designed a local first experience just because running Docker compose locally is just quite a great experience. So we haven't necessarily invested too much time there. But we do have things like Northlake forward, so you can proxy remote containers running in cluster and have them running locally. So if you need sort of to access staging securely or some database or some service running remotely, you can do that. I think one thing we're also building is bring your own kubernetes. So in theory you could import a local cluster running on your machine and have Northlake running, which we do have some people doing, but we haven't necessarily designed for a dev first. We actually call Northlake a post commit platform. So when you've pushed a version control, Northlake handles everything we haven't necessarily designed for a post commit experience.
B
And then both the control plane and the runtime are running within my cloud.
A
So currently by default Northlake has a traditional paas where you sign up, add a card and the control plane and runtime is in northlank's infrastructure, which we're providing secure, multi tenancy. And then as organizations scale, they need the runtime in their cloud. So it's control plane in our infrastructure and then runtime in their infrastructure. And then we have even customers that are requiring the control plane also in their infrastructure. So that's something we're working on, which we call self deployable control plane, where you get a managed experience of running the north flank control plane in your infrastructure. Because transparently, the largest enterprises in the world, that's exactly what they need. And that's what we've been working on. Your team is generating more code than ever, but you're still stuck with rigid legacy tools. Inflexible workflows, manual updates and siloed communication are slowing you down, just as your engineers juggle more pull requests and context switches than ever. That's why there's Monday.com's dev platform. With fully customizable workflows, you can ship faster. No admin bottlenecks, no clunky add ons. Let your developers work in Monday dev or right from their ide. With AI powered integrations that keep every task in context, get full visibility into progress, performance and risk, all in real time, fully synced with GitHub and your entire ecosystem and with business connectivity built in, Monday dev keeps engineering priorities aligned with the impact that matters most. No more admin bottlenecks. Visit money.comdev to learn more.
B
That trend of essentially you got like sort of the SaaS solution, the version where some portion of it's running within someone's own cloud environment via dedicated vpc and then even a version where I can deploy the whole thing and host it within my cloud. Seems to be like the trend of where a lot of both data infrastructure and infrastructure as a service needs to go. Like if you look at my day jobs at Confluent, we have all those versions. Especially with the acquisition of Warp Stream. It's like you want to bring your own cloud. You need to essentially have a solution to that because some customers want that. Do you see that as the future that everyone has to operate in a.
A
Space has to kind of build against 100%? This is the most essential thing that most SaaS organizations need to understand right now is enterprise can't buy your software if they can't deploy it themselves. And if you haven't separated control plane and runtime, you better start thinking about it. And what we've started to think about at Northlank is how do we sort of leverage this opportunity and we call this Biok as a service is if every single engineering team has to build sort of a managed self hosted control plane, that's a huge problem because realistically teams have got to build one product, they don't want to build two. So this is where we're helping some of our customers provide a sort of a one click managed experience to run sort of Northlank bioc plus their software inside a customer's cloud account. And we're finding like minded individuals that understand that enterprise customers want to run their software inside their own cloud. They've got resource commitments, they've made huge GCP and AWS resource commitments, they've made huge billion dollar investments in on prem hardware. It's got to be used for something and it's got to be used for running the SaaS application they really want to run.
B
Are you talking about in this case where like one of your customers who's building Some sort of SaaS application can leverage your technology to offer sort of bring your own cloud out of the box without having to think about building all that stuff themselves?
A
Exactly. So it could be I've just built a new SaaS software and I want to have a multi tenant offering. Then Northlank has the ability for you to provide secure multi tenancy. Just like RPAAS is basically Northlank as a service where we provide the secure multi tenancy where you can run containers with great security on Kubernetes and inside a customer's cloud account. And then again that SaaS provider will may need to run their software securely in their customers account. So essentially for Northlank it's our customers customers and we're doing that through the Northlank API with Biok as a service.
B
Yeah. Can you break down the control plane? Like what is the control plane actually responsible for?
A
So the control plane is almost. So a lot of teams think about infrastructure as code, we like to think of infrastructure as data. So ultimately the job of the control plane is to provide essentially this spec DSL abstraction of cloud infrastructure. So instead of thinking of helm charts, Northlank is thinking about JSON data structure. What's a workload, what's a database, what's a cron job teams are then defining those workloads in northlank that's then being stored in our control plane. And then our control plane is listening and subscribing to all of the Kubernetes clusters. It's observing when things start failing, tries to auto heal. It's listening to Kubernetes events and logs and metrics and then it's essentially acting as a controller is trying to apply and detect drift between the spec that's running in cluster and in our infrastructure. And then we're just applying new manifests as users make changes either via GitOps or through the UI or API. It's essentially our job to provide and generate the Kubernetes manifests and apply those to the cluster.
B
What about the runtime? What is it doing?
A
So the runtime is essentially just running containers, configuring service mesh, making sure that Cilium and the containers are all running healthy at runtime. So essentially we don't actually have an agent running on cluster. We're doing everything through the Kubernetes API, which gives us so much freedom to essentially leverage Kubernetes on any cloud provider. Because the cloud providers are incentivized to have a consistent experience to become compliant. They have to provide a consistent Kubernetes experience for their customers. And that means that when we build a feature for aws, it works on gcp. When we build a feature for Rancher, it works on OpenShift. And then our job is through the Kubernetes API is provide that runtime. So the runtime ultimately is just the user workloads.
B
How does someone who's doing infrastructure as code today using Terraform or whatever play with using northlang?
A
This is something that we're continuously battling. Our sort of stance is you don't need Terraform. Our job is to eliminate your terraform. Let's think of a high level primitive. There's no point thinking in lower level infrastructure primitives anymore because it's now a solved problem. If you start thinking about your workloads, the infrastructure as code tends from HCL to adjacent sort of way of thinking. And maybe some of our customers would prefer us to provide a Terraform plugin. And that's something we have been looking at or even driving Terraform through the Northlake dsl. But currently we're finding enough customers that want to just please help me. I don't want to have GitHub actions, Terraform, Helm charts, Kubernetes manifest, and then there's some pulumi people want simplicity. And that's where we're finding the pushback is you don't need your terraform.
B
What are the primitives that are defined within dsl?
A
So the primitives are things that are like, so block cluster, which defines your structure of your Kubernetes cluster, either in aws, gcp, Azure, things like deployment service, which is an abstraction of a Kubernetes deployment, which define where you want your existing image. Let's say you use GitHub Actions or CircleCI and you use ECR or something like a GitHub container registry. You can then import your existing images, define how many replicas you want, the secrets you want to inject, if you want to enable auto scaling, if you need health checks, sort of defining those configuration options directly on the workload and things like networking how many ports you want to expose, do you want to make it publicly available on a network load balancer? All of these settings are being configured directly on a single object. Where traditionally in kubernetes you'd have to define five to seven different manifests to even achieve that a user's just defining a single configuration. For example, in 130 lines of JSON, I can deploy a highly available zonally redundant Mongo, Postgres and Redis and three microservices into any Kubernetes cluster in less than 130 lines of JSON, whereas in Terraform and Helm charts that would be 1 to 3,000 lines and no one wants to go and write that.
B
In terms of defining a database, how does that work? If I want to use a specific type of database, is that essentially restricted? Like the number of types of databases, is that restricted by what's supported by northflank or does it matter?
A
So we have a number of options and we try and think about this very carefully because stateful workloads are critical to a business. You can't lose the data, and if there is data loss, how do you recover it? So when we think about stateful workloads, we have Currently we offer six managed stateful offerings, which is Postgres, Redis, MongoDB, RabbitMQ, MySQL and Minari, which are great. Those are the most popular sort of open source databases and that's what we find most common when we don't have an offering. Teams want to use Atlas or teams want to use rds, they can. It's essentially a connection string. It's a secret. So if you want to leverage those offerings, you leverage an offlank secret group to inject those securely at runtime. And how we build our stateful workloads is essentially it's an operator pattern in kubernetes and we have an operator that can apply these stateful workloads in cluster. So really it's up to the customer. Do they want to run stateful databases in kubernetes, which we think is great for cost efficiencies and simplicity. Some customers don't agree with that vision and they want to use RDS and cloud SQL. And if they do, that's okay, we don't care. And then also at the side, teams want to run other stateful workloads in Kubernetes and we're not able to provide those managed services yet. And it's on our roadmap. But then they can run helm charts, they can leverage north bank, what we call Bring youg an Add on, where they can bring their own sort of tidied up helm charts that sort of provide data services in cluster and they're free to go crazy with what data services they want to run.
B
So in that case where you're actually extending this with your own helm charts, is this kind of in some ways to tie this to maybe programming language, you're providing the abstract classes, these primitives that I can operate, build this thing at that level, but if they need to, I can essentially extend those classes and do my own bespoke work to really, really tailor this based on what my needs are.
A
That's exactly right. And one phrase that we've used repeatedly before is how do we find the right abstractions to Kubernetes? Because ultimately it's won the enterprise war. It's going to be here for a long time. I know a number of startups in our space have not gotten their Kubernetes through and I think that's a mistake. I think for us to deliver the BYOC model and this vision of running in VPC and having this cloud abstraction, Kubernetes is the landing target for today and in the future it may not be. There'll be another Orchestrator come along in three, six years time, Northlang will be ready to transition that. The first version of Northlake was running on Mesos and then we transitioned to Kubernetes when it came along. So our job is to provide the right abstractions at a workload level and then we will be able to change in and out the ultimate underlying Orchestrator when it comes along. And you're absolutely right that it's our job to enable you to deploy your complex workloads. And that means if we don't do something, someone's blocked from deploying that. So how do we unblock them? Well, through bring your own add on, bringing your own helm charts and then exposing as many of the features of Kubernetes offers directly in product without ironically realizing it's Kubernetes behind the scenes.
B
And then if I'm running certain parts of my code essentially at different endpoints as microservices, how do I map that code base essentially to the microservice deployment.
A
So in northlang we have a couple of primitives called build and deploy services. So essentially think of a build service as a repo blink. This is my repository. This is going to produce a docker image, an OCI image and then you have to build those relationships between how do I get my code from build to deploy? And in northlang that looks like a pipeline. And then these deployment targets all have networking configurations, DNS names and then exposed by the service mesh. So when we start demoing Northlake to some of our prospects or customers, we show them a small demo of deploying nginx because that's the most simple container to deploy. How would you get a container running nginx on Northlake? And we start showing some of the configuration of this is where you enter the Docker hub address and then immediately we're doing a scan of the manifestor to take the port is it publicly available? And then we pre fill the networking configuration for port 80 and we expose that publicly and we allow you to configure the resources. So in about 30 seconds you can go from existing image to networking configuration and deployment running in cluster in under 30 seconds. APIs are the foundation of Reliable AI. And Reliable APIs start with Postman. Trusted by 98% of the Fortune 500, Postman is the platform that helps over 40 million developers build and scale the APIs behind their most critical business workflows. With Postman, teams get centralized access to the latest LLMs and APIs, MCP support and no code workflows all in one platform, quickly integrate critical tools and build multi step agents without writing a single line of code. Start building smarter, more reliable agents today. Visit postman.comsed to learn more.
B
Capital One's tech team isn't just talking about multigentic AI. They already deployed one. It's called Chat Concierge and is simplifying car shopping using self reflection and layered reasoning with live API checks. It doesn't just help buyers find a car they love, it helps schedule a test drive, get pre approved for financing and estimate trade in value. Advanced, intuitive and deployed. That's how they stack. That's technology at Capital One. So a lot of your inspiration and journey came from running game servers. But what is the difference between essentially running, deploying game servers and enterprise applications? Enterprise software.
A
So one is obviously revenue focused and deploying game servers for fun was more enjoyment. The difference is if our software goes down, our customers lose money. And that's something we try and ingrain in all of our engineers is that we're trying to build a critical path software. And if a game server goes down, some people can't play an online game. They can't go and compete in the evening. But ultimately, if a database goes down, there are serious ramifications. If a backup doesn't work and a customer loses their data, that's really important accounting information or critical business logic that has gone awry. And I think it's more about understanding the customers and what are their demands and requirements. Before, we were deploying mesos for fun, and now it's serious and we've raised some money, we've got some customers. It's our goal to handle this with care because it is important.
B
How does that change the way that you go about actually building the software? Do you have to essentially spin up new teams that have different responsibilities, have people focused on security, for example, and change the way that you actually test?
A
I'd say that one interesting fact at northflank is we haven't had to deprecate a single feature in six years. We've taken sort of maintaining our platform very seriously, is that if you release a feature into the world in a cloud infrastructure product, you've got to do so with some credibility because a customer needs to know that it's not going to go away. And how we do that is just think very thoughtfully about how we integrate principles, how we integrate primitives into our code. Let's release a feature. But it's got to be well integrated into the overall vision of the product. And we have enterprise customers that sign up and go, we need this feature to sign this contract. And where we don't go, sure, we'll do it. We go, why? Why, why? And then it's how do we then take those requirements and then build it consistently into the vision? And then how do we build that quickly as an MVP to have this customer sign up and sign a contract, then how do we build it for all of our customers so that we could expose it through feature flag to our enterprise customers? So I think it's quite common across most companies, but that's just how we think about it.
B
So since you're dealing with sort of these enterprise applications, you're building this abstraction layer, you want to be careful about introducing new types of primitives. Does that limit how quick you can sort of adapt to new technology that's coming online?
A
So yes and no. So, for example, we haven't immediate. Like, for example, our deployment target is kubernetes currently, so we didn't really jump into WASM or serverless containers, because we were trying to achieve a vision of serverless containers without wasm. So we didn't necessarily jump into that. We could have integrated cloudflare workers, we could have integrated Fastly's Computer Edge, which were great products and they're great developer experience. But we didn't need to go down that route because we're trying to provide this abstraction which this primitive wouldn't necessarily be required. But then on the other side, currently GPUs are hot. Some of our customers started requesting, hey, it's great that I can run all of my CPU workloads, but what about all my GPU workloads? And we thought, well, in a cloud platform like northflank, it should just be another primitive. So then we work quickly to allow our customers to run GPU workloads in their cloud account. And, and it's funny because a lot of companies are building these GPU automation platforms, but they have no solution to microservices, they have no solution to databases, they have no solution to cron jobs. And when an organization is thinking about deploying their infrastructure, they've got five types of applications now, with GPU now being an extra one. And that's where northlank can fit in quite nicely. It's because we have a consistent story for deploying all of your infrastructure in the same vpc. Yeah.
B
And ultimately, even if you're building, you're doing inference, for example, to build AI applications, there's some application part of this that's not going to be running on GPUs. So if you're doing that with a specific, I guess, GPU platform provider, then you're going to also be doing something different. So it kind of goes back to the same problem that we talked about at the beginning, where ultimately no one wants to have to stitch together like 15 different solutions to deploy their software.
A
This is exactly right. And we just posted a case study with a company called Weights. They have two engineers working on this full time. They come from Pinterest, highly skilled engineers. They could have done this themselves. They started deploying their CPU and GPU workloads with Northlake, and now they're servicing millions of users and there's just two of them. I'm just completely in awe of them that they can scale such an awesome application, deploy so much infrastructure. There's just two of them. And I compare with some teams where they got 25 DevOps engineers and they handle about 100 RPS. I've started to ask Some of that when I speak to some prospects, I say what's your RPS? And some of them say 26. I say, how many engineers do you have? And the answer can be 50 or.
B
30, but they need two engineers for each kind of building on that. With companies needing so many DevOps engineers for reaching minimal scale, how do you see the role of DevOps evolving in the next few years, given if technology like North Flank and other players in the market essentially take off?
A
So I think that Linux system administrator became DevOps, DevOps became platform engineer, and ultimately the goal is to provide a consistent and stable way to deploy. Ultimately their job is to provide developer experience to application engineers. And the concept of you build it, you run it was hot with Heroku and Cloud Foundry and in recent times has gone away. But I think with limited resources, organizations just don't have the resources to spin up huge teams to do things that aren't directly moving the needle for the business. DevOps is going to be seen as a cost center, not a relief. And teams are going to have to adapt to that because ultimately if you're not driving the critical business logic of the business, there'll be questions there. So I think that platforms like Northlank can reinforce DevOps teams to provide more value to their business faster, rather than spending three years building an internal developer platform, which makes no sense, it's too expensive, takes too long, and then every company that does that then has to re platform every five years anyway. So all of that effort was almost for nothing.
B
How do you see, I guess, the future of cloud infrastructure management changing as well?
A
So I think with platforms like northlank and there will be others, it sort of makes cloud infrastructure a commodity. So I think clearly we all know that the three winners in public cloud spending, but there's also a lot of private cloud spending as well. There's still many hundreds of billions of dollars being invested in private cloud. I think that the cost of cloud infrastructure has stopped coming down. It's actually now in theory probably going up. And we've been used to cloud infrastructure becoming cheaper and that's not happening anymore. With GPUs, you spend $66,000 or 8h1 hundreds on GCP on demand. @ the moment that's completely insane. And with platforms like northlank, infrastructure becomes more like a commodity. I think that starts to drive the cost down. I think that the cloud provider becomes less relevant because ultimately they're providing. It's like a water company, it doesn't really matter who's provides, you Your water or electricity or gas, you just get some hardware and then it's what you run on top of that that matters. And that's where I think the interface that developers consume cloud infrastructure will be more important.
B
So what's next for northlank?
A
Currently we're building sort of self deployable control plane automation where an enterprise can run Northlanc on the control plane themselves. That's really quite an essential part of our mission going forward. Self service GPUs. We have a great bring your own cloud offering for GPUs, but many of our customers are demanding access to spot GPUs in seconds, something we're working on very diligently at the moment, and then just providing more enterprise features. We come across organizations using Cloud foundry still day in, day out, huge deployments of millions of containers and they're stuck because they then have to go and build from scratch. You know the wonders of Cloud Foundry or turn to app platform or some OpenShift homegrown solution. And our opportunity is to try and capture this huge amount of demand and provide an out of the box solution that hopefully you get through some procurement. Difficult legal challenges at many enterprises, but that's what we're trying to solve this year.
B
Does the bring your own cloud form of deployment help with some of those procurement challenges?
A
I'd say that for startups, 100%, but then many startups don't have the same legal challenges that some of these enterprises have. I think the self deployable control plane is the biggest unlock. When you go to Kubecon or any event, almost every developer in the room is talking about air gapped. I think in startups, in SF and SaaS startups, no one's ever talking about air gapped. But when you go boots on the ground and listening to organizations need to consume all this software. Air gapped. Air gapped. Air gapped. Air gapped. So I think there's a disconnect in what people are needing in enterprise and what people are delivering. So I think that's quite important. But the bring your own cloud offering is a huge unlock for Northlank and startups. The ability to self serve and deploy to production in 30 minutes is huge for us. So investing in our enterprise story and then also investing in how we unlock startups to deploy to production with less resources.
B
Awesome. Well Will, thanks so much for being here.
A
Thank you so much and have a great day.
B
Yeah, cheers.
A
Bye bye.
Date: August 21, 2025
Host: Shawn Falconer
Guest: Will Stewart (Co-founder & CEO, Northflank)
This episode dives into the contemporary challenges of deploying complex software workloads in the cloud, focusing on developer experience, infrastructure abstraction, platform engineering, and the evolution of internal tools. Shawn Falconer speaks with Will Stewart about how his startup, Northflank, aims to streamline application deployment across clouds and on-prem environments, avoiding the notorious "graduation problem" faced by Heroku-like platforms. The discussion covers technological trends, enterprise needs, abstractions over Kubernetes, developer self-service, and the future of DevOps and platform engineering.
Timestamps: [00:53]–[03:41]
"I was in a small city in the north of England called Lincoln... being able to join this accelerator and learn what building a startup was all about was great." [01:45]
Timestamps: [03:41]–[07:33]
"Game servers are a gateway drug to cloud infrastructure and Kubernetes because I could spin up a Mesos cluster and deploy containers fairly simply. But it was challenging to get the control plane running." [03:58]
Timestamps: [07:33]–[09:57]
"It's very easy to start a kubernetes cluster, it's very easy to run a helm install, but it's very hard to then integrate all of these very easy things into one consistent unified platform." [08:33]
Timestamps: [09:57]–[12:54]
"Northlank is trying to solve this problem in the middle... teams need a self-service developer platform that if you squint Cloud Foundry and Heroku look fairly similar." [11:19]
Timestamps: [12:54]–[16:17]
"Within 30 minutes you could have an EKS cluster created inside your AWS account and your workloads deployed to production without ever having to know what a Kubernetes cluster is or write a line of YAML." [13:57]
"We actually call Northlake a post commit platform. So when you've pushed a version control, Northlake handles everything..." [15:52]
Timestamps: [17:49]–[19:27]
"Enterprise can't buy your software if they can't deploy it themselves. And if you haven't separated control plane and runtime, you better start thinking about it." [18:25]
Timestamps: [20:11]–[25:45]
"Our control plane is listening and subscribing to all of the Kubernetes clusters. It's observing when things start failing, tries to auto heal..." [20:52]
"Our job is to provide the right abstractions at a workload level and then we will be able to change in and out the ultimate underlying Orchestrator when it comes along." [26:27]
Timestamps: [24:03]–[25:45]
Timestamps: [26:08]–[27:09]
Timestamps: [29:00]–[31:43]
"If our software goes down, our customers lose money. And that's something we try and ingrain in all of our engineers..." [29:39]
Timestamps: [31:59]–[33:41]
Timestamps: [34:25]–[35:55]
"DevOps is going to be seen as a cost center, not a relief... If you're not driving the critical business logic of the business, there'll be questions there." [35:20]
Timestamps: [35:59]–[37:04]
"The cloud provider becomes less relevant because ultimately... it's what you run on top of that that matters." [36:39]
Timestamps: [37:04]–[39:04]
"Customers don't pay you to write YAML."
— Will Stewart, [08:53]
“If our software goes down, our customers lose money. And that’s something we try and ingrain in all of our engineers...”
— Will Stewart, [29:39]
“There are a lot of cloud infrastructure [tools], there are so many tools now in the last 10 years but ironically it's if not more complex than it was 10 years ago to ship a container.”
— Will Stewart, [06:48]
“DevOps is going to be seen as a cost center, not a relief. And teams are going to have to adapt to that.”
— Will Stewart, [35:20]
Will Stewart provides an insightful, candid view into the evolution of cloud workload deployment, touching on both the technical and organizational challenges. Northflank positions itself as an enabling abstraction over the flaky, piecemeal world of open-source infra and custom internal platforms, focusing instead on developer experience, cloud agnosticism, and scalable operations. The episode is valuable for anyone interested in DevOps, platform engineering, cloud infrastructure, or SaaS business dynamics.