
Engineering teams often build microservices as their systems grow, but over time this can lead to a fragmented ecosystem with scattered data access patterns, duplicated business logic, and an uneven developer experience.
Loading summary
A
Are you passionate about software development and the tech industry? Software Engineering Daily is looking for a new podcast host to grow its hosting team. In this role, you'll help shape the show's editorial direction and interview engineers, founders, hackers and tech leaders. Podcasting experience is a plus, but not required. Curiosity, great communication skills and a genuine interest in the craft of building software are what matter most. If this sounds like you, reach out at editoroftwareengineeringdaily.com Engineering teams often build microservices as their systems grow, but over time this can lead to a fragmented ecosystem with scattered data access patterns, duplicated business logic, and an uneven developer experience. A unified data graph with a consistent execution layer helps address these challenges by centralizing schema, simplifying how teams compose functionality, and reducing operational overhead while preserving performance and reliability. Viaduct is Airbnb's open source, data oriented service, mesh and GraphQL platform. Built around a single highly connected central schema, it has played a major role in scaling Airbnb's engineering organization. Adam Miskovich is a principal software engineer at Airbnb and he worked on Viaduct. He joins the podcast with Gregor Van to talk about how Viaduct originated inside Airbnb, the architectural principles that shaped it, the challenges of scaling GraphQL to millions of queries per second, and why the team decided to open source the platform. They also discuss the future of backend development in an AI driven world and how unified data layers may influence the next generation of engineering systems. Gregor Vand is a security focused technologist, having previously been a CTO across cybersecurity, cyber insurance and general software engineering companies. He is based in Singapore and can be found via his profile at VAND, HK or on LinkedIn.
B
Hello and welcome to Software Engineering Daily. My guest today is Adam Miscavige.
C
Hey, how's it going? Nice to be here.
B
Yeah, great to have you here. Today we're going to be talking about Viaduct and that is a spin out from Airbnb, so we're going to be understanding what happened there. But yeah, Adam, I'd love you just to talk to us a bit about first of all just your journey to maybe to Airbnb and then where did Viaduct come from and how did that come about?
C
Yeah, absolutely, yeah. So I have been a software engineer for gosh, it's pushing 20 years or something professionally these days and I actually took a little bit of a non traditional path to kind of where I'm at at Airbnb, kind of working in big tech. I have done a lot of work at a Lot of small companies. I ran an agency, like an interactive agency in Baltimore, Maryland for a while, building web and mobile apps for folks and interactive, like installations. Worked at a company called Expo. Some folks, some listeners might be familiar with doing react native tooling and then eventually kind of ended up at Airbnb. So I kind of went from small company to big company instead of big company to small company. I think like a lot of folks do. So it's definitely been a learning experience working at Big Tech. And I've, at this point, I've been at Airbnb close to eight years. So I've seen it kind of grow from a 500 person engineering organization to 3,000. And also the company around us has grown a lot as well. So, yeah, it's definitely been an interesting journey.
B
That's kind of interesting. It's actually quite similar to myself. We're at an agency for a long time which was by most standards small company. And yeah, I'm now in a bigger company, but that's only 150 people. But still feels big to me.
C
I think what's cool about like working at like a interactive agency is you kind of get exposed to a lot of different stuff. Right. And it like forces you to kind of be a generalist. So, you know, I was actually hired at Airbnb originally as a front end engineer and had done a lot of front end development prior to Airbnb. And at this point I haven't done front end development for. I said I was, I've been there about eight years. I haven't done it for about seven and a half. So that kind of takes us into the to the ground GraphQL story and viaduct and all of those types of things. But yeah, it's been cool to work at big Tech and bring kind of the generalist experience into that.
B
Yeah, for sure. And I think as you call out, it will, as we'll hear, that will lend itself really well to kind of, I'm sure why Viaduct came about and what it is today. Because, yeah, the more problems you're exposed to, the more realization that you have to actually understand so many different business types and requirements and so on and so forth. And inside Big Tech, you're effectively working on different businesses, if you want to look at it that way. So, yeah, how did Viaduct come about? What's the story there?
C
Sure. So about the second day after I joined Airbnb, I was pulled into a working group. It was called the GraphQL Working Group. And a bunch of engineers at Airbnb had Been thinking about using GraphQL. We were kind of in this interesting spot. We were starting to do microservices, kind of move out of our Ruby on Rails monolith. We wanted especially like the iOS and Android and web engineers, right? They really wanted like, you know, strongly typed APIs. There were some experiments with GraphQL in the Ruby space, right? We had this thing that I think was open sourced at some point called graphist, which was like this very opinionated kind of framework. It actually wasn't GraphQL, although you could imagine GraphQL layered on top of that. That was kind of this opinionated framework in Ruby on Rails for building our endpoints, our API endpoints. And yeah, so I was kind of pulled into this working group and there was a bunch of folks in there. I mean, it was probably 20 or so folks from backend, from front end, whatever it might be thinking about how do we adopt GraphQL in this new kind of microservice world that we were moving into. And we had, there was already an opinion that was relatively pervasive and then kind of became a bit more pervasive around how to structure our microservices at Airbnb. So you can imagine, you know, we came from this big Ruby on Rails monolith, had millions of lines of code in it, and we started to kind of carve out chunks like lots of folks do, right? We had a service for listings, kind of some of our search components were pulled out, that type of thing. But we were pretty early at that point in like kind of building a bunch of services. But the way we were thinking about structuring the microservice architecture, soas you'll probably hear me call it a bunch during this conversation is kind of have presentation services, derived data services and data services. So data services at the bottom, pretty straightforward. They wrap essentially databases, provide kind of a thrift. We use thrifted Airbnb, so kind of provide a thrift API over core data. You have the presentation services up top that are really, you know, I would say how they started is ports of controller logic from, you know, an MVC system inside of Ruby on Rails. And so they were very tightly scoped to like, you know, they exposed just like Restful or RPC over JSON type of endpoints, right? And then you kind of have the middle tier, which is all the random derived data services that pull data from X, Y and Z places and munch it together. So the GraphQL story was really focused at the beginning around this presentation service layer and there was a lot of trepidation at first because we were trying to figure out like, okay, well, we've already started to build these presentation services and we had a couple big ones at that point, right. And we needed to figure out how to let people continue to build these things, because that's what we said we were going to do. But how do we put kind of GraphQL on top of them? So our first general approach was very far from Viaduct, which was just kind of convert the thrift schema to GraphQL and stitch all of the presentation service schema together into one GraphQL schema and one GraphQL endpoint. And we definitely weren't the only people to take that approach. And mind you, this is essentially seven years ago. Right?
B
I was just going to say, I mean, because you say this was like day two working group.
C
It really was like it was day two. Yeah.
B
So I mean, this was around when quite a few companies, I guess were starting to, well, probably some had maybe done it beforehand, but this was quite a big time for bigger companies to say, hey, we actually think GraphQL is where we should be going with our APIs.
C
Right? I mean, yeah, so actually, let me set the stage a little bit. So like, this is pre Apollo Federation. So like it's pre all of that stuff. Right. And remember, GraphQL actually just had its kind of 10 year anniversary back in September of like open source. So yeah, so this is a few years into GraphQL. There was no Apollo Federation. Apollo did exist, but it was kind of early. They were trying to figure out, you know, what their business model was going to be, how to bring it to enterprises, that type of thing. The client space outside of like Relay was still pretty nascent. You know, GraphQL JS was mature, ish, but like, you know, not used in a lot of spaces. And so everyone was trying to think about how to bring big companies and enterprises, you know, into GraphQL. And yeah, the schema stitching idea was pretty popular at that point. Right. That was how you did it if you didn't want to have one kind of monolithic GraphQL server. So that's kind of the space that we were working in. Right. So we were, like I said, maybe not the first, but kind of on the forefront of what folks were doing then. And so we took these thrift APIs, we converted them to GraphQL and we, we stitched them all together and what the schema that kind of we ended up with was what I like to Call like service oriented GraphQL. I mean, it was not a entity graph. Right. In the way that we did it. Right. It was literally like service foo as like a top level field and then the endpoints for service foo underneath. Right. And we always knew that we were going to like transition away from that somehow. But what this gave us was the ability to get clients using GraphQL across iOS, Android and web, get that stack in there, figure out the cogen situation on the clients, all of those types of things. Right. And we could pretty quickly, and you can imagine, since these endpoints were the same kind of shape as our REST endpoints, then writing code against them, client side code was relatively straightforward. And then you could easily migrate from the old version to this new typed.
B
Version, given this was largely all internal. As in internal APIs.
C
Yeah, it's all internal APIs.
B
Yeah. I'm sort of sidebarring here, but how was that, I guess communicated, especially back then, like, hey, we're moving to GraphQL. That's quite a big shift for probably the number of services that we're talking about here. So were you the person having to deliver this news, so to speak, or how was that done?
C
Yeah, it was me and a few other folks that were working on this. I kind of was the main sort of backend guy at this time. But to your point about going and telling people, the whole idea here was that the conversion from thrift to graphql was automatic. Right. So for the most part, the backend folks didn't really need to know. Right. It was kind of interesting. We were like protecting the backend people back then. I don't know, it's very different now. I wouldn't say we do that, but back then it was like client engineers want to do some crazy stuff, protect the backend people, let them focus on their microservice migration. Right. And so that's what we did. We really made it as trivial as possible on the backend side. And then on the client side, people were much more eager to adopt the new tools. And so it wasn't really a struggle to get folks to adopt the new tools. Right. They wanted it. Right. Especially on the iOS and Android side. But. But then, you know, web was kind of a fast follow there.
B
Yeah.
C
And we had a couple champions in each kind of client platform that really wanted to see it succeed.
B
For sure.
C
That was kind of, you know, the origins of GraphQL at Airbnb and where it. How GraphQL itself really got its foothold. But then around, let's See, was it the summer of 2019, I believe. Yeah. So basically this had been around for about a year. This like GraphQL new GraphQL stuff had been around in Airbnb for about a year. Summer 2019, spring and summer, kind of hired a new CTO. He comes in, he is like, what's going on with like our data at Airbnb? Like, how are we doing this? Right? And we had kind of a fragmented data store situation. Offline data was crazy. We were having trouble with like our data pipelines back then, you know, and we, we were thinking about IPOing at that point, right? And so it's like, well, we got to get our core data to make some sense, right, so that we can use it for financial reporting and all of the things that are required there. And so we spun up this working group. You'll notice a trend of working groups. We spun up this working group called the Data Architecture Working Group, brought in an outside consultant named Ramey Stata and really started to think about the whole stack end to end. So whether it was online side, you know, APIs, whatever, whether it's the offline side, we were really, nothing was off limits to think about changing. We had like eight people in this group, bunch of different disciplines from all over the company, and we sat in a room for what seemed like, you know, months at a time trying to figure out what we were going to do. Bunch of things came out of that, a bunch of improvements to the offline world. It's much, much better now and took a lot of years, but. But it got there a bunch of changes in the online data store side. So born out of this Data Architecture working group was kind of a rethink of our kind of core online data system, which is called uds, stands for Unified Data Store. And that's a big project, been going on for a while and at this point a lot of data has migrated to it. And then on the API side, really this idea of what became Viaduct, or what we call a data oriented service mesh, but really is a more simplified way to say it is just kind of a unified data access layer really emerged as kind of a key need to try to simplify how we build APIs, how we expose data to clients. So we really shipped the very first version of Viaduct, which really wasn't anything super crazy. It was really just a GraphQL server that was separate from the service oriented GraphQL. And we figured out how to kind of stitch it in and we shipped this, this the first version of IoT. I mean, really early. I mean, a couple months in to the project, we shipped it. And I think he was working on, like, the Trips product at the time. So basically, you know, imagine you go and to airbnb in the upper right hand corner on your. On your home screen and, you know, view your current trip. And so that was like, probably the first, like, viaduct powered feature back then. That and maybe wishlist. So, yeah, that was kind of the original origins.
B
Very cool.
C
You're a developer who wants to innovate.
B
Instead, you're stuck fixing bottlenecks and fighting legacy code.
C
MongoDB can help. It's a flexible, unified platform that's built.
B
For developers by developers.
C
MongoDB is acid compliant enterprise ready with the capabilities you need to ship AI apps fast. That's why so many of the Fortune 500 trust MongoDB with their most critical workloads. Ready to think outside rows and columns.
B
Start building@mongodb.com Build Again. Sidebar Question. But since we're going to probably use the word a lot, I did grow up in a country that have a lot of viaducts, but maybe you could just explain what is a viaduct? And why did that kind of become the name, I guess viaducts.
C
Somebody will yell at me if I try to give some exact definition, but essentially it's a bridge.
B
A bridge where water, like, flows over it, basically.
C
No, that's an aqueduct.
B
Oh, there we go.
C
Wow. Yes. A viaduct is a bridge over a span of, you know, it could be over water, but usually it carries, you know, trains or cars or something like that.
B
Then I've learned something today. I called everything a viaduct. Okay.
C
Does tend to have arches, though, much like an aqueduct. So I don't know. Hopefully podcast listeners don't go crazy on me. Maybe there's some, like, technical definition of a viaduct. But anyway, the general idea, Right. The reason why we call it viaduct, we're kind of traditionally pretty bad at naming things. At Airbnb, we always put air in front of them. But with Viaduct, you know, it's really just. It's a connector, it's a bridge. It's trying to connect things together. And we had a lot of services, and that number of services grew tremendously over time. And. Yeah, that was the idea there.
B
Yeah. Okay, so let's kind of get into. I guess there was a sort of philosophy shift where there's a great blog post which we'll link to, that you wrote about all of this about two months ago. There are kind of three guiding principles sort of to that Philosophy Shift, which was essential schema, hosted business logic, and what's called the RE entrant API. And we'll obviously get into that. But you said in that blog post, from the beginning, we've been encouraging teams to host their business logic directly in Viaduct. This runs counter to what many consider to be best practices in GraphQL. So was that from the kind of beginning beginning, or is that something that's kind of come through in time?
C
It's definitely from the beginning beginning. So, like I said, we had built the original GraphQL system, very much integrated with our microservice architecture. But at that moment in Airbnb's kind of engineering journey, there was a bit of microservice fatigue. And while we have a lot more tooling nowadays, that helps with microservice development, at Airbnb in those early days, it was tough to have quick iteration to understand your dependencies. And also, you know, we had this kind of really opinionated framework for how you actually write business logic in the microservice world. So it was just, I think a lot of people found it really tough to, like, be iterative. And so when we built Viaduct, you know, the idea was, you know, like, yeah, we'll figure out how to, like, scale it. It's like, do things that don'. Scale.
B
Right.
C
We'll figure out how to scale it at some point, but for now, let's just write code in there and we'll have some opinion on how to write the code so that it doesn't go completely insane. But at the end of the day, we wanted to build a platform that really was kind of, you know, we had. We had many early ideas of, like, how it can just be like airbnbs, you know, to use a buzzword that a lot of people hate, like serverless platform. Right. And I'd say it's had its ups and downs of that decision. And I think we have a lot of interesting things in the pipeline with some of the work that we've done with Viaduct Modern, which is what we open sourced, but it definitely ended up being like, a pretty core tenet from the beginning.
B
Yeah. I mean, just to kind of call out for anyone. Well, we've touched on Apollo already, but a global schema approach, that's actually unlike Apollo or like, GraphQL modules. So that's kind of the. The differentiator here, right?
C
That's right. I think there is some interesting similarities with Apollo and Apollo Federation. I mean, at the end of the day, like, Apollo talks a lot and has A lot of success with this concept of a super graph. Right. And in their case, yes, it is a super graph made up of sub graphs that are kind of hosted in services. But Viaduct is not necessarily that much different when it comes to this general idea of having a unified graph and vending kind of essentially one schema to your clients. And in our case, our sub graphs are not independent services, but they're what we call Viaduct tenants. Right. So Viaduct, even though you're writing business logic in Viaduct. Right. It's not just a complete free for all. Right. There is a rhyme or reason to how we organize the Viaduct kind of monolithic system at the moment. Right. Which is, you know, we have these things called tenant modules. And tenant modules have schema, they have code, there's opinions on how you write your schema, there's opinions on how you write the code. And, you know, they can be essentially packaged up. There's a little bit of nuance there. But if you look in the code base. Right. They kind of look like little services, honestly. It's just that we host them in one larger platform.
B
Got it. And then this term re entrancy. Could you talk to us about that? I mean, in the blog post you mentioned, this is like logic hosted on Viaduct composes with other logic hosted on viaduct by issuing GraphQL fragments and queries.
C
Yep. So this is one of the things that I do think is relatively unique to Viaduct in that the whole idea is that as you build out this unified graph and you get more and more data into this graph, well, the less that you need to go elsewhere to get some data, and the more that you can use the data that you already have in the graph to build features. So the canonical example I always give, it's very trivial, but I think it illustrates the point, which is, let's say you want to implement a field that returns the user's full name. Well, the full name is typically you store it, the first name and maybe their surname, last name separately. Right. And so you kind of have your first name, you have your last name, and then you have the full name. You got to combine those two things together. There's a bunch of ways you could do that. If you imagine that all this data, the user data, is stored in some separate service. Right. You could query for the user up front and you could take the first name and last name, and then you could just like, you know, have some logic that combines those things together. But in Viaduct, what we encourage is that you are essentially declaring for that full name field you're declaring data dependencies from within the graph. So you're saying that for the full name field, I need the first name, I need the last name from the user entity. And Viaduct will know how to fetch those things, compute them if necessary, and give that data to the resolver. And that general idea, which we've actually had since almost day one in Viaduct, the API has definitely shifted. But we had that idea from, from day one. It turns out it scales really well because as the graph grows and our graph is really big, I think we have 25,000 types and, you know, hundreds of thousands of fields or something like that. So it's huge. And we have the majority of Airbnb is kind of available online. Data exposed in this graph. You can often find what you need without going to some service, especially if you're operating higher up in the layers, building presentational type of features. You can really write entire features and applications without ever making a direct service call and instead calling back into the graph using this reentrancy approach.
B
Awesome. That's a really good example. I like that. Very simple, but very effective. I think to understand what we're talking about here, you mentioned, or you touched on the idea that there's a sort of modern version we're going to get there. I think just before we do, maybe we could just still talk about the just general technical architecture, which again you touched on in the blog post, and the kind of three bits to this, the tenant API, the execution engine and hosted application code. Could we just walk through those three, just sort of how do they come together to make Viatek what it is?
C
So at its core, Viadeck is, you know, really kind of an opinionated GraphQL server. It certainly grew from those origins. You know, it was really just GraphQL Java from the very beginning and then build stuff on top of it. But we've taken a much more principled approach as we've kind of started to rebuild pieces of it. So the modern kind of way that Viaduct looks is those three layers, as you mentioned, you know, an engine, this tenant API and runtime, and then the code itself. And the reason to structure it like that really comes down to we wanted a engine that was really lean, as performant as possible, might be implemented in some other language or framework or whatever one day, and can focus on the things that are kind of core to the Viaduct execution model, which is this high performance execution, executing selection sets, which is a GraphQL concept, our concurrency model. How are we doing kind of parallel fetches, batching which is very much tied into that. So figuring out how to avoid the N plus one problem, maybe caching in certain cases, at the very least, intra request caching and things like that. And then the tenant API and runtime kind of sits on top of the engine and it's what is kind of providing that strongly typed interface. Right. So for those that haven't looked at the open source projects, Viaduct is built in Kotlin. We're a big JVM shop at Airbnb. We use a lot of Java, but also use a lot of Kotlin. Everyone that writes code in Viaduct at Airbnb writes it in Kotlin. And when you're actually writing code inside of Viaduct, you want that strong typing, right? But the engine doesn't care so much about the strong typing, Right. It just is kind of schlepping data around. So that boundary of like where typed data lives and where the engine can deal with just kind of more raw data, so to speak, is actually, actually ends up being quite important. Because if you're, if you're going to build this multi tenant system, right, you want to avoid sending typed information back through the entire system. Because if you want to like say, deploy a tenant individually from another tenant, right, you need that kind of boundary. And so we actually draw it a lot. It's not a perfect analogy, but, you know, we kind of have like engine space, tenant space, kind of like kernel space, user space. Right. And like I said, it's not a perfect analogy, but it keeps us like kind of grounded in that. We need to keep those, those things separate and we need to create that very strict boundary between those two layers of the system. And then you have the hosted code, Right. That kind of is actually what uses the tenant API and tenant runtime.
B
So let's talk about the modernization journey, if you like. There's, I guess what's known now as Classic Viaduct and Modern Viaduct. Is that right?
C
Yeah, it's at least known internally.
B
Yeah, I think it was in the blog post as well.
C
But yeah, yeah, yeah, we've talked about it publicly, but I think to the outside world, Classic Viaduct doesn't mean a whole lot because they've never seen it.
B
Gotcha. There we go. Yeah, yeah.
C
But, yeah, and actually most of Airbnb as of the recording of this podcast, is still running on Classic Viaduct. We're rolling out Modern Viaduct kind of as we speak, going to continue rolling it out in earnest over the following year, over in 26. But yeah, most of what Airbnb is running right now is still kind of the classic product, especially the classic. Again, this is why that engine tenant split is important, because actually we're running the modern engine, but the classic API inside of Airbnb. And so there's like this kind of shim layer that we're building and. And that we maintain with the eventual goal to push everyone to the modern API. So, anyway, so why modernize, right? What are we kind of doing here? So Viaduct, the way it started, like I said, you know, kind of started out as working group. We had a small team. That thing like lived as a working group for a long time. We didn't have a real team, you know, but it found a lot of organic adoption inside of Airbnb, which means.
B
It'S successful, basically, because if it doesn't.
C
Exactly, it was successful. Right. But I think it was also a victim of its own success. And if you look at how the codebase evolved, it's one feature on top of another feature on top of another feature that we added because some customer came to us and was like, we need this. And we're like, okay, we'll help you. And I think in many ways it's fine. It got us to kind of where we're at. But it definitely has some scaling problems. Right. Whether it's runtime performance or whether it's like, build time and developer loop performance, there's just to be completely honest, right. There have been a lot of struggles, and we've worked around a lot of them. But what we realized when we started to think about Viaduct Modern a couple years ago was that there's certain problems that are just insurmountable in the old architecture. And so the old architecture, just to kind of compare and contrast it with the architecture that we were just talking about with that clear engine tenant split, basically, the old architecture had none of that, but. Right. And in fact, there was a lot of overlap. It was very unclear when the engine began and when the tenant API stopped. Right. And we did that on purpose. You know, I think it sounded like a good idea back then in that what you can imagine it looked like is that we code gen basically domain models, right. And then attached implementation to those cogen domain models, and it worked really well. It actually had some pretty ergonomic properties when it comes to just like, oh, just like, go override this method in this class, and then you've implemented your thing. Right. But yeah, like I said, kind of scaling that and figuring out how we offer what we're really trying to offer with Modern, which is more tenant isolation More tenant autonomy, but still retaining kind of the leverage that we have working in this, like, opinionated centralized platform. We just kind of realized that it's just we were not going to be able to get that with the classic system. And one other thing to mention that I think will sound very familiar to anybody who has evolved a very large software project. It also became hard for the platform team itself to work on it, and that slows down velocity, et cetera.
B
It's like, I mean, this is a maybe slightly nuanced example, but given you've worked in an agency, the last thing an agency ends up updating is its own website, basically.
C
That's exactly right. Yep. Absolutely.
B
So it's that kind of problem where unless there is a team that has been put together, dedicated to this tooling, which is, I guess, how you could maybe look at it, in some ways, that tooling just kind of only gets updated. As you, I think, alluded to, sometimes it's a customer that is. There's sort of money on the table that says, like, we need this thing. So a lot of people go, oh, well, then now we can put time on this thing, or bandwidth is made available. But it sounds like, as with so many companies and projects, it was going to be challenging for that to happen. So that's kind of leading into the modernization and skipping ahead a little bit here, the open sourcing as well. But yeah, so I'll let you go back to the story.
C
Yeah, sure, sure. All very good points. Hindsight's 20 20, we could have done some of this stuff sooner, but I think what we ended up, like I said, kind of a victim of its own success. And with that, victim of its own success, we spent a lot of time on reliability.
B
Right.
C
Like keeping it alive. Because at a certain point, it became a very, very critical dependency for Airbnb. And at this point, something like 80% of all traffic runs through viaduct, so of all of our API traffic runs through viaduct. So it can't get much more critical than that. Right. And so the reliability aspect of viaducts, you know, really took the front seat while some of the developer experience pieces took a backseat, much to the chagrin, I think, of a lot of developers at Airbnb who I think, just as a shout out to them, hey, I see you. And I think it's been tricky. However, we've kept it alive. And now, and I would say over the last couple of years, we've been able to start to balance improving some things about developer experience while also keeping it alive. But building this Modern API, modern runtime engine, to really set us up for a much, much better future. Because it really ended up being in a situation where, you know, Viaduct, it's a little too big to fail at Airbnb, and at the end of the day, people have to use it. That's how it is at this moment. You know, this is how a lot, I think big company tech evolves, right? Even if it's not the perfect solution, it is the solution that you have. Right. And so I see my job, I see our team's job, our organization's job as well. At this point. We just want to make it awesome because I think it can be really awesome. And I think that's what Viaduct Modern is pushing us. We're pushing in that direction. And I think that the benefits that Viaduct provides and has provided to Airbnb are real and ones that we want to continue to capitalize on and not go back to. Oh, just create whatever service you want to create and figure out how to string it all together. I firmly believe that there is a better world than that, and I think Viaduct is that world.
B
Awesome. Yeah. I mean, I believe that blog post does touch on real world Airbnb performance uplifts. So I do encourage anyone to go and take a read. And just looking at time, I want to make sure we do talk about what the modern Viaduct is and then also want to kind of hear just about the open sourcing journey is something obviously we've had a couple of other companies on in the last six months who have gone from closed source to open source. This isn't quite the same because it's, I guess, kind of a new framework over an updated framework. But looking just at the new, like the modern sort of architecture, I believe a lot of it's about sort of boundaries, strong abstraction boundaries between what we've touched on the engine, the tenant API and the application code. Could you just speak to us a bit about that? And I believe it's going from a dynamic engine API to a statically typed tenant API with Kotlin classes. That sounds pretty important. And hopefully I'm imagining cleans up a lot of the problems that have just ended up becoming problems with classic.
C
So, yeah, we kind of touched on it earlier. I mean, those strict boundaries that I was talking about really end up becoming critical if you want to optimize the performance and also provide a kind of simpler mental model to people writing code inside of the system. And I would say when I say people writing code, I mean both platform developers, but also tenant Developers, right. At the end of the day, if you build a leaky abstraction, the non platform team folks, they end up having to understand that abstraction far too deeply and that creates a ton of confusion.
B
That's a great article by the way. Leaky abstractions. Yeah, it was funny. I was re listening to a very old software engineering daily episode and the article was brought up, so I went back and reread it. We were talking like years and years ago.
C
One thing I've learned in my career is how deeply important good abstractions are and how hard they are to come up with. And so, you know, I not to belabor the evolution point, but I think like sometimes you have to go through a bunch of different iterations of a thing, even a large thing, in order to figure out what that right abstraction is. Right. So you can think about some of these things from first principles. So anyway, I think with Modern, you know, like the main benefit we're trying to give to folks is that it's a much simpler mental model. Everything is a resolver. We have this like flowchart that we've used internally or in some like external presentations where if you think about what it kind of looked like building stuff in the classic API, you know, it's like this big flowchart of like you choose this thing, this thing, this thing, this thing, right. Ultimately what we wanted was everything to be a resolver. We build that reentrancy capability into that resolver concept and that's how you write your code. We can always provide other kind of smaller abstractions for, let's say, fetching data from service or something like that. Right? Like we can always have utilities to make those types of things easier, simpler, less boilerplate. But the core concept of the system is very, very simple. Everything's a resolver. So the other thing that's pretty core to Modern is there's this pretty fundamental concept that we call like async memoization. And the insight is that when you're building a GraphQL server, in particular, just the way that GraphQL execution works, which is kind of this depth first parallel traversal type of way of traversing the query and executing the query, and then when you layer reentrance this reentrancy concept on top of it, you can end up doing a lot of duplicate work. And so take that first name, last name, full name concept I was mentioning before. Well, somewhere else in, if you imagine the user type, you got first name, last name and full name. Full name depends on first name and last name. There might Be some other field in that type that also depends on first name or also depends on last name. Right. Why would you want to execute that resolver again? So if you're within the entity executing this node in the graph, right, and you execute that same field with the same arguments again within the same request, you're not going to execute the actual resolver itself again. So that's a pretty fundamental concept that like most GraphQL servers, I've never really seen one do that. And turns out, you know, it has a lot of performance wins. What we really notice again, scaling viaduct. Scaling GraphQL. You know, we have a massive schema. I already said that before. We have massive queries. I mean we have queries that query a hundred thousand fields or 300,000 fields, right. In one single query, right. They're returning like megabytes of information. Now you could say, don't do that. Well, easier said than done, right? Sometimes, especially as the platform team, we kind of are looking at these edge cases and how do we scale the platform to support those things versus just simply going telling people no, because sometimes they do actually have very legitimate use cases. So a lot of our work around performance and things like that is looking at how do you really scale GraphQL. GraphQL execution to that size, right? To that level of complexity. And it ends up being a pretty non trivial both graph problem. There's some computer sciencey stuff in there, right, which is pretty interesting. So VitafModern aims to solve those things in a bunch of different ways. And like I said, it kind of makes it easier to optimize and gives us kind of that stable kernel. Right. Of the engine while we can continue to evolve the API on top. And then like you mentioned, the strong typing, we had strong typing before, but it was done in a different way, right? It was done in the. I kind of alluded to it, right? It's you kind of override the domain models themselves, right? In this case, you don't do that. You're just given value classes, right, that contain the data that you need. We can generate a lot fewer of those than we were doing before. Helps us with the build time problem, that type of thing. And again, a lot of these problems that I'm talking about, these are things that only happen when you scale to the size that we're talking about at Airbnb. And there's a few other companies that are similar sized Airbnb that have similar problems to us. I know from talking to them. But most APIs or GraphQL servers or really any API server, they just don't have these types of problems. And that's definitely been a learning and one of the reasons why there's actually been a lot of work that goes into building this system.
B
Yeah, I mean, I think exactly. Just to sort of overemphasize the massive scale that this is designed to handle is such a huge piece of it. The Classic has been sort of battle tested through Airbnb and then obviously some learnings there which are flowing through to Modern. And that's kind of what people should be sort of, if they're thinking, I have a massive GraphQL footprint brackets problem, maybe Viaduct is something they should be looking at. And that sort of leads quite nicely into just the open sourcing of this. Yeah, I mean, was that a given when you were thinking about Modern or how did that sort of come about?
C
It was pretty early in the Modern story. Our CTO Ari, he's been really supportive of open sourcing stuff at Airbnb. And we have a lot of open source projects at Airbnb, some definitely non trivial ones. And I think he was always pretty passionate about kind of sharing our work with the world. And it's not just about kind of the typical corporate open source thing of it helps our tech brand and stuff like that. It also is like a. It's an accountability mechanism almost. Right. So it's like a way to share your work with the world and then like get validation on those ideas and be able to talk about them openly. Right. And be able to kind of share deeply, you know, what we are doing and why we're doing it and operating and how we operate at the scale we operate at. So that was his way that he kind of pitched it to me and others early on. And so, yeah, so we were pretty much focused on open sourcing it from the get go. We kind of separated out the Viaduct Modern stuff kind of in our Monorepo to make sure that we were kind of not taking on internal dependencies and things of that nature. So we did have a bit of like that flexibility that I think would have been maybe a bit trickier than if we were to have Open Source Classic very directly. That being said, it's definitely been non trivial to break Viaduct out of the Airbnb bubble. And the real value I think Airbnb has gotten from it, from a technical perspective is that that has strengthened the abstractions that we are talking about here because we can't be lazy if we're going to open source this thing, which of course we have at this point. We couldn't just kind of sit back and depend on other things or not think about how to support both potential open source users as well as Airbnb.
B
And how is sort of, if I'm understanding correctly, sort of classic and modern, they are different. How internally is that working, shifting things across? Or is it just sort of as new services come online, they take modern versus classic or.
C
Yeah, what we're starting to do is for new use cases, we're having people start to use modern, the modern API. Like I mentioned, we actually like have. We very clearly split that modern engine from the API. So we are running inside of Airbnb, the modern engine and kind of shimming, like I said, on top of that, the old API. But yeah, right now it's very much a, you know, as new use cases come online, you know, use modern, we'll never force our engineers to rewrite their code to modern. Instead, we will likely use AI tools and things like that. There's a whole like diatribe. I could go on about how we're doing AI based migrations at Airbnb and I won't go on that big diatribe, but I think that is going to be a pretty critical way for actually getting rid of ClassIQ in the code base over the next couple years.
B
Got it.
C
And just to give listeners like a little bit of more concreteness to the scale, I won't say exact numbers, but we're talking multimillion lines of code in the Viaduct code base. Right. As tenant modules. And we're talking around like a bit over a million QPS of GraphQL operations per second served by the system. So it's definitely pretty hefty scale.
B
Yeah. Before we kind of just look at sort of future stuff. Your devex getting started with Viaduct Modern. Any things to call out there? Is it all pretty straightforward from the repo?
C
Yeah, it's all pretty straightforward. The thing to call out is that what you see in open source right now, it's not exactly like the developer experience around it and kind of the. If you imagine like kind of embedding it in some server. Right. You're not just going to get Airbnb viaduct scale like for free. There is a lot of like kind of infrastructure work that we have not yet open sourced. We hopefully will open source more of it over time. But that there's glue is kind of what I'm getting at. Right. That makes viaducts kind of operate at Airbnb scale. That being said, what is there is truly the Core of Viaduct and what we run internally. So it's kind of what you see is what you get in that case.
B
Awesome. Well, yeah, that's just GitHub.com Airbnb viaduct so obviously anyone curious or wanting to just give it a spin, you can head over there. I think just as we look ahead, we have managed to get pretty far in an episode without once saying AI. So this is a change. But it would be good to hear you've got some sort of thoughts around GraphQL and AI and maybe it relates to Viaduct or maybe it's sort of just around where is GraphQL kind of going from here? Yeah. What are your thoughts there?
C
Yeah, these are all kind of early thoughts, I suppose, but I'll kind of give some opinions. I think that with the way AI is going, you know, it's very clear and I'm an avid user of all the new fancy agentic tools, right. Every day to code code, right? And I think it's pretty clear that software engineering is going to change a lot. It's not going to go away. It's already changed a lot, but it's going to change a lot, a lot, right? In the next, say, three to five years even, and it's not going to go away. There's still going to be engineers, we're still going to code. We might not write a lot of code by hand, but we're still going to be responsible for code. And in that world, I think that having really strong patterns of how to build stuff is going to be really, really important, especially in enterprise type of scenarios, right. Large companies, you know, more than say 500 or a thousand engineers working on, on a thing. I mean, anyone can go vibe, code some backend and spin something up and have it do something, right? And don't get me wrong, that can actually take you really, really far. I'm a big proponent of like stay in the monolith. Don't do anything crazy. Just like put everything on one server or whatever and scale your startup that way, right? But once you get to a certain size, and again, it's not just around like QPS or something like that. It's really, you know, this concept of sort of like programming in the large, right? Like you have to work with a lot of people and you have to build a lot of things that all have to work together, right? That's really where a system like Viaduct, or there are other systems that are like Viaduct, I suppose really can play a major role. So I see a pretty clear future for both Simplifying architectures in large companies, services don't die, but maybe there's a considerable collapse of how many services there actually are. You know, we figure out how to scale these things, scale the number of people working on a single service a lot better than we've done in the past. And I guess not just scale people, but agents. We teach AI how to help us with our operations, right? We, we kind of are doing some self healing type of things, you know, making sure that AI understands our deployments, the observability systems, all those types of, all those types of components, right? So I think that's why like this move toward building these really stable managed platforms, right, is actually going to benefit a lot of folks in this AI world. Because our agents just want to write code. They don't want to like set up all the infrastructure because that's the type of thing that is like, that's not going away. Like, I forget who I was listening to on Software Engineering Daily. It might have been maybe the, I can't remember. Well, anyway, whoever it was, they made a great point.
B
You're a listener, which is always great. So.
C
Oh, it was the, it was the ona. The ONA folks that are building, that used to be gitpod, right? That are building. Oh yeah, autonomous agents, right? And it's funny, I actually, one of my other lives at Airbnb when I kind of like took a break from GraphQL for a little bit, was building our internal remote development product. So the stuff that get POD and now ONA has worked on is near and dear to my heart. But they made a great point, which is that you spin up agents and they write a bunch of code, right? But like at the end of the day you still have to deploy the code, you still have to observe the code, you still have to do something with the code when the service dies, right? And breaks, right? And you have an incident, right? And while AI can help with all of those things, managing that entire software development lifecycle is not going to be something that a single AI agent is going to be able to do for a long time. I mean, I won't say never, but it's going to take a long time for us to get there. And so having these managed platforms be like critical pieces of infrastructure in large companies and then I'd say when, if you're a small company, you'd be using the vercels and whatnot, fly iOS whatever to build your backends on top of, you just need it. Because I don't think you can do it without it.
B
Honestly, I think that's a really good point. I think that we're probably still relying on at the moment on a lot of as we should be, but like human in the loop and where taking APIs as an example, if whichever model you're working with is able to grok. I'm not saying grok has to be that service, I just mean literally grok. It groks the API, well then fantastic. But there probably has to be a more systematic way around these models, understanding how and where they should go to fetch data and to fetch it safely and so on and so forth.
C
Exactly, yes. And so that brings me to kind of the GraphQL point. I actually think, you know, there's been some like, folks that are like, GraphQL is too complicated, nobody needs it except for the absolute largest companies. Right. And you know, I think there's some truth to a lot of what the critics say about GraphQL. However, I actually think that, like, I mean, maybe you don't like the GraphQL protocol or whatever, but the general idea of having a strongly typed data oriented schema that represents all of your core business data and then you expose that through a query language or protocol that both humans and machines know how to easily query and it's flexible. Right. That's the flexibility thing is really the important piece here, you know, because everyone could say, well, you just do that with RPC or rest or whatever. Right? But it's the flexibility that GraphQL gives you to kind of build these queries that represent exactly what you want, when you want it. I actually, I don't know, I'm biased obviously, but I think like, there could be a bit of a resurgence in GraphQL. And I think as somebody who maintains a GraphQL oriented platform, you know, I think the easier we can make it to write scalable backends using that technology, I think it'll actually benefit folks in the AI world. And I think to your point about accessing data, you're thinking about folks as backend engineers, especially working it at large companies in complicated systems. Yeah, like, are you going to teach your AI about the nuances of every single service and every single API that every single service in your company vends and responds with? Or could you teach it about a kind of unified graph of all of your data and all the relationships are already encoded there and things like that? And that seems pretty powerful, for sure.
B
No, I think that's a really good point. I mean, it's not like we haven't seen technology resurgences of late. Postgres seems like the obvious one that sort of kept getting overlooked for a good decade, I would say. I happened to work with it for a long time.
C
Yeah, I'm a big Postgres fan as well.
B
Yeah. But then MongoDB came along, NoSQL, et cetera. But I think that's a great place to leave it. I think sort of highlighting where all this kind of fits into the landscape today is great. And I mean, as we sort of called out, if you want to check out Viaduct, the viaduct repo, that's GitHub, Airbnb, viaduct. Otherwise, are you on X, Twitter or anything where you talk to developers, like where to find you?
C
Yeah, I'm on X at Skevyskevy. I'm pretty much everywhere on the Internet ascetvy. So if you see a Skevy, it's probably me. And yeah, please, you know, folks find even the idea of Viaduct interesting. Even if maybe it's not, you know, directly, maybe you can't use it for whatever reason because it's, you know, a Java thing and you work in a typescript shop or whatever. You know, I think if you want to come to Vita also has a discord now. I think it's in LinkedIn or ReadMe. But if you come to discord or you come and start a get up issue or discussion, you know, we'd love to talk to you about what, you know, scaling big data oriented service mesh at your company.
B
Awesome. Well Adam, thank you so much for coming on. Learned a ton today. Maybe in a couple of years we'll be talking again and let's see how the AI side of this has all played out as well.
C
Yeah, absolutely. Hope so. It'll be a very interesting next couple of years, that's for sure.
B
For sure. All right, thank you so much again.
C
Thanks a lot, Gregor.
Podcast: Software Engineering Daily
Host: Gregor Vand
Guest: Adam Miskiewicz (Principal Software Engineer at Airbnb)
Date: February 5, 2026
This episode explores Viaduct, Airbnb’s open-source, large-scale GraphQL platform. Adam Miskiewicz discusses its journey from concept to critical internal infrastructure, the architectural principles behind its development, their transition from "classic" to "modern" Viaduct, and the technical and organizational lessons from scaling GraphQL at Airbnb. The conversation also touches on open-sourcing the framework and the evolving landscape of backend engineering in an AI-driven world.
Early Days: Moving from Monolith to Microservices ([05:20])
Internal Communications and Migration ([11:21])
What’s a Viaduct? ([17:01])
Philosophy and Key Principles ([18:00])
Unified Schema vs. Federation ([20:16])
Classic Viaduct ([28:20])
Modern Viaduct ([35:45])
Operational Scale ([45:05])
DevEx for Viaduct Modern ([45:41])
How AI Will Change Backend Development ([47:01])
GraphQL’s Place in an AI World ([51:49])
The conversation was technical yet accessible, blending organizational storytelling with practical engineering insights. Adam’s candid reflections on scaling pains, platform evolution, and the future of backend engineering offer value for both enterprise engineers and tech architects considering GraphQL or unified data platforms at scale.
Summary prepared for Software Engineering Daily listeners and those interested in open-source infrastructure, large-scale GraphQL, and the future of backend systems in an AI-driven era.