
The modern internet is a vast web of independent networks bound together by billions of routing decisions made every second. It’s an architecture so reliable we mostly take it for granted, but behind the scenes it represents one of humanity’s greatest ...
Loading summary
Narrator
The modern Internet is a vast web of independent networks bound together by billions of routing decisions made every second. It's an architecture so reliable, we mostly take it for granted. But behind the scenes, it represents one of humanity's greatest engineering achievements. Today's Internet is also dramatically more complex and capable than in its early years. Eric Seidel is a network engineer at Cloudflare where he focuses on automating global network infrastructure. He joins the show to discuss his unique journey into tech, the fundamentals of how the Internet works, the border gateway protocol, peering versus transit, Cloudflare's architecture, networking in China and much more. Gregor Vand is a security focused technologist, having previously been a CTO across cybersecurity, cyber insurance and general software engineering companies. He is based in Singapore and can be found via his profile at Vand HK or online LinkedIn.
Gregor Vand
Hello and welcome to Software Engineering Daily. My guest today is Eric Seidel. So welcome Eric.
Eric Seidel
Hello. Nice to be here.
Gregor Vand
Yeah, great to have you here, Eric. We've met very briefly a couple of years ago, I say back in Singapore. I'm still in Singapore, you're now in Austin. You were then and still are working for Cloudflare as a network engineer. And we're going to be hearing all about how Cloudflare works as well as just kind of general principles of how the Internet works today, which is going to be very interesting. But as per usual, we'd love to just hear a little bit about your background before sort of leading up to Cloudflare. I gather you have spent a lot of time in Asia, but also China specifically, which is obviously very interesting from an area Internet perspective. So yeah, can you just give us a brief history?
Eric Seidel
I think like a lot of people in the tech industry and I found I'm not unique as this. I have a rather non, what you might call a non traditional, non standard background, very varied background. I came into the tech industry a little later. It's not my first go at tech or understanding tech or working with tech. I mean I've been doing tech related jobs since I was a student worker back in university in the late 90s, early 2000s and then I kind of drifted off away from like computing and Internet altogether and towards a lot of language learning. I was classic students, Latin, Greek, I mean I was training to be a classics teacher at one point actually, and then I ended up going to China for a while, teaching English there and learning the Chinese language. Spent a while there learning Chinese because I just really enjoyed a whole thing of learning languages and then After a few years in China, I kind of drifted back towards tech and within a year of coming back to America, I found in March 2020, about less than a year after I came back to America, I found myself at Cloudflare and I've been working at Cloudflare ever since. I first kind of started as a customer support engineer at Cloudflare, very much networking focused, helping customers with their networking related issues and how to use Cloudflare networking related services, in particular our Magic TR Transit product. And then from there I kind of migrated into engineering as a network engineer. And since then I've actually become a systems engineer on the networking team where I focus more full time on just automating networks or automating our network, I should say.
Gregor Vand
I guess through that time in China, were you sort of keeping up with tech or doing any kind of network related things?
Eric Seidel
That's how I developed my interest in networking. Actually. I started doing a lot of networking related things in China. I ended up building up my own ASN in China and running it with public IP addresses and stuff while I was in China and things like that. That's really where I got into networking to begin with.
Gregor Vand
Yeah. Because I think most of the audience are probably aware, but the great Chinese firewall. So just networking in China, when it comes especially to anything that needs to leave the country or come into the country from a network perspective is just a whole different kind of ball game, right?
Eric Seidel
Yeah, I think we might be getting to this later in the podcast. But yeah, networking in China works very different from the way it works in mainland China, I should say works very differently from the rest of the world. And that very much influences how we deliver our products in China versus row rest of world, as we say.
Gregor Vand
Got it. Okay. Yeah. And as you say. Yeah, we will get to that in more detail later on. Awesome. So let's start kind of with the fundamentals. So we might term it sort of just Internet fundamentals and sort of where Cloudflare fits into all this. So yeah, where do we start with this?
Eric Seidel
I mean, at the most basic level, the Internet is just a big collection of networks. We've got tons of different networks that make them up. I've made previous reference to asn Autonomous system number. The network is divided into what are called autonomous systems, which are like kind of freestanding independent networks in their own right, identified by a unique number called asn. And they all connect to each other, they all peer with each other and altogether in aggregate, that makes up the Internet. Now once you get a little Lower though, what you have is you have different tiers of networks, things like that. At the very top you have kind of what are like tier one providers. These are very big networks, like Telia as an example. GTT is an example, Nippon Telephone and Telegraph. NTT in Japan is a big example. And they are what we call transit provider networks. And they have networks that though they might have certain focus on certain regions, like NTT might be more APAC focused. GTT I think might be more EMEA focused, Europe focused. That area Cogent might be more like North America focused. Initially. They run networks that span the globe. Big networks that span the globe and they engage and they're called tier ones because they are networks that can reach the entire Internet without having to pay any money. They basically engage in settlement free peering with each other. It's like their mutual benefit. Like yeah, we all connect to each other, exchange our customers data with each other. And by connecting to each other and exchanging our customers data with each other, that's the Internet. And in addition you have all sorts of other networks. You have like smaller regional ISPs that connect to one or more tier one transit providers. They might have downstream customers of them too. Like their customers connect to them, that then connect to like the transit providers. Then you have tech companies like cloudflare is an example of one. Google AWS are examples of others. We're big content provider networks. We connect to tier one networks too for transit. And in addition we do lots of interconnections with all sorts of other networks on the Internet to broaden our connection base and reduce the latency. It just reduce the general cost and increase the efficiency of delivery to customers.
Gregor Vand
And so we've talked about tier one and then if we I guess look at edge routers, edge routing, I believe use the BGP routing protocol. Could you maybe just what even is that and how does that sort of then fit into things?
Eric Seidel
So I find the best way to explain BGP like with a lot of things is to start with the problem that it solves. Like, what's the problem that it solves? You mentioned edge routers. Edge routers. Like we have all these networks, these networks have the concept of an edge and a core. The core of the network is like it's all their network. Like right deep in the bowels of their network. If you're an AT and T customer in America, like I am, I'm deep in the bowels of their network. Like my local Internet connection connects to something deep in their core, the edge or the Border called the border as well, the edge of the network or the border. That's where it's actually connecting to other networks where traffic is leaving and entering their network from other networks. And the routers that do that, we call them edge routers. Now when we have that, we have these networks connecting, like their edge routers to are connecting to routers on other networks. They have to basically provide a roadmap to the other routers. We all know the Internet is made up of IP addresses and IP prefixes. Even like the most basic examples, 192.168.1.0 24. That's of course a private IP network, but it works. Public IP networks are numbered and work the same way in that respect with their public networks. They have all these public networks like your public IP address, like my public AT&T IP address. It's part of an AT&T prefix. Now how do other networks know that I'm an att? How do they know? Like yeah, I have to send my traffic to AT&T. Not only AT&T, but maybe it's best to send it to this AT&T router in this region. How do they know that that's where BGP comes in. Bgp, it stands for Border Gateway Protocol. It runs on these edge routers. And what basically do is you establish BGP. These edge routers like AT&T edge routers, Cloudflare edge routers, whatever they might be, if they have a direct connection to each other. Like if the AT&T router has say a direct connection to a cloudflare edge router, like a direct point to point fiber optic Ethernet circuit, then these two networks will run a BGP session over that point to point connection and they will exchange their prefixes. Like in the Cloudflare case. These are our prefixes that you can reach here to the AT&T router. And the AT&T will say these are all of our prefixes that you know, you can reach these prefixes through me, through that same session. And that kind of provides the roadmap and it's made for all of these point to points. We generally have one with the exception of IXP sessions. IXP's wherever we have like multiple sessions. Like each point to point, it's one BGP session. And that's what bgp basically it's building the roadmap of the Internet. It's telling everyone where the IP address is.
Gregor Vand
Are got it. And this might be quite a basic question for some of our listeners, but not for me. Anyway, how often do these change? Because if this protocol is all about exchanging the roadmap. Yeah. How often do they change? Do they change?
Eric Seidel
You have to look at it in terms of individual sessions versus aggregate. Individual sessions, maybe not much. All in all, aggregate, it's changing all the time. The Internet is whole. The IPv4 Internet right now has I think over a million unique prefixes. The IPv6 Internet has, I think, about 200,000 unique prefixes. That's changing all the time in aggregate. And it also can depend how far up the chain you are. If you're like a small regional network, maybe you're a customer of, say, you're like a small office network, maybe, and you're a customer of AT&T, and you've got a BGP session with AT&T or not even ATT, just your regional ISP, your session with your regional ISP might not change that much. It might be very stable. Once you get to your regional isp, the sessions with all of their customers will be very stable. But some of their customers may be changing, which means their sessions with their upstreams might be a little less stable, a little more dynamic. Because every time one of their customers makes a change, they make a change. It gets propagated to their upstream and their upstream sessions make a change. Once you get to, like, the tier ones like Telia and AT&T is a tier one too. Telia, GTT, NTT, all these, like, they're having a lot of changes. It's very dynamic because anytime, not just one of their customers, anytime a customer of a customer of a customer of a customer makes a change that'll get propagated up to one of their sessions. So they've got sessions where like, yeah, there's lots of changes. It can be very dynamic. So it really depends where you are on the Internet. The closer you get to, like, the tier one, the heart of the Internet, if you will, and what we call the default free zone, the dfc, it gets more dynamic.
Gregor Vand
So many questions I could be asking, but I know we're always going to run out of time if I ask everything but one Sidebar question here. Is Tier one AT and T sounds like the main US player. Does that mean that they're just sitting on a massive asset that at some point someone else could come and buy? Could another tier one be created in some way?
Eric Seidel
The basic standard of Tier one, from at least the way I understand it, and I think that's the standard way, is a tier one is a network that it can reach anywhere on the Internet without having to pay another network.
Gregor Vand
Okay, that's a good definition.
Eric Seidel
Yeah, that's basically what it is. And AT and T can reach, from what I understand, if they're still Tier one, I understand they are. They can reach any other part of the Internet without paying. And the reason they can do that is because they have such a large network, such a large customer base. And, you know, I'm not really the finances person or the business decisions person, but it's reached a point where like, yeah, other networks, they've all made the business decision that this is a network that it's worth doing settlement free peering with.
Gregor Vand
Okay, well, that kind of actually leads us pretty nicely then into peering versus transit. And you did touch on it briefly earlier, but yeah, let's talk about that. And obviously we're going to get into what that means from a basic financial standpoint as well, I guess.
Eric Seidel
So. We have our edge routers. They're connected with lots of other networks with fiber optic Ethernets and those BGP sessions I mentioned, a lot of them are peers. We do settlement free peering with them. What they do is we send them our prefixes, they send us their prefixes and maybe the prefixes of their direct customers, which means we can use that connection to reach their network and other networks that are their customers. Maybe what we cannot do through such a session, what if there's some other network out there that's not their customer at all, it's a customer of some other network, we can't reach it through them. Like, they're only going to provide us with the reachability to their network and their customer networks. If we've got, say we've got some other network way out there and we're not directly connected to them, and that other network is not a customer of any of our peer networks, how do we reach them? That's where transit comes in. Basically, when you're buying a transit session, buying a transit service, you are buying a connection to the entire Internet. You pay them and they give you what we call a full table or a full view of the Internet. They won't just send you like maybe a thousand prefixes of just like their network and their customers. They will send you a million IPv4 prefixes containing like every other network on the Internet. They will send you all 200/000 IPv6 prefixes of the whole Internet. And they have given you that full view of the network's Internet. So you can reach any other network on the Internet through them. If you so choose And I guess.
Gregor Vand
In Cloudflare's context, this is sort of table stakes because Cloudflare to provide, maybe you could do a very brief history of Cloudflare in a second, but the product provided initially and then massive suite of products. Now Cloudflare has to assume it can know where to go across the entire Internet or to be able to provide its services effectively.
Eric Seidel
Yeah, we have to make sure like all of our data centers, the basic requirements like, yeah, we have to have a fully functional connection to the Internet that has a full view of the Internet.
Gregor Vand
Yeah.
Narrator
Tired of babysitting autoscalers and overspending on cloud costs. Meet Thoris, the platform that makes engineers heroes to their finance and business leaders. Thoris intelligently manages kubernetes clusters automatically. Right. Sizing and scaling workloads while preventing downtime from traffic spikes. It anticipates usage and capacity needs so systems stay fast, reliable and efficient. Without constant tuning. Teams using Thoris cut cloud spend by 40 to 60%. Thoris predicts compute and GPU demand before it happens, keeping performance smooth and costs in check. Stop wasting compute and guessing your resource needs. Let Thoris handle your auto scaling so your teams can focus on building. Find out how much you can save with Thoris. Visit Thoris AI and try our cloud savings calculator.
Gregor Vand
So let's move on to the global scale, the architecture of Cloudflare. I mean, I definitely didn't prime you to be a history expert on Cloudflare today, but maybe you could just from any basic information, where did Cloudflare start? Where is it kind of now maybe sort of from a product perspective or just product suite, I guess. And then we can talk more about how that's actually then manifested at a technical level through architecture.
Eric Seidel
Sure. I'll focus on the part of Cloudflare I know the best. Like based on not only my networking team, but previously in customer support. The aspect I'm most familiar with for Cloudflare I think a lot is like the security aspect. We provide a lot of security services and I guess you could say the classic way we do it, which is kind of the OG way of doing it, Cloudflare wise is we provide like the CDN edge, where basically we sit between our customers, origin servers, like their web servers. Our edge servers are between them and their customers will not directly connect to their web servers. They'll connect to our edge servers and then we might provide services like caching, like we'll cache some of that for them. We provide a lot of security services for them to make sure like yeah, if there's a DDoS attack, not only do we strive to absorb it, we strive to like filter it as well, as much bad traffic as possible, drop it, as much good traffic as possible, make sure the customer gets or make sure it's served by cash or whatever means. That's the og Like I said from there I worked a lot in customer support on a product called Magic Transit, which is kind of expanding that philosophy to layer three, which is basically networks, like with their own ASNs, like I mentioned, they can put themselves behind our metals, where we basically establish using PNI or using GRE tunnels or some other means to like connect from our edge servers to their edge routers and we sit in front of them and before any of their traffic layer 3 traffic reaches them, it goes through our metals first and our metals filter it and then send it to them. That's another product and that's kind of where I came to Cloudflare. Yeah, where we are very much, at least from the perspective I see in Cloudflare we are very much a security oriented company providing security services to our customers in addition to CDN services and even like Edge Compute services as well. And speaking of Edge Compute, in addition to our workers product, we're moving more and more into the AI realm as well.
Gregor Vand
Yeah. So let's talk about then the sort of pure infrastructure that's required to drive this. I believe just a couple of numbers off the top of the charts if you like. I mean this was what two years ago at least was over 300 global points of presence. And I think it'd be interesting to understand what those are. In over 100 countries, 44 million HTTP requests a second on average. These are just sort of numbers to help the audience, maybe just get a vague sense of scale here. And again, this is two years ago, so it's only probably gone up since then.
Eric Seidel
We've gotten bigger. I'm sorry, I can't provide you with the exact numbers.
Gregor Vand
No, no, it's fine.
Eric Seidel
We're getting bigger. It gets bigger all the time.
Gregor Vand
Yeah. So let's talk about point of presence. What does that even kind of mean and what are they handling?
Eric Seidel
So that goes back to that edge networking concept. So we have these networks, they want to connect to each other. Right. Where do they connect to each other? The point of presence, it's like a data center someplace where like, yeah, we've got routers there, we've got infrastructure there, the other network has routers there, they have infrastructure there, we can connect to each other there. And it's not just case of points of presence like with us, it's not just edge routers as well. Like we'll have our fleet, we'll have some of our services. Well there the thing about our service, the way we've got them configured is we like to have them configured generally to run the whole cloudflare stack. Basically any of our cloudflare services should be able to handle any of our customers needs and services.
Gregor Vand
Got it. So how does this then flow through to data center? I guess design, if you like. I believe this sort of the concept idea of traditional versus multicolo pop or mcp, which is a different MCP to what maybe a lot of our audience are used to hearing about. So yeah, so let's talk about those.
Eric Seidel
Yeah, so the way I like to conceptualize it, to understand how we approach is you kind of get an idea of the Internet topology itself is like just with like human topology, you've got areas where population density is very low, relatively low, like rural areas. And then you have areas more urban where population density is very high. The Internet interconnectivity on the edge is kind of like that as well. You have areas like Austin is an example, even though it's a big urban area, like where there's not as much interconnectivity happening. Right. Not many networks connect to each other, peer with each other. In Austin, where a lot of the peering, a lot of the Internet connectivity happens, closest to me is the dfw, the Dallas Fort Worth area. There you have like lots of networks coming together at lots of data centers in that area, all connecting to each other and peering with each other. Now the Internet almost works the same with the city metaphor. If you've all these networks connecting to each other, guess what? You've got lots of companies, including us, including all sorts of other companies, they want to park their servers and infrastructure there. They want to park their infrastructure close to where the connections are happening. Because the closer you are to where you're connecting to other networks, the quicker it is to get your data onto the customer network. And to the customer, if you're some area where you're far away from where interconnectivity is happening, then you end up at the core of your provider network. You have to go through your provider network before you reach the customer's network. Whereas at the edge, like, yeah, it's easier to just immediately start going to the customer network. And that's why data centers and points of presence will follow those contours. So like DFW lots of data centers, lots of Internet connectivity. We have a big point of presence there with lots of servers there. One of our biggest, IAD Ashburn in Northern Virginia, that's another huge area where there's just tons of network interconnectivity which leads to tons of data centers, tons of infrastructure, everyone parked where all the networks are meeting each other to get to the customers quicker.
Gregor Vand
So I mean, I seem to remember this from a talk you gave a couple years ago. This was a bit of an eye opening moment for me to sort of understand, I think, why we were in Singapore at that point in time and I still am. And Singapore is a listed region, for example, when you go to AWS or gcp. And I think I was always wondering why all the regions effectively kind of the same across different products and across different providers. They can't always offer all the regions, if you want to call it that, but a lot of them do still follow. Southeast one is Singapore and Southeast one is the same in gcp, aws, et cetera. And this is kind of, I guess what you're explaining here, which is same building, pretty much just where all the providers kind of have their same building.
Eric Seidel
Or if not, same building close enough that you can get metropolitan network connectivity in the form of dark fiber or waves connections relatively inexpensively.
Gregor Vand
Yeah. And it does still require agreements between everyone to say I'll connect to you, you connect to me. And that kind of all helps.
Eric Seidel
Yeah, it's still based on that basic system and it comes together like where this happens. It's most efficient if there are certain points on a map where it all happens. Yeah.
Gregor Vand
I guess if we then bring back in the edge router aspect to this, which is that's edge routers are just so I'm getting this clear, like not the opposite of a data center, but they are the bits that do sit outside of these sort of mass transit areas. Is that correct?
Eric Seidel
Well, I mean they're the ones that are actually doing the connectivity. So we try to get them close to the most convenient places to connect them to as many other networks as possible.
Gregor Vand
Okay. And just in terms of like redundancy, we did actually have an episode a couple months ago just talking about subsea cables and just how when they get cut and what can cut or damaged. And we were focusing a bit more on how they actually get repaired. We had someone on from one of the big news outlets who'd gone and been on one of these boats and just seen what work is needed to actually repair these fiber optic Cables, but either those or any other kind of problems, I guess, because we're still talking about a lot of cabling for most of this. How do you factor in redundancy and how to even think about that?
Eric Seidel
Yeah, I mean, you pointed out a tricky question. Subsea cables can be tricky because they carry a ton of capacity. There's only so many of them, and when they get knocked out, it can take a long time sometimes to fix them because they're under the ocean. You have to go out there with boats and things like that to fix them. So we do run a global backbone network. And the thing about it, it does span the globe, circumscribes the entire globe, basically. We try to make sure we have a lot of diversity. We have like backbone links going through multiple different subsea cable networks. And then we have lots of transit providers as well too. Like if we lose some backbone capacity because of a subsea cable deal problem, we can move to transit. Where the trickiness comes in is sometimes with some of these bigger subsea cables, not only does it impact us, not only does it take us out, take out some of our capacity, it takes out some of the capacity of our transit providers because they're running over the same subsea cable. Other, like big networks as well, content provider networks and stuff where some of our customers are based, they get hit by the same thing. Because subsea cables have so many different networks running through them, it can be really tricky and it does at times lead to real capacity issues and difficulties that impact the whole Internet, impact lots of networks, and really are felt until they're fully repaired. Yeah.
Narrator
You'Re a professional software engineer. Vibes won't cut it. Augment Code is the only AI assistant built for real engineering teams. It ingests your entire repo, millions of lines, tens of thousands of files. So every suggestion lands in context and keeps you in flow where other tools stall. Augment code Sprints. Unlike Vibe coding tools, Augment code is built for shipping to production and you don't have to switch tooling. Keep using VS Code, JetBrains, Android Studio or even Vim. Don't hire an AI for Vibes. Get the agent that knows you and your code base best. Start your free trial@ augmentcode.com so let's.
Gregor Vand
Move on to a sort of, I guess, mini case study just around China specifically. So we touched on this beginning, that you had already spent a lot of time there, but I don't think you were you ever worked for Cloudflare in China, but obviously just that Knowledge that you could probably bring to the table in that respect. But how does the China network part of cloudflare work? Because it has to be quite different.
Eric Seidel
So again, starts with a problem. The way that networking normally works here, the way we handle it is it's not something that really knows national boundaries. By that I mean we operate a global network with like not just our edge colors, but our backbone network. When I talk about our backbone network, like the big connections we have between our own data centers, our own network, we operate this big global backbone network and we're not the only ones. Like all those transit providers, those tier ones I mentioned, they all operate those big global networks. And like other content providers like aws, Google, et cetera, they also operate like big global globe spanning networks. And when we manage these networks and when we move traffic through them, our general approaches, unless there's special exception, and those except special perceptions are few and far between, we're not thinking about national boundaries. What we've got is a whole system with lots of different paths. We can choose to send our data. Like, okay, we've got a path, we've got data center A and data center D, right? And we have a path via data center B or data center C to go from data center A to D. But say the path via data center B or point of presence B, whatever you want to call it, you know, starts to congest, then we might shed some traffic, move it to the path via data center C. If you notice we're not really thinking about oh, what country is it in, like what countries are passing through now. We're just thinking like, yeah, we've got this globe spanning network, what's the best path? Like it's all treated basically equally without concern for national boundaries. Unless there's like some special case about how data has to be treated. China's a different story. Mainland China, I should say Hong Kong itself still is very much. It's connected to the rest of the world network where we've got data centers there, we've got our backbone running through there. And same with other networks, but mainland China is one. You've got three major networks there. You've got China Telecom, China Unicom, China Mobile, and if you want to get in and out of China, unless like it's a special case, you're going through their networks, like you're not running your own network in there, you're not running your own backbone network there, you're not transiting your own traffic through there, anything like that. Like you're going through like those big three networks. And that in itself creates a special case because again, we've moved away from just operating our own globe spanning network that grows into China. No, we're like stopping and saying, okay, at this point we kind of have to go into like the Chinese provider networks. And that doesn't just apply for us, of course, that applies generally. And then of course you do have the firewall of China, the great firewall of China I think they call it. That involves a lot of packet inspection. Now the thing you have to understand about our edge routers is at a basic level, they're kind of dumb, they're not very smart thing, like they're not very complicated things. They basically learn each other's prefixes, then they install it onto a hardware route table, all hardware ASICs. And basically their only job is like they're forwarders. They get an IP packet, they see the destination IP address, or in the case of, in the backbone network when we've already got it encapsulated in mpls, they see the MPLS label. And based on that IP address or MPLS label, they say, okay, I need to send it to this next hop, I need to send it to this next hop router. And that's all it's doing, right? There's no real packet inspection there, it's just kind of mindlessly forwarding traffic to wherever it's told, programmed to forward the traffic. But when you've got something like, you know, a fight like the great firewall of China, you're adding a lot of overhead to that. All of a sudden it's more at the Chinese edge or near the Chinese edge. It's more than just like mindlessly forwarding packets to wherever, not caring like what, like, no, you're starting to look what's inside the packet, you know, what's inside the packets, what protocols are the packets, like what's maybe the DNS address, like the packet was originally destined for things like that. And then that in itself adds so much more overhead because, you know, whereas we can just like do more with less. So just because we're not doing the filtering and inspecting, they're doing less with more because they need so much infrastructure just to do that and result of that together. You know, the kind of big three, always having to go through those big three networks, plus the filtering inspection they do on the edge leads to a situation where if you're in China, and I've experienced this for myself, yes, connecting to the outside Internet, the rest of the world, it's not the smoothest experience, like, when I'm in America and I connect to stuff in Europe, like, I might feel a little bit of latency, but it's generally, it's okay. And when you're in China going through that, like, even when the great firewall of China is not blocking it or anything like that, like, even when they're like, this is fine, we don't want to filter that, even then it can kind of like be a lot of loss, a lot of like connectivity issues because of overloads and things like that. It can be generally, you know, not the most enjoyable experience. Now, that means we've got lots of customers who are global companies. They use Cloudflare to serve their customers all over the world. And Cloudflare works great for them. Right? But then the exception is, well, we've got these customers in China. Our stuff, the Cloudflare edge, is outside of China. Our Chinese users are having a bad experience of having to get out of China, having to send their traffic outside of China to connect to us. And the answer to that, we give them, okay, we're going to put Cloudflare infrastructure in China. Now, when you put Cloudflare infrastructure in China, the way we do it, and I think it's the fairly standard ways, like you partner with local providers, local networks, and we have our local partner and we basically host our infrastructure on their network. And I should add at this point, like, yeah, I'm on the network engineering team now, I don't really do anything with China Network because we do not manage the network. Like, we've got Cloudflare servers and stuff there. And our SRE team, for example, will work with our partner SREs to solve any problems with the Cloudflare stack. But when it comes to the actual networking stuff, yeah, that's not ours. That is our partner's network and our partner's networking team handling it for us. And they have the specific China specific experience to know how to handle that well.
Gregor Vand
And just to clarify, this is anything, I think, to my understanding, sort of, that does run, I guess, on this partnership model. It's really just an enterprise product. This is not something that like your average developer would have any interaction with.
Eric Seidel
Yeah, we do offer like our free tier services. Like, you know, you can use Cloudflare as a free user and we welcome, we really get a lot from our free users. That's like a real big part of our network, but not China Network. China Network is an enterprise product. You have to basically buy it and sign a contract for that. So yeah, we're dealing with enterprise people who can afford enterprise services.
Gregor Vand
Yeah, exactly. Where the sort of just the need is absolutely there. I mean, likewise, I've had a lot of. Probably almost over 10 years ago at this point, but, yeah, a lot of experience with, as you say, it's not even just the great China firewall, but it is just the sort of latency involved with the in and the outs. Yeah. Which was just painful to deal with. So anyway, I think that's been a great understanding of what goes on there. I think it's really great. Throughout this whole episode, you're always framing it as, well, what's the problem? Before we're even talking about what do we do? I think this is a problem that most developers are at least aware of, which is DDoS. And I'm sure most developers have either experienced it, I can't access a website, or literally had it happen to their own service, or at the very least read about it as to why something went down. It's a big kind of. Cloudflare, quite frankly, handles so many of these attacks these days. And I think I was almost sort of analogizing it in my head to like, Cloudflare is a bit of a government service at this point. In terms of if Cloudflare went away, I think we would have a much worse Internet, at least in the Western world, if you want to call it that. So let's talk about how does Cloudflare mitigate DDoS? And I believe we're going to talk about Anycast. And I remember you saying Anycast in the presentation, so I'd love to go back there.
Eric Seidel
Sure. So first, let's talk about a little more like what a DDoS attack looks like right now. DDoS attacks are normally launched by what are called botnets. Botnets are basically like malware. A lot of the malware we run into is like viruses that, you know, take to a certain or lesser extent, take over your computer, or maybe not even your computer can even be like your mobile phone or even maybe as something as dumb as like a smart fridge or something like that. Whatever they can. If it's. If it's connected to the Internet and it has a CPU on it, it has a computer on it. Like, yeah, there's a chance it can be become part of a botnet. So, like, basically the malware, like implants itself in your computer or whatever thing it might be, and it'll usually try to do it without the user, like if it's on your phone or whatever, without you even being aware of it because it doesn't want you removing it, of course. And the software, the malicious software maintains a connection, you know, with like a command and control. And then whoever owns that botnet, so to speak, they can use that command and control to order maybe like, I think up to hundreds of thousands of device might be closer to millions. Now, I'm not sure what the exact numbers are anymore, but I think last I heard, in the hundreds of thousands, hundreds of thousands of compromised devices spread throughout the whole world. And each of those devices can just use whatever Internet connection has to just send whatever it can like to target, like target this IP address to send garbage, some attack traffic to this IP address. And it sends in all these compromised systems over 100,000, maybe just at the same time, send a bunch of attack traffic to the same IP address. Now, like in aggregate, that can be overwhelming. You can talking like terabits per second of multiple. I think I've seen over 100 terabytes. I think, yeah, like maybe 100 terabits per second or more than that of traffic, just like going at a certain IP address. Now this is where Anycast comes in. So the idea of Anycast in our network is Anycast is like traditionally with Unicast, one IP address for one computer, right? There's only one computer in, in the world that has that IP address. Anycast is different. The idea behind Anycast is we've got thousands. In the case of Cloudflare, our fleet, like well over 10,000 servers spread throughout the world, throughout hundreds of data centers that each have that IP address in data centers that are each advertising that same prefix to all of our peers. Which means when you do that, when you have it like over 10,000 computers spread through servers spread out through the entire world behind hundreds of edge routers, each with dozens, maybe hundreds of peer networks. And we're all advertising that prefix out. And that IP address out means that when this botnet attacks, its strength, its distributed network kind of almost in a way becomes a weakness. This, because the way the Internet works, like, it's all spread out. It's not aggregated, it's not hierarchical. It's just best path, whatever's the best path. All of these hundreds of thousands of devices will be taking different paths, ending up at different Cloudflare data centers, depending on where they are, and ending up on different Cloudflare metals. And whereas like, oh, one Data center getting 100 terabytes per second of traffic, or like, you know, not even that, because it overloads. No, what you get is Hundreds of data centers maybe getting under a terabit a second of traffic, maybe not quite that even that's ideal. Might be some bigger data centers might be getting a few terabits and some might be getting significantly less and then lots of metals just getting in the order of gigabits per second of traffic. So we basically with anycast taking advantage of the architecture of the Internet, just the topology of the Internet to kind of naturally disaggregate and unfocus that attack. So it just spreads out into smaller bite sized chunks that we can absorb. And that's key because as an anti DDoS network, like if we want to be able to filter our customers traffic and make sure they get all the good traffic we can give them and as little of the bad traffic as we can, we ourselves need to be able to absorb that attack. Like if we can't absorb it, if our network is getting overloaded, then we can't filter it for our customer then because we're already losing it. But we've gotten to the point where like, you know, I've done network edge on call where we like whenever we, if we get congested we get a page right and we have to fix it. I've been on call before where we've had like 100 terabyte per second attacks and I never get a page. I didn't even know it happened because like we were able to just absorb it without having any congestion events at any data center. There have been other times where maybe it's just like we see one link, like one PNI congest and like well what's going on here? And I'll like put aside that engine that sometimes you're like let's back out. Look at the forest when it looks weird, I'm like okay, let's look at the forest for the trees. Look at our global DDoS metrics. And I'm like oh wait, this is a big attack. Okay, we're absorbing it just fine everywhere except like this one link somewhere. And that's the only reason I knew it happened at all.
Gregor Vand
Yeah, that's fascinating. Just to read some of that back is I'm almost thinking like water is quite a good analogy here in the sense if you were to pour in water in the top of a container and this container still has a sort of out at the end of it. And out would be where the water's trying to get to, it's trying to get to. What I'm getting at here is that's the site that somebody thinks they're trying to target. The problem is they've got all these rivers in the container and these rivers do a lot of them actually end up. They flow through effectively. What we're talking about is Cloudflare architecture. And so the water is just dispersed across all these different rivers before it even has a chance of getting to the end point. Let's talk about. Because that sort of is like DDoS filtering pipeline effectively just sort of, I guess briefly what kind of happens then during that process of the data is naturally hitting this one address effectively from the Anycast principle, but it's then filtering through and being filtered. Are we talking it's the same filtering process that goes on across all the data or is there any kind of differences? I guess so.
Eric Seidel
I mean the other side of the Anycast, in order for Anycast to work, you basically have all of our say 10,000 plus servers have that IP address assigned to them and are handling requests destined for that IP address. They basically have to be configured identically. They have to be running the Same services like www.acme.com is our customer and it has to be served from all of those servers. It has to be the same. You don't want an experience where like oh, but depending on what server you get a different website. Definitely don't want that unless the customer themselves does something special to configure that. And some customers will create like regional specific things like that. But basically the idea is like, yeah, we need a system that can propagate all of this out globally and keep all of those servers in sync and make sure they're serving the same thing.
Gregor Vand
Yeah, looking ahead, like IPv6 is obviously a sort of big transition that's going to be. That's sort of in the works and will be, I guess propagated over the next few months, et cetera. How is that affecting how any of this works? Or like how is just maybe at a general level your day to day, how are you having to sort of think about that transition?
Eric Seidel
So I mean, we're well into the transition. I mean I wouldn't, from our perspective, I wouldn't even consider us like still in the process of transitioning. We run a dual stack network, IPv4, IPv6 and they're aligned. Like, you know, our IPv6 network follows the same topologies, our IPv4. And in many ways, like for a lot of our internal services, we're already IPv6 first IPv4, like we support like a lot of our customers, a lot of eyeballs are still on IPv4 networks. So we serve them with IPv4. You know, we sell plenty of customers who like their infrastructure as IPv4. So, you know, we connect to them with IPv4. I generally find, from my experience, like, a lot of the DDoS attacks still seem to be like, at least last I checked, last I handled a major DDoS attack was I guess like six, seven months ago. But they do seem to be more IPv4 heavy than IPv6. But internally, like, we're fully dual stack, we're fully go on IPv6.
Gregor Vand
Got it. And going back to DDoS, just for a second, just because you sort of mentioned like the latest you at least had to handle. But I did see Cloudflare puts out a very good sort of. I believe it's quarterly. No, I think it's yearly, maybe, but they obviously talk about it in quarters in terms of the DDoS attacks that have been witnessed by Cloudflare. And basically Q2 this year was quite way up on last year, and Q1 this year had an exceptional number. Do you have any sort of. I guess, I mean, there's some insights in that article, but I don't know. Just as someone who then actually is dealing with it, do you have any insights as to why this is just only increasing? I guess, or.
Eric Seidel
I mean, I don't have any particular insights. I guess it just seems to be following. There's the same general pattern that, yeah, they've been getting bigger and bigger as time goes by, as the, as more devices are connected to the Internet, there's just more devices that can be compromised.
Gregor Vand
Yeah. And interestingly, I believe in that blog post, one of the people that had experienced this, it was something like 63% of the respondents believed it was actually a competitor that was initiating. And I mean, they talked about specific industries like gaming. Gambling.
Eric Seidel
That's the thing. Yeah, gaming is a thing. And that's not a new thing, by the way. Like, back when I was at customer support and cloud, like there were cases where like DDoS attacks and the suspicion was, I mean, I'm not in a position to confirm or deny, but the suspicion was that like some competitor hired a botnet herder. I think bot herder, I think is one slang for them, but, you know, hired out a botnet to like launch an attack on them. And that's why one reason, like, we provide one product called Spectrum, which is, it's a cdn, like it's an edge network product, it's a proxy, but it's not an HTTP HTTPs proxy. It's a generic proxy for any arbitrary like layer 4 protocol on TCP or UDP and one use case for that. Like one problem that is fixing is customers who are like gaming networks and they want to put their game servers behind Cloudflare and oh well these game server protocols are not HTTP. Well that's where like Spectrum comes in and like whatever their gaming protocol is, they can use Spectrum to proxy and that kind of help protect that from that kind of like yeah, attacking game networks phenomenon.
Gregor Vand
So yeah, I mean for anyone interested that is just double checked. It is, yeah, it's a quarterly report. So it's just Cloudflare's 2025 Q2 DDoS threat report and that's on the Cloudflare blog as well, which is quite a great read as well. Eric, great to have you on. I often ask this question to guests just before we head out, which is pretty simple question but always get a range of answers which is just what do you know now that you might have if you could tell Eric of I don't know, coming out of college and going off to China, for example. What do you know now that you, you might tell that person? I think it's interesting here especially because as you say, you kind of left tech for a while but you obviously got kept quite interested in it in the network side and then you have come back to tech in a big way. But yeah, what would you tell yourself?
Eric Seidel
So one thing I've learned that I wish I knew I was younger is when you're in an engineering role, regardless of what company you work for, it's not like a good employer, bad employer thing or like that regardless of what industry you're in, like burnout is a real thing. That's something. When I was young I kind of took like a sort of blase attitude to burn. I didn't really take it seriously. Like I'm young, I'm strong, like you know, I can do this, I can do lots of all nighters and well yeah, by the end of like before I even was able to enter the industry, like right out of university, I already kind of burnt myself out and I spent some time doing like, you know, studying classics, doing the Latin Greek thing and then time in China doing the chi, you know, learning Chinese thing. It's only much later that I came back to tech and by the time I came back to tech, like I learned like yeah, you need work life balance. You know. The thing I've learned now is to have a like much better work life balance I'm much hard, you know, I'm still very hard worker, still push really hard. But the old days where it was like so much better unbalanced towards like 100% tech. 100% of the time. Yeah. Like looking back, I think I'm glad I've lived the life the way I have. I really enjoyed all my time not in tech and like learning all those other things. But had I done it again, I probably would have like taken a more work life balanced approach just from the get go and not have gone through that odyssey of burnout and then kind of recovery and back and you know, working my way back to tech from my excursion, my travels outside of tech.
Gregor Vand
I think that's a great call out. And the burnout thing is, as you say, it is real. And I would also agree just as you get older, one at least I think starts to understand it more for you because burnout can mean slightly different things to different people in terms of it's not just, oh, are you working 12 hour days, 12 to 18 hour days or whatever. It is actually often about so many other factors that play into it as well. So yeah, I love that answer as well. I think it's great advice for many people. And yeah, I mean, fun fact, my grandfather was a Latin teacher actually, so I'm also quite into languages. I'm terrible at most of them, but I love just trying to learn them. Try to learn a bit of Cantonese, which is a very tonal language, which is very difficult.
Eric Seidel
That's on the hard end of the Chinese.
Gregor Vand
That is on the hard end. Yeah. Yeah. No shock that I didn't exactly become fluent, but I was able to have myself understood. So that's always a fun one. Well, thank you Eric so much. We've learned a ton today and anywhere people can find you, whether it's LinkedIn or anything like that, you want to talk about.
Eric Seidel
Sure. I mean I just got a LinkedIn at just. EricJ Seidel. I think it's just LinkedIn. EricJSeidel E R I K J S E I D E L okay, awesome.
Gregor Vand
Well again, thanks so much for coming on and if I'm in Arsenal, come say hi as well.
Eric Seidel
I'd be glad to have you here, show you around.
Gregor Vand
Thanks so much.
Podcast: Software Engineering Daily
Episode Date: November 6, 2025
Host: Gregor Vand
Guest: Erik Seidel, Network Engineer at Cloudflare
This episode explores the fundamental architecture underpinning the Internet, with a particular focus on real-world networking, global infrastructure, routing protocols, and the practical challenges faced by high-scale providers like Cloudflare. Erik Seidel, an experienced network engineer with a unique background including time in China, shares insights on topics including BGP, peering versus transit, redundancy, regional nuances (especially China), Cloudflare’s infrastructure, DDoS mitigation, and more.
The episode demystifies the Internet’s global network architecture, from the basics of BGP and peering to the specific complexities of operating in China and the technical underpinnings of DDoS protection. Erik Seidel’s hands-on insights and propensity to “start with the problem” make this a valuable listen for any engineer seeking a real-world understanding of Internet-scale networking.