Loading summary
A
You're listening to the Cyberwire Network, powered by N2K. Most environments trust far more than they should, and attackers know it. ThreatLocker solves that by enforcing default deny at the point of execution. With ThreatLocker allow listing, you stop unknown executables cold. With ring fencing, you control how trusted applications behave. And with ThreatLocker DAC defense against configurations, you get real assurance that your environment is free of misconfigurations and clear visibility into whether you meet compliance standards. ThreatLocker is the simplest way to enforce zero trust principles without the operational pain. It's powerful protection that gives CISOs real visibility, real control, and real peace of mind. ThreatLocker makes zero trust attainable even for small security teams. See why thousands of organizations choose ThreatLocker to minimize alert fatigue, stop ransomware at the source and regain control over their environments. Schedule your demo@threatlocker.com N2K today. Hello everyone and welcome to the Cyberwires Research Research Saturday I'm Dave Buettner and this is our weekly conversation with researchers and analysts tracking down the threats and vulnerabilities, solving some of the hard problems, and protecting ourselves in our rapidly evolving cyberspace. Thanks for joining us.
B
Basically, a data leak happened in September of this year, so a few months ago, which was an unprecedented amount of very specific details on how the Great Firewall actually works. And of course, when such a data leak is exposed to the public, it's always worth having a look. And my research team was kind of chomping at the bit. They're like, can we look at this? And we said yeah, of course. With the requisite precautions. Sometimes these dumps might be booby trapped or otherwise, but this appear to be genuine and a treasure trove of information about something that's generally been kept very secret.
A
That's Daniel Schwabe, head of investigations and CISO at Domain Tools. The research we're discussing today is titled Inside the Great Firewall.
B
There's not a whole lot of public information about the Great Firewall and how it does its things. A lot of research has been done just trying to imper figuring it out, but in this particular situation, the over 500 gigabytes of internal data about the infrastructure and how it's organized, et cetera, was relieved and we dug into the data in order to write about it.
A
Well, can you give us some insights how you start digging into a data set that is that large? How do you go about it?
B
Yeah, that can certainly be overwhelming. We first took a high level look at like okay, what files Are they were like diagrams and text specifications. So you cluster those into kind of one category and then whenever there are particular outlines of human interaction, like this is who controls it, et cetera, you put them in a different bucket and then you start going through them. You will have to do a little bit of keyword searching. We intentionally didn't use any like LLM tools because we didn't want to further proliferate information. But we have some of our own tools we can feed information to and do a quick analysis to kind of hone in on what are the sort of large chunks, the human part of it, the technical design and then potential what that could actually mean in terms of the real world.
A
Well, let's dig in here. From your research, how do you describe the overall architecture of the Great Firewall?
B
I'm actually, this might be controversial, but I'm actually quite impressed. I used to work for an organization that basically while it wasn't an isp, ran a carrier grade network. So I've struggled with how to do security on a hundred gigabit link and that is no small feat. Granted my experience was probably about 10 or so years ago. So technology has come a long way. But even back then trying to do any kind of like just anti malware inspection of real time in the traffic was a huge undertaking and cost millions and millions of dollars in equipment to be able to do that properly. Now in a state sponsored situation such as in China, the funding is less of an issue, but the sheer scale of the infrastructure is quite impressive. The fact that they figured out how to build this, if you will, digital wall that any connection sourced from the mainland in China has to go through, but also to map it out where there is central control, which of course is important if you want to block certain types of information from leaving the country. So you have to have a central point of command and control, but then it can still be spread out to regions and it gives regional governments some level of insight and blocking ability as well. And the fact that they managed to design this at scale that large and it's fairly effective, I'm actually quite impressed.
A
Yeah, the report talks about things like the traffic secure gateway and deep packet inspection. Can we dig into some of those details and how, how they work?
B
Yes, a lot of the technology is being used is what's been used as regular cybersecurity best practices for years. So basically what deep packet inspection means is the way the Internet is designed. When information is transmitted from one point to another, it gets chunked into little datagrams called packets. And the idea is if one of them or two of them get lost on the way, you can either ask for retransmission or. Or it's not important because you can make it up from the context. So that gives these small packets that travel over the Internet that contain the information. Now, deep packet inspection essentially means in real time, you intercept this particular packet, you peek inside and glean what information might be included inside, whether there's like a malware hash, or is it particular destinations and sources that are talking to each other. So all very interesting, but doing that at speed to not slow the Internet down significantly, where somebody might become suspicious, or if it's a customer, complain, like, why is my connection so slow? Doing that at scale is important, but it gives you an idea what two points on the Internet are talking about to each other. Of course, there's things like encryption that makes this a lot harder, but there are other techniques you can use in order to get an idea of what. Even if the HTTPs connection is encrypted, you can still get an idea of what the source is trying to reach on the outside Internet and make blocking decisions based on that.
A
Yeah, in the research, you all talk about this notion of fingerprinting encrypted traffic. Unpack that for us. You talk about invisible identifiers. Why does that matter?
B
Yeah, so of course, privacy on Internet is certainly important to me, and I think a lot of people start caring about that a lot more as of late. And so basically, the Internet wasn't really designed with encryption in mind. The early days, everything was transmitted in clear text, so you wouldn't really be concerned about somebody maliciously intercepting your traffic to see what was going on. Well, later on we added some of those layers, and one of them, this very popular, is the secure HTTP protocol HTTPs, which uses TLS encryption, transport layer security. Basically, you connect with a web server, you exchange some pieces of information, and an encrypted tunnel between your computer and the web server is graded where all the data with that particular site is being exchanged. But outside observers would not be able to tell who it is that you're talking to and what information you're exchanging. The extracting the information inside the encrypted tunnel is much harder because the cryptography is pretty strong. And so doing that on the fly is still not trivial. There are entities around this world that probably can do it, but at scale is very difficult. So what you still want to know is who might a particular user on your network that you might be concerned with or have other thoughts about. You want to know who they're talking to on the outside? Part of what TLS encryption, the protocol introduced is the ability to obfuscate what virtual server you're talking about. And what I mean by that is, on the Internet, you might have a web server that has an IP address, but it could answer for multiple domains. So for example, we have DomainTools.com but it could also answer for something like DomainTools.net, et cetera. So from a strictly network connection, all you're seeing is this computer reached out to this IP address, but we don't know what the domain that is loading might be associated with that. And so there are techniques you can do a deobfuscation, essentially by fingerprinting certain sites, by looking at the data that the browser sends, et cetera, you might be able to glean information of what specific website out of the potential dozens that could be present on a particular IP address, what that website is, which then gives you a good idea what might this particular user be up to.
A
So we're talking about looking at metadata then.
B
Yes, got it.
A
Now, one of the things that your research highlights is that this is not a static thing, that this system has adaptive capabilities. Can you explain that to us?
B
Yes. I mean, anything at that size and scale has to be modular. You can't rely on basically a single technology here. If there's a failure or something, then the whole Internet goes down. For a particular country, that wouldn't be very practical. I mean, for better or worse, the Internet drives commerce around the world. And as we've seen here in the States from recent cloud provider outages, if one of them goes offline for a few hours a day, a large part of the population is having a bad day. So the ability to sustain a functioning Internet is highest priority. So fault tolerance to a degree has to be there. And so the way it seems to be designed based on the information in the dump, is that the modularization of it means that certain parts could potentially be instructed to take out one action where another part is completely unaffected. Or if there might be a regional protest movement or something, the administration of that particular region could say, we're going to block any and all mentioning of the following keywords, et cetera, but that might not necessarily be applied globally to the entire thing. So different part of the country might not even be aware this is happening, because otherwise that might give an idea. You know, if you want to control information specifically within the country from one point to the other, you also have to be concerned. What do entities within your network talk to each other? Hey, something's going on over here. And by having this modular design that's pushed pretty far to the edge, down to the regional government and the ability to affect blocking there is very central to the strategy that they're employing.
A
We'll be right back. AI is transforming every industry, but it's also creating new risks that traditional frameworks can't keep up with. Assessments today are fragmented, overlapping, and often specific to industries, geographies or regulations. That's why Black kite created the BKGA3AI assessment framework to give cybersecurity and risk teams a unified, evolving standard for measuring AI risk across their own organizations and their vendors. AI use it's global, research driven, built to evolve with the threat landscape and free to use because Black Kite is committed to strengthening the entire cybersecurity community. Learn more@blackkite.com Foreign.
C
This episode is brought to you by NBA on Prime this Tuesday at 8:30 Eastern it's the Emirates NBA Cup Championship game on Prime. This year's quest for the cup has been building to this the championship game live from Las Vegas. Not a Prime member. Sign up for a 30 day free trial to get started today. The Emirates NBA Cup Championship game this Tuesday at 8:30 Eastern, only on Prime. Restrictions apply. See Amazon.com amazonprime for details.
A
Another thing the investigation mentions is you all refer to it as a state industrial censorship complex with vendors and telecom carriers and regional nodes and central policy hubs. What part do these various folks play and how significant is that for the maintenance and evolution of the system?
B
Yeah, it's an excellent question. From what we can glean from the dump from the data is that basically any entity that provides Internet access to end users within the country is by hook or crook conscripted into helping this effort. Like there's no opting out. You want to do business in China as an Internet service provider, you agree to participate in this scheme. That's the only way it works. Same thing is with mobile providers. They're still in a way, Internet service providers, even though they provide telephony as well. But basically that's that large part of the population accesses the Internet from mobile devices. So wherever that gets routed before it hits the open Internet has to be in there as well. And so Internet service providers play a key role. Manufacturers of hardware that helps to route the Internet, transmit the traffic, et cetera, those all ideally have to be optimized for that purpose. And there are a number of manufacturers in the country that it appears to be based on the information that was leaked, are actively cooperating and building hardware specifically that is beneficial to the type of network inspection at high rates that is needed to sustain this operation. So now we've got Internet service providers, we've got hardware manufacturers, various different entities that are in the chain of bringing Internet access to an end user wherever it may go. And because of the power of the state, and you're not going to do business in China without explicit approval of the state apparatus, they can exercise this control over the various pieces in order to make this all work. If we were to try to do something even remotely close, let's say in the United States, because ISPs are independent entities, it would be very difficult to compel them to do so. Same with hardware manufacturers. They all have regular customers who probably object vigorously to a hardware manufacturer basically building in a better way to sniff the traffic. It's been attempted various different ways, but unless you have the full control end to end over the infrastructure, it would be almost impossible to pull off. But based on the information in the dump, it sure appears like they've done a pretty good job at getting that all working.
A
So help me understand here. Are there global providers of these sorts of things, hardware services that are, are they making custom versions for the Chinese market?
B
So to, to my knowledge, it's focused on the actual Chinese manufacturers. You know, Huawei is one of them. Of course that's been in the news off and on over the years, but there's, there's several others. I don't believe that there are, you know, outside China based manufacturers that do very specific modifications for the country in order to be able to sell there. You may need to take some notes from the regime, but there's also a lot of companies who just simply opt to not sell in the market because they don't want to be forced to introduce potential backdoors or additional hardware in things. That's not to say that you couldn't buy particular hardware on the open market and then modify it for your own purposes after the fact, but at scale, it would basically require the manufacturer to cooperate. And there's enough of the technology know how within the country that they can lean on their domestic manufacturers pretty strongly without having to involve, you know, foreign companies.
A
Well, given this information, how does this affect countermeasures, you know, things like VPNs or proxies or those sorts of circumvention tools?
B
Do they work? Yes, yes and no. It certainly used to be much more of a cat and Mouse game, where because anything that large, there's going to be potential small loopholes or flaws in the design that you can exploit given enough time. And so certain VPNs, a certain way of tunneling, et cetera, has been possible. And if it gets detected and figured out how it's done, then it gets blocked. So you kind of keep moving. However, the specific technical details that were released in this data dump will actually give individuals or entities who want to enable more unfiltered access for people in the country, they might be able to use that to do even better job at circumventing things, because the specific technical details of how VPNs are detected, how certain activity or patterns are detected that then cause downstream blocking or being flagged for further review, or something that's been made public in the dump and could absolutely be used as a blueprint on how to do a better job circumventing. We haven't seen much of that yet, but it's only been a couple of months, so I suspect it's coming.
A
Suppose I'm on an enterprise security team or maybe a global Threat Intelligence team. Is there anything in this data dump that helps inform how I might work with or monitor traffic from China?
B
Yes, I would definitely think so. It depends on the level of sophistication of the entity and also their threat model. But there's enough technical information in there that would give you a pretty good idea, especially if you're seeing web connections coming from mainland China, what those look like, they're all going through the great firewall. So it gives you a better idea about is something going through the firewall, or did somebody find a temporary way to basically circumvent it or get around it? Because the pattern and the fingerprint of stuff that's coming in are likely just slightly different enough that with this additional information of what to look for, you might be able to tell the one activity from the other.
A
You mentioned at the outset that you were impressed by what you saw in this information. How so? How did it surprise you?
B
Just the sheer scale. I mean, we knew the thing existed. And there's been some research, external research, that's been done on it, just by probing the various defenses, et cetera. There was never any specific information. Everything was basically assumptions based on observations, et cetera. But to actually have the documentation that appears to be legitimate is important to say to have the documentation and see things like, yep, I thought this is how they were going to do this. Oh, no, this is completely different than maybe I would have Thought up. Now, I'm not a network engineer, so I'm not saying like my design would have been the world's greatest, but I've been doing this for 25 years. I've seen enough designs where I'm like, yeah, the faster the traffic, bigger the bandwidth, the much more challenging this becomes. And so the like, I guess me being impressed was how to actually force this into being at the scale that it is and it working as reasonably well as it appears to be. That's the impressive part.
A
Yeah. What do you suppose this does to the future here? I mean, this information being revealed? Certainly, I would imagine the powers that be in China aren't happy about this. Do you suspect that there'll be any sort of pivoting here or is this a system? You know, there's a. It's a battleship that's hard to turn on a dime.
B
Yeah, I think it's probably somewhere in the middle. Absolutely. It's a big operation that, you know, just to completely, you know, start from scratch and throw away all of the old paradigms. That's not going to work. Or if so, it would take a really long time and a big investment. I would certainly feel very concerned for whomever leaked that data. I know there's a hacktivist group that took credit for it and they certainly published it, but just looking at the specific data contained in the dump, this almost had to have been somebody with pretty good access on the inside. In my professional opinion, this wasn't like a smash and grab hack, the hack where they found an open file share somewhere and downloaded this information. Whoops. It wasn't properly locked down. I don't believe so. This appears to be some kind of inside job or possibly a disgruntled employee somewhere in the machine that had access to enough of this information. It could be that it was aggregated on some system that got compromised and it wasn't really meant to be leaked. But again, given the specificity and the combination of the files in the data leak, it sure smells like it was somebody with extreme internal knowledge and access to be able to pull all these files together. I would be very concerned for that person and I hope they're going to be okay. I think there will be some evaluation of current techniques. We also are not 100% certain how current the information is. Some of it appears to be very current because it talks about stuff that in the timeline can be placed. But it's also possible that there is additional technologies already being deployed that were not captured by the information in the leak. So it's going to be interesting to see what potential countermeasures the operators of the Great Firewall might be taking as a result of this. To my knowledge, we haven't seen anything very obvious, but this is also something you'd probably want to do low and slow to as to not give away that you're already taking countermeasures.
A
Our thanks to Daniel Schwabe from Domain Tools for joining us. The research is titled Inside the Great Firewall. We'll have a link in the Show Notes and that's Research Saturday, brought to you by N2K CyberWire. We'd love to know what you think of this podcast. Your feedback ensures we deliver the insights that keep you a step ahead in the rapidly changing world of cybersecurity. If you like our show, please share a rating and review in your favorite podcast app. Please also fill out the survey in the Show Notes or send an email to cyberwire2k.com this episode was produced by Liz Stokes where mixed by Elliot Peltzman and Trey Hester. Our executive producer is Jennifer Ibin, Peter Kilby is our publisher and I'm Dave Bittner. Thanks for listening. We'll see you back here next time.
C
And Doug, here we have the Limu.
B
Imu in its natural habitat, helping people customize their car insurance and save hundreds with Liberty Mutual. Fascinating. It's accompanied by his natural ally, Doug. Limu is that guy with the binoculars watching us us Cut the camera. They see us.
C
Only pay for what you need@liberty mutual.com.
B
Savings very underwritten by Liberty Mutual Insurance.
A
Company and affiliates excludes Massachusetts.
Date: December 13, 2025
Host: Dave Bittner (A)
Guest: Daniel Schwabe (B), Head of Investigations and CISO at DomainTools
Main Theme:
A detailed analysis of a massive internal data leak revealing unprecedented, intricate details of China’s Great Firewall—its architecture, operational mechanisms, vendors, and the implications for both censorship and circumvention.
This Research Saturday episode centers on a 500GB data leak exposing internal engineering and administrative details of the Great Firewall of China. Dave Bittner speaks with Daniel Schwabe from DomainTools, whose team investigated the leaked documents, uncovering nuances of China’s internet censorship apparatus—from technical architecture (deep packet inspection, regional control nodes) to real-world implications for global cybersecurity, circumvention efforts, and enterprise threat intelligence.
This episode delivers a revealing look at the operational reality of China’s Great Firewall, as exposed by a massive data leak. Daniel Schwabe and his research team dissected the data, uncovering the Firewall's impressive technical scale, compulsory collaboration from ISPs and hardware vendors, and advanced traffic inspection techniques. The leak opens new opportunities for both circumvention tools and for enterprises to better understand and manage cross-border data flows. The origin of the leak suggests insider involvement, and while rapid shifts by the Chinese state are unlikely, experts anticipate gradual enhancements to firewall defenses. Schwabe’s takeaway: the Firewall remains a remarkably sophisticated, formidable system—one whose secrets are now, at least in part, public knowledge.