Podcast Summary: Inside China’s Great Firewall with Jackson Sippe
Podcast: Software Engineering Daily
Date: February 19, 2026
Guest: Jackson Sip (PhD Researcher, University of Colorado Boulder)
Host: Gregor Vand
Overview
This episode explores the architecture, evolution, and recent breakthroughs in understanding China’s Great Firewall (GFW)—one of the world’s most sophisticated systems for internet censorship. Jackson Sippe, a leading researcher in internet censorship, joins host Gregor Vand to demystify how the GFW detects and blocks traffic, particularly focusing on a novel technique that disrupted encrypted proxies from 2021–2023. The conversation also delves into the cat-and-mouse landscape of circumvention tools, technical and political dynamics, and the global export of censorship technology.
Main Themes
- How the Great Firewall Works: Technical explanation of China's censorship mechanisms and their evolution.
- Groundbreaking Research on GFW’s 2021–2023 Blocking Event: Discovery of a “popcount” algorithm that blocked fully encrypted proxy protocols.
- Reverse Engineering the GFW: Tools and methodologies used by researchers, including risks and practical experimental setups.
- Techniques for Circumvention: Strategies to bypass GFW, including data obfuscation, byte pre-pending, and protocol mimicry.
- Impact, Collateral Damage, and Global Implications: False positives, collateral blocking, commercial ramifications, and the export of similar systems to other countries.
Key Discussion Points & Insights
1. What Is the Great Firewall? (02:59–04:42)
- China’s GFW is a suite of technical mechanisms for internet censorship, affecting DNS, TLS, QUIC, and more.
- Not limited to passive blocking—the GFW can mount offensive actions, such as the 2015 “Great Cannon” JavaScript redirect attack on GitHub.
- The GFW concept now even extends to other countries (e.g., “Iran's GFW”).
Memorable Quote:
"One of my favorite facets of the GFW is a tool called the Great Cannon..."
— Jackson Sip (03:36)
2. Why Researching the GFW is Hard (05:13–06:25)
- Outside researchers lack “ground truth” due to black-box nature.
- Major challenge: Obtaining and maintaining vantage points (e.g., cloud servers in China) for experiments.
Quote:
"We can... only speculate. Right. It takes a number of experiments to really determine whether or not what we have observed is what we think it is or just some other effect of the network."
— Jackson Sip (05:57)
3. The 2021–2023 Encrypted Proxy "Watershed" Event (06:47–11:13)
- In November 2021, Chinese users suddenly lost access to previously reliable encrypted proxies like Shadowsocks and V2Ray.
- Caused widespread confusion—affected all major circumvention tools.
Quote:
"Users all of a sudden just couldn't access their proxies. And it wasn't just one particular implementation, but it was widespread."
— Jackson Sip (09:38)
4. Groundbreaking Research Methodology (11:37–16:10)
- Six-month setup with VPSs in Chinese cloud providers and university “sink” servers abroad to simulate, trigger, and observe censorship events.
- Researchers sent varied, controlled TCP payloads to deduce how the GFW identifies encrypted protocols.
5. Discovery: Popcount-Based Blocking Algorithm (19:16–22:13)
- GFW used a simple yet effective “popcount” or Hamming weight calculation: count of 1s in the payload bits.
- If the proportion of set bits fell within a tight threshold (3.4–4.6 per byte), the traffic was flagged as encrypted and blocked.
- Allowed GFW to efficiently classify high-entropy, likely-encrypted connections.
Key Segment:
"If that value shows up between 3.4 and 4.6, you can say, okay, this is clearly an encrypted payload and we're going to block it."
— Jackson Sip (21:36)
6. Bypass Rules: Protocol Fingerprinting & ASCII Exemptions (22:13–24:30)
- Before the popcount, GFW applied quick exclusion rules:
- Allow traffic starting with long ASCII sequences.
- Permit packets matching certain protocol fingerprints (e.g., TLS headers).
- 80%+ of traffic could be exempted upfront, reducing computational strain and minimizing false positives.
7. Collateral Damage and False Positives (24:30–26:46)
- About 0.6% of normal traffic would be falsely blocked—mainly torrent-related traffic.
- The mechanism was tuned to minimize impact but still produced notable collateral damage.
Quote:
"We found that the large majority of the traffic that would have gotten blocked here at CU... belonged to torrent services."
— Jackson Sip (25:37)
8. Circumvention Techniques: Popcount Manipulation, TLS Headers, and Obfuscation (27:05–33:47)
- Popcount Manipulation: Padding encrypted traffic to bypass popcount detection (e.g., adding bits so the payload's entropy falls outside the block threshold).
- Resulted in about 17% overhead—considered acceptable for circumvention needs.
- Implemented in Shadowsocks Rust & Android.
- Protocol Mimicry: Prepending TLS headers or ASCII patterns to encrypted payloads as a quick fix.
- Active Probing & Defenses: GFW often probes suspected proxy servers; countermeasures include uniform server responses and hiding proxies behind camouflaged applications (e.g., Chromium frontend in naive proxy).
Notable Moment:
"Can we just add those four bytes to the start of a fully encrypted or fully random payload? The answer was yes. And we were like, whoa."
— Jackson Sip (31:50)
9. Responsible Disclosure & Sharing with Circumvention Developers (36:11–37:51)
- Researchers coordinate closely with proxy tool developers for rapid fixes.
- No attempt to "disclose" to the GFW operators.
- Past experiences show Chinese authorities silently patch vulnerabilities.
10. GFW Disables Dynamic Blocking in March 2023: Possible Reasons & Political Context (39:02–42:36)
- Dynamic blocking ceased March 15, 2023—possibly due to political events like Xi Jinping's re-election or computational/resource considerations.
- Similar censorship ramp-ups observed around sensitive events globally.
Quote:
"It's... one of those parts where we can speculate... but the facts are we don't know."
— Jackson Sip (39:22)
11. Current State as of 2025 (42:05–42:36)
- The popcount-based blocking is not active, but the GFW remains powerful, using other methods.
12. Technical Architecture: On-Path vs. In-Path Blocking (42:36–44:22)
- In-path sensors (can drop/tamper with packets) suspected for the 2021–2023 event.
- On-path (copy-only) methods used for most ongoing censorship.
13. Machine Learning in the GFW (44:32–45:37)
- Recent leaks (from GEJ and Mesa) confirm machine learning use for fingerprinting and possibly in active attacks, but not yet for real-time passive blocking.
14. China’s International Network Bottleneck (“The Great Bottleneck”) (47:01–50:11)
- Persistent high latency when accessing international sites, especially downloads.
- Root cause is likely limited international infrastructure—not purely for censorship, but also to promote domestic internet services and platforms.
Quote:
"This isn't something that's unique to China... US trying to block TikTok unless it gets sold to an American entity is really the same thing."
— Jackson Sip (50:20)
15. Hong Kong, Proxies, and Dark Fiber (50:39–52:32)
- Hong Kong sits outside the GFW—often used as a first hop or bottleneck bypass.
- Anecdotal reports of "grey market" fiber connections sold for unfiltered access.
16. Global Export of Censorship Tech (52:32–55:16)
- Leaked info from GEJ confirms sales of GFW-like technology to Ethiopia, Kazakhstan, Myanmar, and Pakistan.
- Iran and Russia run their own distinct censorship regimes; Russia’s is notable for its decentralized, patchwork enforcement.
17. What’s Next? Trends and Predictions (55:16–56:23)
- Jackson predicts censorship will get easier for authorities (due to proximity of filtering infrastructure to end-users) and harder to circumvent.
- Growth of middleboxes/NATs expected, making circumvention costlier and more difficult.
18. Jackson’s Ongoing Research (56:38–57:21)
- Focused on leveraging leaked GFW/GEJ datasets for further ground-truthing.
- Investigating the growing role of machine learning in censorship and traffic analysis.
Notable Quotes with Timestamps
- "One of my favorite facets of the GFW is a tool called the Great Cannon..."
— Jackson Sip [03:36] - "We can... only speculate. Right. It takes a number of experiments to really determine whether or not what we have observed is what we think it is or just some other effect of the network."
— Jackson Sip [05:57] - "Users all of a sudden just couldn't access their proxies. And it wasn't just one particular implementation, but it was widespread."
— Jackson Sip [09:38] - "If that value shows up between 3.4 and 4.6, you can say, okay, this is clearly an encrypted payload and we're going to block it."
— Jackson Sip [21:36] - "Can we just add those four bytes to the start of a fully encrypted or fully random payload? The answer was yes. And we were like, whoa."
— Jackson Sip [31:50] - "It's... one of those parts where we can speculate... but the facts are we don't know."
— Jackson Sip [39:22] - "This isn't something that's unique to China... US trying to block TikTok unless it gets sold to an American entity is really the same thing."
— Jackson Sip [50:20]
Important Timestamps and Segments
| Timestamp | Segment | |-----------|--------------------------------------------------------------| | 02:59 | What is the GFW? | | 06:47 | The November 2021 blocking event explained | | 11:37 | Research setup and challenges | | 19:16 | Popcount blocking algorithm detailed | | 22:13 | ASCII/protocol fingerprint exemptions | | 24:30 | False positive analysis | | 27:05 | Circumvention: Popcount manipulation and protocol masking | | 31:36 | Quick fix: TLS header pre-pending | | 39:02 | Political context: Why was dynamic blocking turned off? | | 47:01 | The Great Bottleneck (international latency in China) | | 50:39 | Hong Kong and the bottleneck bypass | | 52:32 | Export of censorship tech to other countries | | 55:16 | Future trends in censorship and circumvention |
Episode Tone
- Technical, investigative, pragmatic, and at times speculative.
The conversation is rigorous and evidence-driven, but remains candid about the uncertainties and cat-and-mouse nature of internet censorship.
Conclusion
This episode offers a rare, technical, and accessible window into how China’s Great Firewall operates and adapts. It highlights significant advances in reverse-engineering national censorship tools, the rapid evolution of circumvention tactics, and the broader implications of digital control systems—both in China and globally. For software engineers, policymakers, or anyone interested in internet freedom, Jackson Sippe’s insights provide a timely, inside look at an ever-changing digital frontier.
