Disclaimer This article analyzes network traffic obfuscation and penetration technologies solely from a technical research and cybersecurity education perspective. Please comply with the laws and regulations of your country or region. In mainland China, establishing or using unauthorized channels for international networking without permission is illegal.
Due to the presence of the GFW (Great Firewall), users have to resort to various proxy protocols to bypass these restrictions. These protocols have evolved over many years, alongside technological advancements and continuous upgrades to the GFW, resulting in a diverse array of proxy solutions that play a cat-and-mouse game with the GFW.
GFW Technology Overview
Before discussing how to bypass the GFW, it’s necessary to understand how the GFW operates. The GFW primarily employs the following techniques to restrict user access:
-
DNS Poisoning By tampering with DNS responses, preventing users from resolving certain domain names.
For example, when a user visits https://google.com, the DNS query is sent via UDP to port 53 of the DNS server. As this packet passes through the backbone network, the GFW detects the query and preemptively returns a fake IP address, preventing access to Google.
Currently, almost all well-known blocked websites suffer from DNS poisoning.
-
IP Blocking Directly blocking certain IP addresses, preventing users from establishing connections to them.
For instance, some VPN servers may be identified by the GFW after operating for a period and have their IP addresses blocked, cutting off user connections.
-
Deep Packet Inspection (DPI) By analyzing packet contents, identifying and blocking traffic of specific protocols.
This mainly includes:
-
Keyword Blocking Detecting sensitive keywords in packets and blocking traffic containing them.
For example, if the unpacked packet header contains sensitive words like
Host: facebook.com, the GFW sends an RST packet to terminate the connection. -
Protocol Fingerprinting By analyzing protocol characteristics, identifying and blocking traffic of specific protocols.
For example, OpenVPN uses specific handshake methods and encryption algorithms, producing fixed byte sequences. The GFW can identify and block OpenVPN traffic based on these signatures.
-
-
Active Probing The GFW actively sends probe requests to suspicious servers to confirm if they are proxy servers.
For example, if the GFW detects a large amount of encrypted traffic from an IP address, suspecting it to be a proxy server, it may impersonate a client and send specific probe packets. If the server responds according to the proxy protocol, the GFW adds it to the blocklist.
-
Traffic Pattern Analysis Using machine learning and other technologies to analyze packet size distribution, sending frequency, handshake latency, and other features to identify proxy traffic.
-
Whitelist Mode Last year, places like Quanzhou began piloting whitelist mode, blocking all outbound traffic by default, allowing only vetted traffic to pass. This significantly increases the difficulty of proxy.
Common Proxy Protocols
Traditional VPN
VPN is a technology that connects to a remote server via an encrypted tunnel. All user traffic is transmitted through this encrypted tunnel, hiding the real IP address and protecting data privacy. Technically, packets are encapsulated in an outer encrypted protocol, sent to the VPN server, which then decapsulates and forwards them to the target website. Return packets from the website are first sent to the VPN server, encrypted, and tunneled back to the user.
Here’s a diagram of how a traditional VPN works:
We provided a more technical introduction in our previous article VPN Tunneling.
For traditional VPNs, their original purpose was security, not disguise. Therefore, they have a fixed handshake header when establishing connections. Their characteristics are too obvious, and traditional VPN protocols (such as PPTP, L2TP/IPsec, OpenVPN) are easily identified and blocked by the GFW’s DPI.
Especially OpenVPN, due to its widespread use, the GFW has accumulated extensive protocol fingerprints and can easily identify and block its traffic.
Shadowsocks (SS)
SS was developed by Clowwindy in 2012. It’s a lightweight proxy protocol based on SOCKS5, using encryption to protect data transmission. The client and server negotiate a consistent encryption algorithm (such as AES, ChaCha20), then encrypt the data, making the traffic appear as random noise. Thus, even if the GFW performs DPI unpacking, it sees gibberish and cannot read the original packet headers for pattern matching.
Here’s a flowchart of how SS works:
SS’s biggest innovation was the introduction of PAC files, allowing users to flexibly choose which traffic goes through the proxy and which connects directly. This means users no longer need to route all traffic overseas in a loop like with previous VPNs.
However, as the saying goes, for every measure, there’s a countermeasure. After updates, the GFW no longer relies solely on DPI but has added traffic analysis methods. For SS, there are two major issues:
- Active Probing Despite no protocol header, completely random data itself is an anomaly on the internet. For a connection, the GFW intercepts the first packet; if it has high entropy, it’s marked as suspicious. In some cases, it’s directly blocked.
- Replay Attack The GFW intercepts encrypted traffic and later sends it verbatim to the server. By observing the server’s reaction (TCP Reset or response time), the GFW can infer if the server is running SS and block the IP accordingly.
For details, see GFW Report - How Shadowsocks is Detected and Blocked.
Currently, SS is almost completely unusable, as the GFW’s blocking methods are highly mature. Only a few dedicated line providers still use the SS protocol.
On August 22, 2015, Clowwindy announced that, under pressure from Chinese police, he was withdrawing from Shadowsocks development and deleting the source code from his GitHub page.
ShadowsocksR (SSR)
The author’s detention did not stop SS’s development; on the contrary, the SS community became even more active. In 2015, breakwa11 created a fork called ShadowsocksR (SSR) based on SS and open-sourced it on GitHub.
Compared to SS directly encrypting and sending packets, SSR uses obfs plugins to add an extra layer of camouflage. For example, with the http_simple plugin, it adds a fake HTTP request header, making the GFW mistake it for normal HTTP traffic rather than completely random data.
However, SSR’s methods are somewhat crude; for DPI with contextual analysis, it can easily identify that these HTTP requests do not conform to normal browser behavior and thus be detected.
On July 24, 2017, breakwa11 released a closed-source version of SS, sparking controversy.
On July 27, 2017, breakwa11 announced he was doxxed by ESU TV, deleted all code, and stopped the SSR project.
VMess
VMess is the core protocol in the V2Ray project. Compared to SSR just adding a disguise layer, VMess truly disguises itself as HTTPS traffic.
When establishing a connection with VMess:
- TLS encryption is used, making the traffic appear as normal HTTPS traffic.
- Upon reaching the proxy server, the server’s Nginx matches the path.
- If it matches a predetermined path, the decrypted traffic is reverse-proxied to the V2Ray process for handling.
Here’s a flowchart of how VMess works:
Even better, this setup has many extensions for further traffic disguise:
- VMess itself is just an encryption protocol; its outer layer can use not only TLS but also WebSocket, HTTP/2, etc., for encapsulation, making traffic more diverse and harder to identify.
- By using CDN services, the GFW cannot directly block the proxy server’s IP, as traffic goes through CDN nodes first.
VMess’s design makes it excellent against GFW blocking, becoming one of the most popular proxy protocols. However, it still has some potential weaknesses:
- High Performance Overhead Due to multiple layers of encryption and encapsulation, VMess has high performance overhead, potentially increasing latency.
- TLS Fingerprinting Although TLS is used, the GFW can still identify VMess traffic by analyzing the JA3 fingerprint from the TLS handshake. Especially with default configurations, the fingerprint is quite distinctive.
- SNI Blocking If improper SNI (Server Name Indication) configuration is used, the GFW may identify and block traffic via the SNI field.
Moreover, for ordinary users, using VMess is quite cumbersome, requiring configuration of multiple components (like Nginx, V2Ray), and some technical expertise to maintain and update.
Trojan
Trojan was developed and open-sourced by the
TrojanGfw project in 2019, aiming to provide a simple and effective proxy solution. Unlike VMess’s pursuit of comprehensive features, Trojan focuses on simplified design, disguising proxy traffic as normal HTTPS traffic to evade GFW detection.
Trojan’s core working principle is as follows:
- The Trojan server listens on port 443 for all incoming HTTPS traffic. It does not handle encryption directly but relies on TLS for data protection.
- After establishing a TLS connection, the client first sends a packet containing a preset password. If the server receives a matching password in the first packet, it identifies it as proxy traffic, decrypts, and forwards to the target address.
- If the first packet does not contain the password, the server treats the connection as a normal HTTPS request, handing it to the local web server (like Nginx) for processing, returning real webpage content.
This design makes Trojan resistant to active probing. The GFW’s active probes send various probe packets, but Trojan treats all non-proxy traffic as normal HTTPS, returning web pages or error responses, avoiding exposing the server identity.
However, Trojan requires a valid domain and certificate; if SNI is blocked, it may still fail. Users also need some technical knowledge to configure TLS and web servers.
VLESS / Xray
VLESS is a simplified protocol in the V2Ray project, introduced in 2020 as VMess’s successor. It removes some complex features from VMess, such as user ID verification and dynamic ports, focusing on providing efficient encrypted transmission. Its design philosophy separates the encryption layer from the transport layer, allowing users to flexibly choose different transport protocols (such as TCP, WebSocket, HTTP/2) and encryption methods (such as AES, ChaCha20), adapting to different network environments.
VLESS’s workflow is roughly as follows:
- The client establishes a transport layer connection with the server (such as TCP or WebSocket).
- The client sends the target address and encryption parameters.
- The server forwards data according to configuration while applying the selected encryption algorithm.
This separation design makes VLESS more modular and extensible, but also brings a problem: when combined with TLS, adding another layer of TLS (TLS in TLS) leads to significant performance overhead and produces unique TLS fingerprints, easily identified by the GFW’s DPI and fingerprint analysis.
To solve this, the
Xray project (a fork of V2Ray) introduced the XTLS (Xray TLS) protocol. XTLS is not simple TLS encapsulation but an optimized transport method:
- Direct Splicing After TLS handshake, XTLS allows application layer data to be directly spliced into TLS records, avoiding extra encryption/decryption cycles, greatly reducing CPU usage and latency.
- Fingerprint Camouflage XTLS randomizes TLS handshake parameters, uses multiple cipher suites and extensions, making the traffic fingerprint closer to normal HTTPS traffic, reducing the risk of JA3 or other fingerprint identification.
- Compatibility XTLS supports fallback mechanisms; if the connection doesn’t match proxy traffic, it automatically forwards to other services, like a real web server.
Xray, as an enhanced version of V2Ray, not only supports VLESS and XTLS but also integrates advanced protocols like REALITY, becoming a powerful tool against the GFW. Its advantages lie in high performance and anti-detection capabilities, but configuration is relatively complex, requiring users to understand various transport options.
REALITY
REALITY is an innovative technology in the Xray project, released by
XTLS/REALITY in 2022. It no longer tries to disguise as a normal website server but chooses to impersonate CDN nodes of major internet companies, such as Apple, Google, or Microsoft’s edge servers. This “body-snatching” strategy makes the traffic appear completely legitimate in the eyes of the GFW.
REALITY’s working principle is based on the TLS handshake’s Client Hello message:
- Client Handshake The client sends a forged Client Hello containing the target CDN’s SNI (Server Name Indication) and related parameters, making it look like access to a real site.
- Server Verification The REALITY server forwards this handshake to the real CDN server. If the CDN responds normally (i.e., handshake succeeds), it means the client’s forgery is valid.
- Traffic Injection Once verified, the server uses XTLS technology to inject proxy data into this legitimate connection for encrypted transmission and forwarding.
This design brings significant advantages:
- No Domain Required No need to register and configure a domain; directly leverage existing major sites’ certificates and IPs.
- Anti-Blocking The GFW finds it hard to block major CDN IPs, as it would affect normal users’ access to services like Apple.
REALITY represents the latest advancement in proxy technology, from here on almost completely undetectable and blockable by the GFW.
Hysteria
Unlike the TCP-based protocols mentioned above,
apernet/hysteria is based on UDP and mimics the operation of QUIC (Quick UDP Internet Connections). QUIC is Google’s next-generation transport protocol, designed to provide low-latency, high-reliability connections, widely used in HTTP/3.
Hysteria’s core innovation lies in its congestion control algorithm:
- Aggressive Congestion Control Traditional TCP slows down retransmission upon detecting packet loss to avoid network congestion. Hysteria does the opposite: upon detecting packet loss, it accelerates packet sending to seize more bandwidth. This “counterintuitive” method excels in high-loss, low-quality networks, significantly boosting transmission speed and stability.
- QUIC Camouflage Hysteria uses QUIC’s handshake and data format, making traffic appear as normal QUIC traffic, enhancing disguise.
Advantages include:
- In weak network environments (such as satellite links or mobile networks), Hysteria’s speed often surpasses TCP protocols.
- UDP traffic is harder to deep-inspect with DPI, and QUIC’s encryption provides extra protection.
- QUIC’s 0-RTT handshake reduces connection establishment time.
However, Hysteria also faces challenges:
- Many ISPs impose quality of service restrictions on UDP traffic, prioritizing it below TCP, potentially leading to unstable speeds.
- Aggressive packet sending may be mistaken for DDoS attacks, causing server IPs to be blocked.
Later projects like
tuic-protocol/tuic improved upon Hysteria by introducing smarter packet loss backoff strategies to avoid excessive sending while maintaining performance. These UDP protocols bring new possibilities to proxy technology, especially when countering the GFW’s traffic shaping.
However, according to the GFW Report team’s paper published at USENIX’25, Exposing and Circumventing SNI-based QUIC Censorship of the Great Firewall of China, the GFW has begun blocking and interfering with QUIC traffic, so Hysteria’s future remains to be seen.
NaïveProxy
klzgrad/naiveproxy adopts a different counter-strategy: instead of creating new protocols, it strives to perfectly imitate real browser HTTPS traffic. Developed based on Chromium’s network stack, it ensures the generated traffic is byte-level identical to that produced by the Chrome browser.
The advantage of this method is that the traffic fingerprint is indistinguishable from normal browser traffic, making the GFW’s DPI and fingerprint analysis nearly impossible to differentiate. NaïveProxy represents the pinnacle of “faking it till you make it,” but also highlights the limits of proxy technology: when the disguise is perfect enough, detection becomes extremely difficult.
Common Proxy Tools
As proxy protocols evolved, corresponding client tools have proliferated. These tools not only implement protocol support but also provide user-friendly interfaces, rule management, and automation features.
V2Ray and Xray
The rise of the VMess protocol made
V2Ray almost synonymous with proxy tools from 2018-2020. After the author V2Ray was detained, the community continued development under V2Fly’s leadership, keeping the project alive. However, with the emergence of XTLS technology, developer disagreements intensified, leading to the birth of
Xray. Xray not only maintains compatibility with V2Ray but also introduces REALITY and optimized XTLS, becoming a cutting-edge tool against the GFW.
V2Ray/Xray’s advantage lies in high customizability, supporting multiple protocols and transport methods, but the learning curve is steep, suitable for technical users.
Sing-Box
With protocol diversification,
SagerNet/sing-box emerged. It supports almost all mainstream protocols (Trojan, VLESS, REALITY, Hysteria, etc.) and provides a unified configuration format. Sing-Box emphasizes performance and security, written in Rust with low resource usage. It also supports a rule engine for intelligent routing based on domain, IP, or application.
Sing-Box’s emergence simplifies user choices, eliminating the need to switch tools for different protocols.
Clash Series
Clash has been one of the most popular proxy tools since 2019, renowned for its concise YAML configuration and rich rule sets. Clash supports multiple protocols and provides a Web UI and API for easy management and monitoring.
However, in 2023, the original author Dreamacro suddenly deleted the repo and disappeared, causing project interruption. The community quickly responded, deriving branches like Clash.Meta. Ultimately,
Clash.Meta and
Mihomo took over, continuing development. Mihomo particularly focuses on community governance and long-term maintenance, becoming the new core of the Clash ecosystem.
The Clash series is known for ease of use and feature richness, suitable for ordinary users.
Other Tools
- Shadowsocks Clients: Such as Shadowsocks-Qt5, ShadowsocksR. Although the protocol is outdated, some users still use them.
- NaïveProxy Clients: Usually integrated into browser extensions, like naiveproxy’s Chrome extension.
- Hysteria Clients: Official CLI and third-party GUIs, like hysteria-gui.
Conclusion
On September 20, 1987, at 20:55, the China Institute of Computer Applications under the Ministry of Ordnance Industry successfully sent China’s first email, marking the successful connection of China to the international computer network. The content of this email was:
Across the Great Wall we can reach every corner in the world.
Now, it seems quite ironic.
The history of proxy technology is essentially a history of the cat-and-mouse game with the GFW. Whenever the GFW upgrades its blocking methods, proxy protocols and tools evolve new countermeasures accordingly.
During this period, many notorious figures and organizations have aided and abetted. From “Father of the Firewall” Fang Binxing to Cisco, Qihoo 360, Venustech, and the Institute of Information Engineering at the Chinese Academy of Sciences’ MESA Lab, they have enhanced the GFW’s blocking capabilities through technical support, intelligence sharing, etc., posing huge challenges to the proxy community and severely hindering the free flow of information.
However, it is these challenges that have spurred innovation in proxy technology. Each blocking upgrade prompts developers to seek new breakthroughs, driving the continuous evolution of protocols and tools. From traditional VPNs to SS, SSR, then VMess, Trojan, VLESS, REALITY, and Hysteria, each technological innovation reflects the proxy community’s determination and wisdom in combating censorship.
Comments