
GreenTunnel: anti-censorship utility designed to bypass deep packet inspection - hieudang9
https://github.com/SadeghHayeri/GreenTunnel
======
snvzz
This is a nice workaround for those stuck under censorship regimes such as the
UK, South Korea, Turkey, India or China.

Now, Encrypted DNS (thanks to DNS over TLS/HTTPS) and HTTPS (thanks to Let's
Encrypt and HSTS) are getting deployed somewhat widely.

The next step is encrypted SNI[0], and it'll get this much harder to do any
meaningful DPI, for censorship or else.

[0]:
[https://en.wikipedia.org/wiki/Server_Name_Indication#Securit...](https://en.wikipedia.org/wiki/Server_Name_Indication#Security_implications_\(ESNI\))

~~~
nimbius
there are two edges to this sword.

DoH also means breaking stuff like pihole and other ad filtering. It means you
trust companies like google who base their revenue off ads, or cloudflare who
have censored content numerous times in the past, to serve you DNS.

its also kind of pointless if the state knows youre using it outside of a
tunnel...they can just watch your _next_ packets to see where you decided to
go.

~~~
godelski
> cloudflare who have censored content numerous times in the past

Besides Stormfront[0], what else did they censor?

[0]
[https://en.wikipedia.org/wiki/Stormfront_%28website%29](https://en.wikipedia.org/wiki/Stormfront_%28website%29)

~~~
abiogenesis
I wouldn't call that censoring either. They just rejected to provide any
services for them.

~~~
natmaka
Indeed, "deplatforming" isn't equivalent to "censoring".

------
travisgriggs
Why hasn't this become the modern Right to Bear Arms? The root of the second
amendment was trying to ensure that one class of citizenry did not have tools
at their hands to force another class of citizenry to comply. It maintained a
balance. The right to encrypt and keep your data private should be a modern
equivalent of the right to bear arms.

~~~
ploika
Bear in mind that as a non American, invoking a right to bear arms actively
turns me off whatever you're trying to sell me. Too much damage has been done
to my home country by violent groups who took up arms against the state and
each other.

~~~
101404
Just look at the past 100 years of human history and the damage that has been
done because citizens had NO arms to defend their rights against a despotic
government. The deaths are counted in millions.

Insurgent groups will buy 5 dollar AK-47 regardless of legality. This is about
lawful citizens being able to defend their rights against unlawful
governments. Or at least to raise the costs an armed group has to pay for an
attempt to take power.

As another non American, I am still undecided on the issue, but I tend to be
in favor of that 2nd amendment.

------
m_a_g
It is working perfectly for Turkcell Superonline, Turkey. Unfortunately, anti-
censorship tools are very crucial for us these days. Thank you for your work.

~~~
degski
Going voting is crucial.

------
segfaultbuserr
> _GET / HTTP/1.0_

> _Host: www.youtube.com_

> _We send it in 2 parts: first comes GET / HTTP/1.0 \n Host: www.you and
> second sends as tube.com \n .... In this example, ISP cannot find blocked
> word YouTube in packets and you can bypass it!_

If you talk to anyone from China that _this_ is how you bypass (HTTP) "deep
packet inspection", it would sound incrediblely naive. I'm not criticizing
here, thanks for developing an anti-censorship tool, but my point is, any DPI
that can be bypassed in this way is simply too outdated, it's far from the
state-of-art threats we are facing worldwide.

What China does today is what your ISP/government is going to do tomorrow,
when they upgrade the equipment. Learning a history lesson from China, can
help providing insights for developers in other countries to know where this
cat-and-mouse game is heading to...

> paulryanrogers: So basically it just does two things: carefully chunking
> HTTP header packets and encrypted DNS? Not sure this will work for very
> long.

Of course it will not. I'll explain why.

\---

Literally, the same technique was used in China during the early days of Great
Firewall, around 2007. At that time, the "censorship stack" was simple,
basically, it had...

* A brute-force IP blocking list

This is a constantly updated list of IP addresses of "unwanted" web servers,
such as YouTube or Facebook. They are distributed via the BGP protocol, just
like how normal routing information is distributed. Once your server enters
this blacklist, nothing can be done. Not all unwanted websites enter the list
due to its computational/storage costs.

* A DNS poisoning infrastructure

A list of "unwanted" domain names are maintained. These domain names are
entered to the national DNS root server as records with bogus IP addresses. It
was used more widely than the IP blocklist, since it has zero cost to operate,
but it can only block websites in the list and it takes time for the censor to
be aware of a target's existence.

* A naive keyword filtering system.

All outgoing international traffic is mirrored for inspection. A keyword
inspection system attempts to match the URLs in HTTP requests against a
blacklist of unwanted keywords. Rumors said the string matching was performed
by hardware in ASIC/FPGA, allowing enormous throughput.

* A TCP reset attack system

Once an unwanted TCP connection is identified by the keyword inspection
system, the TCP Reset attack system fires a bogus RST packet to your computer,
fooled by the packet, your operating system will voluntarily work against you
and terminate the connection, saving the censors' CPU time. The keyword
filtering system paired with reset attack was the preferred way to carry out
censorship.

That's all. The principle of operation was simple and easy to understand. So
what were the options for bypassing it? There were a lot. To begin with, the
blocked IP addresses were blocked, you could do nothing about it. But in the
earliest day, accessing them was as simple as finding a random HTTP proxy
server. Later, the inspection system was upgraded to match HTTP proxy
requests. Then, you could simply play some magic tricks with your HTTP
requests, like the example in the beginning, so that your request wouldn't
trigger a match. Around the same time, in-browser web proxy tools became
popular, they were PHP scripts running on a web server that fetched pages.
However, they became useless when the keyword matching system was upgraded to
match the content of the entire page, not simply the requests (remember, few
sites had HTTPS). At this point, all plaintext proxy techniques and HTTP
request "massaging" techniques were all officially dead.

Some naive rot13-like techniques were later implemented to some web proxies,
HTTPS web proxies were also a thing, but they saw limited use.

* New: A complete keyword filtering system - Inspect all HTML pages (Was: A naive keyword filtering system)

Another target to attack was the DNS poisoning system, sometimes all you
needed was a correct IP address, since not all IPs were included in the
blocklist due to the costs. Initially, all one needed to do was modifying one
nameserver to 8.8.8.8. However, countermeasures were quickly deployed. A
simple countermeasure was rerouting 8.8.8.8 to the ISP's nameserver, continued
feeding the same bogus records to you. Nevertheless, there were always
alternative resolvers to use. So the system was upgraded to provide a DNS
spoofing infrastructure - at the instant an outgoing DNS packet is detected,
the spoofing system would immediately answer with a bogus packet. The real
packet would arrive at a hundred milliseconds later, but it would be too late,
your OS had already accepted the bogus result.

And ironically, even if DNSSEC was widely supported (it was not), it couldn't
do anything but returning an SERVFAIL, since DNSSEC can only check whether the
result was true, dropping the bogus packet and accepting the true one was
outside the capabilities of a standard DNSSEC implementation.

* New: A Real-time DNS Spoofing System

Better tools were developed later, that acted like a transparent resolver
between the upstream resolver and your computer, that identified the bogus
results to drop them, but the use was limited. Also, at this point, the IP
blocklist has been greatly expanded. Even if a correct IP could be obtained,
it was still inaccessible. Around 2008 or so, a special open source project
was launched by developers in China - /etc/hosts list, whenever someone found
a Facebook IP address that was not in the blocklist yet, one sent patches to
the project. There were also shell scripts to keep your list up-to-date.

However, a /etc/hosts list was useful but its usefulness was limited. First,
it was a matter of time before a new IP address was blocked. Also, a working
IP address still was restricted by the same keyword filtering system.

* New: Expanded IP Blocklist.

Some people also realized that the firewall was only able to terminate a
connection by fooling the operating system. Soon, iptables rules for blocking
RST packets appeared in technical blogs. By ignoring all RST packets, one
essentially gained immunity at the expense of network stability, as legitimate
RSTs were also ignored. Soon, the censorship responded by upgrading the reset
attack system, so that RST packets were sent to both directions - even if you
ignored RST, the server on the other side would still terminate it. Also, RST
was now "latched-on" for a limited time, when the first RST was triggered, the
target remained inaccessible in several minutes.

* New: Bidirectional TCP Reset Attack

* New: "Latched-On" Reset Attack

When HTTPS was enabled, it was impossible to perform keyword inspection in the
HTML pages - at this time, censor sometimes still wished to allow partial
access, only triggering the block when detected a match. This strategy cannot
be applied to HTTPS, since the content was all encrypted. Some people realized
some popular websites supported HTTPS but not enabled it by default, such as
Wikipedia. The Great Firewall responded by implementing a HTTPS certificate
matching subsystem in the keyword matching system, when a particular
certificate was matched, you were greeted by a TCP RST packet (this system has
been removed later when HTTPS saw widespread use).

* New: Certificate-Based HTTPS Blocking System

At this point, around 2010, the only reliable way to browse the web was using
a fully-encrypted proxy, such as SSH dynamic port forwarding or a VPN, which
required purchasing a VPS from a hosting provider. SSH was more popular due to
its ease of use - all one needed was finding a SSH server and ran "ssh -D
1337", so that port 1337 would be a SOCKS5 proxy provided by OpenSSH. OpenVPN
was reserved for heavy web users, since it's more difficult to setup, but had
better performance.

From the beginning to the 2010s, anyone who was using VPN or SSH can enjoy
reliable web browsing (only be disturbed from time to time due to the
overloaded international bandwidth). However, the good days came to an end
when the Great Firewall implemented a real-time traffic classifier, it was
first applied to SSH. It observed the SSH packets in real-time and attempted
to identify whether an overlay proxy traffic was carried on top of it. The
blocking mechanism was enhanced as well, now it was able to dynamically
inserting null route entries when it decided that the communication with a
server was unwanted. The IP blocking system was also improved, now it was able
to collect unwanted IP addresses at a faster rate with help of the traffic
classifier. If you used SSH as a proxy, after a while the connection would be
identified, with all packets dropped, repeated offenses would earn you a
permanent IP block. For VPNs, the firewall implemented a real-time classifier
to detect OpenVPN's TLS handshakes. When handshakes were detected, a RST
packet is sent (or if you use UDP, all packets are dropped). Repeated offenses
would earn you a permanent IP block as well.

New: Real-Time Traffic Classifier

New: Real-Time IP Blocking

New: Actively Updated IP Blocklist using Classifiers as Feedback

Traffic classifiers would later be expanded to cover HTTPS-in-HTTPS as well,
so a naive HTTPS proxy wouldn't work, and possibly have other features, it's a
mystery.

BTW, after Google exited from China, the HTTPS version was immediately
blocked, and for HTTP, a ridiculous keyword blocklist was enforced and it
generated huge amount of false-positive RSTs for harmless words, apparently a
deliberate decision, preferring false-positive over false-negative.
Eventually, all Google services had been permanently blocked. The IP block
became extensive, major websites have been completely blocked, the unblocked
sites were only exceptions. For most people, the arrival of widely-used HTTPS
was too late and useless, since IPs were blocked. And as mentioned, SSH and
VPNs were classified and blocked as well.

This was when a new generation of proxy tools started to gain popularity,

~~~
segfaultbuserr
Shadowsocks being the most well-known example. From a cryptographic
perspective, it was a big step backwards. Since Diffie-Hellman handshakes were
subjected to traffic classifiers, these tools only used symmetric encryption
with fixed keys. Their encryption protocols were ad-hoc, and not
cryptographically robust. While it was a matter of fact that nobody could
break a simple AES-CBC encryption, nobody would trust these tools for one's
confidential data as well (for example, AEAD was unsupported for many years).
But since the goal was bypassing censorship, not secrecy, they became
extremely popular. It was not seen as an major issue, since the widespread use
of HTTPS offered robust secrecy. DNS encryption was still essential (usually
the SOCKS-5 interface was provided by these tools, SOCKS-5 can be configured
to pass the original domain name to the proxy, the proxy can resolve the names
inside its encrypted connection), but became less useful when used on its own,
since the IP blocklist was huge by the time.

The landscape of the Internet has changed dramatically since 2013 as well. The
universal adoption of HTTPS eventually rendered all keyword-based inspection
useless. A few sites were considered too large to block, including Amazon AWS
and GitHub. One side of the battle started becoming a mutual assured
destruction game - either allowing people to exploit a large platform to
publish uncensored material, or blocking the platform altogether and creating
economic damages. I am confident that the MAD game will continue to play out,
however, Russia's response to AWS domain fronting showed this strategy could
fail if major platforms don't want to cooperate, it was a bit worrying, at
least. But anyway, encrypting SNIs should be the next step.

But I digressed, back to Shadowsocks, et al, since the state was eliminated
(pun intended), all one could see was encrypted raw TCP packets, there was no
reliable way for the firewall to classify Shadowsocks-like tools for many
years (until recently, possibly by exploiting cryptographic-related issues,
but we are not sure how successful it is). But the censorship system started
getting weirder and weirder - sometimes, connections break without any
apparent reason at all, sometimes data rate was extremely low, sometimes a few
IPs were blocked mysteriously, and so on, but life kept going on. There were
several possible hypotheses, one was that the traffic classifiers were getting
more and more functionalities, and occasionally they could hit something.
Another was that the TCP RST was sent in a probabilistic manner to suspected
endpoints to degrade reliability. The only thing that could be confirmed was
the significantly increased use of QoS by the ISPs, so that all unknown
protocols would be classified as "low priority", degrading the reliability of
all anti-censorship tools. At this point, bad connectivity and censorship was
indistinguishable.

It's safe to say, that at this point, nobody ever understands how the Great
Firewall of China work anymore. This is the end of our story.

For simplicity, I skipped many less used techniques, such as Tor's domain
fronting, or CDN-based circumvention, or obfsproxy4 that featured Diffie-
Hellman keys indistinguishable from random strings, and possibly others. I'm
well-aware of them. But it's expected that, unless everything is encrypted and
all infoleak is plugged (then, we will start playing the mutual assured
destruction game), all these tools are doing is an endless cat-and-mouse game.

Developers of anti-censorship tools need to consider countermeasures based on
what China is currently doing. So that when the same techniques used by China
are implemented by their own ISPs in the future, they are always prepared to
act.

~~~
d4mi3n
Fantastic breakdown on the recent history of censorship in China, thanks for
sharing it.

You mentioned that for many of these efforts bypassing censorship trumped
secrecy concerns. Is this still the case?

If I were a citizen regularly bypassing censorship of an authoritarian
government, I’d be concerned for my safety if it was well documented that I
regularly accessed censored material.

~~~
SZJX
From what I gather, the regime doesn't really intend to arrest anybody who
simply regularly _accesses_ western websites. Some big corps also have their
special VPN channels to access foreign websites so that they can do business
normally. Hell, even the foreign ministry spokesperson posts regularly on
Twitter. What they want is to stop this floodgate of information being opened
to the common mass, that's when things could get problematic.

People are arrested for _producing_ things that are deemed potentially
destabilizing for the regime/country, but nobody as far as I know ever got
arrested for _accessing_ blocked materials.

Of course, if you are also actively producing content it would be much wiser
to camouflage your identity much better, if you can. That's when the secrecy
becomes a major concern.

------
1996
A simple countermeasure at the ISP level: a buffer to merge 'www.you' to
'tube.com'

A far greater danger is DPI that use statistical analysis to detect possible
tunnels. You want your traffic to be as close as possible to normal traffic.
There is no perfect solution there. The current best is generating valid
images with a hidden data payload (to download), and generating pseudo text
posted on public forums or email (to upload) while limiting the
download/upload ratio, by downloading random content if necessary as most
people download far more than they upload.

It works best when using "known" websites like gmail (draft folder) or
facebook (messenger), as all the traffic goes to a whitelisted host and look
like regular usage.

~~~
gruez
>A simple countermeasure at the ISP level: a buffer to merge 'www.you' to
'tube.com'

addressed here:
[https://news.ycombinator.com/item?id=22656122](https://news.ycombinator.com/item?id=22656122)

>A far greater danger is DPI that use statistical analysis to detect possible
tunnels.

That's only an issue if there's a blanket ban on tunneling/proxies. While it's
a problem in authoritarian regimes (eg. china, kazakhstan), it's not an issue
in most western countries. I haven't heard of any western countries banning
VPNs (yet).

>The current best is [...]

The timing information would still be suspicious. Most people aren't
constantly checking their gmail/facebook multiple times a second, but normal
browsing would generate packets with that frequency. It's really only
undetectable if you're sending/receiving messages (eg. IM or email). A better
candidate might be multiplayer game traffic. They provide a consistent stream
of bits[1] to hide data in. If you're willing to set your tunnel's bandwidth
to a few kilobytes a second (throttling if there's too much data, sending
decoy packets if there's too little), it'd be very hard to detect any
anomalies.

[1] random search:
[https://youtu.be/8Kvj5TZNNJ4?t=1080](https://youtu.be/8Kvj5TZNNJ4?t=1080)

~~~
1996
> Defeating chunking would require additional memory + compute power on the
> DPI boxes, which I suspect ISPs don't want to bear.

It depends. ISP may be willing to spend more, if they gain more or are forced
by governments to do that.

Even as is, the proposed method is still too easy to defeat, especially with
IP bans: if the ISP really doesn't want to let youtube.com work, all the A and
AAAA records will be blacklisted

> The timing information would still be suspicious. > It's really only
> undetectable if you're sending/receiving messages

Indeed, so the suggestion was to use the draft folder and FB messenger.

A better method would rotate the whitelisted websites- like using mostly gmail
for 20 minutes, then facebook for 1h, etc. and of course only "on demand" so
that traffic does not occur 24/7

For multiplayer game, the audio channel already provides a very simple method
to stream more than a few kb per seconds.

------
paulryanrogers
So basically it just does two things: carefully chunking HTTP header packets
and encrypted DNS?

Not sure this will work for very long.

~~~
gruez
>Not sure this will work for very long.

Maybe if it gets popular. Defeating chunking would require additional memory +
compute power on the DPI boxes, which I suspect ISPs don't want to bear.

~~~
jaimex2
I work in the DPI field and have maintained a few DPI firewalls.

Most DPI that I know of will defeat this bypass technique, I'm not sure the
author has even tested if it works.

DPI firewalls already have to support aggregating packets. It's pretty common
to need more information beyond the initial packet. It's not really any more
memory intensive either, you're just reading byte by byte and keeping what you
need.

Heck most DPI firewalls support checking something in the outbound packets is
in the inbound packets. ie - checking if a connection is performing IKE.

------
kburman
Works with You Broadband, India. Thanks a lot man!

------
__sy__
This is great if there isn't a blanket ban on VPN's, but unfortunately, it
won't work in China. I've had next to zero-luck keeping my own VPN tunnels
open for more than a couple of days at a time when behind the great firewall.

~~~
hrdwdmrbl
Check out v2ray. You'll need to have your own domain, server, and a cloudflare
account, but in terms of speed it is unmatched. Unfortunately many of the best
tutorials are in Chinese.

~~~
bscphil
Looks interesting. From
[https://www.v2ray.com/en/index.html](https://www.v2ray.com/en/index.html) it
seems that it's "just" a VPN protocol / software that can tunnel over TLS. I
assume the point of using your own server + Cloudflare is that it breaks IP
based blocking of most VPN providers. I guess just your own server without
Cloudflare would work fine for a while, but they probably have heuristics for
a lot of encrypted traffic sent to a single unknown server?

The remaining question for me is about the TLS part of all this. Does China
not have agreements with most external services about stripping TLS such that
a lot of TLS traffic would be suspect? Or do they not mandate their citizens
to use a Government provided root cert that would allow them to "securely"
MITM connections? That would be how I'd do it if I were an authoritarian
government.

If not, then what's their plan for the future? I could see a Firewall kind of
mostly working for now on a combination of DNS, IP, and SNI filtering, but
_all three_ are going away in the near term. DNS with DNS-over-HTTP, SNI with
eSNI, and IP blocking has become less plausible already through routine use of
proxies like Cloudflare.

~~~
tgragnato
[https://www.scmp.com/news/china/politics/article/3030563/big...](https://www.scmp.com/news/china/politics/article/3030563/big-
data-expert-takes-over-chinas-new-cybersecurity-chief)

They want to make the networks transparent to the government, and apply
machine learning for understanding the data and warnings the monitoring system
will provide.

You either provide decryption keys, or your traffic will be dropped.

~~~
bscphil
Yeah, that's what I figured would happen next. It's honestly very difficult to
defend against an adversary that nakedly aggressive. It's like trying to
browse the Internet privately on your computer at your desk at the major IT
firm you work at.

------
jogundas
Is that a SOCKS proxy, or just a HTTP proxy? The github readme does not make
that clear.

------
benbristow
Doesn't work against Virgin Media UK

~~~
Nextgrid
Virgin is censoring things now?

~~~
buckminster
All the big UK ISPs do. This is targeted at CP but who knows what else gets
covered.

~~~
Nextgrid
The Virgin Media list above has some very benign content blocked (movies,
sports and even a Nintendo cheats/hacks site).

This is concerning and I'm thankful I left that terrible (for a lot of other
reasons) company long ago (my current ISP doesn't seem to be blocking anything
from their list).

------
oedmarap
It's really good to see more tools in the privacy space, resulting in more
options and fallbacks for the end-user.

I think the author should also try to market this as much as possible outside
of the HN crowd, since this seems targeted at non-tech users — I could be
wrong but my reasoning is that HN users who care about privacy would prefer a
combination of a VPN and DoH to defend against traffic & DNS inspection,
respectively.

~~~
yjftsjthsd-h
It really depends on your threat model, but if you have a full tunnel VPN,
then encrypted DNS is a lot less important.

------
dontdieych
It's working very well against South Korea, KT(ISP)

------
Thorentis
> We send it in 2 parts: first comes GET / HTTP/1.0 \n Host: www.you and
> second sends as tube.com

How does this even work? "www.you" won't return a valid HTTP response and
"tube.com" won't either. How can you fetch the content at "youtube.com" but
splitting the domain name in half? Won't you get two completely wrong
responses that don't fit together?

~~~
detaro
It's split across two network packets. It's still one request for the web
server.

~~~
est
might add another fake packet that confuse DPI.

------
the_resistence
Need some more insights into proper use. I couldn't get to work in mainland
China

------
hrdwdmrbl
How does this differ from Gigsaw's Outline, Shadowsocks or V2Ray?

------
terrycody
Any one tested if this tool work in China or not?

------
david_draco
Why not use Tor?

~~~
hrdwdmrbl
Really depends on your use-case. Tor is great but easily detectable and thus
blockable, and though its speed has gotten much better, it isn't as fast as
other options. But again, it really depends on what your goal is.

~~~
jaimex2
Tor has obfuscation options, they work well.

The default endpoint list is usually blocked though as its published to the
clients so you have to request an off-list endpoint.

