
Deep packet inspection is dead, and here's why (2017) - ogig
https://security.ias.edu/deep-packet-inspection-dead-and-heres-why
======
rocqua
I'm worried about this development.

One the one hand, ubiquitous encryption is simply required for security on the
internet. Things like lets encrypt and warning on http are great improvements.

On the other hand, the owner of a network has some right to look into the
packets on that network. Especially if the owner of the network also owns the
end-points of that traffic. My main use-case here isn't corporate networks,
snooping there makes me uncomfortable.

Really, my issue is stuff on my own network. I want to see what my TV sends
home. Same with an amazon-echo, or really any IoT thing. Yet, if they all use
SSL and don't allow me to add a root CA, I can't look at what they run.

A user has no control over an amazon echo. You can't modify the software
because the bootloader is locked down. You can't inspect the traffic because
it is SSL cert-pinned. Amazon can push updates to it at any time. All a user
gets to do is decide whether it is turned on, and whether it gets a network
connection.

Really, what I would want to see is the option to install a CA cert on any
device I own. At the same time, that is a terrible idea. Every 14 year old
with google is going to find some stack-overflow answer that'll tell them to
MitM their TV to do some simple thing.

~~~
ryukafalz
The implication here is that you can't trust the devices on your network. IMO
that's itself a problem; rather than weakening encryption to enable network
owners to analyze traffic on their network (which also harms dissidents who
need secure network access), I would prefer a push for more trustworthy
devices.

The devices we own should be acting in our own best interest; we shouldn't
need to treat them as adversaries.

~~~
rocqua
> The implication here is that you can't trust the devices on your network.
> IMO that's itself a problem;

A large part of trust is auditability. And a large part of auditing a device
is looking at its communication. What we are kind of running into is the
security implications of a debug-interface.

Your point about dissidents needing secure access is a strong one. Any way to
MitM a connection for audit purposes can be repurposed into an MitM for
surveillance given enough coercion. However, I don't think this applies to a
network connection between my TV and Samsung.

Once a surveillor coerces me to give them access, I can disconnect my TV from
the internet. The same goes for an amazon echo, or a juice press with WiFi. It
is different with whatsapp, facebook and internet banking. These are much more
essential, so it is more important that it is hard to MitM them even if the
user gives consent.

~~~
ryukafalz
>A large part of trust is auditability. And a large part of auditing a device
is looking at its communication. What we are kind of running into is the
security implications of a debug-interface.

Sure, but you can also look at the device's communication if you have
administrative access to it yourself, rather than trying to MITM it. I suppose
this could be a vulnerability itself if done poorly, but users have
administrative access to their PCs; why should other devices they own be any
different?

>These are much more essential, so it is more important that it is hard to
MitM them even if the user gives consent.

If the user gives consent, they should be allowed to inspect the connection.
The device's UI can make it _very clear_ that they're about to do something
dangerous, but it's their device and should be their choice.

------
helen___keller
I think a more correct title would be "Deep packet inspection should be dead,
and here's why"

Schools, financial institutions, and more will pay big bucks to web gateway
vendors who will help them deploy man in the middle attacks on their own
machines, employ blacklists or whitelists (even on Google search terms not
just at the DNS level), scan traffic for SSNs, and so on. It's not a dead
market (quite the opposite, startups like Zscaler are fetching unicorn
valuation).

It also encourages terrifying but legal behavior for employers like monitoring
which subreddits you read or what kind of YouTube videos you watch or how much
time you spend slacking off at work.

The arms race between security and exploitation isn't likely to stop, and I
have no confidence that corporations with sensitive data will willingly take a
privacy-granting approach when vendors promise them unmatched security by
decrypting traffic.

I think the two viable approaches are educating the public that your work
machine is not private or looking for lawmakers to step in (but let's be real,
that option is unlikely)

During my time working for one of these web gateway vendors, I became highly
sensitive to what browsing happened on my primary operating system (which had
company certificates installed), and what went on my development VM (which I
set up myself without corporate certificates)

~~~
dillz
My workplace has such a MitM gateway where every host has a company root CA
installed and every SSL certificate we receive in the browser is an
interchanged one. Fair enough.

However, the huge problem is that employees are completely left in the dark
about this privacy invasion... only the tech-savvy ones notice and understand
it.

------
mabbo
A few years ago, one of the best managers I ever worked for left to become the
CTO of a company doing pattern analysis of network traffic, rather than Deep
Packet Inspection. The premise was that most of the internet traffic on your
network follows the same typical patterns, but nefarious traffic doesn't. Drop
their system into the network and voila, you can start to find the weird
things going on that seem out of the ordinary.

At the time, I thought that it seemed a bit heavy-handed- just use DPI and
you'll get the same results. This article is making me think he was very
prescient in the matter.

~~~
m-app
This is exactly what has been researched at multiple security companies and
productized by Cisco under "Encrypted Traffic Analytics". This is based on
research from 2016 that can be found on arXiv:
[https://arxiv.org/abs/1607.01639](https://arxiv.org/abs/1607.01639)

> We conclude that malware's usage of TLS is distinct from benign usage in an
> enterprise setting, and that these differences can be effectively used in
> rules and machine learning classifiers.

Disclaimer: I work for Cisco

~~~
vlovich123
Neat paper but as soon as this becomes more widespread malware authors are
going to adapt to hide as regular traffic so the analysis is going to get more
& more complex until it's not useful as malware traffic will look
indistinguishable from real traffic.

This is a fundamental evolutionary cat & mouse game that's impossible to win;
antibiotics & bacteria, toxins in prey & toxicity resistance in predators,
etc.

~~~
ap0phenia
& autoimmune diseases...

------
lpcvoid
The author suggests towards the end to analyze DNS queries, but that's on the
best way [1] to be encrypted as well (finally).

[1]
[https://wiki.mozilla.org/Trusted_Recursive_Resolver](https://wiki.mozilla.org/Trusted_Recursive_Resolver)

~~~
dstjean
In a corporate environment, managed devices can be configured to force the use
of specific DNS settings. The same type of implementation (MITM) could be used
to analyse the requests.

That being said, this is at the OS level. An app such as Firefox could still
override those settings or provide their own implementation.

~~~
jopsen
If you IT department is your adversary you should get a new job. Or at least
use a personal device for personal matters :)

~~~
dstjean
I don't think NOT performing packet inspection due to privacy concern is a
good idea. (Good security controls should exist over its administration)

One reason why organizations use packet inspection is to protect its staffs,
customers and vendors from malicious actors who could cause data breaches
leading to huge privacy issues.

Privacy over Security? The right balance must be found

~~~
pjc50
Of course, this means the packet inspection host and the organisation's
internal CA are now _great_ targets to attack. This approach puts all the eggs
in a single central basket.

~~~
dstjean
IMO relying 100% on the end devices to protect themselves is too risky.
Layered security seems to work best. Also I prefer to heavily monitor/secure
two appliances/systems than heavily monitor thousands of end devices

~~~
zrm
> Layered security seems to work best. Also I prefer to heavily monitor/secure
> two appliances/systems than heavily monitor thousands of end devices

It can't be both at once. Either you have multiple layers because both the
appliance and the endpoints are independently secure and the attacker has to
compromise both, or you don't monitor/secure the individual endpoints and the
appliances become a single layer / single point of compromise.

And if the appliances can see all the plaintext of everything then they're a
single point of compromise even if the endpoints are otherwise secure, because
the attacker can still read all the secrets through the man-in-the-middlebox.

What works is to leave each thing to what it's good at. The endpoints are good
at inspecting the plaintext, because they inherently have to have it anyway
and they have the context to understand what it's supposed to look like. So
you don't end up interfering with a newer, more secure protocol because the
middlebox doesn't understand it. And plaintext is sensitive data so the fewer
things that have access to it the fewer things you can compromise to get
access to it.

What middleboxes are really good at is certain types of access control, e.g.
blacklisting malicious IP addresses for outgoing connections, or whitelisting
source and destination addresses and ports for incoming connections. They keep
your local IP cameras off the internet even if the cameras "should" be secure
on their own.

------
kijin
Deep packet inspection seems to be alive and well, even outside of corporate
networks.

My ISP uses the User-Agent header in outgoing requests to guess how many
computing devices I have at home, and tries to charge money if it's more than
an undisclosed limit. This of course only works for plain HTTP, but there are
still enough unencrypted sites out there that my ISP has an opportunity to
intercept a request at least a couple of times a day.

Meanwhile, my country is just beginning to roll out a system that detects the
SNI hostname in encrypted connections, in order to block illegal sites that
hide behind Cloudflare. Fortunately they can't spoof certificates on the
public internet, so users just get a connection error. Too bad Cloudflare
supports ESNI now ;)

~~~
rcarmo
Where do you live (if you can share the country name, of course)?

~~~
kijin
South Korea.

Nobody is under any threat of prosecution for talking about our ridiculous
censorship regime, and the surveillance side of the program is probably no
worse than in any other developed country.

Which isn't much of a compliment, but at least we're not China-level evil --
just incompetent. DPI for blocking SNI hostnames is a particularly annoying
way to waste taxpayers' money. It's almost as if they timed it to coincide
with wide availability of DoH and ESNI!

------
adrianratnapala
This sort of development seems good, not exactly from an moral point of view,
but from the point of view of long-term reliability of the internet.

The IP protocols have some expectation of end-to-end packet delivery. Over
time we found ways in which networks could be kept "working" with this
requirement relaxed. Except what could be known to "work" was just whatever
was tested by the manufacturers of various middle-boxes, making change and
development of new ways of solving problems harder than it should be.

The less visibility middle-boxes have into what the the traffic is, the less
they are able to selectively screw things up and the internet will be more
reliable for it.

------
anonymousisme
It's not dead. Encryption has (unjustifiably) pushed the enterprise to install
fake catchall certificates on proxies so they can snoop plain-text traffic.
(Why anyone would ever think this is a good idea is beyond me.)

~~~
jandrese
How else are you going to catch APT (Advanced Persistent Threat) data
exfiltration/control channel traffic?

Assumption 1: Machines on your network are already compromised and fully owned
by a sophisticated and extremely difficult to detect rootkit. This is true of
every large business. There is always that guy who will click on any link or
open the document from what appears to be their co-worker.

Assumption 2: APT tries to disguise their traffic as ordinary web traffic,
because anything else is suspicious.

Assumption 3: You have massive legal liabilities if your data is exfiltrated.

Being able to do DPI and pattern matching on all TLS traffic (and firewall off
anything you can't DPI) is pretty much mandatory.

~~~
zrm
> There is always that guy who will click on any link or open the document
> from what appears to be their co-worker.

Which is another reason why DPI is ineffective. The smart malware will
identify when its connection is presenting a custom root certificate rather
than the expected one and not proceed with its suspicious activities (if not
deploy some kind of steganography). Then the same "that guy" will plug his
personal phone into his computer, and now the malware has an unmonitored
cellular data connection to the outside on a machine that's also connected to
the internal network. Or a compromised laptop will hook up to the WiFi of the
company on the adjacent floor or the coffee shop next door, or the user
connects it to the coffee shop WiFi when they're in the coffee shop.

In theory you can build a Faraday cage around your space and then strip-search
employees for digital devices at the door, but if your data is _that_
important then you probably ought to just not be connected to the internet at
all.

~~~
jandrese
I'd argue that those examples are a higher bar to hurdle than failing to
recognize a spear phishing attack, and can be mitigated by solutions like
always-on VPN.

And if the malware doesn't work because it has certificate pinning, well,
that's a win too. Its not a 100% solution, but you can significantly raise the
bar on your attackers.

~~~
zrm
> I'd argue that those examples are a higher bar to hurdle than failing to
> recognize a spear phishing attack, and can be mitigated by solutions like
> always-on VPN.

The theory behind TLS MITM is that it's an extraordinary and dangerous method
that could be justified if sufficiently effective. If there are a dozen common
ways to route around it, the risk is more than the benefit.

A VPN can't fix it because a compromised endpoint would be able to choose
which traffic it sends over the VPN. Whereas if the endpoint isn't fully
compromised then you could be doing whatever scanning is being done by the
middlebox on the endpoint itself, without centralizing on single point of
compromise for the entire network.

> And if the malware doesn't work because it has certificate pinning, well,
> that's a win too.

It may not make an outside connection, but that doesn't mean it doesn't work.
It could still infect every machine on your internal network. What's the
chance that none of them are ever in range of a public WiFi?

This is before even getting to the issue of steganography. Information theory
says that if your legitimate communications contain zero entropy then they can
be encoded into zero bits, i.e. you don't need a network connection at all,
but if they contain nonzero bits of entropy then an attacker can encode that
much arbitrary data into the stream and still be indistinguishable from
legitimate data.

So the whole thing is inherently a cat and mouse game. A lazy attacker may use
a data pattern that isn't found in the legitimate data and then a middlebox
vendor may find it and use that to distinguish their traffic, but as soon as
they do the attacker can stop using it. The longer the game is played, the
better the attackers get at making their data indistinguishable and the fewer
remaining undiscovered ways to distinguish it that it's possible to find. In
the limit there are none left, the data is completely identical to legitimate
data with the same amount of entropy, the attacker is only assigning different
meaning to it at the endpoints.

------
rcarmo
There was a pre-2010 burst of interest in DPI in the carrier world, back when
they thought it would be feasible to bill different kinds of traffic
separately (i.e., beyond zero-rating traffic's to their walled gardens).

That lead to an arms race from core networking vendors to push out all sorts
of traffic sniffing and policing with insane degrees of intrusion that made me
quite uneasy (I worked in core network planning), and it's been a relief to
finally see LetsEncrypt take hold and TLS become de rigeur.

I do have some qualms about the way legal interception can be abused (in
general) and occasionally ponder how far those vendors may have progressed in
MITM, though - carriers and exchange points are not as secure as they should
(in sometimes surprising ways), and back then finding bugs in carrier
equipment was relatively frequent.

I wonder what's it like now that most of it are actually Linux VMs running
someplace in their ancient datacenters.

~~~
Scoundreller
The other interest from carriers is protecting their media interests.

Slowing torrents or streaming video directly helps maintain their « golden age
of double dipping by running data over lines paid by audio/vidéo
infrastructure »

~~~
rcarmo
The "policing" bit was actually about doing that. Strategies varied, from
smooth shaping to randomly dropping packets to force TCP window resets and
drastically lower throughput.

------
jimmychangas
Not related to the core of the article, but it taught me I can pipe random
gibberish (such as tcpdump) to the audio output and I am finding it amazing.

------
shaklee3
Luca Deri, the author of nDPI did an excellent talk on this topic at the DPDK
summit in December. The techniques they have to use now to apply heuristics on
https is really cool:

[https://www.youtube.com/watch?v=4Vp8-UONhmM&t=0s&index=17&li...](https://www.youtube.com/watch?v=4Vp8-UONhmM&t=0s&index=17&list=PLo97Rhbj4ceISWDa6OxsbEx2jBPaymJWL)

------
75dvtwin
I think (and hope), that the next big thing (after https) -- will be VPNs by
default. (and independent from the internet provider service).

By default, nobody, and I mean, nobody needs to know ones home IP address,
period. And nobody needs know what sites a person visit or when.

So not only DPI should go away, but also IP address-based
blacklisting/whitelisting, tracking/ advertising and so on.

------
jordan314
This sent me on a spiral of checking for MITM connections on my machine. You
can compare the fingerprints of known sites with this list on this site:
[https://www.grc.com/fingerprints.htm](https://www.grc.com/fingerprints.htm)
Though I think the facebook one is wrong (the one I see starts with BD 25 8C
for SHA-1)

------
chrischen
Cool so how would I use these to circumvent the great chinese firewall with my
SOCKS tunnel?

------
xer
PDI is just one tool in the toolbox. It's never gonna die.

------
mimixco
TL;DR = Because encryption.

~~~
suff
TL;FFS = Except for SSL inspection software.

~~~
mimixco
Not SSL, encrypted content. Deep packet inspection won't help you if there's
only encrypted data inside the packet.

------
suff
Author is dead wrong. Products exist today that perform DPI on SSL streams:
[https://www.a10networks.com/resources/articles/ssl-
inspectio...](https://www.a10networks.com/resources/articles/ssl-inspection-
decryption-cisco-asa-firepower)

~~~
Florin_Andrei
Thew author does mention that's doable if you break the SSL tunnel. They also
mention some ethical issues with doing that.

------
yholio
Breaking TLS so you can do deep packet inspection is like a lifeguard throwing
people in the water during winter so he can save them.

~~~
hunter2_
Or a lifeguard blowing their whistle at folks who specifically used the "no
lifeguard on duty" beach so they could swim out far.

------
bawana
I did not realize that squid could provide false certificates on the fly. The
whole business of invalid certificates made people nervous about some sites.
Now someone can sit in starbucks with a squid proxy in the middle and harvest
everything, regardless of ssl encryption. Looking at the little lock in the
URL means nothing to a MITM running squid. Will a VPN protect me by encrypting
everything from my machine so that a squid in the middle will be thwarted?

~~~
nickthemagicman
The whole point of certificates is for the browser to check with a cert
authority. How does squid circumvent the certificate authority?

I think VPN may not be safe if the local machine has to negotiate encryption
with the VPN server. Squid seems like it couldn't intercelt that.

~~~
detaro
It doesn't. In enterprise environments that use something like this, the
sysadmins install their own CA certificate on all machines. You can't just
MITM random machines at a coffeeshop.

------
drieddust
Not a very informative article. All it manages to say is that deep packet
inspection does not work with encrypted traffic. I think author is not aware
of transparent deep packet inspection of SSL traffic. Here is one such product
doing it.

[https://www.sonicwall.com/en-
us/products/firewalls/security-...](https://www.sonicwall.com/en-
us/products/firewalls/security-services/dpi-ssl)

~~~
corebit
That’s just a run-of-the-mill MITM privacy violator

~~~
SideburnsOfDoom
Yes, but I don't think the article explains why it is, or will be, "dead".
Companies that have them don't want to give them up. What compelling reason
would make them? "Employee privacy at work" is not one. Not with the level of
perceived threat of malware downloads and trojaned NPM packages.

