
The hidden cost of QUIC and TOU (2016) - luu
https://www.snellman.net/blog/archive/2016-12-01-quic-tou/
======
dylz
Here's the problem: other companies ruined it for the good ISPs. They
destroyed their trust by screwing with end user transfers, tampering with TLS,
tampering with connections, modifying resources in transit, MITMing in
horrific fashion, invading privacy, and breaking protocols for profit.

Arguably there would be a lot less insane push to make everything a blackbox
if companies didn't keep doing dumb crap for an extra buck. See also the
TLSv1.3 middlebox arguments.

I certainly enjoy the fact that when using an AT&T hotspot I can't reach
certain websites over TLS, with Verizon they would append advertising tracking
headers to requests on their end, on Sprint if I wget the same image multiple
times it'll be in a different quality and not match md5-wise each time, all of
the above do DPI and negatively degrade web traffic no matter what port it
runs on, TCP connections fail if I try to run non-web traffic over port 80
randomly, TCP connections are randomly killed if long lived.

~~~
xyzzy123
The problem I have with TLS 1.3 proposal is it’s gonna stop me having control
over the things that go out of my own network :(

Realise that this is a multifaceted debate but it’s going to destroy the
utility of anything I can’t install a certificate on, and also force me to
install certs on everything I own before I can comfortably let those things on
the network. It’s gonna screw a lot of enterprise use cases as well.

I’m not sure people fully get that the privacy extended to say “dissidents in
Syria” is also going to apply to HP printers on their own networks trying to
figure out whether to show “toner low” dialogs.

Personally I don’t think the “hey the dissidents” value is worth it since
those people are pretty screwed anyway - filtering at scale can still work out
what you’re connecting to (ips, latency, response size patterns, blah blah)
but it really messes up anyone (person or enterprise) who wants to use stuff
but also know what’s going on.

~~~
tialaramex
The threat you're worried about doesn't require that an IETF Working Group
spend years defining a new protocol, whether that's QUIC or TLS 1.3 itself.
Any bozo could roll their own Noise-based encrypted protocol and it wouldn't
be decrypted by whatever edge "security" you think is protecting you.

Worse, chances are that a belief you presently have "control over the things
that go out of my own network" but believe TLS 1.3 would hurt that means
you're relying on "Next Generation Firewall" type technologies which are
hopelessly broken.

If you go stare at the TLS 1.3 "compatibility" changes in later drafts
(particularly Draft 28 IIRC) you'll see that it's basically the equivalent of
wearing a boiler suit with an embroidered "OTIS Lifts" logo to get waved
through the gate check without needing a pass. Except the boiler suit says
"TLS 1.2 Session Resumption". It didn't require the IETF to do this,
presumably Bad Guys have been doing it for years without writing a document
explaining how.

The recurring theme in people's TLS 1.3 horror stories is that they were being
eaten by cannibals all along, but TLS 1.3 asked them why they can't feel their
legs...

Example: Palo Alto and Cisco both shipped products that trip the TLS 1.3
downgrade detection feature. They were told about this months ago, but of
course they waited until the last moment (indeed for PAN they still haven't
shipped a fix for some supported versions) because it's just a compatibility
problem...

Except, it's not - the only way to trip that downgrade detection "by mistake"
is to not choose random numbers where the TLS 1.0, 1.1 and 1.2 standards all
say that it's imperative to use random numbers. If those numbers are instead
copied from somewhere predictable (which they are in affected Cisco and Palo
Alto systems) then much of the security of your TLS connections through these
"security" devices was illusory.

------
olliej
The other fun bit is the “encryption” of web sockets (outside of tls, etc).

Basically (hand waving here) a websocket connection starts with a 64 bit (or
128, I really can’t remember anymore) value, and then essentially just xors
that over all subsequent data.

This isn’t needed for security of the user or the server. It’s because so many
middleware boxes are so poorly build that you could make them crash and/or get
code execution if you had sufficient control over enough of the right payload
bytes. Java applets exposed the exact same problem, so given they existed a
decade before websockets and yet the middleware boxes were still broken enough
that this nonsense was required in the websocket spec should tell you
everything you need to know.

~~~
jorangreef
And now, even WebSockets over TLS need servers to waste CPU on XOR masking.

~~~
jefftk
XOR masking is ridiculously cheap compared to doing anything over the network
or running js in general.

~~~
adrianN
Everything is slow because of a million "don't worry, it's cheap compared to
the rest" decisions accumulating over time.

~~~
jefftk
I think "why is everything slow" is actually a really important question, but
I'm not sure it's primarily lots of "this is cheap compared to X" decisions. A
big driver here I think is layers of abstractions and inner platforms. I check
my email in by browser now, I used to use mutt. Webmail on top of JS on top of
browser on top of OS (with several of those systems having their own layers
and places that add latency) is great from a perspective of checking email
from anywhere but it's a lot slower than C on the OS directly.

I recently got into doing a lot of live music for dances on my mac, and I've
ended up writing everything directly against CoreMIDI in C. Sure, it's would
be more convenient to write in Python, but latency in music is even more
painful than elsewhere.

------
nealmueller
The author is a middlebox employee (IPS, IDS, Firewall, NAT, WAN optimizers,
LBs). Middlebox people want unencrypted transport headers, because they
literally profit from unencrypted headers. :) Everyone else, including users,
site operators, and software engineers writing network software prefer that
middleboxes not be able to see or tamper with transport headers (both for
privacy, avoiding bugs, and being able to evolve software).

From the original article: "What's wrong with encrypted transport headers? One
possible argument is that middleboxes actually serve a critical function in
the network, and crippling them isn't a great idea. Do you really want a world
where firewalls are unviable? But I work on middleboxes, so of course I'd say
that. (Disclaimer: these are my own opinions, not my employer's)."

(Credit for this observation goes to my friend NC.)

~~~
zamadatix
Transparent proxy was the wrong way to implement IPS, IDS, FW, NAT, LB, and
WAN optimization. For the cases you have a reason to be in the middle these
services should have been explicit proxies from the start.

------
jsnell
There's been an attempt to fix this during the last two years, as the QUIC
standardization talks really got going. A bunch of operator people expressed a
need for some sort of in-path measurability, while the privacy people have
expressed the need to avoid any sort of session linkage or other forms of
information leakage. The most viable compromise proposal seems to be the spin
bit[0] which gives RTT measurements, but it's not agreed whether a version of
that will make it to the first release.

[0] [https://quicwg.org/base-drafts/draft-ietf-quic-spin-
exp.html](https://quicwg.org/base-drafts/draft-ietf-quic-spin-exp.html)

~~~
pdkl95
> The most viable compromise proposal seems to be the spin bit

That's utterly useless. If the "spin bit" becomes widely used, I intend to
write a trivial patch that sets it randomly on each packet.

If packets could be dropped unless the bit is set to specific values at
specific times, too much session information is leaking to middleboxen. More
likely, the state of the bit doesn't matter so setting it randomly will
discourage wasting a bit in the protocol with this kind of nonsense in the
future.

------
quickben
When advertisement companies push for L4 redesign... This is going to be a fun
one to watch how it plays out.

------
vinay_ys
This argument is broken. An ISP engineer debugging need only look at IP packet
drops in their part of the network. If their end clients are asking them to
debug an issue then they should debug it at the end points, not in the middle.
This attitude is what lead them to stick more e and more buffers in the middle
and do other shaping devices in the name of problem solving or adding value
and ended up ossifying protocols. I think endpoints will evolve to better
protocols to solve their own problems if the ISPs stopped putting hacks (like
deep buffers) in their networks.

~~~
jsnell
> This argument is broken. An ISP engineer debugging need only look at IP
> packet drops in their part of the network.

That's just not true in practice, pretty much on any level.

First, you need to look at a lot more things than just packet drops (e.g.
reordering, queue buildup, corruption).

Second, even getting full visibility into your own network is highly non-
trivial since nobody has active probes on every link. My experience is that
arranging for packet captures from an arbitrary point in the network could
take a week. And if you guessed wrong about which node was at fault, you'd
need to do it again in a binary search pattern.

Third, you absolutely do need to know about things other than your own
network. Otherwise you don't even know _whether there is a problem you can
solve_. If the bottleneck is in the server, or the client, or the transit
links, there is no point in debugging the core or the access network.

> If their end clients are asking them to debug an issue then they should
> debug it at the end points, not in the middle.

The endpoints are not going to be available. Do you think that Youtube is
going to give an operator some kind of server access or even insight to the
traffic? Do you really want to see a world where a customer having a complaint
needs to first root their phone and install packet capturing software?

What you're really saying here is that no problem should ever be debugged, and
we should just hope that the network doesn't break. And hope is not much of a
strategy.

------
denormalfloat
Alright, so existing debugging tools don't work with QUIC. We will need to
make some new tools that can expose the information we need. If the hops
between the source and destination are willing to expose the info, (and we can
assume they do, as the author has), then we can figure out what packets go in
that never come back out.

Instead of say "this won't work because ____", why don't people say "it would
work if we could ____"? Someone (or some company) needs to improve the
Internet, and it seems like the world just harangues them for their effort.

------
ncmncm
When you tunnel your protocol in UDP, and control both ends, you can get
overwhelmingly better flow control than TCP, which cannot trust the other end,
so must rely on packet drops to get a reliable signal.

When you can trust the other end, rate of change of packet transfer time
(delta packet delay) reveals congestion exactly -- i.e., increasing time means
you had better slow down, decreasing, you can go faster.

Only problem is, the receiver has the signal, but the sender needs it, and the
useful lifetime is less than the packet delay. So, you need a predictor on the
sender, fed by corrections from the receiver. This is control theory applied
to network flow.

This is how all of Hollywood sends reels around the basin to effects houses,
and completed movies to digital projection theaters.

------
tinus_hn
It would be nice if these protocols could be easily decrypted using a key
available on the client. Other than that, tough luck for the ‘transparent’
proxies and friends.

~~~
tialaramex
Of necessity both client and server have the session keys.

In principle it would be possible for the client to lack keys needed for the
server to read data sent by the client, and vice versa, but in practice this
is never done.

Under Forward Secrecy a Middlebox must learn fresh session keys for every
connection or it can't decrypt it. Both clients (e.g. Firefox) and servers
(e.g. using Java or OpenSSL) have facilities to dump the session keys out
somewhere, and this is adequate for debugging although obviously you will need
to acquire new skills if you're used to being able to get stuff done with a
paperclip and a copy of tcpdump. At scale this get hard, arguably that's fine
because a minute ago we said we wanted this for "debugging" but people who got
their foot in the door with a "debugging" argument often actually want to
decrypt everything, always, and so they're unhappy about this.

If you don't want Forward Secrecy you have two options. Firstly, when the
specification says to think of a random number for the key exchange protocol,
you can always pick the same number, or a number chosen in some predictable
fashion, the Middlebox can know this number (or method for predicting it) and
then it can snoop as normal. This works in TLS 1.3, obviously it weakens your
security (if bad guys learn how to predict the numbers you are screwed) but
that's your choice.

Secondly you could use a key exchange process that doesn't have any Forward
Secrecy by design, such as the RSA key exchange from SSL that's grandfathered
into TLS 1.0 through 1.2. In this case you just give the server's private RSA
key to the middleboxes and they can decrypt everything.

As you may notice in all the above scenarios, this is very bad for your
security. But if "debugging" is really the problem that's almost certainly
acceptable to you.

~~~
rocqua
Dumping out those session keys comes with major logistical and security
problems. All of a sudden, debugging requires

1) Updates on every client you might want to debug

2) Securely transporting the session keys from those clients to the person
debugging.

Those are some massive challenges. An alternative is to always MitM all your
devices. This comes with obvious downsides. Moreover, I could see providers
doing cert-pinning that isn't over-ridden by user installed certificates. That
would make it literally impossible to MitM your own devices.

This kind of cert-pinning really scares me, because it takes away any
possibility to inspect your own network communications.

------
sly010
Arguably you wouldn't have to debug bad connections if the middleboxes just
did what they were supposed to.

------
cm2187
Actually it raises an important point I hadn't thought about. By moving the
transport protocol from layer 4 to layer 5, google is taking it off the hand
of the OS. I understand why they do it (easier than to get all 3 or 4 major OS
to implement it), but there is another cost in term of inter-operability. It
means that every software consuming QUIC needs to have its own implementation
of the protocol. It means every language in which you write that software
needs to have libraries available implementing QUIC, or that the libraries you
consume must themselves have a QUIC implementation. That's introducing a non
insignificant inefficiency if QUIC becomes prevalent.

~~~
tialaramex
The operating system can offer QUIC on top of UDP the same way it offers UDP
on top of IP. BSD sockets aren't necessarily the ideal way to use QUIC but
there's no reason you couldn't use them.

Researchers have even repeatedly built Linux protocol modules that do TLS,
either all of it, or the encrypted record layer (so the bulk but not the
tricky negotiation decisions at the start). There's just a new TLS protocol
you ask for instead of TCP and then the kernel handles encryption and so on.

------
nicenewsbc
There is no combination between them.

------
ocdtrekkie
You know what a lot of middleboxes do? They block ads/malware. Shocking that
the two largest ad companies are trying to push standards which break things
that block their ads.

~~~
icebraining
That's news to me, I've never heard of a widespread use of middleboxes for
blocking ads. Which boxes are these and how did you know that there are a lot
of them deployed?

~~~
ocdtrekkie
So, an example of a middlebox that's exceedingly common is a "web security
gateway", which is your average web filter and logger in a corporate
environment. Obviously it logs employee web activity, blocks access to adult
websites, and maintains it's own malware definitions to try and block
malicious content as well. It's quite often for these to also block domains
used by ads and popups by default. When these sorts of devices are configured
to inspect HTTPS, this adds a significant additional complexity: Network PCs
need to be configured to permit a certificate from the box for all domains,
which intercepts, decrypts, and re-encrypts all traffic.

Of course, the same type of technology a corporation might use to manage their
network could be used by a state actor or a hostile ISP.

~~~
brazzledazzle
If you’re only using it to block domains you can do that at the DNS level with
the added bonus of it being more efficient.

~~~
vetinari
This goes out of the window with apps doing their own host resolution with
DoH.

