
Every Byte of a TLS Connection Explained and Reproduced - aberoham
https://tls.ulfheim.net/
======
syncsynchalt
Author here - I was going to publish this today but it leaked out ahead of
time. Enjoy!

EDIT: I'm putting a CDN in place.

~~~
zzzeek
very nice! first thing I noticed is that at
[https://github.com/syncsynchalt/illustrated-
tls/blob/master/...](https://github.com/syncsynchalt/illustrated-
tls/blob/master/site/index.html) the content and samples are hardcoded into
HTML - might it be nice if this was generated from some kind of JSON file or
similar such that the approach you have here could be generalized to support
any network protocol someone might want to annotate with descriptions?

~~~
syncsynchalt
Yes, if I do this again I will definitely generalize it into a content
generator.

(Isn't that what every site turns into eventually, a custom CMS?)

As it is it's all tcpflow, hexdump, and vim.

------
toast0
This is wonderful!

I may make a version where the bytes used for lengths are highlighted, since
it feels like so many bytes are lengths; look at the SNI extension, which has
three 16-bit lengths, I know why they're there, but SNI probably shouldn't be
a list, and even if it was a list, an extension that consists solely of a list
has a list of the length of the extension, you shouldn't need two bytes for
that, and if we recast sni into just a type and a string, the string is
clearly going to take the rest of the extension length, so it doesn't need a
two byte length either.

~~~
userbinator
Moreover, from a security perspective since we're talking about TLS,
"overspecified" and/or "redundant" lengths are just begging to be made
inconsistent and a source of vulnerabilities.

~~~
the_clarence
Usually if you mess up something there your implementation just doesn't work.

~~~
wool_gather
Vulnerabilities often arise from implementation bugs, no?

~~~
the_clarence
Yeah, but not every implementation details will equally lead to the same
vulnerabilities. Having said that, heartbleed was a length issue.

------
bungie4
Beautiful.

I recently had to implement two way auth over 1.2 and this would have saved
much hair pulling. (and who does TWO way auth over an MPLS connection. Turns
out, us).

~~~
pimlottc
I feel your pain. Two-way TLS is a funny thing, it's supported by the
standards and even most implementations but its actual use is minuscule
compared to "normal" one-way TLS, so much so that it's hard to find
documentation even acknowledging two-way TLS exists, let alone how to use it.

And don't get me started about the hassles of obtaining signed certificates
that are actually usable for client auth...

~~~
tialaramex
> And don't get me started about the hassles of obtaining signed certificates
> that are actually usable for client auth...

What sort of clients were you authenticating? The Web PKI needs to be trusted
by random people from the whole world, but most mutually authenticated systems
have a relatively small number of clients which are known to the server
operator out-of-band. So probably the Web PKI is not the right choice. Instead
you (the server operator or some neutral facilitator if it's a group of
providers operating services for the same clients) should operate a CA for
this purpose, not piggyback on the Web PKI.

One reason not to use the Web PKI if you aren't actually part of the public
Internet is that we, to put it bluntly, don't give a shit about people who do
that. Running a PKI is expensive (not just in dollar terms, it needs a bunch
of smart, motivated people who are morally upright or it's worthless), and
this one is ours, so it obeys our rules.

If you have your own PKI (or just one CA) you set the rules. Fifty year
certificates for 1024-bit RSA? Why not. A current passport photograph baked
into every certificate? Sure. Want the issuer to mint the keys and keep a
copy? Do as you please. All those things are prohibited in the Web PKI.

~~~
richardwhiuk
Ignoring the Web PKI defaults though is probably a silly idea - e.g. long
lived certificates with rubbish hash algorithms, huge certificates, and issuer
kept keys are all really bad ideas, in almost any scenario.

------
gbajson
Has anyone of you seen _such beautiful_ explanation for other protocols (TCP,
4-way handshake)?

~~~
Xeanort
You can usually use Wireshark to find out the meaning of each byte for
internet connections (and images), it works great.

~~~
tialaramex
However, Wireshark can mislead you if you don't understand what you're looking
at.

It's a bit like having a low-level debugger. If you're happy with low-level C,
and you're looking at a debugger and it says variable 'k' which you know is a
uint8_t which loops from 0 to 5, currently has the value 65 you should say to
yourself. "Hmm, I bet that the compiler used the same place to store variable
'c' that's a single byte from a text string, so this is just the capital
letter A and the compiler has realised it doesn't need to store the value of k
even if it's technically in scope..." rather than "OMG my loop variable
somehow massively exceeded its expected range, maybe cosmic rays have damaged
the RAM".

With TLS for example if you give it a whole TLS 1.2 sesssion, Wireshark will
say oh, this is TLS 1.2. Fine. But if you show it only a TLS 1.2 connection
that failed, Wireshark will say "Oh, this is TLS 1.0". Why? Well, the low-
level protocol has been bodged over the years because of crappy middleboxes,
so Wireshark doesn't actually know for sure, and rather than say "I don't know
yet, I need to see more of the connection" it says TLS 1.0

This can be a problem because you'll get amateurs saying "Our system can't
talk to your server because you only do TLS 1.0" and you say "No. You are
wrong" and they say "Look, here's a Wireshark trace" and sure enough Wireshark
is telling them it's TLS 1.0 because their system has disconnected early (e.g.
because they disabled all the crypto algorithms you allow), and so Wireshark
wasn't sure and labels it TLS 1.0 rather than TLS 1.2

This is going to happen again with TLS 1.3. TLS 1.3 deliberately says "Hi I'm
TLS 1.2" (middleboxes again) and so that's what Wireshark will report (until
you get a newer version that knows to look inside the supported_versions
extension field for the version) and so you can bet that amateurs are going to
say "Your service only does TLS 1.2" when actually their connection failed for
some reason and they don't understand how to read the Wireshark trace.

------
rohansingh
Beautiful. A few years ago, were implementing some TLS handshake stuff at
Spotify ([https://github.com/spotify/ssh-agent-
tls](https://github.com/spotify/ssh-agent-tls)) to be able to use your SSH key
as a client-side cert for HTTPS connections.

The first couple drafts took forever to figure out, and I got a bunch of stuff
wrong. This guide would've saved a ton of time back then.

------
kodablah
There's a lot of practicality I learned from reading the Go source and
implementing my own (e.g. always an array of a single value of 0 for
compression method).

Now do a version with DTLS (one more message type, couple more fields on
existing types, and logic concerning retries). Also, now do a TLS 1.3 one.

~~~
syncsynchalt
Yep, reading the Go source is what inspired me to do this, it's still very
compact and readable.

I originally used an AEAD cipher but found it was impossible to demonstrate on
the command line (openssl enc refuses to do AEAD because it can't confirm the
authentication in a streaming context).

A friend asked me to demonstrate ALPN in this but as I looked into it I found
it was distracting, as there's already so much going on and any new feature
required digression. Maybe next time!

As for 1.3 my next project was going to be implementing it rather than
documenting it. Just a throwaway implementation, nothing you'd want to use.

~~~
kodablah
> As for 1.3 my next project was going to be implementing it rather than
> documenting it. Just a throwaway implementation, nothing you'd want to use.

Since you read the Go source, you might like [0]. I will say I personally
think Go could have done better. I think it's too compact, too hidden, too
disorganized, too underdocumented, and too inflexible/non-extensible. I began
to pick some of it apart for a DTLS impl I started at [1], but have put on
temporary hold yesterday due to other work obligations.

0 - [https://github.com/cloudflare/tls-
tris](https://github.com/cloudflare/tls-tris) 1 -
[https://github.com/cretz/go-dtls](https://github.com/cretz/go-dtls)

~~~
syncsynchalt
I've also had [https://github.com/h2o/picotls](https://github.com/h2o/picotls)
suggested.

------
quadyeast
I have no need to look at that page at length ... but I am. This is very well
made and interesting.

------
blazespin
I remember implementing ssl in java when I worked for Netscape back in the
day. Nostalgia rush looking at this. Great stuff

------
kalimatas
The irony is: I get "Your connection is not secure" when trying to open the
link.

~~~
syncsynchalt
I've uncapped MaxRequestWorkers while I wait for the Cloudflare queue, it
should work now (I believe you were seeing an error related to TLS timeout).

------
phaedrus
I love the way you present the data (where clicking on the hex values of the
bytes gives their explanation). At work I work on software that communicates
via RS-232 serial packets; I should make something that turns logs of the
packet data into HTML files with an interface like this!

------
amenghra
[http://www.networksorcery.com/enp/Protocol.htm](http://www.networksorcery.com/enp/Protocol.htm)
has a bunch of network diagrams for various protocols.

This website lays things out really nicely, would love to have more protocols
:)

------
Obi_Juan_Kenobi
So the record header has 2 bytes for the payload size, and the handshake
header has 3 bytes. Am I correct in thinking that the 3 bytes of the handshake
header is superfluous? Isn't payload size limited to 2^16 bytes

~~~
zamadatix
I think this is explained by the optional compression which is handled at the
record layer.

------
max23_
Good work on the illustration.

Using Diffie-Hellman to generate a shared key with each party's private key
and the other party's public key is the part that amazed me most when I was
trying to understand the handshake back then.

------
userbinator
_The parties had agreed on a cipher suite using ECDHE, meaning the keypairs
will be based on a selected Elliptic Curve, Diffie-Hellman will be used, and
the keypairs will be Ephemeral rather than using the public /private key from
the certificate._

I think it's important to mention that even with ephemeral cipher suites, the
server's ephemeral public key is signed using the server's certificate private
key and verified by the client, since otherwise one would be able to MITM the
key exchange.

~~~
syncsynchalt
It’s in the Server Key Exchange record (from memory) but i probably didn’t
explain it well.

------
pcunite
Nice, could have really used this a few years ago when I was making my own
HTTP server implementation. Could someone make a PDF of this? Any C++ and C#
examples of this?

~~~
syncsynchalt
I might post a PDF later (as the content becomes final), or make a printable
view. In the meantime try this:

    
    
        # in your javascript console, paste this:
        [].forEach.call(document.querySelectorAll(".record, .calculation"), function(el){el.classList.add("selected")});
        [].forEach.call(document.querySelectorAll(".record, .calculation"), function(el){el.classList.add("annotate")});
        [].forEach.call(document.querySelectorAll("codesample"), function(el){el.classList.add("show")});
    

Then you can print the page to PDF.

------
umanwizard
As a crypto ignoramus - Why is the random data from each side necessary? Why
can’t things just be encrypted with the PreMasterSecret directly ?

~~~
tialaramex
Involving random data gives everybody who gets to pick the random data (so in
TLS that's both client and server) a freshness guarantee.

Because the other party needed to know you'd picked this particular random
data to make the keys, the messages from them encrypted with those keys
couldn't possibly have been pre-recorded / replayed.

In the ephemeral Diffie Hellman modes both parties contribute to the key
anyway so this isn't as important, but with old school RSA the random values
are the only thing preventing Replay attacks.

TLS 1.3 capable servers also scribble "DOWNGRD" in part of the random field if
a client message says it can't do TLS 1.3. If a TLS 1.3 client sees that
unusual "random" choice it knows bad guys tampered with the connection
(attempted a downgrade attack). If bad guys just change the values, they won't
match between client and server and the connection aborts. Older clients think
nothing of the unusual random value and carry on as before.

~~~
userbinator
_TLS 1.3 capable servers also scribble "DOWNGRD" in part of the random field
if a client message says it can't do TLS 1.3. If a TLS 1.3 client sees that
unusual "random" choice it knows bad guys tampered with the connection
(attempted a downgrade attack). If bad guys just change the values, they won't
match between client and server and the connection aborts. Older clients think
nothing of the unusual random value and carry on as before._

I haven't looked at the spec in detail, but does this mean that random
generation has to specifically exclude that "sentinel value", lest it
accidentally occur?

~~~
umanwizard
The probability of a particular seven bytes occurring by chance is less than
one in a billion billion billion billion.

~~~
jwilk
256⁷ ≈ 7.2E16, which is much less than billion⁴ = 1E36.

Still, the likelihood of this happening by chance is miniscule.

~~~
tialaramex
The actual feature uses an 8-byte value, it's just that the DOWNGRD part (the
first 7 bytes) is intuitively easy to follow so why spell it all out in
hexadecimal or whatever.

So it's one in 2^64 random connections

Also the client isn't even checking for possible downgrade if it got the
protocol version it wanted (if I wanted TLS 1.3 and I got TLS 1.3 that is not
a downgrade). So if "One in every 16 billion billion connections fails" is
unacceptable, upgrade your servers and the problem vanishes.

------
kapad
There's a typo in the server keys calculation. It should read "server" and not
"client" IMO.

------
DontSueMeBro
This is very nice. I think the green on green and blue on blue highlight are a
little hard to see.

~~~
syncsynchalt
Agreed - the color choice is one of the things I never really felt happy with
(I'm not a frontend guy).

------
based2
[https://www.reddit.com/r/netsec/comments/9nk764/the_illustra...](https://www.reddit.com/r/netsec/comments/9nk764/the_illustrated_tls_connection_every_byte/)

------
ct520
As someone that frequently deals with large scale integration failures I
greatly appreciate this site. Thanks syncsyncchalt!

------
lxe
I love this. Excellent work. Would be cool to see a step-by-step niblets of
client/server implementation of each step.

------
jugg1es
So does this happen for every single request? or is this based on a single
session.

~~~
sgwae
This is what happens under the hood of every ping request.

~~~
deaps
A ping is very much different. A ping is (typically) simply an ICMP Echo
Request, (not TCP, thus no TLS, etc). The receiving device, if accepting echo
requests and configured to reply with echo replies, then responds with an ICMP
Echo Reply - or some device in the middle (or the device itself could respond
with an ICMP unreachable, or some other response - or quite simply drop the
ICMP Echo Request entirely and silently).

 __*Edited an incorrect UDP reference out based on the below comment.

~~~
gear54rus
If I'm not mistaken, its not UDP, it's ICMP, like you said

~~~
deaps
Ahh yeah - good call. Totally different protocol. I guess ICMP more closely
resembles UDP at the end of the day, but you're absolutely right. I edited out
the incorrect UDP reference so that a person reading for the first time will
not get misled. Thanks!

~~~
e12e
Which is also why some poorly configured network devices firewalls will eat
pings - if they for example whitelist tcp and udp protocols and drop
everything else (yes, that's a bad idea).

[https://security.stackexchange.com/questions/22711/is-it-
a-b...](https://security.stackexchange.com/questions/22711/is-it-a-bad-idea-
for-a-firewall-to-block-icmp)

------
sizzzzlerz
very fine work. the data flow and annotation are clear and concise. good job

------
taco_emoji
Site's timing out and google cache is giving me a 404...

------
sigi45
Cool! Would you be able to add a ascii view beside it?

~~~
kodablah
Not trying to come off trite, but the RFC [0] has a simple ASCII diagram of
the message flow and the structures that follow are fairly easy to read.
Granted you have to hop to a couple of other RFCs to understand extensions and
maybe even real world impls to understand some changes (e.g. no time in the
random block of client hello), but it's worth perusing if you're interested.

0 -
[https://tools.ietf.org/html/rfc5246#section-7.3](https://tools.ietf.org/html/rfc5246#section-7.3)

~~~
syncsynchalt
I think they mean a side-by-side ASCII interpretation like that supplied by
`hexdump -C` - where any alphanumeric character is printed as itself and every
other character is printed as a "."

It gives a nice strings-like view of raw data.

------
rqs
Explained in detail, also the web page is well-built.

Thank you.

------
rajeshamara
Yes very useful. Thanks for your effort.

------
mirages
HN hug of death ? Page is timeouting

------
nanidin
tls.ulfheim.net - "This site is blocked due to a security threat."

ulfheim.net - no problem

At least according to my megacorp threat filter. I have never actually seen
something blocked before, and it's a shame because the page would be great to
share with my team.

~~~
syncsynchalt
Wonder why that would be, tls.ulfheim.net is a CNAME to ulfheim.net, they're
just different apache vhosts. Same cert (using SAN).

The only things I can think of:

    
    
      - it doesn't like the hostname (tls?)
      - the hostname is new, and has no reputation
      - too much h4cking content

------
thejoeflow
Kind of ironic that their website is using an invalid security certificate?

~~~
tialaramex
The site is currently presenting a Let's Encrypt certificate issued just under
24 hours ago. There have historically always (for reasonable definitions of
always) been valid certs for this site, although of course I can't tell from
here that they were always properly installed.

Likely explanations for your experience:

1\. Your clock is wrong. If your system currently thinks this is Thursday 11
October for example, that's a problem, 'cos this is Friday 12 October.

2\. There's some subtle configuration error on their server (seems unlikely as
it looks to be just a generic AWS setup) that results in the wrong certificate
being presented.

3\. Your OS or browser trust store lacks the root CA "DST Root CA X3" operated
by IdenTrust. If you didn't deliberately choose to do this, you should
investigate as most likely you aren't getting important security updates.

All three causes can often by diagnosed by closely examining the detailed
error reported in a browser e.g. SEC_ERROR_EXPIRED_CERTIFICATE

------
int0x80
Thanks, very useful.

------
truth_seeker
Thank you for sharing.

------
parrik
nice (y)

------
hathym
nice work!

~~~
henrychongs
Very detailed explanation. Useful even for a refresher on the mechanics. Well
done!

