
Third-Party Audit of Rustls - dochtman
http://jbp.io/2020/06/14/rustls-audit.html
======
dochtman
Some quotes from the report's conclusions:

"the team of auditors considered the general code quality really good and can
attest to a solid impression left consistently by all scope items."

"was consistently well-documented and readable, demonstrating that security
processes are ingrained in the development and documentation processes"

"Both from a design point of view as from an implementation perspective the
entire scope can be considered of exceptionally high standard (...) no
directly exploitable weaknesses could be identified."

"It appears to have been developed with all previously known issue-types in
mind; furthermore, its missing support for insecure or outdated protocols and
primitives indicates a security-conscious development approach."

"the developers of rustls have an extensive knowledge on how to correctly
implement the TLS stack whilst avoiding the common pitfalls that surround the
TLS ecosystem. This knowledge has translated reliably into an implementation
of exceptional quality."

"The developer’s intent to provide a high-quality TLS implementation is very
clear and this goal can be considered as achieved successfully. (...) Cure53
had the rare pleasure of being incredibly impressed with the presented
software."

------
bsder
I especially liked the: "We couldn't reason about this. So we rewrote it."

Thankyouthankyouthankyouthankyou.

I have this discussion _all the time_ and I can't get people to see the issue.

~~~
lmkg
> There are two ways of constructing a software design: One way is to make it
> so simple that there are _obviously_ no deficiencies, and the other way is
> to make it so complicated that there are no _obvious_ deficiencies. The
> first method is far more difficult.

-Tony Hoare, from his Turing Award lecture

------
wahern
> TLS-01-004 - Data Truncation in DER Encoding Implementation (low). This
> finding rightly points out a function in rustls that produces incorrect
> output when applied to an X.501 Name that is larger than 64KB. While that’s
> an exceedingly unlikely case, and the bug does not cause unsafe operation
> (but perhaps connection failure), the function has been corrected to produce
> valid output for all inputs.

The ISO 7816 smartcard standard uses this same ASN.1-based scheme for command
and reply encoding. I found a memmove overflow and a separate off-by-one bug
in Yubico's PC/SC library, and an off-by-one bug in the Yubikey 4 firmware,
all related to their handling of this encoding.

It's possible that the Yubikey 4 bug could have been leveraged into an
information leak, a buffer overflow, or some other exploit of the firmware.[1]
The firmware was calculating a command reply length based on what the proper,
compact length of the object should be, not the _actual_ length of the stored
object with malformed internal encoding, leading to corrupt reply chains. But
I never explored it. I'm not an exploit developer, and didn't have time to go
down that rabbit hole. I only discovered the bugs tracking down a bug in my
own PC/SC library. It turned out I was generating an overlong encoding ({0x82
0x00 0xff} instead of {0x81 0xff}) for an object, tickling the Yubikey 4
issue. Yubico quickly fixed their library, but the engineer resolving the
ticket downplayed the firmware issue, which seemed far more concerning as it
suggested bug-prone buffer management techniques in the firmware.[2] (Indeed,
it suggested that the firmware reused the buggy library code.) I haven't gone
back to see what happened with the firmware since then.

Anyhow, paying attention to small encoding and decoding issues like these are
important--the details always matter. "This couldn't possibly lead to an
exploit" are famous last words.

[1] Hopefully not a leak of the key itself as I presume it was inaccessible to
the firmware, but who knows--it's a black box.

[2] IIRC, their argument was garbage in, garbage out (GIGO), which is a
dangerous misapplication of that principle. A malformed object should never
result in malformed, corrupt responses by other layers of the protocol stack.
In fact, the GIGO principle usually implies the opposite--that the behavior of
other layers of the stack is invariant to bad data, not conditioned by it.

EDIT: Looking at my Git logs it seems my code was never buggy. My library
couldn't read back some certificates generated by another, third, project and
written to the card using the Yubico SDK. That third project was generating
the over long encoding, and in the process of tracking down the root of the
issue I stumbled upon the Yubico bugs. IIRC, that third project was never
reading back the certificate as they had a very convoluted scheme for 2FA and
client-side SSH X.509 certificates that relied heavily on the bespoke behavior
of their client-side Go daemon.

------
mivvy
This is really exciting! Many popular Rust libraries support switching between
rustls and openssl for their TLS needs and it’s much easier to do cross-
platform builds with rustls.

------
fluffything
> After spending thirty days on the scope in late May and early June of 2020,
> the team of auditors considered the general code quality really good and can
> attest to a solid impression left consistently by all scope items.
> Naturally, this is partially thanks to the usage of Rust as the preferred
> language for the entire implementation of the rustls project.

------
packetized
The finding in TLS-01-003 is surprising to me, mostly because it presupposes a
lack of sophistication among users of this library who are also using X.509
NameConstraints. From RFC5280:

    
    
      For example, a name constraint for "class C" subnet 192.0.2.0 is represented as the octets C0 00 02 00 FF FF FF 00, representing the CIDR notation 192.0.2.0/24 (mask 255.255.255.0).
    

As they mention in the findings:

    
    
      Typically, subnet masks should be contiguous and the presence of a non-contiguous mask might indicate a typo (such as 225.255.255.0 vs. 255.255.255.0), or potentially an attempt to bypass an access control scheme. Therefore, it is recommended to treat certificates containing non-contiguous subnet masks in their name constraints as invalid.
    

This seems to run counter to the intent in the RFC. By allowing for a four-
octet subnet mask, instead of simply an int to represent the a contiguous CIDR
mask, the RFC authors may have intended that more complex IP-based
NameConstraints could be constructed. This certainly would make a huge
difference for something like an intermediate (CA:TRUE), where it becomes much
more economical to specify a sparse mask for a highly templated network. Think
certs for network equipment or VoIP phones with regular, repeatable IP
addressing across many locations/networks. E.g., a VoIP provisioning system
that has an intermediate issuing CA with the following NameConstraints:
IP:172.16.0.0/255.255.1.239.

If any change comes from this specific finding, I would hope that it's simply
a flag to allow or disallow the use of discontiguous masks. I do understand
that this is specific to WebPKI; having said that, if a client is implemented
using rustls (with these recommendations enabled) and it happens across a
perfectly valid certificate issued by an intermediate with a discontiguous
mask in the NameConstraints, presumably it would fail or otherwise break. And
yes, I have previously configured precisely this in an intermediate CA.

~~~
JoshTriplett
Leaving aside the question of whether other software will work reliably with
non-contiguous subnet masks (which led to this recommendation), in general,
most software does _not_ deal well with NameConstraints. Some libraries ignore
it, some libraries fail hard if a constraint exists, and in general, I'd
expect a certificate chain involving NameConstraints to be poorly supported at
best and insecure at worst.

I _wish_ that NameConstraints were better supported, to make it easier to
support intermediate CAs; for instance, prove you own example.com and you
could then have a CA restricted to *.example.com. But right now, that just
doesn't seem feasible.

~~~
briansmith
I spent significant effort writing the name constraint implementation in
mozilla::pkix (used by Firefox) and by webpki (used by Rustls) to fix this
situation. The result is summarized by Netflix:

> The result was that every browser (except for Firefox, which showed a 100%
> pass rate) and every HTTPS client (such as Java, Node.JS, and Python)
> allowed some sort of Name Constraint bypass.

\-
[https://netflixtechblog.com/bettertls-c9915cd255c0](https://netflixtechblog.com/bettertls-c9915cd255c0)

Since then, Google Chrome has implemented and deployed a new certificate
processing library on some (most?) platforms it supports, and I bet they have
similar or better name constraint support. I believe Apple has also improved
their implementation.

> I wish that NameConstraints were better supported, to make it easier to
> support intermediate CAs; for instance, prove you own example.com and you
> could then have a CA restricted to *.example.com. But right now, that just
> doesn't seem feasible.

Since the aforementioned improvements have shipped in production browsers, it
is much more practical, from a technical standpoint, to do that. The real
problem now is browsers' CA policies. As I understand it, they do not want you
to be able to get your own name-constrained intermediate CA certificate. The
CA that issues you the intermediate CA certificate would be on the hook, with
the consequence of being removed from the root CA programs, if you issued a
malformed certificate. And there are other issues with the policy. I hope
there are improvements to the browsers' CA policies to make this practical,
but I wouldn't hold my breath.

~~~
JoshTriplett
Thank you for working on this!

> Since the aforementioned improvements have shipped in production browsers,
> it is much more practical, from a technical standpoint, to do that.

What about non-browser HTTPS/TLS libraries? Even if browsers support it
perfectly, this doesn't seem like something CAs should start deploying if
widely used libraries have problems with it. And based on the test results
from BetterTLS, it looks like widely used libraries still have problems with
it.

Also, the BetterTLS article gives an example of a NameConstraints for
"192.168.0.0/16"; I don't think that's something a public CA could reasonably
issue, for a variety of reasons (conflicts with routers, IoT devices, and many
other things). We need some reasonable solution for "local network" devices,
and in particular for devices where a user may be able to get hold of the
private key, but in the meantime I don't think publicly valid IP-address
certificates make sense.

> The CA that issues you the intermediate CA certificate would be on the hook,
> with the consequence of being removed from the root CA programs, if you
> issued a malformed certificate.

Given that today, such an intermediate CA could be used for arbitrary MITM,
that's entirely understandable. If there are additional constraints that an
intermediate certificate needs to have, we need to have enforcement mechanisms
to support such policies, so that an intermediate CA _can 't_ create
certificates that may lead to MITM or similar attacks.

\- Root CA policies for proving access to (star).yoursite.example such that
you can get an intermediate CA.

\- Root CA policies for the valid duration of intermediate CA certificates (no
longer than X, no longer than the proven ownership of your domain...). This
already constraints the valid duration of certificates underneath the
intermediate CA.

\- Policies for requiring Certificate Transparency (e.g. any such intermediate
certificate still has to use CT logs, such that individual certificates must
appear in the CT logs to be considered valid). This _could_ be done by policy
in browsers, such that any CA opting into issuing intermediate CA certificates
must use CT for _everything_ ; that's where we're going for all CAs anyway.

\- Do we have mechanisms to ensure that an intermediate CA can't issue another
intermediate CA certificate?

\- Policies for wildcards underneath intermediate CAs. If you have an
intermediate CA for (star).yoursite.example, under what circumstances can you
issue a wildcard certificate underneath it, and for what domains?

\- Under what circumstances should you be able to get an intermediate
certificate for something like (star).yoursite.internal? Should you, or should
all names chain up to a domain you prove ownership of (e.g.
(star).internal.yoursite.example)? There's potential value in internal-only
domain names, and this would reduce the scope of attacks (preventing the
intermediate CA from issuing certificates for your public site), but ownership
issues might become tricky.

\- Related: Today, if you want to MITM example.com, you need to get a root CA
to sign your certificate. If example.com obtains an intermediate CA, it's
potentially easier for an adversary to somehow obtain a certificate for
example.com or www.example.com signed by that intermediary, depending on
example.com's infrastructure. (It's also potentially easier for an adversarial
jurisdiction to force example.com to surreptitiously mint a certificate; CT
would help there, but this still seems like substantially more attack
surface.) Is there some way we can make that less likely and more difficult as
an exploit vector?

\- Policies for the duration of certificates underneath intermediate CAs.

\- Policies for handling revocation of certificates underneath intermediate
CAs.

\- Policies for the use of dedicated HSMs for intermediate CAs. Should we make
that a policy requirement?

\- Eventually, when NameConstraints becomes not only usable but best-practice
for intranets, browsers and libraries could start defaulting to only loading
certificates from the system certificate store iff they have NameConstraints
that meet some appropriate policy.

------
ReactiveJelly
"There were two informational and two minor-severity findings."

They found a few small issues but nothing horrible

------
est31
I wonder, with deprecation of TLS 1.0 and 1.1 support being planned, whether
Firefox can adopt rustls in favor of NSS.

~~~
dathinab
It's probably still missing a lot features NSS supports and Firefox needs for
backward compatibility and enterprise use-cases. As far as I know NSS does
slightly more then "just" TLS.

Also rustls not supporting certain legacy/less secure settings can be a
problem. Generally its a good idea, but sometimes you have to things like the
"communicate with embedded hardware which can only do that probably backdoored
eleptic curve in hardware and is too slow to do other curves in software"
situation..

Through for a very high amount of use-cases this doesn't matter and rustls
feature set has everything you need.

