Third-Party Audit of Rustls

dochtman · on June 14, 2020

Some quotes from the report's conclusions:

"the team of auditors considered the general code quality really good and can attest to a solid impression left consistently by all scope items."

"was consistently well-documented and readable, demonstrating that security processes are ingrained in the development and documentation processes"

"Both from a design point of view as from an implementation perspective the entire scope can be considered of exceptionally high standard (...) no directly exploitable weaknesses could be identified."

"It appears to have been developed with all previously known issue-types in mind; furthermore, its missing support for insecure or outdated protocols and primitives indicates a security-conscious development approach."

"the developers of rustls have an extensive knowledge on how to correctly implement the TLS stack whilst avoiding the common pitfalls that surround the TLS ecosystem. This knowledge has translated reliably into an implementation of exceptional quality."

"The developer’s intent to provide a high-quality TLS implementation is very clear and this goal can be considered as achieved successfully. (...) Cure53 had the rare pleasure of being incredibly impressed with the presented software."

bsder · on June 14, 2020

I especially liked the: "We couldn't reason about this. So we rewrote it."

Thankyouthankyouthankyouthankyou.

I have this discussion all the time and I can't get people to see the issue.

lmkg · on June 15, 2020

> There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.

-Tony Hoare, from his Turing Award lecture

wahern · on June 15, 2020

> TLS-01-004 - Data Truncation in DER Encoding Implementation (low). This finding rightly points out a function in rustls that produces incorrect output when applied to an X.501 Name that is larger than 64KB. While that’s an exceedingly unlikely case, and the bug does not cause unsafe operation (but perhaps connection failure), the function has been corrected to produce valid output for all inputs.

The ISO 7816 smartcard standard uses this same ASN.1-based scheme for command and reply encoding. I found a memmove overflow and a separate off-by-one bug in Yubico's PC/SC library, and an off-by-one bug in the Yubikey 4 firmware, all related to their handling of this encoding.

It's possible that the Yubikey 4 bug could have been leveraged into an information leak, a buffer overflow, or some other exploit of the firmware.[1] The firmware was calculating a command reply length based on what the proper, compact length of the object should be, not the actual length of the stored object with malformed internal encoding, leading to corrupt reply chains. But I never explored it. I'm not an exploit developer, and didn't have time to go down that rabbit hole. I only discovered the bugs tracking down a bug in my own PC/SC library. It turned out I was generating an overlong encoding ({0x82 0x00 0xff} instead of {0x81 0xff}) for an object, tickling the Yubikey 4 issue. Yubico quickly fixed their library, but the engineer resolving the ticket downplayed the firmware issue, which seemed far more concerning as it suggested bug-prone buffer management techniques in the firmware.[2] (Indeed, it suggested that the firmware reused the buggy library code.) I haven't gone back to see what happened with the firmware since then.

Anyhow, paying attention to small encoding and decoding issues like these are important--the details always matter. "This couldn't possibly lead to an exploit" are famous last words.

[1] Hopefully not a leak of the key itself as I presume it was inaccessible to the firmware, but who knows--it's a black box.

[2] IIRC, their argument was garbage in, garbage out (GIGO), which is a dangerous misapplication of that principle. A malformed object should never result in malformed, corrupt responses by other layers of the protocol stack. In fact, the GIGO principle usually implies the opposite--that the behavior of other layers of the stack is invariant to bad data, not conditioned by it.

EDIT: Looking at my Git logs it seems my code was never buggy. My library couldn't read back some certificates generated by another, third, project and written to the card using the Yubico SDK. That third project was generating the over long encoding, and in the process of tracking down the root of the issue I stumbled upon the Yubico bugs. IIRC, that third project was never reading back the certificate as they had a very convoluted scheme for 2FA and client-side SSH X.509 certificates that relied heavily on the bespoke behavior of their client-side Go daemon.

mivvy · on June 14, 2020

This is really exciting! Many popular Rust libraries support switching between rustls and openssl for their TLS needs and it’s much easier to do cross-platform builds with rustls.

fluffything · on June 14, 2020

> After spending thirty days on the scope in late May and early June of 2020, the team of auditors considered the general code quality really good and can attest to a solid impression left consistently by all scope items. Naturally, this is partially thanks to the usage of Rust as the preferred language for the entire implementation of the rustls project.

packetized · on June 14, 2020

The finding in TLS-01-003 is surprising to me, mostly because it presupposes a lack of sophistication among users of this library who are also using X.509 NameConstraints. From RFC5280:

  For example, a name constraint for "class C" subnet 192.0.2.0 is represented as the octets C0 00 02 00 FF FF FF 00, representing the CIDR notation 192.0.2.0/24 (mask 255.255.255.0).

As they mention in the findings:

  Typically, subnet masks should be contiguous and the presence of a non-contiguous mask might indicate a typo (such as 225.255.255.0 vs. 255.255.255.0), or potentially an attempt to bypass an access control scheme. Therefore, it is recommended to treat certificates containing non-contiguous subnet masks in their name constraints as invalid.

This seems to run counter to the intent in the RFC. By allowing for a four-octet subnet mask, instead of simply an int to represent the a contiguous CIDR mask, the RFC authors may have intended that more complex IP-based NameConstraints could be constructed. This certainly would make a huge difference for something like an intermediate (CA:TRUE), where it becomes much more economical to specify a sparse mask for a highly templated network. Think certs for network equipment or VoIP phones with regular, repeatable IP addressing across many locations/networks. E.g., a VoIP provisioning system that has an intermediate issuing CA with the following NameConstraints: IP:172.16.0.0/255.255.1.239.

If any change comes from this specific finding, I would hope that it's simply a flag to allow or disallow the use of discontiguous masks. I do understand that this is specific to WebPKI; having said that, if a client is implemented using rustls (with these recommendations enabled) and it happens across a perfectly valid certificate issued by an intermediate with a discontiguous mask in the NameConstraints, presumably it would fail or otherwise break. And yes, I have previously configured precisely this in an intermediate CA.

JoshTriplett · on June 15, 2020

Leaving aside the question of whether other software will work reliably with non-contiguous subnet masks (which led to this recommendation), in general, most software does not deal well with NameConstraints. Some libraries ignore it, some libraries fail hard if a constraint exists, and in general, I'd expect a certificate chain involving NameConstraints to be poorly supported at best and insecure at worst.

I wish that NameConstraints were better supported, to make it easier to support intermediate CAs; for instance, prove you own example.com and you could then have a CA restricted to *.example.com. But right now, that just doesn't seem feasible.

briansmith · on June 15, 2020

I spent significant effort writing the name constraint implementation in mozilla::pkix (used by Firefox) and by webpki (used by Rustls) to fix this situation. The result is summarized by Netflix:

> The result was that every browser (except for Firefox, which showed a 100% pass rate) and every HTTPS client (such as Java, Node.JS, and Python) allowed some sort of Name Constraint bypass.

- https://netflixtechblog.com/bettertls-c9915cd255c0

Since then, Google Chrome has implemented and deployed a new certificate processing library on some (most?) platforms it supports, and I bet they have similar or better name constraint support. I believe Apple has also improved their implementation.

> I wish that NameConstraints were better supported, to make it easier to support intermediate CAs; for instance, prove you own example.com and you could then have a CA restricted to *.example.com. But right now, that just doesn't seem feasible.

Since the aforementioned improvements have shipped in production browsers, it is much more practical, from a technical standpoint, to do that. The real problem now is browsers' CA policies. As I understand it, they do not want you to be able to get your own name-constrained intermediate CA certificate. The CA that issues you the intermediate CA certificate would be on the hook, with the consequence of being removed from the root CA programs, if you issued a malformed certificate. And there are other issues with the policy. I hope there are improvements to the browsers' CA policies to make this practical, but I wouldn't hold my breath.

JoshTriplett · on June 15, 2020

Thank you for working on this!

> Since the aforementioned improvements have shipped in production browsers, it is much more practical, from a technical standpoint, to do that.

What about non-browser HTTPS/TLS libraries? Even if browsers support it perfectly, this doesn't seem like something CAs should start deploying if widely used libraries have problems with it. And based on the test results from BetterTLS, it looks like widely used libraries still have problems with it.

Also, the BetterTLS article gives an example of a NameConstraints for "192.168.0.0/16"; I don't think that's something a public CA could reasonably issue, for a variety of reasons (conflicts with routers, IoT devices, and many other things). We need some reasonable solution for "local network" devices, and in particular for devices where a user may be able to get hold of the private key, but in the meantime I don't think publicly valid IP-address certificates make sense.

> The CA that issues you the intermediate CA certificate would be on the hook, with the consequence of being removed from the root CA programs, if you issued a malformed certificate.

Given that today, such an intermediate CA could be used for arbitrary MITM, that's entirely understandable. If there are additional constraints that an intermediate certificate needs to have, we need to have enforcement mechanisms to support such policies, so that an intermediate CA can't create certificates that may lead to MITM or similar attacks.

- Root CA policies for proving access to (star).yoursite.example such that you can get an intermediate CA.

- Root CA policies for the valid duration of intermediate CA certificates (no longer than X, no longer than the proven ownership of your domain...). This already constraints the valid duration of certificates underneath the intermediate CA.

- Policies for requiring Certificate Transparency (e.g. any such intermediate certificate still has to use CT logs, such that individual certificates must appear in the CT logs to be considered valid). This could be done by policy in browsers, such that any CA opting into issuing intermediate CA certificates must use CT for everything; that's where we're going for all CAs anyway.

- Do we have mechanisms to ensure that an intermediate CA can't issue another intermediate CA certificate?

- Policies for wildcards underneath intermediate CAs. If you have an intermediate CA for (star).yoursite.example, under what circumstances can you issue a wildcard certificate underneath it, and for what domains?

- Under what circumstances should you be able to get an intermediate certificate for something like (star).yoursite.internal? Should you, or should all names chain up to a domain you prove ownership of (e.g. (star).internal.yoursite.example)? There's potential value in internal-only domain names, and this would reduce the scope of attacks (preventing the intermediate CA from issuing certificates for your public site), but ownership issues might become tricky.

- Related: Today, if you want to MITM example.com, you need to get a root CA to sign your certificate. If example.com obtains an intermediate CA, it's potentially easier for an adversary to somehow obtain a certificate for example.com or www.example.com signed by that intermediary, depending on example.com's infrastructure. (It's also potentially easier for an adversarial jurisdiction to force example.com to surreptitiously mint a certificate; CT would help there, but this still seems like substantially more attack surface.) Is there some way we can make that less likely and more difficult as an exploit vector?

- Policies for the duration of certificates underneath intermediate CAs.

- Policies for handling revocation of certificates underneath intermediate CAs.

- Policies for the use of dedicated HSMs for intermediate CAs. Should we make that a policy requirement?

- Eventually, when NameConstraints becomes not only usable but best-practice for intranets, browsers and libraries could start defaulting to only loading certificates from the system certificate store iff they have NameConstraints that meet some appropriate policy.

dathinab · on June 15, 2020

I'm not completely certain, but my guess is a lot of software and midleware doesn't work well with non-continuous subnet masks.

It's one of this (many) thinks where in theory the RFC has intended support for this but in praxis it's close to unusable and better to block for most application use cases.

Like e.g. many valid email-addresses are in practice unusable and it's recommended to not support them because today they are basically only used to intentionally cause problems (I mean email addresses using quoting like `"a b"@example.com`, you always should support internationalized mail addresses, except maybe as mail provider where it depends on your target customers).

briansmith · on June 15, 2020

I commented on this at https://github.com/briansmith/webpki/issues/130. This is nothing to be concerned about, for quite a few reasons. I don't expect any compatibility issues with being stricter since Google Chrome's new X.509 processing does enforce the stricter syntax.

ReactiveJelly · on June 14, 2020

"There were two informational and two minor-severity findings."

They found a few small issues but nothing horrible

est31 · on June 14, 2020

I wonder, with deprecation of TLS 1.0 and 1.1 support being planned, whether Firefox can adopt rustls in favor of NSS.

dathinab · on June 15, 2020

It's probably still missing a lot features NSS supports and Firefox needs for backward compatibility and enterprise use-cases. As far as I know NSS does slightly more then "just" TLS.

Also rustls not supporting certain legacy/less secure settings can be a problem. Generally its a good idea, but sometimes you have to things like the "communicate with embedded hardware which can only do that probably backdoored eleptic curve in hardware and is too slow to do other curves in software" situation..

Through for a very high amount of use-cases this doesn't matter and rustls feature set has everything you need.