Hacker News new | past | comments | ask | show | jobs | submit login
A Rust-based TLS library outperformed OpenSSL in almost every category (zdnet.com)
214 points by tene 35 days ago | hide | past | web | favorite | 80 comments



Hacker News readers would probably do well to read the actual post: https://jbp.io/2019/07/01/rustls-vs-openssl-performance.html

It also comes with four separate posts that dig into the details, figuring out why the discrepancy exists.

Additionally, rustls is a tls library built on top of ring. ring is a project that is porting BoringSSL to Rust + assembly, bit by bit. BoringSSL was forked from OpenSSL. So there's common code ancestry here, especially around the primitives.

I have heard some people criticize these benchmarks because "those APIs aren't really used" or something; I don't fully understand this criticism myself, but they saw them as being "not real enough."


A small nit that bugs me in the original post

> The difference between AES128 and AES256 appears to be approximately 26%, rather than the 40% one might expect (AES128 does 10 rounds, AES256 does 14 rounds). There may be some limiting factor that prevents AES128 running at the expected speed, or perhaps this CPU has extra area dedicated to making AES256 faster.

The hardware in the CPU performs a single AES round, it does not dedicate more hardware for AES256. One who understands that AES-GCM is not just AES knows that the difference is due to the GCM overhead, that is the same for 128 and 256.


This is always a risk with blogging: you say something shortsighted or silly and don't realise until someone says "eh?" in a public forum. I've removed that paragraph; thank you for the correction.


Did you find where they talked about what configuration options were given to OpenSSL? The benefit and curse of complexity is that OpenSSL can run on everything so what config options are used are very important. I struggle to feel like this is a reproducible benchmark as 'default options' for OpenSSL turns into a whole lot of autoconfiguration depending on the host OS detected and I am not sure what version of Debian they are using for this test and how that was configured (stock or otherwise).

What would have also been nice if there were packet dumps from each to see if there was anything interesting there as well. If it is like you say based on BoringSSL it would be interesting if rust did something different at the packet handling level that OpenSSL did not.


All I know is what's in the post; you're right that "Debian" is not exactly enough info...


How is the "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA38" cipher suite test "not used". Thats just BS, 22% performance is pretty big.

and 42% performance increase in TLS 1.3 handshakes is even bigger.


I'd expect very few clients to negotiate using SHA-384. It's widely viewed as overkill compared to SHA-256, and it hurts performance.

I'm not saying it invalidates the benchmark results or anything—I just wanted to address your "not used" question.


SHA-384 can actually be faster than SHA-256 is some cases. The reason is that SHA-256 used 64 rounds with 32 bit words, while SHA-384 uses 80 rounds with 64 bit words. So, each block processed by SHA-384 is twice as big but uses less than twice as many rounds.


Oh yes, definitely. But SHA-384's output is 128 bits wider, which uses more bandwidth. Although for the AEAD ciphersuites, it doesn't make much of a difference, admittedly.


There is the not so widely used SHA-512/256 for that case - the speed of the 64 bit variants with a 256 bit output.


Isn't SHA-256 vulnerable to length extension attacks?


Yes, but so is SHA-384 and that is not relevant for the TLS context.


384 is not vulnerable to length extension attacks precisely because it is truncated. The output is not he full internal state.

The speed advantage of SHA-512 and the advantage of truncation is why some more exotic variants like SHA-512/256 (SHA-512 truncated to 256 bits) are used in newer protocols.


Instead of rewriting BoringSSL bit by bit wouldn't it be better to just write a modern and much simpler Noise library?


Very interesting!

Rust normally prevents a lot of memory safety problems, but those are not the only kind of security problem in a crypto system. Obviously the library has to correctly implement the spec (and expectations of users), which just using Rust doesn't guarantee. More importantly, a crypto library also has to avoid leaking crypto data through various side channels. Just using Rust doesn't help with that; "Crypto Coding" at https://github.com/veorq/cryptocoding lists some rules to help address that.

Anyone know if rustls has worked on those issues as well? I notice that rustls uses ring, which asserts on https://github.com/briansmith/ring that "Most of the C and assembly language code in ring comes from BoringSSL, and BoringSSL is derived from OpenSSL. ring merges changes from BoringSSL regularly."

It doesn't matter if it's fast if it doesn't do things correctly...!


Inevitably some crypto primatives must be written in assembly languages in order to ensure timing attacks and alike are not possible. As far as I know, there's no support in Rust or C for requiring such constraints from the code generation.

Still, ensuring the code around the primitives is safe in various ways is still valuable.


You could write those in Vale[0], but at that point, just use miTLS[1] to have a completely verified TLS implementation.

[0] https://github.com/project-everest/vale

[1] https://mitls.org/


Brian Smith is well-regarded as an expert in this field. He knows what he's doing.


Just trusting a name and not the product has led to some bad outcomes before.


Having worked with Brian while integrating ring into Trust-DNS, I concur with the assessment from pcwalton.

But, yes, the name isn’t the only thing, but the discourse on the development of ring through the issue discussion on Github is high quality and very conservative in what is accepted. The library is high quality and has decent test coverage, with a diligent review process. All of that stems from Brian’s effort in maintaining an amazing project.


Can you provide an example of a name trusted among cryptographers, a specific library, and a bad outcome? I agree with Patrick Walton: Brian Smith is an extremely credible engineer in this particular domain.

I would venture, tentatively, the hypothesis that the exact opposite thing is true, and that trusting well-established subject matter experts has been a pretty good strategy in cryptographic security. There's a reason everyone uses Sodium/Nacl.


I think the subject is somewhat ambiguous because for many users, they are more likely to recognize library author names than cryptographers. And imagine those authors are cryptographers in their own right.


Not a single one.


I am unfamiliar with any of the code, so take my statement with a grain of salt.

Sometimes crypto code must be written in weird ways to fend off attacks.

For example, to prevent timing attacks, code must sometimes be written to take constant time. This basically means that some loops must not exit early (such as when a key is wrong). Instead they must note the failure, then continue looping and return failure afterwards.

This means some code has to be written to always take worst-case time.

But if worst-case time in one libraray is better than another library, there can definitely be improvements.


Yeah, but not surprising: OpenSSL is generally considered to be rather poorly written, from reinventing their own memory allocators, poorly, to layers upon layers of performance-sucking wrappers, to just bizzare and convoluted design decisons.

I'd be far more interested in seeing comparisons to, eg, BearSSL.


Note well: that was the prevailing (and probably correct) opinion a few years ago, but OpenSSL's quality has by most accounts improved dramatically over the last few years. I wouldn't talk someone out of using BearSSL, but if you're writing code that has to depend on a C-language TLS library, OpenSSL is a sane first choice.


I’ve reviewed OpenSSL code somewhat recently and agree that it’s vastly improved.

Though, if you want something based on Rustls with C bindings (I have no involvement in this project) there is: https://github.com/mesalock-linux/mesalink

That’s OpenSSL compatible bindings into Rustls.


The thing is openssl runs on so many platforms, which may or may not guarantee anything except a barely working malloc/free, so it makes sense to ship a custom allocator.


Which ones?


I wonder if the performance gains are due to BoringSSL being faster than OpenSSL?


I linked to the in-depth explanations below; it has stuff like

> The difference between OpenSSL and rustls appears to be thanks to an extra copy in the main data-path in OpenSSL.

and

> It appears OpenSSL recalculates and then discards the local public key when processing the server’s key share extension. This likely explains the larger performance deficit in this test.

in it. So I doubt that it's the result of differences in the primitives.


X.509 parsing has no allocations and call depth is significantly less than OpenSSL.


That’s very cool!


Impressive. I know Firefox uses NSS (its own TLS library) and not OpenSSL, but this article makes me wonder if Mozilla would consider using rustls in FF as Rust handles more and more of the FF code. Maybe it says something that NSS has few users apart from FF despite the obvious advantage of being maintained by a major organization.

Interestingly, the crypto code in rustls is by ring, a Rust crypto library. According to the ring github page, "Most of the C and assembly language code in ring comes from BoringSSL, and BoringSSL is derived from OpenSSL."


I'd much rather see Firefox use miTLS[0]. They already use HACL[1] (miTLS and HACL are part of Project Everest[2]) in NSS[3], so I don't think it would be too big of a leap. Rust is really cool, but if you want to ensure that the implementation adheres to the spec, a language with formal verification is much better. That said, miTLS isn't finished yet, so of course it's going to take a while until anyone can actually use it.

[0] https://mitls.org/

[1] https://github.com/project-everest/hacl-star

[2] https://project-everest.github.io/

[3] https://blog.mozilla.org/security/2017/09/13/verified-crypto...


It looks like it's written in F#; that would make it harder to integrate with mozilla's c++/rust codebase. Not impossible surely, but possibly not worth it since in general an attacker has far more interesting avenues of attack than the ssl implementation; especially since it hasn't AFAIK had any high-profile vulnerabilities.



From the linked project page:

The stable version of miTLS including the new 0.9 release are written in F# and specified in F7.

Excuse my confusion.


Chromium uses NSS too, at least on some platforms. I've also seen it in embedded Linux contexts, it's more common than you think it is.


The implication here is that it does better because it's written in rust. I doubt that's the case. I think it's faster because it's new, smaller (i.e. doesn't have to deal w/backwards-compatibility with previous versions or old protocols). More importantly, it uses the ring library for crypto, which is over half asm according to its github. Clearly, there's a lot of optimization there.

Regardless of the reason why, I'm very excited to see a better alternative to openssl. More competition will hopefully make all the projects better, and I wish the rustls guys the best of luck.


I've suspected for a long time that Rust would prove to be a faster language than C++ once it was done, but only as the code scales up. On a line-by-line basis, the two languages are comparable, so microbenchmarks will never show it, but Rust will tend to very, very strongly encourage better architectures that require you to do less copying just to be safe. It's really hard to measure "architectural affordances", though... heck, it's hard to even define them, let alone benchmark them.

OpenSSL has been optimized extensively, so I consider the fact that any hobbyist project with a fraction of the resources able to beat it by that much to be significant in a way that, say, beating "some guy's github json parser" isn't. If I were going to temper my excitement, though, what I'd want is to make sure the Rust implementation isn't unknowingly trading away important security considerations to get that speed, such as whatever constant-time operations may be necessary or guarantees about when the key will be destroyed or whatever. I have no concrete knowledge about that in this specific case, but in a security context it is something I'd want to check in general. OpenSSL has had its stumbles, and the architecture is... debateable, especially when binding to it in a non-C language... but it has also been reviewed a lot, and battle-tested, and a lot of other things this little library just hasn't been yet. (Not because of any deficiency of the library, author, or language, but just, something things can only come from time and scale.)


Rust already beat C in the benchmarks game

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...


That boxplot shows both the C and the C++ programs ahead of the Rust programs, so why do you say "Rust already beat C" ?


That's not how I read the title. I think it's impressive that it's faster despite being written in Rust.

Maybe Rust also makes it easier to write fast code (I couldn't say one way or another), but it's great that you at least don't need to sacrifice performance on the altar of safety.


I’m generally positive on rewriting openssl in rust, but agree the comparisons aren’t completely scientific or necessarily more important than correctness. First you should compare performance using the same implementations of cryptographic primitives as these are relatively easily fungible. We also don’t know if for example OpenSSL optimised primitives were used. Secondly the rust language only excludes memory bugs, it doesn’t exclude errors in the implementation of the tls protocol or incorrect usage of cryptographic primitives which can be just as catastrophic for security. These have been prevalent in OpenSSL and are somewhat harder to prevent a priori. For all we know these issues are worse for rustls than OpenSSL. This is where formally verified implementations would be useful.


> Secondly the rust language only excludes memory bugs, it doesn’t exclude errors in the implementation of the tls protocol or incorrect usage of cryptographic primitives which can be just as catastrophic for security.

I linked to a talk about the project elsewhere and it's worth noting that the author of rustls leverages a lot of rust techniques that ensure certain correctness attributes at a semantic level, not just memory safety.

In particular, TLS libraries have long suffered from dealing with the complex composite state machines required by the protocol[0]. Rust makes the expression of safe state machines pretty easy (the talk demonstrates how).

[0]https://www.mitls.org/pages/attacks/SMACK


Crypto libraries is a domain where the advantages of Rust are very important. You need really good performance, and you need really good safety guarantees. Rust is relatively unique in being able to deliver both of those as well as it does.


There are probably a lot of reasons why it's faster, and at least some of those are about rust, and some definitely aren't.

https://www.youtube.com/watch?v=aHMRFZkXq4Y

Here's a good talk about the project. Some of the patterns the author uses are sort of hard to do in other languages (such as pattern matching on composite states).


People do good things with rust. Therefore rust is a good programming language.

Could someone have done this particular good thing in another language? Absolutely. But that's not very relevant.


> Could someone have done this particular good thing in another language? Absolutely

While this is true for many other tasks, I don't think there exists any other programming language that fits so well for safe and efficient protocol implementation (where experimentation speed is not that important)


Given enough time and resources someone could do equivalent work in assembly. The point of Rust is that it should take you less time and resources to do so.


How much of these 'lol x written in Rust is better than y' is simply a matter of newer code generically improving on older poorly written software?


We're not talking about taking a venerable tool that's been sitting around for years and rewriting it though. We're talking about an actively-maintained and extremely highly-used library. I'm sure it's still bogged down by tech debt, but I'm also sure that any easy performance wins have already been taken.


A lot of time is spend on the assembler versions doing the crypto, which is why they get used by various other projects like ring, boringssl, and the Linux kernel.

Less or no time is spend on improving the performance of the SSL library. From the article, it seems that the SSL library might have easy fixes.


From the article it sounds like there are some simplistic reasons why OpenSSL may be slower, such as making an extra copy of the plaintext, but that doesn't necessarily mean that fixing it is easy.


Highly used yes. But there are only two full time developers on this and funding was flakey. There was giant discussion about this during Heartbleed vulnerability.


That's true, but performance improvements don't require fulltime developers. Many companies are directly impacted by OpenSSL performance (maybe moreso in the past when a big question about HTTPS adoption was the performance overhead of adding SSL to all requests, whereas today it's generally considered not a problem) which means companies have an incentive to submit patches to fix identifiable performance issues.


Even if that was all that is there, "x is better than y and written in a language that makes some kinds of bugs much less likely" is still great news for all current users of y. I read "rust-based" as an endorsement of the software, not of the language.


Plenty of them. Is there something to suggest that applies in this case?


Almost all of them but HN loves jumping on the bandwagon.


Thread from a couple weeks ago: https://news.ycombinator.com/item?id=20352296


If your coding in rust, need ssl, and are cross compiling, using rustls over openssl leads to simpler builds. This helped me simplify the rust -> aws lambda flow on a recent project.


It would be cool if Firefox could switch to Rustls, but I can imagine that it will take a lot of time.


I'd prefer it to be a bit more battle-hardened before going into a widely used browser.


Agree. I'd rather have safe/secure over speed in this case.


I would be interested in a comparison with BoringSSL.


[flagged]


There was an article from Microsoft on why they're adopting Rust that was posted a little bit ago. The gist of it is that even with a skilled staff, 70% of their CVEs were due to memory errors.


Right - the whole point of having computers / machines is to reliably and efficiently do things that humans are bad at doing at scale. I honestly don't understand people who are excited about programming and also excited about doing things by hand that computers can do better and faster. Like, what is the appeal of programming to them?


Enjoying debugging their code? :)


The article was not from Microsoft but from some rag like ZDNet (I think it was also ZDNet, in which case they're currently submarine-ing Rust).

70% of CVEs are C/C++, because those bugs need to be fixed and important things run on C/C++.

It is an entirely useless metric.


ZDNet reported on a presentation given at a security conference by Microsoft.

> 70% of CVEs are C/C++, because those bugs need to be fixed and important things run on C/C++.

You mis-understood the statistic; the statistic was not "70% of CVEs come from C and C++", the statistic was "70% of bugs come from memory safety issues." They did not classify by language.


I kind of wanted to downvote you for being non-constructive, but instead, I'll ask you a question to try to make this a concrete discussion. Can you give a concrete example of "no language can even come close to doing what C can"?

(Not a statement about how C is flexible or powerful or portable or anything, a specific actual thing in C code.)


I'm a fan of Rust (although I haven't waded into it much yet), and a heavy user of C, and the one particular thing I find missing so far is a properly Rust-ified abstraction of QSBR-style RCU, with the full power and flexibility that can be achieved with liburcu-qsbr in C.

For C it's just some header files and a library and you get some amazing performance magic out of it. It's the key missing feature that makes me leery of going down the road of converting some existing C daemon code over to Rust.

It's hard to do low-level QSBR as a primitive for scalable threading in languages other than C/C++ unless you bake the abstraction in pretty deeply. I think it's probably possibly to bring it to Rust in a meaningful way, but probably non-trivial to integrate with all the other cool things going on in the language for concurrency and the borrow-checker, etc.


I'm not familiar with the implementation details of QSBR, and I've never used liburcu-qsbr, but what I've read about it tonight sounds a lot like the im[0] crate in terms of behaviour.

The behaviour in im is a data structure where readers get a reference that pins data in memory, so they have a consistent snapshot to read from. Any mutations made while there are outstanding read references will write to a new allocation for the part that changed, so the two references can share most of the data through structural sharing. When there are no outstanding read references, mutations happen in-place.

Have I correctly understood QSBR if I think it's similar behaviour to this? How does the implementation differ?

[0] https://docs.rs/im/13.0.0/im/


I’m not super familiar, but from a cursory google, it looks like crossbeam is the rust equivalent?


A valid point could be compiler availability. Almost every platform imaginable has a C compiler. Not a language feature though, just a sign of popularity.


Right, that's not an instance of "... language can even come close to doing what C can," absent an argument about how architectures that LLVM doesn't support are easy to write custom C compilers for but not LLVM backends. LLVM just happens not to support that many architectures today.

Also there have been a few attempts at a C backend for LLVM, and code in Rust that is compiled through C will generally retain its safety and speed properties.


Was hardware AES and DH ever used? With hardware AES, it becomes hard to get statistically significant difference in between crypto libraries.

Unless you use a purpose made low CPU usage cipher like salsa 20, hardware crypto is just so much faster than software. It is like comparing software h264 implementations in 2019, and trying to make big news out of it.

Salsa 20 and derivatives looks to me a very promising cypher suite, but for as long as there is no hardware support for it, it will not see its niche even in resource constrained niches as even $5 smartphone SoCs nowadays can run hardware AES faster than any software crypto.

Another moment that needs note is that Salsa is still a relatively new and untried cypher, without much real world track record other than its use in Chrome. Poly1305 is an even less tried construct vs GMAC, with harder to imagine hardware implementation (it will be fair to say that most HMACs are hard nuts to crack for hardware implementation because of them using tons of LUTs.)

Second, Salsa is a work of just 1 man + few collaborators. Knowing the crypto community, this itself may be a reason for push back from it.

P.S. Question to people in crypto: why not to use some derivative of Salsa 20 as an HMAC instead of Poly1305 in Salsa-Poly combo?


Second, Salsa is a work of just 1 man + few collaborators. Knowing the crypto community, this itself may be a reason for push back from it.

Daniel J. Bernstein, to be precise, also author of Djbdns, Qmail, NaCl, Curve25519, ... Some of which he solo authored. So I'd take the quoted skepticism with a huge grain of salt.


A lot of the benchmarking focused on handshaking, which is asymmetric cryptographic performance. As far as I know acceleration for asymmetric algorithms specifically is uncommon, although general SIMD instructions can improve performance.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: