
Go crypto: bridging the performance gap - jgrahamc
https://blog.cloudflare.com/go-crypto-bridging-the-performance-gap/
======
travjones
It's great to see that companies can turn a profit AND contribute to open
source projects that benefit the web as a whole. Thanks, Cloudflare.

------
Intermernet
I didn't know the name Vlad Krasnow (I'm not that familiar with the crypto
world), but from looking at these [1], [2], it seems that he would know what
he's talking about from both a crypto and performance viewpoint.

> attempting to make them part of the official Go build for the good of the
> community

Would be interesting to hear what Adam Langley thinks about this, but I
couldn't find anything recent on the golang-dev list.

[1]: [http://dblp.uni-trier.de/pers/hd/k/Krasnov:Vlad](http://dblp.uni-
trier.de/pers/hd/k/Krasnov:Vlad)

[2]: [http://stackoverflow.com/users/1516766/vlad-
krasnov](http://stackoverflow.com/users/1516766/vlad-krasnov)

~~~
stock_toaster
I found this:
[https://github.com/cloudflare/go/issues/5](https://github.com/cloudflare/go/issues/5)

------
CyberShadow
I don't understand, why reimplement this if it is already implemented with
adequate performance/quality in OpenSSL? I thought Go didn't have issues
calling C code?

~~~
andrewchambers
Go has moving memory so it generally means calling C requires a bunch of
copying. This is slow and annoying.

~~~
robmccoll
Are you sure that's true? Perhaps they have reserved the ability to do that in
the future (can you point to doc on that? ), but right now I'm fairly certain
that go does no such thing. The only copies necessary when calling C code are
for strings since they aren't guaranteed / required to be null terminated byte
arrays in Go. And cgo has C ABI compatibility so function call overhead is non
existent.

~~~
ominous_prime
Function call overhead in cgo is considerable. You're right that it's not from
copying, but the runtime scheduler still has to coordinate the blocking call,
and the stack needs to be switched out to the C stack, kind of like a context
switch.

~~~
robmccoll
True, just benchmarked it and found the function call overhead to be about
1.85834729e-7 seconds (185 ns). Which isn't much, but the pure C version would
obviously be single nanoseconds for the handful of instructions needed
depending on the function call.

~~~
fdej
185 ns is worse than function call overhead in Python (about 100 ns on this
machine)

~~~
robmccoll
Python calling Python, Python calling out to a native module, or Python
calling through ctypes?

~~~
fdej
Python calling a no-op Python function def foo(): pass

------
4ad
I will happily look over these once they are part of upstream Go, or I will
happily review them once they are proposed upstream, but I will take a pass on
their "special fork of Go".

------
biftek
I'm not sure what's involved with getting assembly working in Go (my
experience has only been with servers/clients and CLI's), but could this not
have been implemented as a stand alone package? What's the benefit of a fork
in this case?

~~~
jzelinskie
I'm also wondering the answer to this question. To my understanding, go-crypto
is actually maintained separately from the Go stdlib by the Go developers[0].
So why isn't this just a fork of crypto? Why fork the entire language? Why
can't this be upstreamed?

[0]: [https://github.com/golang/crypto](https://github.com/golang/crypto)

~~~
sdevlin
This is a library of supplemental crypto algorithms. You won't see things like
AES, SHA-2, or the NIST curves in here. Those things are part of the standard
library in Go.

------
ashearer
Nice work! The article benchmarks a > 20X speedup for AES-128-GCM, for
performance described in the text as "on par" with OpenSSL. It would be
helpful for reference to have an OpenSSL column added to the benchmark table.

------
michaelt
Cloudflare's Universal SSL is pretty great - I used to host my static website
on S3, but that means no SSL or ipv6. Putting everything through cloudflare
sorts both those out.

The original announcement [1] mentioned they were planning support for adding
in the HSTS header - as jgrahamc is here responding to comments, I'd be
interested to hear how far they've got with that :)

[1] [https://blog.cloudflare.com/introducing-universal-
ssl/](https://blog.cloudflare.com/introducing-universal-ssl/)

~~~
jgrahamc
We added it during "Week of SSL": [https://blog.cloudflare.com/enforce-web-
policy-with-hypertex...](https://blog.cloudflare.com/enforce-web-policy-with-
hypertext-strict-transport-security-hsts/)

------
thomasahle
"Let's just rewrite all the crypto ourselves! In assembly!"

Sorry to be a buzzkill, but that sounds like a recipe for disaster.

~~~
4ad
_All_ the low-level crypto is written in assembly. And not only for speed, but
for ensuring properties like constant-time execution, etc. It's the same for
OpenSSL, and the same for commercial crypto libraries. Go is not at all
different here.

The difference is that the higher-level crypto is written in Go, not in C; Go
is memory safe, much more strongly-typed in general, and with run-time bounds
checking which eliminate buffer overflows.

The bugs are almost never in the low-level algorithms, they are in the higher-
level components.

~~~
alfiedotwtf
There was a post on HN a few weeks ago talking about "ensuring properties like
constant-time execution" isn't possible as instruction timings doesn't take in
account things like caching, pipelining, task switching, microcode
optimisations etc.

~~~
sdevlin
When people describe crypto code as "constant-time", they typically just mean
its execution time is data-independent.

------
ufo
> Given the many vulnerabilities related to the use of AES-CBC with HMAC

What are they talking about here? Are there any important ones if you MAC
after encryption? The only vulnerabilities I know of are when you MAC before
you encrypt.

~~~
tptacek
They're talking about the TLS CBC constructions, not AES-CBC and HMAC in the
abstract.

------
donpark
Wish Cloudflare spread some of that Go asm love to Ed25519.

~~~
higherpurpose
Indeed. Couldn't they have supported it for a small part of users like they're
doing with ChaCha20-Poly1305?

~~~
donpark
ChaCha20-Poly1305's 3-4x performance gain over AES-GCM made it worthwhile for
them where Ed25519's benefits are not as clear cut to non-proponents. Too bad.

------
tav
Nice work! Do you guys also have a similar implementation of ChaCha20-Poly1305
for Go? If so, any chance you could share that too?

~~~
jgrahamc
That work has been done for OpenSSL, I guess we could add for Go:
[https://blog.cloudflare.com/do-the-chacha-better-mobile-
perf...](https://blog.cloudflare.com/do-the-chacha-better-mobile-performance-
with-cryptography/)

~~~
tav
That would be pretty awesome! Also, do you happen to know if your current
patch is safe to apply on top of OpenSSL 1.0.2a?

------
jvermillard
Strange they dont mention AES-CCM at all. I'm using it a lot in IoT
applications (dtls). It's not frequent in the "web" world?

------
atlbeer
Off-topic but, they probably shouldn't be using a Southeast train image in
this blog post as they are notoriously bad for delays and slow service in the
UK[1] :)

[1][http://www.which.co.uk/home-and-garden/leisure/reviews-
ns/be...](http://www.which.co.uk/home-and-garden/leisure/reviews-ns/best-and-
worst-uk-train-companies/best-and-worst-trains-for-delays-/)

~~~
jgrahamc
I added that. As someone who lives in London I'm well aware of how bad their
service is. It was a deliberate ploy to make you spend more time on the blog
post by getting your to shake you head about the image and then read the post.

------
aikah
I'm curious whether the Go team will merge Go crypto into the core or not. It
will show how open they are to third party contributions.

~~~
4ad
Your comment is significantly out of place. The Go team is extremely open to
quality external contributors. 463 people have contributed to Go so far,
(obviously) most of them outside the core Go team. The Windows port was
exclusively done and maintained by contributors. I have done the arm64 Go
compiler, which is upstream now, the Solaris port, and I am now doing the
sparc64 compiler. Many 3rd party contributors do many things every day.

The Go project is an extremely open project. More than half of the people who
have direct commit access are external contributors.

~~~
coldtea
> _Your comment is significantly out of place. The Go team is extremely open
> to quality external contributors._

I think his comment is still valid. Adapting something major like this is not
the same as accepting bugfixes from hundrends of people, or ports to a
different architecture.

Even more different would be accepting some code for the standard library
whose API wasn't designed by the core team.

From what I've seen the core team is quite opinionated and micro-managing
things.

~~~
DannyBee
well, for starters, to accept it, the licensing would need to be changed
slightly.

Parts have

    
    
      +// Copyright 2015 The Go Authors. All rights reserved.
      +// Use of this source code is governed by a BSD-style
      +// license that can be found in the LICENSE file.
      +
      +// Copyright 2015 Intel Corporation
      +// Copyright 2015 CloudFlare, Inc.
      +
      +// This file contains constant-time, 64-bit assembly implementation of
      +// P256. The optimizations performed here are described in detail in:
      +//   S.Gueron and V.Krasnov, "Fast prime field elliptic-curve cryptography with
      +//                            256-bit primes"
      +"
    

The additional copyright notices would not be okay. They would cause
_everyone_ who uses this library to have to reproduce not just the standard go
copyright notices, but _those_ , too.

~~~
twotwotwo
They seem OK to me. A quick grep of my Go tree turns up lots of third-party
copyright notices. It's OK as long as their code was released under licenses
compatible with Go's.

These additional copyright notices could be removed by CloudFlare and Intel
going in the AUTHORS file (which defines "The Go Authors"), presumably after
they and Google do any required paperwork. Red Hat, Dropbox, and Fastly are in
AUTHORS. But that needn't be a condition of integrating their code as long as
it's licensed properly.

The paper citation doesn't appear to be copyright-related, and others like it
are sprinkled around the codebase, e.g., package sort's source cites some
papers on efficient sorting.

~~~
DannyBee
"They seem OK to me. A quick grep of my Go tree turns up lots of third-party
copyright notices. It's OK as long as their code was released under licenses
compatible with Go's."

Okay, let me rephrase: "they aren't okay". It's actually my job to make these
decisions and tell teams what is and what isn't okay :)

The other issues you mention are in the process of being fixed.

"These additional copyright notices could be removed by CloudFlare and Intel
going in the AUTHORS file (which defines "The Go Authors")"

Yes, they could, but that requires agreement from more than just cloudfare.
This is code Intel donated to openssl, not to Go, so it's simply not as
trivial as cloudfare saying "sure, here's some code". Intel has to agree to
have their copyright notice changed, etc.

"The paper citation doesn't appear to be copyright-related, and others like it
are sprinkled around the codebase, e.g., package sort's source cites some
papers on efficient sorting. " I have no care in the world about this part.

~~~
twotwotwo
I get that Google always _really_ really wants a CLA since it does things the
BSD license doesn't (patent grant!). I also agree that, practically speaking,
legal stuff is absolutely part of the process of getting the asm crypto stuff
merged in. And I know you're qualified.

But I read the initial comment as saying, specifically, no third-party
copyright notices, ever.

I have code up with under Go's license with more than one set of copyright
notices
([https://github.com/twotwotwo/sorts](https://github.com/twotwotwo/sorts)).
From a grep, Go 1.4 has ~1,026 non-"The Go Authors" copyright notices in ~271
files in ~35 dirs (Lucent and other Plan 9 copyright holders, Sun,
individuals, MPEG and yacc authors).

If there is stuff I should read/learn to have any hope of understanding what's
OK (to keep my own stuff clean, and generally), it would help me to know.

