Hacker News new | comments | show | ask | jobs | submit login
Why Constant-Time Crypto? (bearssl.org)
117 points by libeclipse 5 months ago | hide | past | web | favorite | 23 comments

From having spent a bunch of time working on wrapping crypto libraries for Python, the author’s name (Thomas Pornin) is one that I regularly saw providing in-depth answers on the Cryptography stack exchange: https://crypto.stackexchange.com/users/28/thomas-pornin.

It is encouraging to see BearSSL at beta stage. I believe the last time I saw it, Thomas still considered it to be alpha. The promise of a small, MIT-licensed crypto library, written by a solid cryptographer who understands implementation details is nice. I’ve used libtomcrypt in the past, but as a non-expert, I’ve always wondered how resilient and well-written the code base is.

libtomcrypt was started by Tom St Denis as he was learning programming, at the age of 15 years old or so. The exact history could be extracted by reading the Usenet sci.crypt group archives. In its early days suffered the horrible security woes that you can expect with these premises (e.g. https://groups.google.com/d/msg/sci.crypt/j3EW_pt6pKA/u8PYkf...). To his credit, Tom St Denis was always very honest and prompt to fix the issues. Pretty recently, it still had very bad issues (e.g. https://github.com/libtom/libtomcrypt/issues/74)

I got no idea what to expect now, but I've never seen it used in anything I'd consider serious. To many people, it's forever tainted. The lack of CVE in libtomcrypt is not an indication of quality, just and indication of lack of usage. In any case, its lack of support for PKCS#11 means it's not suitable for use cases where keys must be securely stored.

Looks like @pornin has invested a lot more effort since BearSSL's initial release nearly a year ago, previous HN comments at:


A little off-topic, but I wonder if the barcode-ish design in the header is a puzzle with an interesting hidden message. It's not uncommon for cryptographers to do things like this.

it's 3 of 9 barcode (https://en.wikipedia.org/wiki/Code_39) that says "BearSSL"

Best I could come up with was "868686692969"...

The bar code reads: 86 86 86 69 29 69.

> Curve25519 allows for much simpler code (for instance, there is no need for validation, since all sequences of 32 bytes are by definition valid), and the “natural” Montgomery ladder is constant-time.

It's worth noting that no validation is strictly necessary, but some iinputs can force the output to be all-0[1]. That should be checked.

[1] RFC 7748 (https://www.ietf.org/rfc/rfc7748.txt), p. 14

It would only matter to cryptographic code if an implementation somehow used floating-point, but the source-level conversion from floating-point to unsigned integer can also leak information in execution time when translated to x86 code:


Other implementations purely in hardware or purely in software would be constant-time, but the x86, only offering an instruction for the conversion to signed integer, leads to a mixed solution in which the short sequence of instructions generated by compilers contains a conditional branch on the value being converted.

I had another idea for making constant time programs. The idea is that you create a virtual machine without loops or branch instructions, in which each instruction is tested so that the amount of time it takes is independent of the instruction's input or output. You then compile your C code to this instruction set, and execute it on the VM. A program running on this VM, no matter how it's produced, will execute for the same amount of time.

Why not do this for crypto?

I'd do it the other way round. A VM is a bit of a pain and you can't target a constant-time-only VM with a regular C compiler. Instead I'd produce a restricted C compiler that only accepted constant time programs, and mapped directly to a limited set of machine instructions. Regardless of what you do, you'd have to produce a whilelist of constant-time machine instructions, so you might as well emit them directly rather than relying on a general purpose optimising compiler to do the right thing.

You can then emit an object file containing the functions you're interested in and link it to the rest of your non-constant-time program.

"Instead I'd produce a restricted C compiler that only accepted constant time programs, and mapped directly to a limited set of machine instructions."

This has more potential just because it could be licensed in embedded sector to reduce work on things such as WCET analysis for hard-real-time. That money and uptake can feed back into the compiler's development for things such as constant-time-preserving optimizations. I haven't seen one of these, though. They typically do timing analysis manually (i.e. specific tool) since they have to assess the algorithms themselves, interactions with other components, and so on. Neatest thing I found looking into that was a formally-verified WCET analyses that was built on top of CompCert compiler.

Oh wait, just found something:


"Instead I'd produce a restricted C compiler that only accepted constant time programs, and mapped directly to a limited set of machine instructions."

How would the compiler knows if the code it's given is constant-time ? It's actually a property quite hard to verify due to pointer aliasing or whatnot.

Another angle is to have a compiler that always outputs constant-time code. It's always possible, at least for crypto code according to Peter Schwabe, just that it's probably not going to be very fast code...

Because as explained in the article, elementary things like memory access a[i] are not constant time with respect to i?

Because an arbitrary C program will have non-constant branches (so may become exponentially long in your proposed instruction set) and loops (not representable at all in your proposed instruction set)?

If what you are saying is that a language could hypothetically statically verify that the execution of a function was constant time with respect to certain inputs, that sounds like it would work and that the only obstacle is engineering effort. But compiling to such a language doesn't sound very practical to me.

I think someone wrote an applicative functor instance in haskell that would accomplish something like this. I forget if it was constant space or time or what (I can't find the link unfortunately).

All I remember at the time was thinking I had no idea why anyone would want this. But then someone (on either hn or proggit) mentioned that there were crypto reasons why it would be useful.

So I suspect that compiling isn't too bad ... however, programming with it probably is annoying.

EDIT: Actually, it looks like it was top comment in this thread: https://news.ycombinator.com/item?id=7557089

Kind of a neat idea at any rate.

Silly question here: why not get a reasonable worst-case time measure one-off, then yield the thread until total time elapsed for said crypto op is up?

First of all: that technique is not very accurate, calling sleep() is not guaranteed to return in a specific amount of time.

But more importantly, that only covers the basic timing attack side channel and not even that very well. Cache and branch predictor attacks are still possible (and these are a bigger threat when you're using a virtual machine that may be shared with your adversary). Not to mention all the side channels (power consumption, etc) that are visible if your hardware is compromised.

And if you do end up yielding a thread, the context switch overhead will be measurable. It can take hundreds of microseconds to fully recover from a context switch (TLBs and cache misses, etc), which will certainly affect the timing. It would almost certainly be possible to detect if the encryption method yielded the thread or not based on timing alone (e.g. encrypting next block would consume more time after a thread yield).

There are no easy ways out, crypto code should be free of side channels: constant time, no data dependent memory accesses (cache attack), no data dependent branches (branch predictor attack), etc. This requires the crypto implementor not only to be proficient in the mathematics of cryptography but also details of the computer architecture they're running on.

Making the overall computation take a fixed amount of time does not plug indirect leaks. In particular, if the computation makes memory accesses at addresses that depend on secret values, then this will populate the memory caches for these addresses, and evict from the caches whatever data was using the same slots. The attacker may work out, after the computation, which previous data elements were evicted from cache, thereby leaking secret information. Crucially, the cryptographic computation itself took a fixed amount of time, but leakage occurs nonetheless.

This kind of leak still counts as a "timing attack" because the test for cache eviction is based on the time it takes to next access the relevant element (and this occurs _after_ the cryptographic computation, from other code). Notably, this can still be done remotely, possibly from a large distance, thanks to the efficiency of modern networking (this is what makes timing attacks special among side-channel leaks: power analysis, electro-magnetic emissions, sound... require the attacker to be in the physical vicinity of the target system, while timing attacks may be performed from hundreds of miles away). Demonstrations have been made with network access (over ethernet or optic fibre), and, in another kind of setup, when the attacker can run his own code on the same hardware, even with logical isolation (another process, or even in another VM that happens to run on some other cores of the same machine).

Thus, "true" constant-time code will make no memory access at an address that depends on secret data (but the data contents, of course, may be secret). Similarly, it won't make conditional jumps that depend on secret data (because of the cache accesses for loading the code, and also because of the jump prediction cache, both having been successfully exploited in lab demonstrations).

I would imagine that you can detect that in various ways - how fast the server is able to serve concurrent requests, or if you have more local access how much power is being drawn, and so on.

I think the goal is not necessarily constant time, but more like do the same things independently from the bits of the secrets (symmetric keys, private keys, etc.). As a result, you get constant time.

But you get much more, because if one analyzes physical conditions (cpu/memory usage, temperature, time itself), he can't gather information about the secrets.

That's going to vary based on processor, memory, load, etc.

Something I really dislike about OpenSSL is how many generated header files it requires. Does BearSSL manage to implement everything in standard C?

What is everything? The TLS and X509 code is implemented in forth compiled to C.

Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact