
Show HN: Monocypher: a small, easy to use crypto library - loup-vaillant
http://loup-vaillant.fr/projects/monocypher/
======
niftich
This project seems to be the culmination of a rhetorial question that the
author posed a while ago: if the advice is 'don't roll your own crypto' [1],
then how does any crypto ever come into being? [2]

So perhaps the marketing is off: I don't view this as a "small, secure,
auditable, easy to use crypto library"; I view this as a practical and ongoing
study in investigating how to make small, secure, auditable, easy-to-use
crypto libraries. To be fair, the page mentions that it's "not ready for
production" and needing external review a quarter of the way down, but if the
point is to prove that a sufficiently dedicated individual can learn the
material, endure some criticism, and produce quality software following the
example of well-acclaimed experts, then _own_ this point, bare your
disclaimers in the beginning, and make it more prominent.

And once you phrase it that way -- which is arguably consistent with the
author's previous posts (and HN comments) on the topic, it's suddenly a lot
more palatable, and regardless of its ultimate outcome, I can only applaud.

[1] I use an interpretation of "don't roll your own crypto" that's slightly
less circular than the author's. In my opinion, it's shorthand for _' defer to
the opinion of experts, use ready-made constructs when possible, and if not,
then exercise caution when hooking crypto primitives together in unproven
ways'_. Originally posted here: [3]

[2] [http://loup-vaillant.fr/articles/rolling-your-own-crypto](http://loup-
vaillant.fr/articles/rolling-your-own-crypto)

[3]
[https://news.ycombinator.com/item?id=12400040](https://news.ycombinator.com/item?id=12400040)

~~~
loup-vaillant
I'm fairly confident I can write a bug free crypto library. Monocypher is
probably already bug-free.

On the other hand, I don't see how I can _ensure_ my library is bug-free
without external help.

"Probably bug free" is not good enough. I need over 99.9% certainty, and I'm
currently pretty far from that.

~~~
vessenes
I would be careful with statements like this, largely because the question of
what 'bug free' means is real.

Do you mean that the crypto algorithms work as advertised? e.g. would DES
after public knowledge of differential cryptanalysis be okay?

Do you mean that you have implemented algorithms exactly as specified?

Do you mean that you have implemented algorithms exactly as specified that are
also constant time so that they resist timing attacks?

Do you mean that you have implemented algorithms exactly as specified that are
not vulnerable to cache eviction attacks on shared hardware?

Do you mean that you have implemented algorithms exactly as specified that
resist bit-flip attacks or SDR attacks, or audio attacks, or other Van Eck
style listening?

I'm actually not trying to be down on your work. It's hard work, and the world
needs more people who actually do it. I'm just saying it is VERY hard to feel
confident, and as someone who has a passing interest in this kind of security,
I worry when I read statements like you made.

~~~
loup-vaillant
Exactly as specified, and immune to timing attacks including cache eviction. I
make no claim about other side channels, or the strength of the primitive
themselves —though I did pick strong primitives.

~~~
vessenes
:)

------
tptacek
Two things.

First: it needs to be made clearer why anyone would use this rather than
Nacl/Sodium.

Second: this statement does not inspire confidence:

"The current code is fairly rigorously tested. Every primitive match their
respective test vectors, and most constructions are tested for consistency.
The entire test suite is Valgrind clean."

If your primitives don't match their test vectors, your tool doesn't work at
all, so pretty much every crypto tool, including MiLiTARY-GRADE-CRYPTO v20
GOLD on the Android store or whatever, achieves this level of rigor.

Here's a good Reddit comment on the last crypto thing the author wrote, a few
weeks ago:

[https://www.reddit.com/r/programming/comments/5iv1ti/rolling...](https://www.reddit.com/r/programming/comments/5iv1ti/rolling_your_own_crypto/dbbmwwo/)

~~~
Cyph0n
Instead of digging through OP's Reddit history, it would have been much more
useful for everyone if you lent OP your trained eyes and had a look through
his project's codebase (which is quite small). Screaming "NO!" at every
attempt to implement crypto primitives is not a constructive form of
criticism.

Regarding the Reddit thread, I actually enjoyed a response much more than the
comment you linked:

[https://www.reddit.com/r/programming/comments/5iv1ti/rolling...](https://www.reddit.com/r/programming/comments/5iv1ti/rolling_your_own_crypto/dbg58iy/)

~~~
lvh
Why is it necessary to volunteer expert time every time someone publishes a
new piece of code that by their own admission is not a significant improvement
over TweetNaCl? Time is limited; if something doesn't do a good job arguing
why you need it, pointing people to things that are packaged, updated, have
been consistently maintained and reviewed is _solid engineering advice_.

~~~
loup-vaillant
> _someone publishes a new piece of code that by their own admission is not a
> significant improvement over TweetNaCl?_

That one is a tough call for me. If I brag too much, I can be dangerous. Too
little, and nobody uses my stuff. I'm not sure about the significance of the
improvements, but they are undeniable: except for curve25519, everything is
more readable, faster, and a bit more useable than TweetNaCl (curve25519 is
merely more usable).

There is also one significant contribution I did not brag too much about: my
Argon2i implementation is _much_ simpler than the reference implementation.
I'm considering adding support for multiple lanes (maybe with threads, maybe
not), and Argon2d. The result should be RFC worthy.

Was re-implementing Argon2i from scratch good engineering practice? I don't
know. What I do know is that I ended up uncovering a long standing bug in the
reference implementation; one that complicates implementations _and_ hurt
security a little bit.

I think Monocypher was worth the effort, even if just for Argon2.

~~~
lvh
The Argon2 improvements are interesting and I'd love to hear about them.

Re: usability of other core primitives: I'll bite. Places where you don't get
to solve a hard coordination problem to agree on a nonce matters. I know you
document that you can't use a random nonce (and that's accurate), but I think
that is an important safety regression.

~~~
loup-vaillant
> _I know you document that you can 't use a random nonce_

Looking back at my manual, I understand it was easy to miss. But you _can_.
You just have to use XChacha20, which gives you a 192-bit nonce, where the
chance of collision even after 2⁶⁴ messages is negligible. The high level
constructions all use XChacha20, so random nonces are possible.

I really should re-arrange the order of presentation.

The slightly better usability mainly referred to the fact that functions that
cannot fail return void. It's small, but the user is not left wondering
whether the return value should be checked, or why the manual doesn't say
anything about it. I tried to make the ordering of arguments consistent across
the whole library. Easier interfaces have shorter names (no "easy" suffix).

Then there are opinions. Primitive functions are named after the primitive,
because I don't believe in hiding those to the user. Fixed sizes are hard
coded, because I don't expect them to change without changing the primitive
itself. Constants like CRYPTO_ENCRYPTION_KEY_SIZE obscure the point somewhat.

------
lvh
Saved you a click: it's many of the same ideas as libsodium, but mostly with
ChaCha and allegedly easier to deploy. I'm not sure I believe the latter, and
my recommendation to use libsodium stands. ChaCha20 is also available (but
only in AEAD mode with Poly1305) in libsodium, if you really need it. I think
XSalsa20 is a reasonable default: if you had a protocol where there's an
obvious sequence number to use as a nonce, you probably should already be
using TLS anyway; most of the things where libsodium is your best option is
precisely in the cases where you don't have the luxury of solving that sort of
coordination problem, and you want a random nonce (and hence, extended-nonce
versions like XSalsa20 or XChaCha20. I don't know of anyone who has deployed
XChaCha20, although the construction is "obvious".)

I've previously criticized a blog by the same author called Rolling Your Own
Crypto, which had serious flaws.

AFAICT, this is ostensibly cryptographically fine (modulo some safeguards that
probably only matter if you're Doing It Wrong already anyway), even though the
final recommendation is still "avoid". There are many things I disagree with
(don't put the documentation for the internal hash for ChaCha20 before the
authenticated encryption API!), and using TweetNaCl as a base is less
interesting than using libsodium which is both fast and has a lot more
eyeballs on it.

~~~
loup-vaillant
> _allegedly easier to deploy._

One .c file, one .h file. No fancy compiler option. It doesn't get any easier.
I agree with most of the rest, including the "avoid" recommendation, since
Monocypher had only one pair of eyeballs so far.

> _There are many things I disagree with_

Please, I need more! (The documentation order was out of laziness, I'll change
it.)

> _using TweetNaCl as a base is less interesting than using libsodium which is
> both fast and has a lot more eyeballs on it._

TweetNaCl is smaller. _Much_ smaller. It doesn't need nearly as many eyeballs
as libsodium to reach the same level of confidence. And I'm not sure libsodium
had as much audits and static analyses as TweetNaCl, if only because its size
may render some analyses impractical.

I tried to use ref10 as a base. It's too much code.

~~~
lvh
I'd disagree with "easier to deploy". Most users are going to do this via FFI,
and then one .c and one .h pales in comparison to "already has a dylib/so" I
can use.

It may be true that it doesn't need as many eyeballs; alternative
implementations of the same code have the benefit that they're pretty easy to
automatedly check. libsodium is not anywhere near the size where static
analyses become impractical -- do you have any evidence that would support
that?

~~~
loup-vaillant
> _Most users are going to do this via FFI_

The C interface is the first step. I may consider C++ and OCaml bindings in
the future. Still, point taken: calling C from other languages is always a bit
of a pain.

> _libsodium is not anywhere near the size where static analyses become
> impractical -- do you have any evidence that would support that?_

It's not much, but the TweetNaCl papers indicate that its size made it
possible to automatically prove the absence of a couple classes of bugs with
an algorithm prone to combinatorial explosion. Even being twice as big could
have been too much for those.

~~~
lvh
I'm not suggesting so much that it is painful in general (that's neither here
nor there): I'm suggesting that existing libraries, specifically libsodium,
make this _easier_ for anyone who isn't writing like C or _maybe_ Rust and
_maybe_ Go. (So, anything on the JVM, Python, Javascript, Ruby, PHP... You
name it.)

For those use cases, the consumed artifact is a .so. Having a .so is easier
than having to make it. Furthermore, having repeatedly built libsodium (I
built FPM debs when libsodium wasn't yet commonly packaged), for those people
building this is just as easy as libsodium if for some reason they need to
build at all.

~~~
loup-vaillant
> _Having a .so is easier than having to make it._

What's the point, a compilation unit is only a gcc call away from being an
.so, right? Besides, I made sure I don't expose unnecessary symbols to the
linker. What you're telling me amounts to saying binary distributions are
easier to consume than source distributions.

I would be quite bewildered by such a claim. As well as a little upset at my
disadvantage: I do not have the resources to package my library for all major
OSes and distributions. This of course has nothing to do with the library
itself, merely the community around it (with one author and no user, as is the
case with anything new).

I've seen new programming languages being criticised for the immaturity of
their tooling or the lack of community. This feels similar.

\---

Or, your idea of an .so is different from mine. Does .so libraries have
something the C source files don't?

------
jedahan
The manual is absolutely amazing. It shows proper and improper use, and
explains how and _why_ to the lay person.

[http://loup-vaillant.fr/projects/monocypher/manual](http://loup-
vaillant.fr/projects/monocypher/manual)

Example from crypto_chacha20_init (grammar aside):

"Warning: don't use the same nonce and key twice. You'd expose the XOR of
subsequent encrypted messages, and destroying confidentiality."

------
FabHK
I'm torn about this.

Brought to you by the person that wrote "Rolling your own crypto" (see earlier
discussion linked below), with the motto "Impossible? Like that would stop
me."

Some fields of knowledge in certain phases benefit from iconoclasm and Sturm &
Drang, but is cryptography one of them? On the other hand, many existing
implementations are so crufty that a fresh wind might help (if only by pushing
more experienced cryptographers to put out better implementations).

At any rate, at least the writer followed through, though I'd not be inclined
to use this in production anytime soon.

[http://loup-vaillant.fr/articles/rolling-your-own-crypto](http://loup-
vaillant.fr/articles/rolling-your-own-crypto)

[https://news.ycombinator.com/item?id=13221923](https://news.ycombinator.com/item?id=13221923)

~~~
lvh
Right there with you. A question that helped answer this for me personally:
does this library need to exist given that libsodium already exists?

The answer seems "no" to me.

~~~
loup-vaillant
The very existence of TweeNaCl indicates that DJB would have answered "yes".
And I do have a couple minor advantages over TweetNaCl…

~~~
lvh
TweetNaCl was created approximately simultaneously (2013) with libsodium.

~~~
loup-vaillant
Hmm, that explains a number of things. Still, I've dug through libsodium, and
it's still a bit of a mess —not that I would've done better given the
constraints they had.

------
dchest
I like such tiny projects: you can drop .c and .h into your own project and
have something useful ready-to-use. Thanks for sharing! For those asking, this
is what distinguishes it from libsodium. Compared to NaCl or TweetNaCl, it has
password hashing (Argon2), although, to be fair, I wouldn't recommend using
non-optimized implementation of it, like in this library, for serious stuff.

Some notes:

\- Let's call Ed25519 with BLAKE2b something like "Ed25519-BLAKE2b" to avoid
confusion.

\- crypto_chacha20_random is a confusing name, I'd prefer
crypto_chacha20_stream

\- Everyone uses "crypto_" for namespacing their libraries nowadays! Let's use
something else to avoid collusions :)

~~~
loup-vaillant
\- Naming Ed25519 is not obvious: I do provide SHA-512 as an option.

\- I'll consider crypto_chacha20_stream

\- I'll stick with "crypto_", for 2 reason. First, you only need one crypto
library. Second, a brutal search & replace will let you use another namespace
in 30 seconds. Maybe I should provide a script to do that?

~~~
e12e
> I do provide SHA-512 as an option.

Why? If the goal is compact, easier to deploy - why have choice at all?

[ed: I just saw this on the homepage:

Note the presence of SHA-512 in a separate compilation unit. It is meant to
test ed25519 with the test vectors I got from the RFC. In production, most
users will be expected to use Blake2b —the default setting.]

I also wonder a bit on the naming - using algorithm over intent (sha512 vs
"hash") - I can see why transparent naming is good - on the other hand,
wouldn't the _easiest_ approach be: use some sort of framing (monochypher_v1)
and pair that with primitives that "do the right thing"?

I suppose one could use monochypher as a provider and another "interface"
library as a consumer - might make for easier auditing and development.

But now you need two header/c files, and they have to stay in sync in order to
be reviewed in pairs (to avoid subtle bugs in key handling and such?).

Which brings us back to: in five years, when a developer needs to update their
simple utility that uses monochypher - will that developer have to consider
changing cryptographic primitives, moving users from one generation of ciphers
to another - or will monochypher handle that?

~~~
loup-vaillant
_(I spelled it "Monocypher", without the first "h".)_

I need SHA-512 for tests. The only test vectors I could find were for the
official ed25519, with SHA-512. The option is there anyway, I might as well
provide it.

I do transparent naming because the choice of primitive has user visible
effects. If I have the SHA-512 of some file, I can't just use Blake2b to check
that hash. Same for encryption and authentication: changing them is
transparent only for transient messages. Encrypted files won't convert
themselves to the new encryption scheme, and neither will the authentication
tags. Same thing for long term secret/public key pairs.

That said, the high level constructions don't mention the primitives. They're
all called "crypto_lock" or something.

Long term, changes are not expected to be transparent. The only backward
compatible path is to stick to the older version of Monocypher. Any transition
will need to be handled explicitly. I think that's okay, because I don't think
I'll need to make breaking changes often. I expect a stable version to last at
least 5 years, possibly 15. Then the next version will break everything, and
users will just have to cope, most probably by supporting both v1.0 and v2.0
for a time. Hopefully that won't be too hard, given the small size of the
library.

------
ktta
If you don't mind me asking, how much time did it take approximately for you
to write the whole library, excluding the online documentation and other stuff
not directly related to the code? (2 hours each day for a couple months, 18
man hours total, etc.)

~~~
loup-vaillant
I think about 9 weeks, full time. (I'm currently on sick leave.)

Documentation took me relatively little time. Probably a week if I count my
previous blog posts on the subject.

I've had some false starts and gaps in my knowledge. I think that slowed me
down by a factor of 2.

If I lost all data, I think I can re-do it in a week and a half.

~~~
ktta
Thanks for the reply.

I think we all will appreciate a follow up blog post on your experiences when
you were writing it and also post-'Show HN'

------
CiPHPerCoder
I don't see anything in the code that would stop someone from using 0 as a
public key for a x25519 handshake.

~~~
dchest
To add more:

Not only 0, but also some other keys listed here
[https://cr.yp.to/ecdh.html](https://cr.yp.to/ecdh.html) which produce all-
zero shared key.

DJB argues you shouldn't validate public keys except for "some unusual non-
Diffie-Hellman elliptic-curve protocols that need to ensure ``contributory''
behavior". So NaCl/TweetNaCl also don't do anything about it. Libsodium, on
the other hand, returns an indicator when the shared key is all-zero.

One recent example where it was considered a ("low") vulnerability is in the
recent wire audit: [https://www.x41-dsec.de/reports/Kudelski-X41-Wire-Report-
pha...](https://www.x41-dsec.de/reports/Kudelski-X41-Wire-Report-
phase1-20170208.pdf)

 _Therefore, if Bob sends all-zero ratchet public keys, subsequent message
keys will only depend on the root key and not on Alice’s ephemeral private
keys. A dishonest peer may therefore keep sending degenerate keys in order to
reduce break-in recovery guarantees, or force all sessions initiated by a
third party to use a same message key, while a network attacker may force the
first message keys to a public constant value, for example._

~~~
loup-vaillant
Crap, this is more severe than I anticipated. I'll think about it, and
probably change the code and API.

That's said, I don't understand why DJB says this is unnecessary for Diffie-
Hellman? I mean, an all-zero shared secret sounds pretty _bad_ , doesn't it?

Or is it because DJB assumes that if one party decides to publish the
communication, it can do so anyway?

~~~
lvh
Not all AKEs require contributory behavior to guarantee their security
properties, but many do. Your suggestion is pretty close to why; if you're
using plain scalarmult as opposed to an AKE with fancier properties (say, 3DH,
HMQV...) in the context of e.g. box, it's the sender that picks public keys
anyway, so they don't care.

In the context of most transport encryption, you presumably do want to prevent
those keys from being selected by either party. That means either filtering
those keys, or, perhaps better, using a fancier AKE like the 3DH in
Signal/Noise or HMQV or something.

------
jedisct1
In the same vein: libhydrogen [1].

Both the API and the underlying primitives are still moving targets, but one
of the motivations behind this project was to revisit the
NaCl/LibSodium/SUPERCOP API to build something more difficult to misuse.

In particular, users never provide nonces. Contexts are mandatory everywhere.
A broken PRG is not always catastrophic. Operations requiring keys have a
dedicated keygen() function (eventually, keys may have their own type instead
of generic byte buffers). There's no crypto_stream_* API, but a
randombytes_buf_deterministic() function.

libhydrogen is not meant to replace libsodium, but the libsodium2 API may
resemble more to it than the current one.

[1]
[https://github.com/jedisct1/libhydrogen](https://github.com/jedisct1/libhydrogen)

------
VMG
Why this instead libsodium?

~~~
loup-vaillant
Libsodium is _big_. Smaller than NaCl, but still a little unwieldy. A bigger
contender is TweetNaCl, but it doesn't provide password key derivation, and I
dislike some of its primitives a tiny little bit.

My library have two clear (albeit small) advantages: first, its interfaces are
more consistent with its (non-existent) failure mode. NaCl and descendants
have most of their function return an error code for no reason –the underlying
primitives are foolproof.

Second, my Argon2i implementation doesn't allocate. Okay, the _user_ must
allocate, but that give more control, and let me guarantee I cannot fail. For
some reason, the reference implementation is not like that. Surprising,
considering the Argon2 authors make sure implementers could avoid spurious
allocations.

\---

My current recommendation still goes to TweetNaCl and Libsodium. My hope is to
be able to recommend Monocypher instead. Maybe sometime later this year?

~~~
VMG
Thanks! Considering that this seems to be a popular question, maybe it makes
sense to state this more clearly on the site.

------
sporkenfang
I am curious as to why you've posted this as a tarball on a private website
instead of to Github, Bitbucket, etc.

~~~
loup-vaillant
My web site, my domain name, my home, my freedom. I avoid centralised services
whenever I can. Besides, my git history is not very interesting, so I didn't
feel the need to publish it.

~~~
laurent123456
Wouldn't it make sense to share this over https though?

~~~
cclemmons
I agree, https to serve software (libraries included) is a must, even if the
source is open. (That's not to get into the debate on whether or not to use
https for everything, but I am a proponent of that as well).

Since he's running his own server, it should be easy enough to add a
certificate from [https://letsencrypt.org/](https://letsencrypt.org/)

~~~
loup-vaillant
It's on my TODO list…

