Hacker News new | past | comments | ask | show | jobs | submit login

A quick look at the source code shows the generate_key() function [0] to be insecure. It generates 32 random bytes (good, that's what you need for an AES-256 key), but then it uses those random bytes to sample from a distribution which only has 62 characters. This significantly reduces the security of the key, from 256 bits of entropy to ~190 bits (log2(62^32)). And that would be in the best case, if it were sampling uniformly from the distribution - it is not.

I recommend reading Section 9.7 of Cryptography Engineering [1] to understand why choosing random elements from a set is harder than it seems. A good example of a similar bug is the nasty bug in Cryptocat's PRNG from 2013 [2].

I assume this step was done so the AES key could be included in the URL fragment, since a set of random bytes may not be url safe. I recommend feeding the random bytes of the key directly into the underlying cryptographic functions, and using a urlsafe encoding at a higher level when necessary.

Also, it appears you are using AES [3], a block cipher, but I cannot figure out what block cipher mode you are using. I'll have to dig into the CryptoJS code a little more to see what it defaults to, but I have a sinking feeling that it's ECB, which is completely insecure. Dan Boneh's Crypto I course on Coursera is a good way to learn the basics of block cipher modes.

[0]: https://github.com/jes/hardbin/blob/c77c2d7eb93586e0e009ea4b... [1]: https://www.amazon.com/Cryptography-Engineering-Principles-P... [2]: https://nakedsecurity.sophos.com/2013/07/09/anatomy-of-a-pse... [3]: https://github.com/jes/hardbin/blob/c77c2d7eb93586e0e009ea4b...




I'm aware it's only 190 bits of keyspace. I mentioned this in my blog post[0], and would in fact be more likely to decrease it than increase it, in order to make the URLs shorter. I don't think it's a problem, but am interested in being proven wrong.

It's using CBC mode.

[0] http://incoherency.co.uk/blog/stories/hardbin.html

EDIT:

> And that would be in the best case, if it were sampling uniformly from the distribution - it is not.

Can you please point out how it's not? It's intended to sample uniformly. It would be non-uniform if it were "randombytes[i] % alphabet.length".

EDIT2:

I see now how it's non-uniform. 256 values in randombytes doesn't map 1:1 onto 62 values in alphabet. I will fix this tonight, thanks for pointing it out.


> I'm aware it's only 190 bits of keyspace. I mentioned this in my blog post[0], and would in fact be more likely to decrease it than increase it, in order to make the URLs shorter. I don't think it's a problem, but am interested in being proven wrong.

I understand that you're trying to balance the tradeoff between security and usability here, which is tricky. If quantum computers are part of your threat model, remember that Grover's algorithm provides a quadratic speedup for brute-forcing a symmetric key, so 2^190 would become 2^95 against a quantum adversary. Personally I prefer the margin of safety provided by using a full-strength 256-bit key :)

> It's using CBC mode.

Phew! That would've been truly catastrophic.


CBC mode isn't exactly a saving grace here, since it's unauthenticated.


The code and data are shipped together out of IPFS. If you don't trust the data, you don't trust the code anyway, so it makes no difference whether the data is authenticated.


First, read this: https://tonyarcieri.com/all-the-crypto-code-youve-ever-writt...

Second, what is the threat model where you trust IPFS but still need to encrypt client-side? Unauthenticated CBC mode totally defeats the point of encryption, but encryption totally defeats the point of trusting IPFS.

Why not just-- crazy idea!-- use authenticated encryption even if you trust IPFS?


I don't think you understand IPFS.

If you trust your IPFS node, you know that you're retrieving the correct content. You still don't want others to be able to read it.

EDIT: (Since HN won't let me reply to you): There is no mode of operation in which it's safe to use Hardbin without trusting the messages you're receiving, authenticated encryption or not.

Since the code and the data are both served out of IPFS at the same time, checking the message integrity is pointless. If somebody controlled the IPFS node you're using and wanted to do something malicious, they could more easily just add some code to ship the decryption key off to a remote server than perform an attack on the unauthenticated encryption.


A lot of crypto engineering problems go out the window if you just trust the messages you're receiving. That doesn't mean it's okay to use unauthenticated encryption in 2017.

> EDIT: (Since HN won't let me reply to you): There is no mode of operation in which it's safe to use Hardbin without trusting the messages you're receiving, authenticated encryption or not.

> Since the code and the data are both served out of IPFS at the same time, checking the message integrity is pointless. If somebody controlled the IPFS node you're using and wanted to do something malicious, they could more easily just add some code to ship the decryption key off to a remote server than perform an attack on the unauthenticated encryption.

http://www.cryptofails.com/post/121201011592/reasoning-by-le...


Your HN profile describes you as a Cryptography Engineer, and I'm sure that's true.

You're just not presenting a convincing argument for why it's OK to trust the code but not the data when they're both served out of the same place. Do you understand that they're both served from the same place? If somebody can modify the data, they can just as easily modify the code and make it do whatever they want. Do you understand that?

You're just linking me to things other people have written instead of presenting a persuasive argument. So I'm not going to bother doing what you say. Sorry.


If you call your project "the most secure X" and you're doing something related to X in a very sub-optimal way (i.e. not using AEAD modes), what happens if someone uses your code as a reference point for writing their own crypto for a protocol that doesn't use IPFS? What if, instead of a specific instance of copying code out of context, people learn bad habits from your code and then defend their choices until they're blue in the face because they copied it off of "the most secure" reference implementation?

A lot of problems, many unforeseeable, are easily prevented by using authenticated encryption.

> You're just not presenting a convincing argument for why it's OK to trust the code but not the data when they're both served out of the same place. Do you understand that they're both served from the same place? If somebody can modify the data, they can just as easily modify the code and make it do whatever they want. Do you understand that?

Yes, and this has been covered to death to the point that I'm sick of hearing it: https://www.nccgroup.trust/us/about-us/newsroom-and-events/b...

One could easily build a Chrome extension that still uses your IPFS gateway for shipping ciphertext. Now you've got the code at rest (so transport-layer integrity doesn't undermine the code being executed), but the data is still reliant on IPFS.

Let's do a thought experiment.

First, move your code to a Chrome extension (which makes the code immutable), then give the attacker an enormous amount of power. For a moment, assume that an attacker can totally defeat IPFS. It doesn't matter how they do it. Is your application-layer cryptography protocol still secure?


> First, [completely change your application and its core security assumptions]. Is your application-layer cryptography protocol still secure?

No.


If your application-layer cryptography protocol is not secure in isolation, what argument do you have against making it secure in isolation?


I have none. I'd merge a sensible pull request.

But characterising it as a huge security flaw is disingenuous. It's neither here nor there.


I'm characterizing it as a protocol/design flaw in something that bills itself as the most secure X, sure, but I haven't done anything to describe it as "huge".

Are you being needlessly defensive?


You mean truly catastrophic if it was codebook?


> to understand why choosing random elements from a set is harder than it seems

There is no sample from a set problem involved, just convert b256 -> b62, there is a correct way to do this.




Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: