The reason it is described as "the world's most secure" is because both the code and the data are served from IPFS. IPFS is a content-addressable storage system, so as long as you access it over an IPFS gateway you trust (running one locally is the best way), you know the code and data haven't been meddled with.
It degrades to the same security model as ordinary encrypted pastebins when accessed over a gateway you don't trust (e.g. the hardbin.com public gateway).
This article (was on HN a couple of weeks ago) gives a good overview of IPFS: https://ipfs.io/blog/24-uncensorable-wikipedia/
And you can learn how to set up a local IPFS node here (it's super duper easy - copy binary into /usr/local/bin, ???, profit): https://ipfs.io/docs/getting-started/
So, please, don't knock it before you understand it :)
EDIT: Although I did try to explain it thoroughly in the About section, I do take responsibility for not making it easy enough to understand.
IPFS is pretty new technology, and presents a very different set of assumptions compared to traditional web apps.
If you didn't understand it at first but you do understand it now, please let me know what the key piece of information was that made it "click" so that I can put more emphasis on that next time.
I recommend reading Section 9.7 of Cryptography Engineering  to understand why choosing random elements from a set is harder than it seems. A good example of a similar bug is the nasty bug in Cryptocat's PRNG from 2013 .
I assume this step was done so the AES key could be included in the URL fragment, since a set of random bytes may not be url safe. I recommend feeding the random bytes of the key directly into the underlying cryptographic functions, and using a urlsafe encoding at a higher level when necessary.
Also, it appears you are using AES , a block cipher, but I cannot figure out what block cipher mode you are using. I'll have to dig into the CryptoJS code a little more to see what it defaults to, but I have a sinking feeling that it's ECB, which is completely insecure. Dan Boneh's Crypto I course on Coursera is a good way to learn the basics of block cipher modes.
It's using CBC mode.
> And that would be in the best case, if it were sampling uniformly from the distribution - it is not.
Can you please point out how it's not? It's intended to sample uniformly. It would be non-uniform if it were "randombytes[i] % alphabet.length".
I see now how it's non-uniform. 256 values in randombytes doesn't map 1:1 onto 62 values in alphabet. I will fix this tonight, thanks for pointing it out.
I understand that you're trying to balance the tradeoff between security and usability here, which is tricky. If quantum computers are part of your threat model, remember that Grover's algorithm provides a quadratic speedup for brute-forcing a symmetric key, so 2^190 would become 2^95 against a quantum adversary. Personally I prefer the margin of safety provided by using a full-strength 256-bit key :)
> It's using CBC mode.
Phew! That would've been truly catastrophic.
Second, what is the threat model where you trust IPFS but still need to encrypt client-side? Unauthenticated CBC mode totally defeats the point of encryption, but encryption totally defeats the point of trusting IPFS.
Why not just-- crazy idea!-- use authenticated encryption even if you trust IPFS?
If you trust your IPFS node, you know that you're retrieving the correct content. You still don't want others to be able to read it.
EDIT: (Since HN won't let me reply to you): There is no mode of operation in which it's safe to use Hardbin without trusting the messages you're receiving, authenticated encryption or not.
Since the code and the data are both served out of IPFS at the same time, checking the message integrity is pointless. If somebody controlled the IPFS node you're using and wanted to do something malicious, they could more easily just add some code to ship the decryption key off to a remote server than perform an attack on the unauthenticated encryption.
> EDIT: (Since HN won't let me reply to you): There is no mode of operation in which it's safe to use Hardbin without trusting the messages you're receiving, authenticated encryption or not.
> Since the code and the data are both served out of IPFS at the same time, checking the message integrity is pointless. If somebody controlled the IPFS node you're using and wanted to do something malicious, they could more easily just add some code to ship the decryption key off to a remote server than perform an attack on the unauthenticated encryption.
You're just not presenting a convincing argument for why it's OK to trust the code but not the data when they're both served out of the same place. Do you understand that they're both served from the same place? If somebody can modify the data, they can just as easily modify the code and make it do whatever they want. Do you understand that?
You're just linking me to things other people have written instead of presenting a persuasive argument. So I'm not going to bother doing what you say. Sorry.
A lot of problems, many unforeseeable, are easily prevented by using authenticated encryption.
> You're just not presenting a convincing argument for why it's OK to trust the code but not the data when they're both served out of the same place. Do you understand that they're both served from the same place? If somebody can modify the data, they can just as easily modify the code and make it do whatever they want. Do you understand that?
Yes, and this has been covered to death to the point that I'm sick of hearing it: https://www.nccgroup.trust/us/about-us/newsroom-and-events/b...
One could easily build a Chrome extension that still uses your IPFS gateway for shipping ciphertext. Now you've got the code at rest (so transport-layer integrity doesn't undermine the code being executed), but the data is still reliant on IPFS.
Let's do a thought experiment.
First, move your code to a Chrome extension (which makes the code immutable), then give the attacker an enormous amount of power. For a moment, assume that an attacker can totally defeat IPFS. It doesn't matter how they do it. Is your application-layer cryptography protocol still secure?
But characterising it as a huge security flaw is disingenuous. It's neither here nor there.
Are you being needlessly defensive?
There is no sample from a set problem involved, just convert b256 -> b62, there is a correct way to do this.
For some reason you got lucky, and all of them are in this thread talking to you, attempting to give advice. Advice that some people pay a LOT of money for, to these specific people.
Most of your responses to them have read very defensively and argumentative. Out of everything said so far, this is the thing that scares me about possibly using this system.
It's frustrating when people act like I haven't thought through those assumptions and are handing out cargo-cult advice that is not applicable in this application.
The non-uniform key generation is a bug, and I'm glad somebody noticed it.
Had you come on with something reasonable and descriptive like "Hardbin: a security-focused IPFS based pastebin", you might have received a more tempered response (though still, I'm sure, skeptical - as with all things security-oriented).
What could possibly go right?
You don't exactly need clairvoyance to predict this outcome.
If you don't trust that IPFS is providing you with the correct content, then you also can't trust that it's providing you with the correct code.
If you don't trust the IPFS node, it could just ship off your decryption key to a remote server. There is no getting around trusting IPFS to work correctly. And if you trust IPFS to work correctly then there is no benefit to authenticated encryption.
But you can run your own IPFS node locally. Then you're only trusting it as much as you trust your local system. If you don't trust your local system then you can't trust anything.
Also, while I have you, I recommend you use SJCL for generating random bytes in response to this: https://github.com/jes/hardbin/issues/1.
On a meta note, can I ask why you didn't go with something like Tahoe-Lafs if you were looking for a secure, decentralized file store system? I don't immediately know that it would have been better per se, but I'm not quite sure what IPFS provides that you can't get otherwise.
EDIT: I'm not trying to grief you here, but there are three people in this thread already (myself included) who know security very well (professionally so), and I want to point out comments like this one and the GitHub issue that was opened are good-faith attempts at that.
The code and data are served out of the same place. Anybody who can modify the data can modify the code. There is no benefit whatsoever from trying to do any more authentication.
I'll have a look at SJCL, thanks.
I've not played with Tahoe-LAFS, I don't know anything about it.
I fully intend to fix the non-uniform key generation. That is a bug.
I'm yet to hear a convincing argument that the unauthenticated encryption is a problem.
I have two suggestions, then:
1. Implement data integrity and authentication directly in Hardbin, instead of offloading it to IPFS, such that it can be used with both confidentiality and integrity even if assumed-hostile nodes are part of the connection path. To do this I'd recommend using HMAC-SHA256 for authentication, and AES-CBC is probably fine for the encryption. You could combine the authentication and encryption with something like AES-GCM or AES-OCB, but I personally wouldn't do that.
2. Explicitly state upfront and center in the readme that Hardbin is currently abstracting the duty of authentication and integrity to IPFS using that sort of terminology. Now it's clear to me why you're saying that an untrusted IPFS path shouldn't be used, but if you used that sort of language it would be more "formally" expressed cryptographically speaking. The fact that you're mentioning that attackers cannot manipulate files but not explicitly describing how authentication works (e.g. as you do with encryption) is an antipattern.
There's a bit of a runaround here - Hardbin is designed to be an "encrypted, secure pastebin", but Pastebin is inherently an antagonistic medium for file authentication, which you'll really need for file integrity. It's designed to be fairly anonymous, which you have to trade off in some way if you want real file integrity.
The difficulty in this is that "encryption" on its own only offers confidentiality, and in modern cryptography that level of assurance is relatively rarely used in complete cryptosystems. It's not necessarily helpful to have confidentiality across a connection without also having integrity. So you can add decentralized components and encryption to it via Hardbin, but (in my opinion), you're not significantly adding a ton of value to it because I'm not really clear what the use case is where you want to securely share files in a decentralized manner, but you also don't really mind if the files are not protected against manipulation.
No, I still think you don't understand.
You only need to trust your local node. If another node on the path modifies the data, your local node will reject it because it won't match the hash.
Content is addressed in IPFS by its content hash. The Hardbin "URLs" are content hashes.
EDIT: I see the confusion! When I said you need to trust the "IPFS path", I was talking about a path like "/ipfs/QmXyE...". Not a connection path. You need to trust the IPFS path (i.e. content hash), and your local node, but nothing else.
Hash functions can provide integrity against accidental errors or very simple manipulations related to XORing the message or digest. This might not be useful against an active attacker, but technically speaking it is a (weak) form of integrity, which is why we use checksums.
Message authentication codes (MACs) are required to assure integrity against active attackers because you need to elevate to authentication, and hash functions cannot provide authentication (at least not on their own). Authentication is a stronger notion in security than integrity because it simultaneously guarantees and requires integrity - there is no point in having integrity without authentication, because you'd be assuring data integrity without assuring the data origin, which short circuits the integrity problem for an attacker. Conversely, you can't have authentication without data integrity because an attacker could simply forge the data origin.
the "Authentication" in "Message Authentication Code" really is a term of the past that we now use to mean cryptographic integrity.
Another way to see this: integrity is not a security provided by hash functions in the cryptography sense of it.
- "Note that the security benefits of hardbin only apply when accessing it over a local (or otherwise trusted) gateway. If you access it over a gateway that you do not control, then the security model degrades to be equivalent to that of traditional encrypted pastebins."
- "The content will need to be pinned to make sure it stays around for long term (the same as any content stored in IPFS)." (though this doesn't impact the merits of its encryption)
- "You need to make very sure to use a known-good version of the code when creating pastes, as it would be trivial to create a malicious version that looks identical. The best thing to do is write down the hash the first time you use it, and always use the same hash. If you want to upgrade to a new version of the software, you'll need to update your hash."
- "I don't recommend using hardbin for highly critical stuff as the code has not been thoroughly audited by anyone but me."
The last is perhaps the most worrying thing.
I think the point is that it's an extraordinary claim with unextraordinary evidence.
Better, actually-trying-to-be-secure pastebin implementations will encrypt the data with JS before sending it to the server, and later decrypt it with a key that's embedded in the link's URL fragment (which the server never sees).
The security evolution here is that both the ciphertext and the code for the webpage to decrypt it are stored on IPFS, which is a content-addressed filesystem.
Thus, so long as you access pastes through a trusted (ie, local) IPFS note, the hardbin server operator can't insert code on the webpage after-the-fact to exfiltrate the key or plaintext back to them.
This is not an extra-ordinary claim. This is the claim of "I made secure software, and I reckon it's pretty good" wrapped up in the extremely ordinary common English-language hyperbole of calling something 'the best' when there is no way to know if it is, or prove it one way or the other.
That is, unless you consider hyperbole to now be standard, in which case we have to likewise transform the original material being quoted.
Otherwise we're comparing apples to super duper hyper bestest-ever oranges.
Your data might be secure when is encrypted and never leaves your network.
Plus, the "world's most secure" would probably use HSM's at some point with real ones at server and smartcards w/ open standards at user side. I know a guy working to commercialize stuff like that. Stuff running on a vanilla stack w/ unvetted code and non-tamper-resistant hardware isn't most secure anything.
I assume if you used a third-party Tahoe-LAFS gateway instead of running your own, you would have the same security risks.