Hacker News new | comments | show | ask | jobs | submit login
Hardbin: secure encrypted pastebin (hardbin.com)
133 points by natejackdev on May 23, 2017 | hide | past | web | favorite | 69 comments



Hi all, I made this.

The reason it is described as "the world's most secure" is because both the code and the data are served from IPFS. IPFS is a content-addressable storage system, so as long as you access it over an IPFS gateway you trust (running one locally is the best way), you know the code and data haven't been meddled with.

It degrades to the same security model as ordinary encrypted pastebins when accessed over a gateway you don't trust (e.g. the hardbin.com public gateway).

This article (was on HN a couple of weeks ago) gives a good overview of IPFS: https://ipfs.io/blog/24-uncensorable-wikipedia/

And you can learn how to set up a local IPFS node here (it's super duper easy - copy binary into /usr/local/bin, ???, profit): https://ipfs.io/docs/getting-started/

So, please, don't knock it before you understand it :)

EDIT: Although I did try to explain it thoroughly in the About section, I do take responsibility for not making it easy enough to understand.

IPFS is pretty new technology, and presents a very different set of assumptions compared to traditional web apps.

If you didn't understand it at first but you do understand it now, please let me know what the key piece of information was that made it "click" so that I can put more emphasis on that next time.


A quick look at the source code shows the generate_key() function [0] to be insecure. It generates 32 random bytes (good, that's what you need for an AES-256 key), but then it uses those random bytes to sample from a distribution which only has 62 characters. This significantly reduces the security of the key, from 256 bits of entropy to ~190 bits (log2(62^32)). And that would be in the best case, if it were sampling uniformly from the distribution - it is not.

I recommend reading Section 9.7 of Cryptography Engineering [1] to understand why choosing random elements from a set is harder than it seems. A good example of a similar bug is the nasty bug in Cryptocat's PRNG from 2013 [2].

I assume this step was done so the AES key could be included in the URL fragment, since a set of random bytes may not be url safe. I recommend feeding the random bytes of the key directly into the underlying cryptographic functions, and using a urlsafe encoding at a higher level when necessary.

Also, it appears you are using AES [3], a block cipher, but I cannot figure out what block cipher mode you are using. I'll have to dig into the CryptoJS code a little more to see what it defaults to, but I have a sinking feeling that it's ECB, which is completely insecure. Dan Boneh's Crypto I course on Coursera is a good way to learn the basics of block cipher modes.

[0]: https://github.com/jes/hardbin/blob/c77c2d7eb93586e0e009ea4b... [1]: https://www.amazon.com/Cryptography-Engineering-Principles-P... [2]: https://nakedsecurity.sophos.com/2013/07/09/anatomy-of-a-pse... [3]: https://github.com/jes/hardbin/blob/c77c2d7eb93586e0e009ea4b...


I'm aware it's only 190 bits of keyspace. I mentioned this in my blog post[0], and would in fact be more likely to decrease it than increase it, in order to make the URLs shorter. I don't think it's a problem, but am interested in being proven wrong.

It's using CBC mode.

[0] http://incoherency.co.uk/blog/stories/hardbin.html

EDIT:

> And that would be in the best case, if it were sampling uniformly from the distribution - it is not.

Can you please point out how it's not? It's intended to sample uniformly. It would be non-uniform if it were "randombytes[i] % alphabet.length".

EDIT2:

I see now how it's non-uniform. 256 values in randombytes doesn't map 1:1 onto 62 values in alphabet. I will fix this tonight, thanks for pointing it out.


> I'm aware it's only 190 bits of keyspace. I mentioned this in my blog post[0], and would in fact be more likely to decrease it than increase it, in order to make the URLs shorter. I don't think it's a problem, but am interested in being proven wrong.

I understand that you're trying to balance the tradeoff between security and usability here, which is tricky. If quantum computers are part of your threat model, remember that Grover's algorithm provides a quadratic speedup for brute-forcing a symmetric key, so 2^190 would become 2^95 against a quantum adversary. Personally I prefer the margin of safety provided by using a full-strength 256-bit key :)

> It's using CBC mode.

Phew! That would've been truly catastrophic.


CBC mode isn't exactly a saving grace here, since it's unauthenticated.


The code and data are shipped together out of IPFS. If you don't trust the data, you don't trust the code anyway, so it makes no difference whether the data is authenticated.


First, read this: https://tonyarcieri.com/all-the-crypto-code-youve-ever-writt...

Second, what is the threat model where you trust IPFS but still need to encrypt client-side? Unauthenticated CBC mode totally defeats the point of encryption, but encryption totally defeats the point of trusting IPFS.

Why not just-- crazy idea!-- use authenticated encryption even if you trust IPFS?


I don't think you understand IPFS.

If you trust your IPFS node, you know that you're retrieving the correct content. You still don't want others to be able to read it.

EDIT: (Since HN won't let me reply to you): There is no mode of operation in which it's safe to use Hardbin without trusting the messages you're receiving, authenticated encryption or not.

Since the code and the data are both served out of IPFS at the same time, checking the message integrity is pointless. If somebody controlled the IPFS node you're using and wanted to do something malicious, they could more easily just add some code to ship the decryption key off to a remote server than perform an attack on the unauthenticated encryption.


A lot of crypto engineering problems go out the window if you just trust the messages you're receiving. That doesn't mean it's okay to use unauthenticated encryption in 2017.

> EDIT: (Since HN won't let me reply to you): There is no mode of operation in which it's safe to use Hardbin without trusting the messages you're receiving, authenticated encryption or not.

> Since the code and the data are both served out of IPFS at the same time, checking the message integrity is pointless. If somebody controlled the IPFS node you're using and wanted to do something malicious, they could more easily just add some code to ship the decryption key off to a remote server than perform an attack on the unauthenticated encryption.

http://www.cryptofails.com/post/121201011592/reasoning-by-le...


Your HN profile describes you as a Cryptography Engineer, and I'm sure that's true.

You're just not presenting a convincing argument for why it's OK to trust the code but not the data when they're both served out of the same place. Do you understand that they're both served from the same place? If somebody can modify the data, they can just as easily modify the code and make it do whatever they want. Do you understand that?

You're just linking me to things other people have written instead of presenting a persuasive argument. So I'm not going to bother doing what you say. Sorry.


If you call your project "the most secure X" and you're doing something related to X in a very sub-optimal way (i.e. not using AEAD modes), what happens if someone uses your code as a reference point for writing their own crypto for a protocol that doesn't use IPFS? What if, instead of a specific instance of copying code out of context, people learn bad habits from your code and then defend their choices until they're blue in the face because they copied it off of "the most secure" reference implementation?

A lot of problems, many unforeseeable, are easily prevented by using authenticated encryption.

> You're just not presenting a convincing argument for why it's OK to trust the code but not the data when they're both served out of the same place. Do you understand that they're both served from the same place? If somebody can modify the data, they can just as easily modify the code and make it do whatever they want. Do you understand that?

Yes, and this has been covered to death to the point that I'm sick of hearing it: https://www.nccgroup.trust/us/about-us/newsroom-and-events/b...

One could easily build a Chrome extension that still uses your IPFS gateway for shipping ciphertext. Now you've got the code at rest (so transport-layer integrity doesn't undermine the code being executed), but the data is still reliant on IPFS.

Let's do a thought experiment.

First, move your code to a Chrome extension (which makes the code immutable), then give the attacker an enormous amount of power. For a moment, assume that an attacker can totally defeat IPFS. It doesn't matter how they do it. Is your application-layer cryptography protocol still secure?


> First, [completely change your application and its core security assumptions]. Is your application-layer cryptography protocol still secure?

No.


If your application-layer cryptography protocol is not secure in isolation, what argument do you have against making it secure in isolation?


I have none. I'd merge a sensible pull request.

But characterising it as a huge security flaw is disingenuous. It's neither here nor there.


I'm characterizing it as a protocol/design flaw in something that bills itself as the most secure X, sure, but I haven't done anything to describe it as "huge".

Are you being needlessly defensive?


You mean truly catastrophic if it was codebook?


> to understand why choosing random elements from a set is harder than it seems

There is no sample from a set problem involved, just convert b256 -> b62, there is a correct way to do this.


I would like to point out one thing. There are 3 or 4 people in cryptography that are respected for their amazing talents in the field AND actually take the time to comment around here on articles.

For some reason you got lucky, and all of them are in this thread talking to you, attempting to give advice. Advice that some people pay a LOT of money for, to these specific people.

Most of your responses to them have read very defensively and argumentative. Out of everything said so far, this is the thing that scares me about possibly using this system.


Delivering applications over IPFS presents a different set of assumptions compared to traditional applications.

It's frustrating when people act like I haven't thought through those assumptions and are handing out cargo-cult advice that is not applicable in this application.

The non-uniform key generation is a bug, and I'm glad somebody noticed it.


You've made a very large, arrogant, and bombastic claim: "The world's most secure X.... guaranteed". While this is followed by a "* not guaranteed", implying it is some kind of joke heading, the tone remains, and given that, you must reasonably expect to get a hard time from HN / security experts. You might be able to back up this claim, but you really shouldn't feel frustrated by the response here given its tone.

Had you come on with something reasonable and descriptive like "Hardbin: a security-focused IPFS based pastebin", you might have received a more tempered response (though still, I'm sure, skeptical - as with all things security-oriented).


If you add a check box to make it a "one time secret" this would be very useful for sharing passwords.


If your security can be degraded to the same security model as everyone else then you have the same security model as everyone else. Your paste bin is cool and unique no need to lie about it.


It can't "be degraded" by others. You decide whether you want to degrade it or not.


The world's most secure encrypted pastebin, guaranteed

What could possibly go right?


finally, i can combine the security of the browser with the convenience of seeding torrents!


Great phrasing lol...


https://github.com/jes/hardbin/issues/1

You don't exactly need clairvoyance to predict this outcome.


There really needs to be an awareness campaign for developers to let them know encryption != authentication != integrity.


Authentication and integrity, in this instance, are provided by IPFS.

If you don't trust that IPFS is providing you with the correct content, then you also can't trust that it's providing you with the correct code.

If you don't trust the IPFS node, it could just ship off your decryption key to a remote server. There is no getting around trusting IPFS to work correctly. And if you trust IPFS to work correctly then there is no benefit to authenticated encryption.

But you can run your own IPFS node locally. Then you're only trusting it as much as you trust your local system. If you don't trust your local system then you can't trust anything.


I'm not deeply familiar with IPFS, but on a cursory look it appears to be accurate that it optionally[1] offers authentication via HMAC for file objects. Can you indicate how this works in hardbin? I don't see any mention of it in the GitHub readme - is it activated by default or is it assumed that a user has it activated in their own copy of IPFS?

Also, while I have you, I recommend you use SJCL for generating random bytes in response to this: https://github.com/jes/hardbin/issues/1.[2]

On a meta note, can I ask why you didn't go with something like Tahoe-Lafs[3] if you were looking for a secure, decentralized file store system? I don't immediately know that it would have been better per se, but I'm not quite sure what IPFS provides that you can't get otherwise.

EDIT: I'm not trying to grief you here, but there are three people in this thread already (myself included) who know security very well (professionally so), and I want to point out comments like this one and the GitHub issue that was opened are good-faith attempts at that.

__________________________________________________________________

1. https://ipfs.io/ipfs/QmR7GSQM93Cx5eAg6a6yRzNde1FQv7uL6X1o4k7...

2. https://bitwiseshiftleft.github.io/sjcl/doc/symbols/sjcl.prn...

3. https://tahoe-lafs.org/trac/tahoe-lafs


There is no HMAC used in Hardbin. You have to trust that the IPFS path you are accessing is correct, otherwise all bets are off.

The code and data are served out of the same place. Anybody who can modify the data can modify the code. There is no benefit whatsoever from trying to do any more authentication.

I'll have a look at SJCL, thanks.

I've not played with Tahoe-LAFS, I don't know anything about it.

I fully intend to fix the non-uniform key generation. That is a bug.

I'm yet to hear a convincing argument that the unauthenticated encryption is a problem.


Ah, I understand, thanks for clearing that up. So each IPFS node in your connection path must support data authentication.

I have two suggestions, then:

1. Implement data integrity and authentication directly in Hardbin, instead of offloading it to IPFS, such that it can be used with both confidentiality and integrity even if assumed-hostile nodes are part of the connection path. To do this I'd recommend using HMAC-SHA256 for authentication, and AES-CBC is probably fine for the encryption. You could combine the authentication and encryption with something like AES-GCM or AES-OCB, but I personally wouldn't do that.

2. Explicitly state upfront and center in the readme that Hardbin is currently abstracting the duty of authentication and integrity to IPFS using that sort of terminology. Now it's clear to me why you're saying that an untrusted IPFS path shouldn't be used, but if you used that sort of language it would be more "formally" expressed cryptographically speaking. The fact that you're mentioning that attackers cannot manipulate files but not explicitly describing how authentication works (e.g. as you do with encryption) is an antipattern.

There's a bit of a runaround here - Hardbin is designed to be an "encrypted, secure pastebin", but Pastebin is inherently an antagonistic medium for file authentication, which you'll really need for file integrity. It's designed to be fairly anonymous, which you have to trade off in some way if you want real file integrity.

The difficulty in this is that "encryption" on its own only offers confidentiality, and in modern cryptography that level of assurance is relatively rarely used in complete cryptosystems. It's not necessarily helpful to have confidentiality across a connection without also having integrity. So you can add decentralized components and encryption to it via Hardbin, but (in my opinion), you're not significantly adding a ton of value to it because I'm not really clear what the use case is where you want to securely share files in a decentralized manner, but you also don't really mind if the files are not protected against manipulation.


> Ah, I understand, thanks for clearing that up. So each IPFS node in your connection path must support data authentication.

No, I still think you don't understand.

You only need to trust your local node. If another node on the path modifies the data, your local node will reject it because it won't match the hash.

Content is addressed in IPFS by its content hash. The Hardbin "URLs" are content hashes.

EDIT: I see the confusion! When I said you need to trust the "IPFS path", I was talking about a path like "/ipfs/QmXyE...". Not a connection path. You need to trust the IPFS path (i.e. content hash), and your local node, but nothing else.


Ahhh...so basically, this is a system where you inherently trust the integrity only of your own computer. Yes, sorry about that I misunderstood what you meant when you used "connection"...in my head I'm thinking of stateful networks, not a filesystem.


That's it, yes.


I was waiting for this kind of comment: http://cryptologie.net/article/389/a-hash-function-does-not-...


That's correct, but to provide some color to the comment since it doesn't really explain why it's correct:

Hash functions can provide integrity against accidental errors or very simple manipulations related to XORing the message or digest. This might not be useful against an active attacker, but technically speaking it is a (weak) form of integrity, which is why we use checksums.

Message authentication codes (MACs) are required to assure integrity against active attackers because you need to elevate to authentication, and hash functions cannot provide authentication (at least not on their own). Authentication is a stronger notion in security than integrity because it simultaneously guarantees and requires integrity - there is no point in having integrity without authentication, because you'd be assuring data integrity without assuring the data origin, which short circuits the integrity problem for an attacker. Conversely, you can't have authentication without data integrity because an attacker could simply forge the data origin.


> Authentication is a stronger notion in security than integrity because it simultaneously guarantees and requires integrity - there is no point in having integrity without authentication

the "Authentication" in "Message Authentication Code" really is a term of the past that we now use to mean cryptographic integrity.

Another way to see this: integrity is not a security provided by hash functions in the cryptography sense of it.


Yep, that's correct. It's still helpful to differentiate between authentication guarantees, however (e.g. data origin vs peer entity auth).


The footnote says: (* this is not a guarantee). OK, I guess...


Someone might make a snarky remark on HN.


*not guaranteed


good for storing passwords. /s


It's very interesting but the title seems a little hyperbolic as there are numerous caveats (from the "About" section):

- "Note that the security benefits of hardbin only apply when accessing it over a local (or otherwise trusted) gateway. If you access it over a gateway that you do not control, then the security model degrades to be equivalent to that of traditional encrypted pastebins."

- "The content will need to be pinned to make sure it stays around for long term (the same as any content stored in IPFS)." (though this doesn't impact the merits of its encryption)

- "You need to make very sure to use a known-good version of the code when creating pastes, as it would be trivial to create a malicious version that looks identical. The best thing to do is write down the hash the first time you use it, and always use the same hash. If you want to upgrade to a new version of the software, you'll need to update your hash."

- "I don't recommend using hardbin for highly critical stuff as the code has not been thoroughly audited by anyone but me."

The last is perhaps the most worrying thing.


These are all standard security concerns. Anyone who brings you something similar and does not have a long list like this is in marketing or sales.


If these are all standard security concerns, doesn't that make this a standard security pastebin?

I think the point is that it's an extraordinary claim with unextraordinary evidence.


The vanilla "secure pastebin" is simply an insecure pastebin served over HTTPS.

Better, actually-trying-to-be-secure pastebin implementations will encrypt the data with JS before sending it to the server, and later decrypt it with a key that's embedded in the link's URL fragment (which the server never sees).

The security evolution here is that both the ciphertext and the code for the webpage to decrypt it are stored on IPFS, which is a content-addressed filesystem.

Thus, so long as you access pastes through a trusted (ie, local) IPFS note, the hardbin server operator can't insert code on the webpage after-the-fact to exfiltrate the key or plaintext back to them.


Does anyone remember when the phrase "extraordinary claims require extraordinary evidence" was used regards to truly extraordinary claims? Like room-temperature superconductors. Or cold fusion. Or that aliens were responsible for the birth of human civilization?

This is not an extra-ordinary claim. This is the claim of "I made secure software, and I reckon it's pretty good" wrapped up in the extremely ordinary common English-language hyperbole of calling something 'the best' when there is no way to know if it is, or prove it one way or the other.


However, the use of hyperbole transforms the claim into an extraordinary one.

That is, unless you consider hyperbole to now be standard, in which case we have to likewise transform the original material being quoted.

Otherwise we're comparing apples to super duper hyper bestest-ever oranges.


Yep, I consider the use of hyperbole to be a standard colloquialism in this case, not a claim that they did an in-depth experimental study of pastebins and have compiled a report for peer review.


Or just knocking out risk areas using proven solutions in ways similar to past within parameters the cryptographers state instead of new, fancy stuff. Plus one for a memory safe language low-level enough for preventing leaks. Works more often than the new, fancy stuff.


We took "the world's most" out of the title above.


It's still inaccurate with word secure in there. That's a more widespread problem, though. ;)


For how long is publicly available, encrypted data going to resist bruteforce decryption? 30 years perhaps? Usually the "millions of years" estimate ignores the Moore's law.

Your data might be secure when is encrypted and never leaves your network.


Why not just gpg-encrypt and ascii-armor your text for the recipient(s), and put it in an ordinary pastebin?


Exactly. Some kind of proven cryptosystem implemented with great UI on untrusted, online storage. Options to do it over various transports with HTTPS a default that most users can tolerate. This one does all this fancy stuff which has so much complexity in its trusted computing base (TCB) that it's likely to get smashed somehow. That's assuming the protocols and interfaces are correct which is often not true.

Plus, the "world's most secure" would probably use HSM's at some point with real ones at server and smartcards w/ open standards at user side. I know a guy working to commercialize stuff like that. Stuff running on a vanilla stack w/ unvetted code and non-tamper-resistant hardware isn't most secure anything.


Ease of use for both encrypters and decrypters?


Why not just use Tahoe-LAFS? Then you don't have to trust the server to get the security guarantee!


If you access hardbin through your own, local, IPFS gateway, then you don't have to trust a third-party server.

I assume if you used a third-party Tahoe-LAFS gateway instead of running your own, you would have the same security risks.


I was always a fan of paste.ee, but totally welcome more encrypted paste web sites.


For a simpler secure pastebin, try https://sptpb.pw.


Also another alternative.

https://privatebin.info


and another one https://safepaste.org (although currently suffers from bug)


Pastebins tend to get caught up fast in very dodgy/shady/even illegal content.. just a heads up


But he is not hosting anything. It's a self-hosted peer-to-peer pastebin storage system


Well, encryption helps with deniability, no?


This is very interesting. Could you elaborate or possibly share any resources you may have / know of on the matter?


Any quick way to minimize nasty uploads ?


Time to add IFPS to 0bin.net then ?




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: