Hacker News new | past | comments | ask | show | jobs | submit login
Service lets you "certify" a document using the Bitcoin blockchain (proofofexistence.com)
179 points by obiefernandez on Nov 27, 2013 | hide | past | favorite | 83 comments



This idea is generally called "trusted timestamping": http://en.wikipedia.org/wiki/Trusted_timestamping

One example would be to take a photo of a rental car showing damage at the time you rented it and be able to prove it was taken at that time. Then the rental company cannot later claim you caused the damage.

This is a pretty clever use of the blockchain as a publicly-visible and authenticated timestamp. This way, the site's owners do not have to establish themselves as a legal authority on timekeeping in order for this to be a trusted service.


Naval, founder of Angel List, has described[1] this and many other excitign usages of Bitcoin, going beyond basic money transfer:

Everyone has a copy of the Bitcoin block chain, so anyone can verify your transactions. You can write software that will crawl the block chain and generate automatic accounting histories for tax and verification purposes. You can engaged in “Trusted Timestamping” – take a cryptographic signature of any document, timestamp it, and put it into the block chain. Anyone can verify that the document existed at a given time. If you sign the document with your private key and another party signs it with theirs, it becomes an undeniable mutually-signed contract. This entirely eliminates notaries and websites like https://www.proofofexistence.com/ are showing the concept. The Namecoin project is building a distributed Domain Name System that allocates and resolve Domain Names without needing ICANN or Verisign, by using the block chain to establish proof-of-ownership.

Really worth a read.

[1] http://startupboy.com/2013/11/07/bitcoin-the-internet-of-mon...


It turns out you can build a surprisingly wide variety of secure application with just a trusted timestamp (or even simpler, a trusted incrementing counter). This paper, published at NSDI 2009, (and won the best paper award) is a pretty good read: http://research.microsoft.com/pubs/78369/trinc_nsdi09.pdf


I was developing a similar online notary. My idea is to timestamp open code, docs etc create a prior case database against patent trolls and goliaths.


Are there any documented cases where that would help?

It seems to me that the real problem is convincing a jury that a prior piece of work is related to a patented piece of work.


yes you might be right, NIST standardized timestamping methods are themselves full of patents. And I believe Jury would accept only those + a certified authority which would want to be testify in front of Jury.


NIST standardized timestamping methods are themselves full of patents

hu?

And I believe Jury would accept only those + a certified authority which would want to be testify in front of Jury.

I think you've missed my point. I'm saying that the problem isn't the TIME of the prior art, it is proving to a civilian jury that prior art relates to a patent.


Now I just have to trust their hashing algorithm..


Hashing is performed locally, probably via crypto.js, so you can verify it, I'd wager.


This is definitely a cool idea, but there could be potential issues if it is to be relied upon for long periods of time. If at any point in the future the hash algorithm used (SHA256 right now it seems) is found to be vulnerable then it could invalidate all past certifications. You don't really even need a full collision attack, a chosen prefix collision attack is enough to completely destroy the system's validity. MD5 is already vulnerable to chosen prefix attacks, maybe in 5 or 10 years SHA256 will be too...

Perhaps using a combination of different hash algorithms that we know are secure today to certify a file together would be a potential solution to the problem. It's not perfect, but at least this way all the algorithms used need to be compromised for the certification to break.


Collisions are a problem if you want to corrupt data, not if you specifically want a human validate it in a single document.

E.g.:

I claim that I authored the string "foo bar" with hash "1f2ec52b7743687..".

Now you find a collision and go to court arguing:

>> Your honor, but "3HSSHog*8FF9 z!!!!ady94765&$^#" also has the same hash!

I think the court will still assume that "foo bar" is the correct original, not the garbled collision data you produce. And you still can't deny that the original produces the correct hash.


That's not why it would be a problem.

The problem is that I can author "foo bar" and "moo bar" constructed to hash the same, and then assert retroactively which one I meant. While collisions are likely to have random binary garbage in them, it may not matter - I am on phone now, but IIRC I have two PDFs on my desktop that have the same MD5, one of which viewed is an airbus pamphlet, the other being a Boeing one. (created by Dan Kaminski, IIRC)


Can we see them?


I guess they are on a backup or misremembered - but examples are really easy to find on the net if you want them:

http://th.informatik.uni-mannheim.de/people/lucks/HashCollis...

http://www.win.tue.nl/hashclash/ChosenPrefixCollisions/ has a multicollision: 12 PDF files with different content and the same MD5 hash.


an application of each once? Btw my crypto is weak, but do answer this. if i apply a hash algorithm on something, it outputs a relatively small number of bytes. Correct me if i am wrong but the strength of the hash lies in the fact that the input space is vastly huge so its difficult to create an input-output mapping for reversing the hash. Now if we use one hash over another, aren't we restricting the input space of the 2nd hash? Wouldn't that make it easier to crack the 2nd hash for this particular usage? And if my above statements are correct then the strength of sch a mechanism only lies in the strength of the first algorithm


I also developed a similar project:

https://www.btproof.com/ https://news.ycombinator.com/item?id=5790382

Recent changes to Bitcoin made the method I'm using to timestamp the data on the blockchain somewhat problematic [1], but I'll be working on an update once I release another Bitcoin-related project I'm currently working on (hopefully in the coming days).

[1] https://github.com/shesek/btproof/issues/1


The idea is cool but it might be too early for people to use the blockchain like this as right now BitCoin can support 7 transactions per second.

Once that hard limit is lifted, and things like this can scale and support demand, applications like this could be very interesting.

One thing though, it says the BTC involved in the transaction is unspendable, isn't that a bad thing? I imagine an idea like this that didn't render any amount of BTC unspendable would be ideal.

https://en.bitcoin.it/wiki/Scalability#Current_bottlenecks


Wow, didn't realize that! That's bad news, since the current daily transaction rate (~100k) is within an order of magnitude of the daily limit (~600k = 7 x 3600 x 24).

Fortunately, the doubling time is, from eyeballing the chart below, about 6 months, so that allows ~1-2 years for a fix.

https://blockchain.info/charts/n-transactions?timespan=30day...


Well I was expecting there would be a significant jump in transaction volume as news spreads, prices rise, more services were offered in BTC and more countries get local exchanges. You also seem to miss that its not the total number of transactions per day you need to worry about, its the transactions per second.

There are surely periods of frenzy in the day where we cannot support 7 transactions per second as more economic activity takes place, even if the total transactions for the day is well under 600000. Its not like the transactions are neatly distributed on a flat line throughout the day.

There must also be some consideration that each time this is done, the BTC in the transaction is rendered unspendable and so it is taken out of the market. So I guess its not good if something like this could scale on top of BitCoin.


> BitCoin can support 7 transactions per second

It doesn't really matter, it's not like they're at all close to that limit. The limit is there to create a market with transaction fees, so it's not just going to be "removed" any time soon.


> The limit is there to create a market with transaction fees, so it's not just going to be "removed" any time soon.

The limit is actually there to limit the size of a block. It has nothing to do with transaction fees. Click the link I posted: https://en.bitcoin.it/wiki/Scalability#Current_bottlenecks

It used to be 250K and then this year was raised to 1M so it probably will get lifted again. That is the goal of the project, to allow it to scale to at least PayPal numbers (46 transfers per second).


> The limit is actually there to limit the size of a block. It has nothing to do with transaction fees.

Again, the limit is there to create a market with transaction fees. People jostle for space in the block and a market develops around how much people are willing to pay to get into the next block. It's based on the assumption that miners pick the transactions with the highest fees to include in their blocks, which does happen to a certain degree.


No. That is redefining the purpose of this limit. It was meant only as a anti-spam measure and now people like you claim it is necessary. There is a _natural_ limit to the minimum transaction fee which is determined by the orphan cost (including transactions means slower block propagation means higher chance of a found block being orphaned by another, concurrent block). Bitcoin needs to eventually scale to survive and that means removing the block size limit and optimizing the protocol for lowest possible orphan cost.


It's limited so that the blockchain grows at a linear rate instead of an exponential rate.


Huh? If Bitcoin is really successful and eventually used by everyone on the planet and people use it with about constant rate, the blockchain will grow linearly. Until then bitcoin will grow presumably S-shaped which means that there will be a phase of exponential growth that we are in now. If it is crippled by people insisting on 7txn/s for "economic reasons", its transaction rate will probably peak and then decay.


So then isn't this (or satoshi dice) the sort of thing the limit is trying to create a disincentive for?


A couple of great developers from HackTX a couple of weeks back created something similar during the hackathon there in just 24 hours.

https://www.hackerleague.org/hackathons/hacktx/hacks/proveme with a demo copy @ http://162.242.216.46/


I'm one of the two developers.

I got the proof server working again so you can play with it.

Try proving data and typing in "testing" (lowercase). It will tell you when we put it in the blockchain. This file should also work: http://s3.amazonaws.com/rapgenius/filepicker%2FvCleswcKTpuRX...

It's a little patchy, but it does work! Payments are fake, and don't do anything! So don't pay! I don't know how much our Bitcoin wallet has left in it. I'll try to remember what we've uploaded so you can try it out.

Bitcoin/crypto code here: https://github.com/wyager/hacktx-proveme


My friend and I did this as well for a hackathon (HackTX) a few days ago!

162.242.216.46

Try proving data and typing in "testing" (lowercase). It will tell you when we put it in the blockchain. This file should also work: http://s3.amazonaws.com/rapgenius/filepicker%2FvCleswcKTpuRX...

Here is our bitcoin/crypto stuff. https://github.com/wyager/hacktx-proveme

It's not up to my usual quality because we did it in 24 hours with a sleep break! Please don't judge me on this :p


Couple questions.

1) AFAIK, bitcoin has "comments" within transactions. Why not embed the checksum as a comment in a transaction between wallets you control?

2) In the approach described here, no coins are being sent with the transaction (right?). Are blockchain participants really accepting NOP transactions involving 0 BTC transfers. Maybe I missed something.

3) As others pointed out, requiring the whole file to be uploaded is a non-starter for anybody savvy enough to be using this service. Users should be able to directly specify the checksum.


1) AFAIK BitCoin doesn't have comments[1]. When people have stored data in the blockchain in the past, it was by making a series of transactions in order so later those markers could be decoded to represent something else. Some services allow comments to be layered on top of transactions but that doesn't exist in the blockchain.

2) Money is sent and is rendered unspendable[2] afterwards, so I assume the idea is to send the smallest amount possible (although it seems the service takes a fee).

[1] https://en.bitcoin.it/wiki/Block#Block_structure [2] http://www.proofofexistence.com/about


1) Comments are enabled as a service by blockchain.info. For details on how the Genesis Block (and comment by Satoshi) was formed, look at the Bitcoin Wiki[1]

2) Coins are being sent. The Developer page says you must send at least 0.00000001 BTC. Edit: actually 500000 Satoshi.

3) Agree. So does the service. See the Developer Page[2]

[1] https://en.bitcoin.it/wiki/Genesis_block

[2] http://www.proofofexistence.com/developers


5460 satoshis is the current default minimum that Bitcoin itself allows (other transactions are valid, but won't be forwarded, thus you need to hand it over to a miner yourself for inclusion in a block).


> 3)...

"Your document will not be uploaded. The cryptographic digest is calculated client-side."


This would be better if the site was served over HTTPS. At the moment anyone could MITM the Javascript to steal the file's contents.


1) It might be just doing that actually.

2) 500000 satoshis, as it says on http://www.proofofexistence.com/developers

3) If you are tech savvy, the api lets you send a checksum through a curl to their endpoint. http://www.proofofexistence.com/developers


2) Those 500000 satoshis are sent to the proofofexistence people and they keep them. In the actual proof transaction only 1 satoshi is sent to each of the proof addresses.


You can put "comments" (raw text) in the transactions. I have done so. Try piping the blockchain database into the unix util `strings` and you will see some. It is a non-standard transaction, and I don't know if they're still allowed.


Hi. Thanks for your questions. I'm the developer behind this. 1 and 2 have been correctly addressed by others. Regarding 3, the documents are NOT uploaded! All hashes are performed client-side via JavaScript. I should make that clearer on the site


Can anyone eli5 exactly what this accomplishes. What sort of scenario would this work for. I'm assuming some sort of legal purposes.


    I wrote a haiku
    to run through a hash function
    and send to strangers.
Every time this document is run through a particular function, it returns:

> d15396b27a2b176e6315c9fbbec09e2c2e042e595755902e5ff5eccec1ca634b

If I changed a single character of the document, the function would return an entirely different string.

This means it's very, very difficult to come up with another document that returns the same string when run through this same function.

If I sent my string to a bunch of strangers, they wouldn't know what my haiku is. To find it out, they would have to run through every possible document ever written (and that ever could be written) to hope to return the string.

But if someone decided to say they wrote my haiku, I could prove I wrote it first by showing that the document returns the unique string that I sent off to strangers.

What this service provides is a way of making it easy for strangers to store and date these strings for me, because they're doing it anyway when they're using Bitcoins.


This means it's very, very difficult to come up with another document that returns the same string when run through this same function.

This isn't entirely true for some hashes (like MD5). However coming up with another Haiku, in English, that makes sense and has the same hash, is probably close to impossible.

So this probably wouldn't work that well as a service to see who first generated a random string, but works very well if we know something about the structure of the string in question (like that it is in a known human language and makes sense)


Nicely done.


It's like taking a fingerprint of a file and then storing it in a public and distributed database (so it is out of any single parties control). Basically you can prove something existed at a certain time (since if I take a fingerprint of that thing in the future, it will match the record in the distributed database). If the thing changes, so does the fingerprint.

It could be used as a form of "poor mans copyright", where people would send a book manuscript, or whatever, to themselves through registered post and keep the envelope intact so they could use it, if need be, in a court. There could be other uses, you could verify that picture was taken when you said it was etc.


Basically to prove that something existed before a specific date. For instance, grab a random piece of paper on your desk with writings and then try and prove to me that the contents of that paper existed before last year, you can't really do it (easily). Before this, the only (easy) way to achieve this would be to send this document to a third party to store into a database that it did infact exist when sent to the database, but how can you prove that database wasn't manipulated? Another common way to prove something existed before a certain time would be to make a copy and send the copy through the mail to yourself. One you have received this copy from the mail, leave it unopened and now you have proof (through the mail service's date on the stamp) but this isn't too easy, some what of a hassle, and I think you could debate that you could fake it. (Faking the stamp, finding a way to get an opened document through the mail and than replacing the content later, etc.)

What this does is it stores a document's signature into the actual "history" of bitcoins which is public. ANYONE and EVERYONE can access this history of all transactions and by them storing the actual signature of the document into this public "log" you are now able to say that this document DID exist before the specific bitcoin transaction. This removes the hassle of all current methods of proving for existence and now no longer rely on a single third party (which could manipulate it themselves).


Your first paragraph isn't quite right. There have been cryptographic proof-of-timestamp systems for a long time, and they're described in standard crypto texts like Schneier's.

They work basically the same way, except that you have to set up your own network. When you want to timestamp something, you (just like this service) hash it together with the previous entry in the system, sign it, and publish that. The proof of the timestamp lies in how you had to look at the previous entry to make it, and the next, at yours. Compromising it would require a large number of participants to cooperate to re-write the history (and the participants' own records).

What this service does is avoid the need to integrate with an existing service or pay any of the record-keeping costs yourself, and it does so by piggy-backing off of bitcoin's distributed timestamp system, thereby making your timestamps at least as valid as bitcoin's, and increasing (significantly) the number of conspirators required to falsify it.

tl;dr: This is just like existing crypto timestamp systems except for the more extreme decentralization and using the bitcoin network's resources.


It's a way to use the bitcoin blockchain as a non-centralised way to record the fact that you were in possession of a particular file at a particular time, without exposing what that file is, or who you are.

The problem has been around for a long time, especially when dealing with semi-intangibles like Priority for scientific discoveries, or proof of first invention for patents[1]

One solution is to present proof to a trusted but private notary, who can copy or stamp or otherwise indicate that he has seen your documents and so they must have existed at least since the date indicated.

But counterfeiting, forgery, untrustworthy notaries, etc, all make this a less than ideal solution. So we add computers & crypto.

The document in question is condensed down to a single cryptographic hash, which (should[2]) to all intents and purposes be unique for a given document, despite being only a few tens of characters long, regardless of the size of the original input.

This hash then serves as proof[3] that you have the source document, without anyone being able to turn it back into the original document. This is a very one-way process.

Then, you need to find someone to vouch for your hash and indicate when they first saw it. You can do this with lawyers/notaries again as before[4], which partly solves the forgery problem, but not the trust one.

The solution proposed here is to store that hash in the bitcoin block-chain, which is a distributed log of all transactions on the bitcoin network, which has 3 nice properties:

1. It's append-only. Once your hash is encoded in there, it's staying there as long as bitcoin exists[5].

2. It's peer-to-peer/distributed. There's no single controlling organisation you need to trust for answers.

3. It's updated regularly enough that timestamps can be relatively fine-grained.

So you stuff it in there using this tool or whatever, and then X years hence when you need to prove you'd actually created that file in 2013, you should be able to prove that to most people's satisfaction.

This is a loose take on the matter and glosses over whole swathes of other complexities involved, but is probably close enough for [non]government work :)

[1] That is, who discovered/did something first. You may want to be able to claim you did in future, but without making it public at the time, because you might tip your rivals off to the idea before you've fully developed it. See: https://en.wikipedia.org/wiki/Scientific_priority

[2] Breaking (or "colliding") hashes is a whole subfield of cryptography research, and some pretty impressive things have been done there. It's why you probably shouldn't use MD5 for anything nowadays, for example. But again, we'll handwave "Done Properly = unique identifier".

[3] Well, in the same way that a password proves that you're the/a person who knows that password - if you did it right and never told anyone, it should be exclusively you. But if it leaks somehow, others could represent the hash as belonging to something they own. But if they only have the hash and not the original source document, there are relatively easy tests that could distinguish them. That's not very important here though.

[4] https://en.wikipedia.org/wiki/Trusted_timestamping

[5] well, mostly. But it's really hard, and the same capabilities let you defraud the rest of the bitcoin network with double-spending and whatnot, so unless your timestamp priority is super-important, you're probably ok.


One question. I know that its nearly impossible to reverse a good hash or to create a doc that will knowingly return the same hash. So PREDICTING a collision (finding a doc that will collide with a given doc) is very very very difficult. But Due to the size of the output and input space we know for a fact that collisions do exist (right?) So if a very large number of documents are timestamped using this, their hashes WILL collide right? I mean there is a POSSIBILITY that you may find proof for a doc that wasn't actually stored here, right?

(Just wondering. Not throwing criticism. I actually LOVE this idea!)


Great explanation man, can I borrow it for the site if I credit you? (I'm the developer).


I hope you take a moment to post this to /r/bitcoin.


Something like this:

Your provider stores when you had which IP address. This is used for instance in file sharing processes. At least in Germany it has turned out that this information is unreliable. So you want to log your IP addresses and you want to have a proof of the logs integrity. You could use this service for the purpose.

However, I don't like that I need to upload something. I'd prefer generating and sending the hash and get it signed.


I wonder if these concepts could be used to change the way land records are maintained and eliminate the need for title insurance [1] in the U.S. I am no expert but the current system doesn't seemed to have changed much in the past century or so. This results in a lucrative business (title insurance) that effectively insures against book keeping mistakes.

[1] http://en.wikipedia.org/wiki/Title_insurance


If some claim to title is omitted from the public record, how exactly would this guard against it?


Two thoughts.

1) Will this have legal weight? I know that the whole "prove paternity of an invention by sending a certified letter containing the design through the mail, and not opening it until you need it" thing doesn't very much work, if it goes to court you tend to be told "OK, what you say is true in the physical universe, but you did not go through our blessed channels, so nyah nyah nyah".

2) Anything that makes bitcoins more legitimate / part of the world's infrastructure decreases the chance of bitcoin going away as a system. I think the idea for the bitcoin community is to make the system as a whole "too big to fail" before governments decide that they want to get rid of it.


> Will this have legal weight?

I doubt it. If I have to fax something for it to be a legal document, such an antiquated system will ignore any advances made around it.


Maybe it's because progress is faster, but I see a growing divergence between the physical world and the legal world. I've seen some post-Occupy hearings in which police brutality that had been proven beyond a doubt by video from multiple camera angles was deemed to "not count".

http://opinionator.blogs.nytimes.com/2010/01/11/the-true-ans...


The law doesn't pretend that signatures and fax machines are magic. Contracts written on bar napkins can be enforced.

Any time you introduce evidence, you have to establish its credibility. The first time a signing system like this is used it's likely to take the court some time to understand, but there's no legal doctrine forbidding it.

Source: I am a (not practicing anymore) lawyer.


I believe you could implement it in such a way that the coins involved are spendable, relying on "Pay to script hash": https://en.bitcoin.it/wiki/BIP_0016


"Your document will not be uploaded. The cryptographic digest is calculated client-side."

That's great, but I have to trust you there. For a document that really matters, I don't think I would.

How about a box that allows me to specify my own SHA256 instead?


Maybe calculate the SHA256 on your own box, then upload a document containing only the SHA256?


There is a developer API that allows you to do this:

    http://www.proofofexistence.com/developers


That's not the same as making it available for users though, is it?


How about miner's fee, is that added to the transaction?

In the About, they say that they make 2 undependable dust transactions.

Would there be any incentive for miners to confirm these transactions and store them in the blockchain? If there's no incentive to confirm the transactions, then the miners might find a way to filter these out of the blockchain.

See also dust transactions:

http://bitcoin.stackexchange.com/questions/10986/what-is-mea...

http://bitcoinfees.com/


nice concept ! apparently sha-1 is vulnerable to collisions (http://eprint.iacr.org/2008/469.pdf) how are they handled ?


It uses SHA-256, not SHA-1.


By using SHA-2?


Excuse me?


So, if I could automate this, could I use this to prove in court that logs were created, and not tampered with?

Just trying to think of practical uses of this cool project, and it's the first thing that came to mind.


Yes. Or that a contract was not created before a certain date. Or, as another poster suggested, that a photo of a damaged rental car was taken before you took possession of the car.


Or that a contract was not created before a certain date.

I think you have that one backwards. You can demonstrate a contract did exist at a certain date. There is no proof of non-existence before that date.


you're right, it can be used to prove that a contract with both signatures existed before a date, not after. So if Alice makes an agreement with Bob, Alice can't turn around and say that the contract was only ratified after Alice failed to meet certain obligations.

Then again, digital signatures come with a timestamp, Alice could lie about that timestamp, but it would be obvious to Bob that she did.

Fun to think about.


Cool! I worked on something similar before: https://github.com/Miserlou/CitizenMediaNotary

I've also got an idea related to this which involves a global array of satellites in order to add a geo component to this kind of verification..


This would help to certify that a contract was signed at a particular time by both sides, and then included in the blockchain.

It would also help certify that a particular person created a document, once again because it was signed by their personal key and included in the blockchain.

It basically functions as a certain timestamp.


well... you can save actual files to namecoin blockchain if you try a little.

my script that does exactly that (not updated, maybe won't work now)

https://github.com/runn1ng/namecoin-files


It would be nice if you included some UI for just pasting in a bunch of hashes myself.



Now this is fricken awesome. I can imagine all sorts of instances where this is applicable. What about as a replacement for a notary public in some cases? Very nice work, and a genius idea.


Would this have any legal standing? Legally valid but insecure will trump secure but inadmissable in nearly. situation where having a certified timestamp is useful.


Couldn't I also publish the checksum on Twitter? Although it's not decentralized, it's probably equally trusted in the eyes of the legal system.


Yes you could, however, this isn't only about the trustworthiness of your claim. More importantly it's about the trustworthiness of the longevity of the data.

Twitter can remove a single post or turn off their servers and your prove is gone. That's not so easy for a BitCoin transaction. Once it's added to the blockchain it's there for as long as anyone is running a bitcoin node.


But that's kinda centralizing it. What if twitter shuts down? What if twitter's DB is compromised? With bitcoins, it kinda makes all this impossible (cryptographically) and making it super decentralized


I was thinking along similar lines today, but geared more towards proof of title.


brilliant.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: