Hacker News new | past | comments | ask | show | jobs | submit login
Google's end-to-end key distribution proposal (code.google.com)
261 points by abdullahkhalids on Aug 28, 2014 | hide | past | favorite | 92 comments

The first comment by Mailpile seems to highlight the biggest problem to me:

>Hello! Bjarni from the Mailpile team here.

>This is an interesting proposal and sounds like a significant improvement over the current centralized key-server model.

>The main quibble I have with it, is it seems there's no concern given to the privacy of user's communications - the proposed gossip mechanism seems designed to indiscriminately tell everyone who I am communicating with. That's a pretty severe privacy breach if you ask me, worse even then PGP's web-of-trust because it's real-time, current metadata about who is interacting with whom.

>Am I misunderstanding anything here?

>- Bjarni

As I just posted as a comment to the proposal:

In the EU, e-mail addresses are personally identifiable information. It's not clear that an append-only log with no expiration or means for individuals to delete the content will even be legal in many EU countries.

What do you mean? When users sign up to a Key Registry, they are explicitly acknowledging that their emails and public keys can be used in the way described by the protocol. AFAIK, that's perfectly legal.

This assumes all requests are made by the owner of the e-mail address, with sufficient understanding to give consent vs. e.g. an e-mail provider that fails to understand the privacy concerns.

Even then, as pointed out by others, this does not preclude the user from withdrawing consent to continued use of the information.

It may or may not become a legal problem. But they really need to consider the privacy implications and have lawyers that actually know the relevant national laws throughout the EU member states to evaluate it.

in EU, you can say at any point to a company 'delete every piece of personal information about me that you have', and email is considered an personal information. Even if you gave consent, you can withdraw consent at any point in time.

So, basically, I can tell my bank/credit agency/hospital "please forget that I got a mortgage/defaulted/tested positive for HIV"? Doesn't that sound rather risky?

It's not like that. You can demand to be informed about when the data is processed and for what purposes, get a copy of it, request corrections and object to processing in certain circumstances.

In practice, no, you can't demand to be wiped out from the internal database if you creditor. You probably can prevent them from publishing a debtors' list with your name, though.

It's not that simple (in the UK at least), you only have to stop processing information if "unwarranted and substantial damage or distress" is being caused.

Additionally, all stop processing / deletion regulations require a court order, which will be subject to all the usual requirements of justice.

It only takes a single successful demand before an append-only log because untenable.

There are of course exceptions ;)

Yes, but if users later can't remove their information because the list is append only then that may not be legal in the EU.

This is not worst than the key infrastructure right now, that not only reveals all the email address but their relationships.

One more reason to implement the append-only log using blockchain technology. Seems like a good fit. I don't think privacy laws can be applied to a blockchain, but probably that is still unclear. For example, if I post my email address in a Bitcoin transaction, can I then force every user to delete their blockchain?

You would not be able to stop individual users from doing so, but you could potentially make it legally untenable for commercial/corporate operators.

But personally, regardless of legality, if Google implements this with the degree of disclosure of e-mail details, I'll find another operator that isn't implementing this proposal.

The whole point is ease of key exchange. Of course if you don't want any hint of your email details out in the world you shouldn't use it.

Ease of key exchange does not justify publishing what is effectively a public ledger of who I start communicating with when.

I think you're misunderstanding the gossip protocol, but, regardless, that was not the claim I was responding to, it was that the disclosure of email addresses (or a hash of them) with no way to remove them from the Key Directory is somehow beyond the pale for a public (and automated) key exchange system.

Can you elaborate on how the gossip mechanism reveals who you are communicating with?

From my understanding (which obviously could be off) what is sent with the gossip protocol is a verification of the entire key directory, similar to a git commit hash.

I was amused to see that on neither of the links in the post does SSL work correctly. convergence.io identifies as whispersystems and www.certificate-transparency.org produces an ssl error on every version of openssl and browser I have to hand (I didn't try anything with non-standard ECC curves).

That said, however, the entire end-to-end project is for me one of the most interesting and exciting practical innovations in security in years.

Interesting. It has bugged me for some time that if the Web of Trust was bigger, it could grow exponentially and become universal: once someone you personally know has entered you into the WoT, you can be trusted and can trust others based on a number of public signatures on their public key. However, currently the WoT is so sparse that you cannot do this.

My idea was to use the existing web TLS platform to bootstrap the WoT to a sufficiently large level. I run my email on my own domain. Why can't I tell the WoT (and have it trust that it is true) that my public key is XYZ by putting it at https://igorpartola.com/pub.pem? GMail could do something similar and at least to start, we could get enough emails validated to start having the WoT spread on its own. Then we could modify the infrastructure to remove the public CA's and central authority entirely, by using the WoT itself. Google's HTTPS cert would then be based on its PGP key and be verified by humans inside the WoT.

I also think that the important part of the WoT is verifying emails/digital identity, not government docs. I don't care if I am talking to "Bob Bobber", I care that I am talking to bob@bobber.com. I may never have met bob@bobber.com, but I see his/her public git repos, blog, etc. and I want to connect to them securely.

Let's not forget that one of the most successful attack vectors is social engineering, that is, tricking people into trusting you and making you part of the WoT.

Solvable by only ever issuing marginal trust (technically, if enough people get social engineered in, that still causes a problem, but that's a lot harder than a single person.

Trusting above marginal level should be reserved for very few people if any at all. I don't have anyone with full trust.

Sure. But if you are willing to trust some CA to issue me a HTTPS cert for my domain, why are you not willing to trust that I serve my public key for this domain suing this HTTPS cert to secure its transport? Oh, sure some adversary could gain control of the web server used for this and replace my pub.pem, but then I will notice and revoke it. And once enough people download and sign my pub.pem, it no longer matters: I am now in the WoT and can remove pub.pem.

I think the issue is that a lot of people who are informed about CAs don't trust them, but we still don't have anything better that's anywhere even close to wide adoption.

That's what key revocation is about though: its an assumption that the WoT will get things wrong, but that we'll be able to retroactively undo some problems.

Conversely, it is why anonymity is an orthogonal goal for non-realtime communications.

Your server is already publishing a public key - in the certificate. "Humans" at the CA verified it.

What you're really arguing for is some army of volunteer CAs to do it (this is the WoT model summed up). However verifying identities is not fun, takes knowledge and skill to defend your private keys, and in the absence of payment will attract only a tiny number of uber-geeks who think the word "party" is a reasonable word to describe a bulk ID verification ceremony ;) This is why the WoT is a bust and nobody developing new crypto systems cares about it anymore.

>> This is why the WoT is a bust and nobody developing new crypto systems cares about it anymore.

I think the real reason nobody cares about it is because it gives you actual privacy. There is no way to exploit it commercially. All they could get is that data went "from here to there" with no idea what was in it. Not even Google could target ads with that.

I've met plenty of non-profit privacy activists who share Mike's pessimism about web of trust authentication.

Reading through the spec, there is something eerily familiar with the key directory implementation. Quoting:

Alice will obtain a proof from the Key Directory that demonstrates that the data is in the permanent append-only log, and then just encrypt to it.

Within the message to send to Bob, Alice includes a super-compressed version of the Key Directories that fits in a 140 characters (called STHs which stands for Signed Tree Heads). This super-compressed version can be used later on by anyone to confirm that the version of the Key Directory that Alice saw is the same as they do

Append-only log. Global. Hash of the log's tip state at the time of use...

Smells like a mixed blockchain/git type approach - which is a good thing. The "super-compressed" version of the log tip sounds like git revision hash. The append-only, globally distributed log is pretty much like a blockchain.

And it attempts to solve a really hard global problem. I like it.

The blockchain (as most people understand it atleast) is an implementation of a Merkle tree. Which is why Git/Bitcoin are eerily similar - they both utilize Merkle trees for integrity.

The real innovation in the "blockchain" was using proof of work in combination with the Merkle tree in order to enforce a single history. Take that away and yes, it looks alot like Git. :)

And if you consider that a git commit is actually "proof of work" and find a way to quantify the value of that commit (fixes issue x which was worth y points, passes all regression tests), you would have... gitcoin.

For fun: Stripe's CTF3 included an implementation of Gitcoin in which the work wasn't actually semantically valuable contributions to a repo, but rather brute-forcing the commit message until the commit hash meets a certain criterion.


Git commits can not serve as "proof of work" as the term is normally used, because they are not much easier to verify than they are to generate.

Proof of work is used to limit behavior by adding artificial cost - in hashcash they are used to make spamming more expensive, and in Bitcoin for controlling block creation.

It's not a desirable feature in a protocol if you can avoid it. In the case of coding contributions, it's easier and more reliable to have a central maintainer that accepts patches and pays out bounties.

Passing automated tests ?

Would that work? I design and build a set of tests that extend existing OSS business app A. I post them up and ask for contributions ... Only to be accepted if tests pass.

But really the value I want is in "quality" a very hard to quantify idea.

But Test first development as a means of proving compliance is a good idea. Not sure the git chain is useful though.

Quickly, buy the .in domain name and throw a client up onto github ;)

Edit: Domain already taken; I guess after bitcoin's success people just bought up (word)co.in for every value of word they could think of...

I'm only half joking :) Gitcoin is much looser than bitcoin when it comes to verifying proof of work. However, there is a (practical) sense in which it exists. Fixes are submitted, accepted, merged, pulled until they are part of all users' "blockchains". gitcoin is the quantization of this karma. The lack of a hard definition of work means that it is perhaps easier to bootstrap in a centralized ecosystem like Github. Like Quora credits for code.

In much the same way as Quora credits are used to power A2As, gitcoin would enable you to ask J Random Coder on github for a fix and pay him in karma. The GPL is effectively a no-freeloaders mechanism, gitcoin could be another.

Ah, so that's why.

The term "Merkle tree" was familiar, but I didn't know what it was used for. (Read: never had reason to look it up.) Now I do.

Thank you. :)

  The special thing about this Key Directory, is that 
  whatever is written in the directory can never be 
  modified, that is, it's impossible to modify anything from 
  there without having to bring the service down, or telling 
  to everyone that's looking about what is being modified. 
The MIT PGP key server has eleven different keys in my name which I created in 1997, when I was about 12 years old. Of course, I have long since lost the private keys and e-mail addresses.

I guess with this proposal, the fact you used to go by benstillerfaggot69@verizon.net will be part of your permanent record.

Unless I missed a part of it, the key is only permanently tied to the email address - i.e. you could always say that benstillerfaggot69@verizon.net is a different Joe Bloggs and not you (probably only plausible if you have a common to medium rarity name though).

My suggestion is for them to work at least another 12 months on this, before they even begin to publicly test it on Gmail accounts. We need more privacy quickly, but let's not rush into a protocol that could last for another 20 years, if all the major e-mail providers adopt it. We need to get this right.

I wish the DIME (former Dark Mail) protocol was out already, too, so we can compare. There's also Adam Langley's Pond [1], but sounds like it's too complex, and it only works over Tor. And TextSecure/Axolotl could probably be used as a secure e-mail protocol, too, if you add a proper e-mail like interface. I hope the team behind End-to-End is looking at all of them.

[1] - https://pond.imperialviolet.org/

I have been through this thought process before. The conclusion I came too was that the implementations should be transparent, but that the user information should not.

Basically I was not going to put up a list of everyones email addresses and keys anywhere, and certainly not who they connect with.

The more I looked into the problem, the more I realised that the vast majority of users would rather sacrifice security for usability. Even in my implementation people would rather not see the "Please verify this key with the recipient" page. They just want to get something done. I think this proposal from google would work well so long as their base implementation involves no additional steps beyond that of a normal email client.

My implementation uses a central key authority, however the application is pure javascript, and the entire javascript is downloaded to the browser prior to the user entering their email address and password. After that no more code gets sent to the client. You can verify it wont steal your data.

I have the same problem of initial key exchange that everyone else does, but I give the user options to verify they keys themselves. Once they have they encrypt their own contact list (along with keys) and re-upload it. Therefore limiting the attack vector to initial key exchange.

If anyone wants to have a look check out http://senditonthenet.com/

Do you use Key Escrow for private key storage? How can the other receiver decrypt the file using his browser only? Where do you store the private key?

The users password acts as a symmetric key. It is never sent to the server, but a hashed copy is sent to the server for authentication, which is then rehashed and stored in the DB.

The users private key is AES encrypted with the password as key and sent to the server for storage. A JSON hash of their contacts is also encrypted in the same way and sent to the server for storage.

I hope this works out. If enough of the big email vendors (gmail, outlook, yahoo, etc.) get on board the network effect could be enough to push adoption to a very high percentage.

Once that happens my "unencrypted" email folder would be viewed about as often as my junk mail folder.

And then... maybe just maybe... spammers will be faced with a serious challenge.

> And then... maybe just maybe... spammers will be faced with a serious challenge.

Why? Couldn't spammers encrypt the mail they send to you just like everybody else?

It'd probably become like SPF/DKIM - somewhat spammy companies who still have a genuine mailing list and will still opt you out if you ask (or even click 'Spam' in gmail etc via FBLs) will go to the effort to implement it, but random viagra spam being pumped out by botnets likely won't.

Because, I think, they would have to be part of the Key Directory too, listed under "Spammer".

This kind if implies a secondary directory market of ratings and rankings of the prime source (ie isASpammer, isWithinThreeHopsOnLinkedIn)

They'd just keep creating new keys, and use a botnet to exchange seemingly valid e-mails with itself and a trickle of seemingly legit e-mails (buy access to a few small legit mailing lists to make sure their botnet isn't a total island). It's not clear to me that this won't make it possible for spammers to manufacture/buy sufficient "trust" to put themselves in a better situation than they're in today.

Darn - I think you are right... Ah well, defeating spam is only a nice to have from this

Yes the could, but it would at least cost them more resources which could matter at the scale most spammers work.

AFAIK most spammers use botnets, so they don't really care about resources - it's not their resources to begin with.

How will the Key Directories and third party Monitors verify that I'm the real owner of my self-hosted email address user@myowndomain.com, uploading my real public key to the directory?

Depends on what you mean by "verify the real owner". They can verify that whoever has control of what gets uploaded to myowndomain.com also has control of the user@myowndomain.com email account and can prove they have the private key corresponding to the public key that was posted. Is that what you meant?

Is that verification process described somewhere in Google's proposal? I don't understand how it would work with third party Monitors. Would I have to prove my identity with some email-callback or DNS/HTTP token separately to them all?

Wouldn't proper compartmentalization dictate that email providers be explicitly eliminated from the end-to-end encryption process?

Apart from being able to inspect email and recognize that it contains content that looks like an encrypted form of something, I think we wouldn't want them to be explicitly informed that encryption was used, or know anything about the encryption algorithm, or know anything about how keys were distributed, or see any keys even public ones.

I think this would apply to the general case where someone uses an email service that is run by another party. In the common cases of major email providers with business models that conflict with privacy and security in various ways, the risks would higher. Even before factoring in their being high priority targets for hacking, government surveillance of questionable legality, etc.

TLDR is: Distributed (as in DNS) directory with append-only entries keyed by hash-of-email and third party replication/validation/logging.

Discussion of spam implications at https://moderncrypto.org/mail-archive/messaging/2014/000727....

Widespread encryption could be a recipe for Google (as the largest of the proposed directories) sliding in an identity reputation scheme (eg. based on location / communications history) and brokering it as a service.

So it seems they invented PGP keyservers with a monitoring protocol as a bag on the side?

I've had keys on keyservers for years. The monitoring side is interesting though.

It's also unclear how the whole directory will be compressed to 140 bytes - iirc, the best compression algorithms reduce text by ~80-90%, so it might work for a week or so, I guess.

Presumably the 140 characters is more like a git revision number than reversible compression.

Hash then compress. (Or hash then truncate.)

"The model of a key server with a transparency backend is based on the premise that a user is willing to trust the security of a centralized service, as long as it is subject to public scrutiny, and that can be easily discovered if it's compromised (so it is still possible to compromise the user's account, but the user will be able to know that as soon as possible)."

So it's a central service where we can - eventually - find out if a key changes or is invalid, though not necessarily if anyone just breaks into the service. An attacker can still break in and monitor authentications to gather intel on users. Or they can automate an attack such that the keys are compromised and the attacker gets the access they want while you're asleep at 3am.

"a Monitor could notify the user whenever a key is rotated, and the user should be given a chance to revoke the key, and rotate it once again."

Now the user needs to see this e-mail about their key getting rotated, realize it's a compromise, issue a new rotation, and hope nothing bad happened in the meantime.

"making it so hard to hide a compromised key directory, it would almost require shutting down all internet access to the targeted victim."

This is completely within the scope of many different types of attacks used today, though often you only have to limit it to specific servers.

"The model envisioned in this document still relies on users being able to keep their account secure (against phishing for example) and their devices secure (against malware for example), and simply provides an easy-to-use key discovery mechanism on top of the user's existing account. For users with special security needs, we simply recommend they verify fingerprints manually, and we might make that easier in the future (with video or chat for example)."

So it really is just a PGP keyserver. For users who care about security, do something else to make yourselves more secure.

Slightly worrying to see the word "checksum" used to refer to a one-way cryptographic hash function...

Huh? The only mention of a checksum also acknowledges it's possible to brute-force reverse it, aka not one-way.

A checksum is a computationally efficient way to detect unintentional changes in data. But as protection against intentional changes it's quite useless. That's what cryptographic hash functions are for.

Now the proposal described using a "checksum" to tie together two halves of a self-signature, where one half would be a an identity and the second would be an email address, essentially. That would make it trivial to forge a second half with another email address (but the same "checksum").

I understand that's an entirely different problem. All I'm saying is that few cryptographers would use the word "checksum" at all, and even fewer would use it in this context. That's what worries me slightly.

The last people in the world you want managing your keys. Why don't you just send them directly to the NSA ?

As much as I generally distrust google with privacy when it comes to actually handing over data, in this case it's an open protocol that has just been drafted by google; anyone can run a server or make their own implementation. It also never touches private keys, only public. If you want a public key to have limited distribution, you don't put it on a keyserver, and instead only exchange it with the people you would like to in person or over some other verified and secure channel.

It does have a somewhat anti-privacy feature in that if I understand it correctly, it keeps a record of messages between participants (in the sense of a record existing that a message was exchanged although not the content), but that level of data is already accessible to NSA/GCHQ anyway.

Google would not give up the ability to see email content... maybe they would analyze them locally on your phone instead of on a remote server (that might even save them from buying a few server too)

This is part of End-To-End[1]. There's no way for their code to see the email contents, even locally in the browser. That's the whole point, in fact.

[1] https://code.google.com/p/end-to-end/

If you can see your emails on your Android device and Google is admin on your device, why do you say that it is impossible for them to read your emails? I don't mean that they could read them in the cloud but if they can read them locally on your Android device and for example they could send a message back to Google saying "I think this guy should get ads for a new router".... but of course they could do much worst (they could send whole messages back for example).

End-to-end is a browser extension that doesn't work on Android.

that is probably because end-to-end is not fully implemented yet...

Android was designed to leak as much information about the user as possible all the time, even to third parties. Spy satellites have less invasive software.

But isn't keyword-based advertising their only revenue source from non-"Apps for Business" accounts?

You think google can write a proposal like this without the NSA getting involved ? The fact that they are preserving metadata is revealing and meaningful. Its another PR stunt.

Google can never never be trusted again. They publicly lied about PRISM, and they got caught. These people have no business making security protocols for us.

What statement did Google make, regarding PRISM, that was a lie? And what evidence caused them to get caught?

What is the actual lie?

More to the point, what is the truth? Google leadership post 9/11 knew what "certifying" their communication systems meant and took the money to do it. Whether they knew that the program was named Prism is irrelevant, in my opinion.

Google responds to legal requests for information. The whole PRISM scandal was indicating that the NSA had some kind of direct link into the databases themselves. This assertion is what Google denied, and still denies.

There's been a bit of a wandering definition for PRISM, from "they have NSA software with root access to all machines!" to "they receive LEO requests for information, which they review, and sometimes fulfill." The former is untrue, and the latter is true.

> The whole PRISM scandal was indicating that the NSA had some kind of direct link into the databases themselves.

Though, as Google and Yahoo later found out, the NSA was directly tapping cables between their data centers [1].

[1] http://www.wired.com/2013/10/nsa-hacked-yahoo-google-cables/

That appears to be several things:

1) True.

2) Consistent with statements made by Google.

3) Quickly mitigated by Google when it began encrypting traffic between data centers.

Go look at the slides. The NSA isn't lying in its own internal documents. Of course google is in on it. Do you get how huge this infrastructure is ? Its a massive engineering effort to manage that kind of information flow.

What has happened here is google has got scared. Because without trusting users all their business models fall apart. So they are lying. Its that simple.

> Do you get how huge this infrastructure is ? Its a massive engineering effort to manage that kind of information flow.

Which is why it could never happen without it being well known inside Google.

I wasn't implying any trust in google on my part (I didn't trust them even before PRISM, but that shut the door on the chance of my ever trusting them again), and chances are this will never progress beyond a draft spec, but there isn't any real way to implement something like this without doing so. the NSA would find the block chain of this helpful, yes, but it isn't data they don't already have unless it is somehow extended to non-email messaging as well.

Thats how google works. Piecemeal. You think loon was about internet for the poor and oppressed ? Thats just how they get their foot in the door. Im sure google just wanted to write a draft spec for the fun of it.

Google is a front for US intelligence. We should give no quarter. Shun them.

At the risk of sounding like a fanboy apologist, I must say you're making a lot of serious accusations against a one of the most benevolent company in the history of mankind. Some serious evidence should follow.

Picking on Project Loon... come on, is there anything Google could do that you wouldn't immediately label as evil forefront of US intelligence?

The do no evil line is bullshit. This is not a benevolent company by any standard. The services are not free, you are just paying in a different currency.

Here is more on loon and look further up for links to the PRISM slides and documentation showing the companies involved including google were compensated financially by the NSA. The evidence is damning.


Look the bottom line here is these guys betrayed us. They are traitors, and we need to cut them out of our future.

> The do no evil line is bullshit. This is not a benevolent company by any standard.

From the linked Slashdot article - Google patented Loon-related technology, describing it "as just the ticket for those well-to-do enough to pay a tiered-pricing premium to get faster internet access while attending concerts, conferences, air shows, music festivals, and sporting events where a facility's overtaxed Wi-Fi simply won't do."

Picking on this is like saying that Elon Musk is an evil liar, because if he really cared about good of humanity and electric transport for the masses he surely wouldn't start with an superexpensive car for ultra rich, and then move to expensive car for moderatly-rich. Obviously, the whole argument about "Roadster bankrolling Model S bankrolling $35k Sedan" is just a bunch of lies trying to hide how evil he is.

That's basically what I'm reading from your argument.

You know, altruism involves money, and quite often the best way to do something good is to make it profitable.

> Look the bottom line here is these guys betrayed us. They are traitors, and we need to cut them out of our future.

If so, then long, long before Google you need to get rid of GoDaddy, Amazon, Facebook, Microsoft, Apple, BMW, General Motors, General Electric, Coca Cola, Nestle, Walmart, every other mom and pop store and pretty much 90% of other companies who betrayed us in many more ways, heavily documented and not alleged. Seriously, saying that Google Is Bad is nothing but a signalling game around here.

Sorry, but google are the furthest thing from benevolent. It's all about data collection to spew more adverts at people.

"Don't be evil... to our shareholders."

Yes, the fact that they're doing nothing bad to their customers today is an evil conspiracy to hide the fact that they want to do something bad to their customers. Makes sense.

> It's all about data collection to spew more adverts at people.

The way they do this is the single most ethical way of doing advertisements. Non-intrusive and trying to predict what you actually need. They're pursuing the ultimate goal of good advertising, i.e. connecting your needs to the best way to satisfy them, but hell, they're evil.

We can discuss side effects and externalities of their data collection, but that's a completely different thing than assuming malice.

Data collection isn't for advertising exclusively. It builds profiles that can be sold to third parties who will use that information to redline demographics. For instance health insurance companies and banks.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact