

Time for back-end privacy - mkutzwelli
http://mkutzwelli.tumblr.com/post/51635092624/time-for-back-end-privacy

======
icebraining
Have you read the critique of JavaScript Cryptography by Matasano Security[1]?

TL;DR: if an attacker can hack the server, he can just inject some of his code
into the page, that'll copy all the data unencrypted to his own server. And
since you download the code each and every time you connect to the server,
it's impossible to validate it.

The only way to avoid this is to have some kind of trusted, verified code in
your machine, like a binary client.

[1]: <http://www.matasano.com/articles/javascript-cryptography/>

~~~
tekacs
Whilst _more_ passwords (and indeed more attempts to add ad-hoc features to
HTML) are the last thing we need right now, a simpler answer to all of this
(including hosts storing passwords insecurely) if we absolutely _must_ stick
with such authentication for now, has always struck me as adding
authentication which only relies on trusting one's browser/local OS.

I'd be curious to hear criticism (tptacek?) of, for example, an attribute on
<input type='password'/> that caused the browser to hide its plaintext
contents even from the site in question and to only expose a
hashed/<>crypted/MACed form - granted this would require periodic updates to
bring it in line with newer standards, but it would centralise the
implementation of such password securing schemes (we could even have a stab at
proper zero-knowledge!) and subject to the field being implemented sanely
(secret set, non-empty, using MAC) the field could perhaps be indicated secure
by the browser in a way hard to emulate in HTML (changing something in the
window chrome upon entering the field? I know this is difficult.)

(something along the lines of <input type='password' expose='hmac-sha1'
secret='...' /> (taking the example from the Matasano article) - I grant that
in the case of a server compromise this would allow this field to be a chosen
secret, but as before, an _empty_ secret could be noted by the browser as
insecure, for example - and no, in this instance I haven't sat down and gone
over the primitives for this yet to check how they would behave in the face of
a chosen secret).

Granted this is all wishful thinking (and certainly thought upon by others
before), but I'd be curious to hear criticism.

~~~
icebraining
(Note: I'm not a cryptography expert by any means)

If the server doesn't know the password, he can only verify the hash by
storing it once and comparing it when logging in, right? Therefore, the server
must always use the same secret, because if the hash changes, it can't verify
it.

So if the hash is always the same, and the server must know it to log the user
in, the hash effectively _becomes the password_ , since any attacker could
just send it as-is to authenticate.

If you want some system that the server can verify without knowing, but that
can change to prevent replay attacks, I think you need some kind of asymmetric
signature.

~~~
tekacs
Or just some other zero-knowledge proof mechanism, of which there are _many_ ,
including those involving little more than cryptographically secure hashes. :)

------
waitwhatwhoa
We've built a prototype that does this: <https://cloudsweeper.cs.uic.edu/>. It
works for gmail only, doesn't require full account access, and is completely
server side.

Currently we only search for and redact/encrypt plaintext passwords, but the
same workflow can work for whatever the user might decide to encrypt.

We get around the javascript cryptography issue by performing encryption
server side. Very unintuitive for a privacy primitive, but our goals are (a)
security at rest and (b) providing an improvement over what already exists
(plaintext storage everywhere).

Our other design decision is to only encrypt the (as identified by the user)
important passages. This allows server side search to (usually) still work
which is nice. We're currently exploring other methods for identifying
"lucrative" information in someone's cloud based data store.

I talked to Eric Grosse, VP Security @ Google, about trying to get a definite
time limit on how long it takes for "old" versions of stored messages to no
longer be accessible by Google. While there is a FAQ that states that old
messages cannot be undeleted after 30 days (can't find the link right now), he
wouldn't give me a straight answer for when the old email blobs are no longer
accessible. It's anyone's guess regarding what the absolute security is here;
regardless, for a non-state attacker, it's Probably Good Enough.

(please don't post this link to the front page, it's not currently engineered
to handle HN style loads)

------
zokier
I'm sorry, but for most cloudy webapps using servers as dumb datastores just
doesn't cut it. You can't do everything at clientside. Could you imagine Gmail
requiring you to download your whole multi-gigabyte mailbox to do a simple
full-text search? That is maybe somewhat extreme example, but the principle
applies quite widely.

~~~
Zigurd
> _Could you imagine Gmail requiring you to download your whole multi-gigabyte
> mailbox to do a simple full-text search?_

I can imagine syncing a client to a backend and running an inverted index
search on as much as, say 10GB locally, even on a mobile device, when 64GB of
storage on a memory card is very affordable. Once indexed, you don't have to
store the message locally, but you do have to sync it to each endpoint once.
If you need enforceable confidentiality as a lawyer, for example, or if you
just want privacy, the price of this small increment of additional memory is
small.

On a laptop, the numbers are even more compelling. Your mailbox is probably a
fraction of the storage digital media files occupy on a typical laptop.

The economics of Web mail, cloud storage, and bandwidth for sync have moved on
from the early days. While it may be that a privacy enhanced cloud service
would need to be subscription fee based, it need not be a very large fee.

~~~
kijin
That would solve the indexing problem, but is there any point in encrypting
your mailbox when Google's SMTP servers see the plain text of every email that
you send or receive anyway? With mail, it is currently not possible to prevent
the service provider from seeing the plain text unless you always use PGP or
you run your own mail server.

~~~
Zigurd
Well, yes. What's I'm describing is what it would take to have mail payloads
(but not routing) that's opaque to the back-end. In that case, you would have
to index your mail at the endpoint. I don't think that's going to be a
problem.

------
au58
If the service already has your plaintext data, how would encrypting solve the
problem? There is no way to know the service didn't just make a plaintext copy
beforehand. The only way to be sure is to enter after encryption in the first
place, but that would be impractical for the user and leaves the service
unable to do any processing of the encrypted data.

------
jiggy2011
Isn't this sort of similar to what Mega upload does?

Problem with this is that it mostly limits you to just using the back end as a
dumb data store for a client program.

You couldn't implement a social network this way because nobody apart from you
would be able to read your profile.

------
Finster
I try to keep MY backend as private as possible...

