
CryptDB - Encrypted MySQL - pelle
http://css.csail.mit.edu/cryptdb/
======
pelle
Many of the techniques can be done at application level. A great example of
this that most of us already use in the database is the hashed password.

Wayner's translucent databases book is one of those classic DB works that
should be on every dev's bookshelf.

I'm not affiliated with Peter Wayner at all but I've added a separate story to
get the word out about his work:

<http://news.ycombinator.com/item?id=3373947>

But to see what is really unique about CryptDB see this paper:

[http://people.csail.mit.edu/nickolai/papers/raluca-
cryptdb.p...](http://people.csail.mit.edu/nickolai/papers/raluca-cryptdb.pdf)

We already have a way of checking equality and indexing data safely using
digests.

They have come up with similar techniques for ordering encrypted data,
performing calculations on encrypted data and doing full text search on
encrypted data.

These are all quite amazingly useful to me even though there are definite
drawbacks. Eg. an encrypted value that you want to provide calculations on is
stored in a 2048 field. But there are definitely great applications for it
where it would be worth it.

I am still trying to understand what benefit their DET and JOIN constructs
have over just using say a sha256 digest. But I have only skimmed the paper so
far.

It would be interesting to see if this can be setup on an ec2 instance
proxying towards an RDS instance. I don't from the outset see why not.

~~~
RobAtticus
The DET construct (and this might apply to the JOIN as well, I don't remember)
is most useful with symmetric key encryption since then you can use in inside
one of their "onions". You couldn't peel off the DET layer to get more
functionality if it was stored as a digest, since those are only one-way.

------
digitalsushi
30 degrees off-topic, but sqlite was extended to support encryption.
<http://sqlcipher.net/>

It has been a fascinating evolution personally to write software that creeps
on the border of sqlite not being quite enough. (Which probably speaks to the
design, but still, it seems like a rite of passage).

------
mike-cardwell
The link is a bit light on information, so I cloned the repo and pasted the
README file from it here:

<https://pastee.org/323xe>

Basically, from the looks of it, it seems to run as a proxy inbetween your
client software and the MySQL database.

~~~
alexchamberlain
There are 3 publications listed... not that I read them...

~~~
mike-cardwell
Yes. Three detailed papers in PDF format. The front page needs a summary of
what it actually is though, and the README file provides that information
quickly and efficiently, whilst the website doesn't.

------
conformal
i guess it's useful to keep this data from a db admin, but whoever admins the
server that encrypts the data must have access to the encryption key(s).

it does further restrict who can access data which is good and it seems to
come at a serious cost in complexity. if your db admin is bleeding company
data i think your problems are just beginning...

~~~
RobAtticus
Actually, your first statement is not true. They also present a multi-user
mode, where the keys are generated by a user's password when they login. The
keys only remain active while the user is logged in, so if somebody gets a
hold of your proxy only those users data is vulnerable. Although, I will
admit, the paper seems to assume an attacker only gets a small attack window
(I believe) and hasn't just installed something that monitors the proxy
indefinitely.

~~~
Canada
The point you raise about attackers monitoring the proxy for a long time is
important.

My understanding is that CryptDB offers no protection from attackers who sit
between the proxy and the web server. Obviously the web server (or other
client) must deal with plaintext, otherwise application software would require
changes. The authors of CryptDB assert in no uncertain terms that this is a
drop in solution that requires zero application changes, therefore the proxy
must do all the work.

The idea is to run the proxy on a different machine than the database, thus
allowing the maintenance of the database server's hardware, OS, and RDBMS
software to be outsourced without providing access to your data. No amount of
monitoring of traffic between the proxy and the RDBMS should matter.

The weakest part of this system is that is appears to store the data in the
database with different types of encryption that allow for various operations
to be performed on the cipher text. I think that anyone who controls the
database system can obtain some of the weaker cipher texts of the data and
possibly break them.

I really can't be sure until I test it out... I'm kinda disappointed that it
doesn't come with quick instructions to get it going on postgres.

------
ihaveyourbuns
Nice try MIT, but I don't really see this useful in the real world. Typically
data that needs to be encrypted must be accessed by system processes to make
any application useful. For example you want to encrypt the contact email, but
you want to send automatic alerts to that email or a system needs to do an
automatic credit card payment. Those are hard to do when you need the users
password to decrypt the data.

~~~
sirclueless
You're thinking at the wrong level. The password that encrypts a user's credit
card data is almost certainly not the password a customer logs in with. It's
some highly controlled password that only some privileged authentication
server knows.

The goal is to reduce the attack surface, and prevent incidental discovery of
data. For example, I could set up a server that manages high security
passwords and only grant access to a select few trusted people. I have to
carefully audit how that server gets used, and who can use it, but I can let
any old DBA mess around with all my encrypted data. I can throw it on any old
server, I could outsource it to some cloud hosting company, it doesn't matter.
That's a huge win in some industries. The only thing I need to trust now are
the servers with credentials and the CryptDB software itself, I don't need to
care about the data itself.

