Hacker News new | past | comments | ask | show | jobs | submit login
Things to Know about Databases that Leverage Partially Homomorphic Encryption (lab41.org)
94 points by jonbaer on Feb 8, 2016 | hide | past | web | favorite | 18 comments

Quick summary of a great article:

Homomorphic encryption allows a provider to run computations over encrypted data. Thus we do not necessarily have to expose sensitive data to a provider in order to manipulate this data and derive insights from it.

The tradeoff comes in the form of increased time to compute and increased complexity in the storage, retrieval, and manipulation of the data - the provider has some general sense of what type of data is stored where, but cannot simply determine what to do by looking at the plaintext values.

A few companies/research groups are working on bridging the gap between the extremes (unencrypted/fully homomorphic encryption) in order to come up with a good compromise between security and cost.

> Both the CryptDB and Monomi publications cited approximately 20% performance impact and neither offers full, standard SQL functionality.

With increased regulatory standards on the near-horizon - healthcare/insurance in particular - research in this area holds a lot of potential, especially if they can get the overhead closer to 0% and figure out a way of safely tagging data for manipulation in a way that does not expose it to re-identification.

Came here to post a summary but you did a better job than I could.

   > A few companies/research groups are working on bridging 
   > the gap between the extremes (unencrypted/fully 
   > homomorphic encryption) in order to come up with a good  
   > compromise between security and cost.
I just wanted to add that plain text vs homomorphic encryption is indeed a security and cost trade-off.

Fully vs partially homomorphic encryption is a trade-off between supported operations and cost. Fully homomorphic systems allow arbitrary computations while partially homomorphic systems do not, hence the lack of support for standard SQL.

Ah thanks for the clarification. I know of IBM and Enigma's efforts in the realm of fully homomorphic encryption - do you know of any others?

Microsoft has a thing for Bioinformatics. The only info I know is "Manual for Using Homomorphic Encryption for Bioinformatics"[1]

[1] http://research.microsoft.com/apps/pubs/?id=258435

This is a really good read. Thanks for the heads up.

Nope, sorry, but I'd be interested as well.

I needed this for a DB a while back - patient names. You need to be able to do searches on partial names "Dav" to find "Dave, David, and Davide", as well as sort results. That makes things extra tricky.

Hi, We're the team working on Google's experimental encrypted BigQuery client, also mentioned in the article. It's open-sourced and available on github: https://github.com/google/encrypted-bigquery-client

Happy to answer any questions or chat about cool uses of PHE!

I've wondered about another system, and perhaps someone here can point me in the right direction. Fully homomorphic encryption (FHE) has total control of two operations. Partially homomorphic encryption (PHE) has one operation. We can do PHE pretty well, and we're currently very very bad at doing FHE in reasonable amounts of time.

In Craig Gentry's FHE schemes (and all current ones that I know of) are all based off of an idea of bootstrapping, where each operation performed introduces some additional noise and every so often one must work to limit the noise and re-extract the signal. But somehow, as more operations are performed, simply too much effort goes into understanding the signal and noise.

But let's say that I had a FHE scheme in which I could do a limited (and relatively small, on the order of 100) operations, but no more. I suspect this has a name, but I'm not familiar with it. If I had to name it something, I would give it the confusing name Limited FHE, or perhaps Finite FHE.

Then my question is: what sorts of things could we do with Limited FHE? Is this a question that people have thought about?

I don't see this idea of bootstrapping at all in the Enigma whitepaper: http://enigma.mit.edu/enigma_full.pdf

Enigma uses secure addition and multiplication protocols to construct a fully secure interpreter. A user throws a problem to a bunch of decentralized nodes running this secure interpreter that are incentivized by fees to perform these computations and then post the results to a public blockchain, where the validity of the computations can be verified by an auditor.

It also includes (unlike Bitcoin) a distribution protocol so that the "miners" aren't all solving the same problem in a redundant fashion, although I don't think the "network reduction" protocol is specified in the whitepaper.

This is probably the most relevant part to your question:

Code evaluated in our system is guaranteed not to leak any information unless a dishonest majority colludes (t ≥ n/2). This is true for the inputs, as well as any interim variables computed while the code is evaluated. An observant reader would notice that as a function is evaluated from inputs to outputs, the interim results generally become less descriptive and more aggregative.

For simple functions or functions involving very few inputs, this may not hold true, but since these functions are fast to compute - no additional steps are needed.

However, for computationally expensive functions, involving many lines of code and a large number of inputs, we can dynamically reduce the number computing nodes as we progress, instead of having a fixed n for the entire function evaluation process. Specifically, we design a feed-forward network (Figure 5) that propagates results from inputs to outputs. The original code is reorganized so that we process addition gates on the inputs first, followed by processing multiplication gates. The interim results are then secret-shared with N/c nodes, and the process is repeated recursively.

This is called Somewhat Homomorphic Encryption (or SHE).

Article does not mention zerodb, FWIW:


Doesn't look like that uses homomorphic encryption.

They do not use homomorphic encryption

I'm up voting this based mostly on the author's reference to Meatloaf songs and PEBKAC in his headings.

I never understood why homomorphic encryption is too slow to be used reliably today. Does some part of the computation need more power than todays computers can offer?

So here's an analogy that might help: Today, most RSA keys are 2048 bits. You can go up to 4096, but things start to get a little bit too slow for comfort. Double again and they get much slower.

Now imagine doing RSA with a 100k-200k bit key. That's the amount of slowdown FHE incurs.

They're not "that slow". They're too slow for big companies to use. But for a small product? Why not.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact