
Homomorphic encryption implementation - gaigepr
https://hcrypt.com/
======
guelo
According to [https://hcrypt.com/scarab-library/](https://hcrypt.com/scarab-
library/), the functions that you can do on encrypted texts are

    
    
      Add cyphertexts (= XOR)
      Multiply cyphertexts (= AND)
      Add with carry in and carry out
      Add with carry out
    

I'm not really familiar with homomorphic theory, but it doesn't seem like you
could build that much on top of these primitives.

~~~
dvanduzer
You may want to start here:

[https://en.wikipedia.org/wiki/Group_homomorphism](https://en.wikipedia.org/wiki/Group_homomorphism)

The way I've understood it is that only addition has been possible so far.
With a multiplication operation, you can get a field, and do "everything" with
the real/complex numbers.

~~~
darkxanthos
Well a homomorphism applies to a group and a group can only have one operator
by definition. That won't change probably. Rings have two operators, fields
are rings that support cancellation and integral fields I believe support
inverses as a rule and where you'd get "everything" I believe.

~~~
wging
"Well a homomorphism applies to a group"

Not so. The notion's far more general:
[https://en.wikipedia.org/wiki/Ring_homomorphism](https://en.wikipedia.org/wiki/Ring_homomorphism)

This gives us a notion of homomorphisms between fields.

[https://en.wikipedia.org/wiki/Homomorphism](https://en.wikipedia.org/wiki/Homomorphism)
for more general notions of homomorphisms.

~~~
darkxanthos
You're right. My bad. Thanks for correcting me.

------
higherpurpose
If I were Google I would fund the heck out of homomorphic encryption research.

As Google tries to gobble up even more of everyone's data and every waking
habit as they try to improve Google Now and other services, the privacy
concerns are only going to grow bigger. It will be not just a drain on their
public image (just like it is on Facebook's image), but it would also give
competitors a lot more opportunities to take jabs at them. Think Scroogled,
but 10 fold.

Homomorphic encryption would pretty much fix all of that, and they wouldn't
even need to give up their data collection (or not as much).

~~~
CHY872
Unfortunately not. HE has some performance issues (that might be solveable
with time) but the primary issue is that Google do actually want to see
exactly what you're doing. It's important for basically everything Google does
that makes it money. Furthermore, the privacy concerns are nowhere near great
enough. All of the mainstream privacy concerns have come from Google releasing
your information to others (Google Buzz, that recent ECHR case etc) and not
Google keeping information themselves. Furthermore, since I assume that what
you're really talking about here is emails, it would be impossible to
practically implement a HE system for emails that would allow for searching
(because one could derive meaning from emails by sending them the right
queries). It'd be no better than client-side encryption, and no one really
wants that.

No, the people for which this is really useful is large non-IT companies like
Boeing or healthcare companies etc. They have huge data processing or storage
requirements, and frequently have to run around hoops in order to ensure data
is not transmitted in a way that would breach commercial agreements or
confidentiality requirements. The problem is that frequently these workloads
(especially for companies like Boeing) will involve much data, and HE tends to
come with a large size increase and performance drop - so it's often too
costly, which is a shame.

~~~
anonymousDan
Can you expand a bit more on the type of commercial agreements and
confidentiality requirements of Boeing? In particular, what kind of data would
the agreements pertain to, and who would the other parties be? Would the
'hoops' occur only for internal data transmission, or is it because they need
to share it with 3rd-party organisations? I'm a researcher in the area, and
I'm familiar with stuff like that from the healthcare domain, where you have
e.g. lots of different hospitals that want to share certain data for research
purposes, but I'm curious as to the use cases more commercial organisations
might have.

~~~
CHY872
Yup - Ars can explain it better than me! [http://arstechnica.com/information-
technology/2014/04/how-bo...](http://arstechnica.com/information-
technology/2014/04/how-boeing-merges-its-data-centers-with-the-amazon-and-
microsoft-clouds/) Clearly, though, this would be a case where homomorphic
encryption would not be required - because they've solved the problem without
it.

~~~
dvanduzer
Boeing's technique sounds more like steganography than cryptography.

~~~
CHY872
It's neither. They make the assumption that their combined cloud partners
aren't all in cahoots, and distribute their data such that the probability
that a) one can intercept enough chunks and b) one can infer meaning from the
chunks is very low.

The cynical part of me wants to say that it's very similar to a standard
mapreduce, and the security properties came for free/very little.

------
gaigepr
Can anyone else verify the existence of a truly homomorphic encryption scheme?

This would be great for an encrypted file synchronization project I am working
on; being able to diff an encrypted file would be amazing.

~~~
swordswinger12
I can verify their existence, but you don't need to take my word for it - much
of the research on fully homomorphic encryption is publicly available through
the IACR's online preprint database,
[http://eprint.iacr.org](http://eprint.iacr.org) . I doubt FHE is a good
solution for your project, though. The technique is not quite ready for
primetime due to seriously slow performance. It gets better every year but it
will be a while yet before you'll be able to diff an encrypted file using FHE.

EDIT: Though now that I think about it you might be able to code a custom diff
circuit for FHE that borders on feasibility. It depends on the algorithm used,
really. File diff seems spiritually similar to some of the bioinformatics work
being done with privacy-preserving edit distance computation on encrypted DNA
sequences.

EDIT2: Here's a paper that discusses an FHE 'hardware platform':
[http://eprint.iacr.org/2014/106.pdf](http://eprint.iacr.org/2014/106.pdf)

~~~
gaigepr
I was not aware of the lackluster performance. That is really too bad.

The consistent feedback I get about file sync (from techies) is they really
don't want to be uploading a whole file every time they change something. I
agree, but to have client-side encryption such that you're data is protected
from the server host as well as people who gain access to said server seems to
preclude efficient sync with current tech. Which is again too bad.

I am no crypto expert so rolling my own / building off this FHE is not really
possible. I do understand usage of things like AES and RSA well enough though
to know that adequate security precludes diffing/efficient sync.

~~~
ef4
> seems to preclude efficient sync with current tech.

I disagree. You can deterministically chunk the large file with one of the
existing rolling checksum schemes, and then use the chunks as your primitive
instead of whole files. The server only sees encrypted chunks. The client
knows to only upload changed chunks.

That still leaves important choices to be made about cipher mode, key
management, etc. But it's not intractable.

~~~
gaigepr
Assuming you are using AES with CBC, best case you will have to resend every
block after and including the first one that changed.

My understanding is that CBC is among the most secure forms of AES encryption
because it is essentially impossible to have patterns in data (unlike EBC).
Practically speaking then one must assume that it is common to upload most of
a file. Any software that boasts this security cannot effectively 'diff' your
files.

EDIT: Formatting

~~~
Eiwatah4
You don't have to apply the CBC mode to complete files. If it is secure for a
1 MB file, I don't see why it would be insecure for 100 parts of a 100 MB
file.

If you manage to merge small files into the same blocks, you even gain some
privacy because the server can't even tell the number of files anymore.

[1] also has a discussion of the trade-offs of the different modes of
operation for whole disk encryption. That seems related here because nobody
wants to rewrite the whole disk after changing the first byte.

1:
[https://en.wikipedia.org/wiki/Disk_encryption_theory](https://en.wikipedia.org/wiki/Disk_encryption_theory)

~~~
gaigepr
I understand now. that is a clever idea; I like it a lot.

------
sillysaurus3
Would anyone comment on
[http://www.wisdom.weizmann.ac.il/~oded/PS/obf4.pdf](http://www.wisdom.weizmann.ac.il/~oded/PS/obf4.pdf)
?

"On the (Im)possibility of Obfuscating Programs"

"In this work, we initiate a theoretical investigation of obfuscation. Our
main result is that, even under very weak formalizations of the above
intuition, obfuscation is impossible."

I haven't spent too much time meditating on the implications of the paper, so
I was hoping to hear from someone who has. Does this paper strike a mortal
blow to FHE? Or is this paper pointing out a result that can be ignored in
practice?

When I get a few hours I'll spend some time trying to come up with examples of
what the paper is talking about and hopefully discover the truth for myself,
but if someone here has already done that, please chime in!

~~~
swordswinger12
This is talking about a completely different primitive (indistinguishability
obfuscation) so FHE is fine. In fact this is talking about an exceptionally
strong characterization of IO (virtual black-box) which is not used in current
research on the subject.

~~~
sillysaurus3
Would you expand on this?

From the abstract: "Informally, an obfuscator O is an (eﬃcient, probabilistic)
“compiler” that takes as input a program (or circuit) P and produces a new
program O(P) that has the same functionality as P yet is “unintelligible” in
some sense"

That sounds like an exact description of FHE, and the paper seems to claim
something is impossible. So I'm trying to figure out: If the paper isn't
claiming FHE is impossible, then what is that "something" and how does it
relate to FHE?

~~~
CHY872
So the obvious difference is that if I run an obfuscated program, I'd expect
to get the same output as the original program - the obfuscated program is to
an extent expected to perform as a normal program, but be unreadable.

With HE, you operate on encrypted data and what you get out is still encrypted
data - you never have knowledge of what the data is, you just know you've done
something to it. So I could perhaps add two encrypted numbers together to
obtain something I know to be the sum (but have no idea what the sum is).

~~~
sillysaurus3
Ok, so if I understand correctly, it's impossible to write a program which
hides its intent? That is, reverse engineering of what a program is doing will
always be possible?

Something doesn't seem quite right, though. Programs are data. If it's
possible to perform operations on data without knowing the data, then
shouldn't it be possible to perform (useful) computation without revealing the
algorithm? If not, why not?

~~~
CHY872
No. It's because they're different things. The idea here is that we perform
the operation, but the output is not known either. So my data can pass through
my system with me doing whatever I want to it, and when it comes out it can be
decrypted again.

The closest analogy (I think you are trying to get to this) is that one might
be able to make a series of operations that would execute an encrypted
program. The problem is that this would result in encrypted output.

~~~
sillysaurus3
_The closest analogy (I think you are trying to get to this) is that one might
be able to make a series of operations that would execute an encrypted
program. The problem is that this would result in encrypted output._

Thanks for your time. That's what I was trying to get to, yes. Would you help
me understand why encrypted output would be a problem for the interpreter? If
the input data is encrypted, and the data defines a program which can be
executed, and the result of that execution is more encrypted data, then (for
example) why can't that encrypted data be fed back into the interpreter as
further input? Or transmitted over the network to a computer with the
decryption key (so that the encrypted output can be used in a meaningful way,
without revealing to the original computer what was computed)? In other words,
why is encrypted output any more of a problem than operating on encrypted data
in the first place?

~~~
akiselev
Obfusticators are pretty different from FHE. Instruction based obfustication
takes common instructions like mov eax, 0xff and replaces them with more
complex instructions that do the same thing but in a very indirect way while
other common methods include mostly name mangling, compression, and
encryption. To modify the code or dump it, you need the unpacker to somehow
decrypt the code and load it into memory.

The point of FHE, however, is to encrypt private data to send it to a third
party to perform operations on that data without that third party having the
keys necessary to decrypt the data (and thus see what it is). With FHE, the
third party can modify the data but must then send the encrypted result back
to the client who then uses the original keys to decrypt the result and look
at it. The client can't see exactly which operations were performed and the
third party can't see the original data.

I don't know if a FHE based interpreter is possible but since you need to have
the original keys to read from an FHE payload, I don't think so.

~~~
sillysaurus3
Let's try a direct question: why can't the third party perform the operation
"is x equal to 5"?

The point of HE is that you can do add and mul. FHE extends HE to do arbitrary
computation. "Is x equal to 5?" is an arbitrary computation, so it seems like
FHE must support it.

If comparison ops are supported, then you can make an FHE based interpreter.
Thus far, no one has been able to explain why _specifically_ it wouldn't be
possible. And if it's possible, then programs are naturally obfuscated: the
interpreter is operating on encrypted bytecode input. But the original paper I
cited says this is impossible! So there seems to be an interesting mystery
here.

~~~
nshepperd
> why can't the third party perform the operation "is x equal to 5"?

They can; the output is an encrypted boolean.

> And if it's possible, then programs are naturally obfuscated: the
> interpreter is operating on encrypted bytecode input. But the original paper
> I cited says this is impossible!

No it doesn't. Isn't your paper talking about an obfuscated program Ob(P) that
does the same thing as P? P takes plaintext to plaintext, so Ob(P) takes
plaintext to plaintext.

Homomorphic encryption is a different thing. Hom(P), if it exists, takes
ciphertext to ciphertext. (Or possibly plaintext to ciphertext if it's a
public-key cryptosystem.)

~~~
sillysaurus3
How is the output to a conditional an encrypted boolean? A conditional is "if
this, do that." That means if the conditional is true, the branch is executed.
The output wouldn't be an encrypted boolean; the output would be whatever the
branch does. Right?

~~~
hrjet
I am sure they meant that the output of a conditional _expression_ would be an
encrypted boolean.

A conditional _branch_ would become an encrypted conditional branch. That
means, you wouldn't be able to infer the branch from the encrypted output.

Let's take a simple example. I have an algorithm X which takes two integers
and returns an integer.

Let's say I want to run this algorithm on a third-party VPS. I obfuscate the
algorithm X, in order to hide the operations that X does. Let's called the
obfuscated version OX.

I host the OX algorithm on the third-party server, and start supplying it
pairs of integers. The third-party is observing the set of inputs and outputs:

    
    
      (4, 4) => 8
      (3, 2) => 5
      (1, 0) => 1
    

From these observations, the third-party would be able to infer what OX does.
(simple addition in this case).

Now, however, if I use HE, I will get an algorithm (HX) which takes encrypted
input and spits out encrypted output. A third-party will see the corresponding
log of inputs and outputs like this:

    
    
      h3830120cjfakj, 102309123clals => 293sdlzlxdf
      94ka3lkc.zdkf, 102kksdllz => 1923939nddd
    

From these observations, it is impossible to know what HX does (as per the
theory).

------
toolslive
HE is supposed to be the holy grail of cloud computing, but last time I looked
at it, it was dog slow. I glanced at the site, but saw no performance
figures...

------
X4
Thank you so much for posting this now! I was writing a paper about this topic
and this arrives just in the right time! Awesome.

~~~
gaigepr
Glad I to be of help, I would be interested to read your paper when it is
publicly available!

------
sneak
Sometimes I am really glad that Bootstrap exists.

------
emocakes
I was hoping for some homophobic encryption.

------
opnitro
Totally read this as homophobic encryption...

