Does this seem like impossible magic? Yes. Everyone was pretty surprised when the first theoretical fully homomorphic encryption schemes were found. For a while people believed that it might never be efficient enough to be practical. But now it's actually feasible to do limited but real useful computations on behalf of a user without the service learning anything about what it is computing for the user. There is ongoing work to expand what kinds of computations can be done efficiently.
It is also possible for the model to be provided by the user and encrypted along with the data rather than being fixed and public. That way the service can apply a model that it doesn't know to data that it also doesn't know. This allows a fairly general service in theory, but it remains a challenge to express more than fairly simple models in a way that can be computed efficiently.
Is there a way of doing the training (without being provided a trained model) all on the encrypted side?
After reading: Yup. It makes sense, so long as your resulting model is run against similarly encrypted data, the same patterns will be there for the ML to identify.
Which is, of course, one of the issues with homomorphic encryption.
The person you're responding to is correct. It's an explicit design goal that a fully homomorphic encryption system would not expose any distinguishable oracle about the underlying data. Otherwise there would be no point to it whatsoever, because you'd just be performing the same computations on the data dramatically less efficiently and without any benefit.
This follows the general imperative of cryptography, which is that the outputs of cryptographically secure primitives (hash functions, pseudorandom generators, pseudorandom permutations, etc) should be computationally indistinguishable from random up to 2^n queries, for some large n (such as 128).
Look for penguin image in ECB mode. Encryption without randomness reveals the patterns! That image is highly educational and makes you think.
Intuitively it may or may not make sense to people, but a proof would also clarify whether there are any caveats, limitations or other particuliarities.
What is the domain and what is the codomain? If f and C commute, then x in X is just your "data" and both the domain and codomain.
> the top voted comment
Ah, I see that the top comment says that you use a pre-trained model. Intuitively that I think makes sense. Bot how about training the model itself?
Also, something can be statistically random and still have patterns (see PRNGs, which are statically random (you can't identify the next value from previous values), but there's still a pattern if you know the algorithm and seed).
I'll admit, the promise of homomorphic encryption is pretty amazing, but this particular combination of data and ML seems like a fairly obvious way to leak data. I believe there's a reason that homomorphic encryption has not been broadly accepted as an allowed standard.
EDIT: So, I think I'm missing a practical example. I have a model which I want to train on data homomorphically encrypted with key X, and that model is a very simple "is this a cat". I'm given a whole set of data encrypted with key X that's tagged with "cat" and "not a cat".
Once the model is trained, I can run this on any data encrypted with key X and find out if the data contains a cat (with some degree of accuracy). I have no way of telling information outside the tags provided on the training data, but it still gives me, a person without the encryption key, the ability to identify any feature that's tagged in the training set on any un-tagged production set.
Having seen a large quantity of ML training sets, the tag sets are rarely so limited. There's also often "elephant", "ball", and "dog" tags, even if I'm only being asked to train on cats.
I think you're missing the fact that the predictions come back encrypted, so you don't learn anything unless you know the key. Also, in this particular example the model was trained on unencrypted data, but as discussed below, you can do either.
Large quantities of small finite sets are anathema to encryption.
I am not sure whether this is achievable practically though.
Well, yes... Practical homomorphic encryption is cutting-edge research, and standards bodies like NIST aren't going to deal with an area like this until it's much closer to "solved" (by which I mean much more efficient, with more practical applications, widely used and scrutinized schemes, etc.)
I get that a lack of broad acceptance is a hard thing to source, but do you have one?
It leads to block cipher standards, AES standards, and so on.
I think it's not unreasonable to say that if it should ever appear as a recommended cipher by NIST, then it can be considered to be broadly accepted.
Now the way you do machine learning here is by translating your model to use the instructions offered by HE. You've effectively recompiled the model to a new architecture.
If you'd like to read more about machine learning with homomorphic encryption, we published a paper on our CHET compiler . I also talk about this space on a high level in this MSR podcast episode .
So, since the arithmetic is secure (otherwise it wouldn't be HE), and the entire runtime pattern is fixed up front and made of nothing but arithmetic, there's no way to leak anything.
If the whole logic can be expressed as a pure function without conditionals then it would fully fit into HE.
But what if we want conditionals?
Comparison operators like (a < b), (a > b), etc go out of the question immediately as they would allow to guess the values by a simple binary search.
Equality operation (a == b) seems plausible from the security standpoint as it would not reveal the encrypted value. But there is a challenge in performing that operation because both of its arguments may be encrypted with different randomization. To overcome this, probably some neat trick could be performed but... this is the question of the future.
EDIT: Here is an idea. Some FHE engines have pre-encoded values for some magic numbers like 0, 1 and -1.
What if the equality operation p(a, b) = (a == b) is performed like this:
p(a, b) = (a + (b * -1)) == 0
Does anyone have an intuition regarding the feasibility of the proposed trick in something like Microsoft SEAL? Do both left and right parts of the comparison would have the canonicalized randomness making the equality operation possible?
P.S. On a side note, the sheer existence of an equivalence operation in FHE scheme would decimate its security by allowing to plant bruteforce attacks with a lower guesswork. Not a catastrophe by any means, and some systems would prefer that as a small price to pay for having conditionals in a program.
But instead let's say you have some register which gets rotated/xor'd in some way based on the relationship of a to b.
I don't think you can do branching logic though.
> I don't think you can do branching logic though.
You can. If we would be able to introduce a special equality operator, let's call it SEQ that works like so:
SEQ(a, b) = 1 when a == b;
SEQ(a, b) = 0 when a != b
Then we would be able to do conditionals by discarding the result of unmatched branch by multiplying it by 0.
Thanks to the fact that all HE calculations are pure, no observable side effects would be produced.
For example, here is a simple program with a conditional statement:
P(x) = x == 10 ? 42 : sin(x)
This is how it may look in HE domain:
P(x) = SEQ(x, 10) * 42 + (1 - SEQ(x, 10)) * sin(x)
No branching is involved here but the result is equivalent to a program with branching. Eureka!
Once again, thank you for the fruitful suggestions.
EDIT: by the way, it paves the way for implementing > and < operators as well due to the fact that operator result remains in encrypted domain. This is a serious wow moment.
Otherwise it would be trivial to analyze and eventually crack the scheme. This is especially true for homomorphic encryption where an attacker can employ math operations to solve equations in order to reveal as much data as he/she humanly can.
> It is hard to experiment with front end features like named dimensions, because it is painful to match them to back ends that expect calls to monolithic kernels with fixed layout. On the other hand, there is little incentive to build high quality back ends that support other features, because all the front ends currently work in terms of monolithic operators. An end-to-end tool chain for machine learning requires solutions from many specialist disciplines.
This is a fantastic example of getting around this problem using Flux.jl's interaction with the full Julia language. Here the author does some exceedingly cool stuff (doing machine learning on encrypted data!), and in order to get there he needed to write what many would think of as lower-level kernels that should be "provided by the library" (encrypted matrix multiplication, encrypted convolutions). To make the interface useful, he needed to use a mature interface that people are already using and make it something that can automatically switch over to encrypted implementations. And, because of the nature of fully homeomorphic encryption, the implementation has to be fast, otherwise it's dead in the water since FHE is expensive!
To me, this example showcases one way how Flux.jl's is helping machine learning get out of that "rut". The author adds dispatches to standard kernels which allow for his encrypted data types, which then allows standard Julia Flux.jl machine learning models to act on encrpyted data, and it uses type-inference + JIT compilation to make it fast enough to work. Not only that, but it's also not tied to some "sub-language" defined by a machine learning framework. That means the FHE framework not only works nicely with machine learning, it can be used by any other package in the Julia language (differential equation solvers, nonlinear optimization, macroeconomics models?). This allows composibility of tools and community: all tools from all fields can now use this same FHE implementation, so authors collaborate and mature this to a very good one. These knock-on effects give people doing research in Julia a lot of competitive advantages over other researchers, and it will be interesting to see how this effects not just the ML research community, but also everyone else!
(Repost from the previous thread on this!)
I think you mean fully homomorphic encryption, that is, f(x.y)=f(x).f(y).
A homeomorphism is a isomorphism of topological spaces. If you do mean the latter then it would be something new (and I am curious to hear about it).
It really just boils down to the fact that julia's core value proposition is primarily enticing to domain experts trying to implement new algorithms without wanting to worry about switching languages for performance hotspots or having to reimplement existing functionality from other packages. Julia packages tend to be small, focused and astonishingly composable.
Hence, the proportion of the community that are actively developing and maintaining state of the art pacakges in julia is way higher than that in other languages like Python.
Furthermore, even if you're not some super researcher developer, Julia substantially lowers the bar for mere mortals such as myself to do package development. The composability I mentioned above means that I can stand on the shoulders of giants as I implement my own stuff, plus the fact that the majority of libraries are written in pure julia means that I can easily "peek under the hood" to figure out how they work and learn from them.
It's hard to make broad statements about how much ecosystem support one can expect from julia without first specifying the field. Some fields have fantastic julia support and others haven't received enough attention. It really just depends on if someone from your field has started making good packages yet.
How did you end up with libraries in global directories?
Also, while we are working on reducing the time to first plot latency (which has to do with compiling large amounts of code - the time to second plot is 0.2sec), it would be good to learn more about the compiler quality issues you mention. We always welcome constructive comments on the Julia discourse.
Note that even before this though, it's just the compilation phase for the first plot that's expensive. Plotting itself is quite fast in julia, and if you start up julia with the command line argument `--compile=min` you'll see very very fast first plots. Obviously, this will come with a runtime performance cost though.
Yep. This has been commented on again and again, but many refuse to show
willingness to fix or even acknowledge the problem:
We take an ML model trained on unencrypted data, use a 'homomorphic evaluation' technique (let's just leave that as magic here) to convert the model operation-by-operation to a model that runs on encrypted data, do a little more crypto magic, and we've solved the business problem described at the beginning of the article.
(In particular, if you train a model on encrypted data you get a really bad model, right?)
People have done very effective training on encrypted data using simpler models, like linear or logistic regression. See for example this work  from my colleagues at Microsoft Research.
There are certainly a number of factors to consider besides data security when evaluating the practicality of such an approach but I just wanted to confirm that it was technically possible before getting in to any of that. Thanks for your response and the post, I knew almost nothing about HE before today.
I might be wrong though.
There is also Functional Encryption, a different technique solving a similar (but different) goal. With F.E., the owner of the data must know the function f, but the party performing the calculation get directly the result f(X).
: https://www.cs.utexas.edu/~roshan/CHET.pdf : https://arxiv.org/pdf/1812.10659.pdf
Aren’t hashes etc. done by some nonlinear functions?
Doesn’t ML kind of try to fit your model using linear functions all the way down? Or not only linear?
Point being — can ML techniques be used to reverse hashes or find collissions? Immovable object vs irresistible force?
Anyone got actual INFO on how this plays out?
That's actually a good philosophical observation. And there is an answer for that.
If you try to use ML on an encrypted data set then it will blow up and won't converge. The encrypted data contain a lot of randomness while the ML would assume the presence of a not insignificant amount of continuity, linearity and correlation.
Trying to train a network on encrypted data would blow it up as it would not be able to find a convergence point in any observable time. The effort would be equivalent to trying to bruteforce the encryption.
But if you have a few million years, a few PBs of storage and some extravagant network training algorithm based on mutations, who knows.
Yes, that's what the blog post describes how to do.
That's from the post. Is that what you meant by creating a model whose results only the owner of the data can understand?
That was optimizing for latency and throughput is a something you can separately optimize for. This landmark paper (also from Microsoft)  showed inferencing on 2 convolutional layer network at a throughput of 58k images per hour.
You can't do this effectively without outside knowledge they should not have. They are in fact using outside knowledge, specifically that the encrypted data is in the form of images. Without that knowledge, you wouldn't know which ML techniques to use! Additionally, remember that feature engineering is a big part of what makes ML effective at all, and that certainly cannot be done on encrypted data (the feature engineering you need to do depends on what the data looks like).
Sure, you know that any files sent to you are going to be encrypted audio files, but you don't know what is being said in them which is the point of this technique.
It tends to make some people think that neural network can somehow "see" the patterns in encrypted data. Plain wrong.
What happens in reality is this: the neural network is encrypted and can only work on encrypted data producing only the encrypted results.
The data can only be decrypted by the data owner who would not share his private key with anybody. That's it.
In this way, the whole algorithm works in a so called encrypted domain. When used in this role, homomorphic encryption (HE) engine resembles a custom CPU. That CPU is not Turing-complete because it lacks conditional branching, but it is almost there providing two basic operations: addition and multiplication. This is enough for a lot of tasks.
Does a CPU know a type of data or something special about them? Not at all. It executes primitive instructions of a program, one by one producing the output result from a set of inputs.
The program P that HE "CPU" executes can be any program suitable for the given instruction set. Not only ML. It can be any other program!
That's why homomorphic encryption is a very important technique in the area of cryptographic obfuscation. It allows to hide the data processed by a program P. And if that program P is a virtual machine (sometimes called Universal Circuit or UC) by itself, it would allow to hide the code as well, reaching the state of indistinguishability obfuscation. There are some practical problems along the way, but we are moving there.
Yep, that sounds like magic, but it's based on a pure "boring" math with a few bright ideas here and there. What was thought as an impossible feat in the past, gradually becomes a reality.
Isn't that the reason why Nobel is a no go for mathematicians? Otherwise, they would get plenty of Nobels (almost all of them, for sure).
Also, you can do feature engineering on a training set that you collect, as long as that is similar to the distribution of end user inputs. That's a pretty standard ML workflow.
What does your statement about feature engineering have to do with encrypted data? Maybe I can clarify with an example: You can't transform a waveform into more usable features with a fourier transform if the waveform is encrypted.
With homomorphic encryption:
- I encrypt my data locally
- Send the encrypted bits over to the third-party
- The third-party uses their ML algorithm to compute a prediction
- They send back an encrypted prediction
- I decrypt the prediction locally and get my results.
At no point in this process can anyone, aside from me, see either the input data or the output.
A small correction. Decryption is performed with a decryption (private) key which is kept in secret and only known to the data owner.
In contast, encryption key is public and known to both data owner and "the person who runs the model".
It depends on how salt is applied. If it is concatenated, the homomorphic property persists. To really destroy it, you would need to mix it up. But then you would lose the malleability, and all that would remain is an equivalence operation from the ring, but not + or *. This still can be useful for some tasks, but not for encrypted calculations or ML.
This is more like anonymization (effectively a one-way hash) than encryption. If you encrypt 2 different values with the same algorithm and key, you will get the same ciphertext which reveals information about the original value (i.e., that they are the same).
>This is more like anonymization (effectively a one-way hash) than encryption. If you encrypt 2 different values with the same algorithm and key, you will get the same ciphertext which reveals information about the original value (i.e., that they are the same).
This is a misunderstanding. Homomorphic encryption schemes are randomized encryptions schemes, which means even if you encrypt the same value twice with the same encryption key you get two different ciphertexts (both look like "random data"). They just have the property that they will decrypt to the same value. This is possible because the space of ciphertexts is much larger than the space of plaintexts.
The big challenge in creation of a fully homomorphic encryption scheme was to make it able to cope with added randomness. This problem remained unsolved starting from 1978 when Rivest, Shamir and Adleman noted probable possibility of such scheme up until 2008 when Craig Gentry came up with solution in his PhD thesis. Gentry's work is a fascinating read by they way.
In other words, homomorphic encryption has the same security guarantees as a conventional crypto.
This analyses depends on the property of the same values producing the same ciphertext, which also mean you're leaking information.