Basically, if this works as the headline suggests, and my poor understanding of the paper implies we can take their example of competing financial institutions:
Two banks share data, that is intended to be secret, and they use this technique to compute some property over the data. Each bank can then do the same computation over there data alone, and compare the resulting model over their data and the result over their competitor’s. From that they can infer the properties of their competitors portfolio, which seems to be leakage of data that is not intended to be possible.
Hence I am clearly missing something, which isn’t unexpected as HE is unintuitive to me.
Also if you had all of the decryption keys you’d just decrypt the data and use the raw data. They explicitly state that it is only fast in the context of HE problems - being multiple orders of magnitude slower than techniques you can use on raw data (they actually said “fast” - complete with the quotes, which I appreciated)
"In the training phase, it takes as input an encrypted training data and outputs an encrypted model without using the decryption key. In the prediction phase, it uses the encrypted model to predict results on new encrypted data." Figure 1 further implies that the results must be decrypted.
This is the typical operational setting of homomorphic ML.
(E.g. so that the value 'male' in the 'sex' column has the same ciphertext across subjects/observations).
If this is the case, then couldn't a standard logit model be used? (It needn't know what it's categories 'mean', either x or y).
These properties would be fulfilled by, e.g. the MNIST example in the paper (and it wouldn't take 17 hours to train!). Does anyone know where the technique in the paper would work but the one described above would fail?
Recent work  shows that MNIST, for example, can be done in a model secure/data secure way with online 30ms inference latency. For ~500,000 samples, evaluated sequentially, that's roughly 5 hours, which beats their time by about 3x. Perhaps I'm missing details in my read of the paper.
 Gazelle: A Low Latency Framework for Secure Neural Network Inference https://arxiv.org/abs/1801.05507
The 17 hours in the article is to derive the 'encrypted' model. Training takes significantly longer than prediction.