
Testing Privacy-Preserving Telemetry with Prio - feross
https://hacks.mozilla.org/2018/10/testing-privacy-preserving-telemetry-with-prio/
======
oedmarap
Great strategy here to aggregate results by splitting and distributing the
data to different nodes.

As for them vetting third-parties as mentioned in the article, I'd actually
prefer Mozilla to host different telemetry collection nodes itself and just
route them round-robin anonymously within their own network.

IMO having to extend intermediary trust might be taking one step forward two
steps back here if who collects the data != who analyzes it.

~~~
xcodevn
You don't need to trust all third-parties. _Crypto magic_ guarantees that what
you need is at least ONE server being honest.

If you trust Mozilla, then you have nothing to worry about, except the proof
or program may be wrong ;)

Third-parties are for someone who doesn't trust Mozilla. Or that Mozilla could
be compromised by an unknown adversary.

------
a_imho
It is probable me but I don't understand what are the advantages here or what
that picture supposed to convey. Data is transmitted and aggregated, albeit in
a fancier way. What does it have over Telemetry?

~~~
deno
Here’s a cool example of secret sharing from the paper[1]:

> App store. A mobile application platform (e.g., Apple’s App Store or
> Google’s Play) can run one Prio server, and the developer of a mobile app
> can run the second Prio server. This allows the app developer to collect
> aggregate user data without having to bear the risks of holding these data
> in the clear.

However I would like to know why Firefox doesn’t want to just adapt RAPPOR
instead. It seems like a better fit for general telemetry, and it’s already
used by Chromium.

I guess you can intersect and union Prio datasets in ways you can’t with
RAPPOR?

[1]
[https://crypto.stanford.edu/prio/paper.pdf](https://crypto.stanford.edu/prio/paper.pdf)

~~~
xcodevn
Google RAPPOR technique adds noise to the report at client (user) side. The
noisy reports are then collected to a central server for analysis.

Doing this guarantees differential privacy, a strong privacy protection.

However, the _major_ disadvantage of RAPPOR is that we get a collection of
very noisy data points. Therefore, we need a very big collection to learn
anything useful at all!

If the number of clients/users/records is small, RAPPOR is practically
useless.

Prio, on the other hand, adds NO noise to client's reports. It uses _crypto
magic_ to compute _some_ functions (mean/sum/...) on the collection of reports
by a network of servers. Only outputs of these functions are known. No reports
are leaked until at least one server is honest!

However, Prio has no guarantee that the computed output doesn't leak
information about client's reports (in fact, it surely leaks some
information). While, RAPPOR guarantees this protection statistically!

A current research direction is to add noise to the output of Prio as such it
guarantees differential privacy.

------
rhelmer
I worked on the Firefox integration (per the article) if anyone has specific
questions.

If you want more detail on Prio itself, I'd suggest
[http://blog.ezyang.com/2017/03/prio-private-robust-and-
scala...](http://blog.ezyang.com/2017/03/prio-private-robust-and-scalable-
computation-of-aggregate-statistics/) as a more gentle introduction than the
research paper.

