
2020 Knuth Prize is awarded to Cynthia Dwork [pdf] - lmkg
https://sigact.org/prizes/knuth/citation2020.pdf
======
singhrac
I took a class senior year with Cynthia which tried to cover some of the
things that she had discovered. One of most surprising things about her work
is that she brings this full algorithmic firepower to problems that I had
always considered outside the realm of mathematics - fairness and privacy.
Some of her greatest achievements, I think, are figuring out what problems to
solve (and of course solving them). She really stood out as one of our best
faculty even in a very crowded room.

~~~
jedberg
On a side note, I notice you call her "Cynthia" instead of Professor or
Doctor. I've noticed other Harvard students do this with other professors too.

Is this a typical Harvard thing?

~~~
downerending
At least in the US, having a Ph.D. and insisting on being referred to as
"Doctor" is considered a real asshole move. Usually one might address the
person once as "Dr. Whatever", but they will invariably reply, "Please, call
me Jane".

Even referring to someone as "Dr. Whatever" in the third person is pretty
unusual, if you've ever met them. If I was to speak of my college professors
right now, I certainly wouldn't use "Doctor". Maybe if they were 70 or had a
Nobel or something.

(Related: In the movie Avatar, one scientist introduces himself to another
scientist as "Doctor Norm Whatever". It absolutely clangs, at least to my
ears.)

~~~
jedberg
If you know them sure. But for a professor who I took a class with? I would
always call them Professor Whatever if my only interaction with them was in a
classroom context.

~~~
foota
Maybe a generational thing? Not sure when you went to school, I went to a
midsized state school 3 years ago and mostly referred to professors by their
first or last name. I think this is more common in CS perhaps? I don't think
my physics friends did the same thing as often though.

------
sirgawain33
I really liked Dr. Dwork's Turing lecture:

[https://www.youtube.com/watch?v=vsA4w3itxA0](https://www.youtube.com/watch?v=vsA4w3itxA0)

It's an introduction to differential privacy to an academic audience (i.e. not
necessarily computer scientists). I sweeps across a range of surprising real-
life privacy attacks that are possible against anonymization approaches that
feel good-enough. Really gives you a sense for sort of problem that privacy
protection is in today's world of greatly increased data collection and
computational power.

~~~
rolandog
Thanks for the link to the talk, it was really enjoyable and informative!

------
d33
As an interesting example of what differential privacy is, consider this
excerpt from Wikipedia:

"A simple example, especially developed in the social sciences,[15] is to ask
a person to answer the question "Do you own the attribute A?", according to
the following procedure:

1\. Toss a coin.

2\. If heads, then toss the coin again (ignoring the outcome), and answer the
question honestly.

3\. If tails, then toss the coin again and answer "Yes" if heads, "No" if
tails.

(The seemingly redundant extra toss in the first case is needed in situations
where just the act of tossing a coin may be observed by others, even if the
actual result stays hidden.) The confidentiality then arises from the
refutability of the individual responses.

But, overall, these data with many responses are significant, since positive
responses are given to a quarter by people who do not have the attribute A and
three-quarters by people who actually possess it. Thus, if p is the true
proportion of people with A, then we expect to obtain (1/4)(1-p) + (3/4)p =
(1/4) + p/2 positive responses. Hence it is possible to estimate p.

In particular, if the attribute A is synonymous with illegal behavior, then
answering "Yes" is not incriminating, insofar as the person has a probability
of a "Yes" response, whatever it may be."

~~~
papeda
Not only is this differentially private, it's locally differentially private,
which is an even stronger privacy definition. It's "local" because the user
adds randomness themselves. Generic differential privacy is a weaker
definition because it lets whoever's running the algorithm collect raw data
and then add randomness somewhere in the computation pipeline to produce
privatized outputs.

This kind of example also predates the definition of differential privacy by
about 40 years [1], although the motivation is pretty much the same.

[1]
[https://www.jstor.org/stable/2283137?seq=1](https://www.jstor.org/stable/2283137?seq=1)

------
fraggle222
Basically invented Proof of Work (see Bitcoin):

Cynthia Dwork and Moni Naor. Pricing via processing or combatting junk mail.
In Proceedings of Crypto, 1992\. Also available as
[http://www.wisdom.weizmann.ac.il:81/Dienst/UI/2.0/Describe/](http://www.wisdom.weizmann.ac.il:81/Dienst/UI/2.0/Describe/)
ncstrl.weizmann_il/CS95-20.

------
formalsystem
Is anyone aware of any work for how to apply differential privacy to language
models?

So the main question I have is let's say I'm working with sensitive data like
emails or doctors notes. How can I train an ML model that would still learn
something useful without leaking private data.

When I say "leak", an example would be I train an RNN on some company data
email data and when I feed the RNN "$AMZN" the network would say SELL.

How can I quantify how much the model has learnt and how much privacy has been
leaked.

~~~
janhenr
Check out Han Song's (MIT, most well-known for NN compression) Lab new paper
on gradient leaking and follow-up work.
[https://arxiv.org/abs/1906.08935](https://arxiv.org/abs/1906.08935) I
attended a talk by him recently and was very impressed by the work of his lab
in this area.

------
gok
Original DP paper remains a great read: [https://www.microsoft.com/en-
us/research/publication/differe...](https://www.microsoft.com/en-
us/research/publication/differential-privacy/)

~~~
arjunnarayan
This is not the original Differential Privacy paper, this is an invited talk
(I agree with you that it remains a great read). However, the original paper
is "Calibrating Noise to Sensitivity in Private Data Analysis"[1] with four
co-authors: Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith.

[1]: [https://people.csail.mit.edu/asmith/PS/sensitivity-tcc-
final...](https://people.csail.mit.edu/asmith/PS/sensitivity-tcc-final.pdf)

------
tasseff
Anyone else find it funny that the first paragraph of a Knuth Prize
announcement writes "treat like alike" instead of ``treat like alike?''

~~~
svat
More accurately, the _typeset_ output has ”treat like alike” instead of “treat
like alike”, which suggests that the TeX _input_ (most likely) had "treat like
alike" instead of ``treat like alike?''. Some historical context that led to
features like this (or gotchas, today) in TeX and other systems of the time is
here:
[https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html](https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html)

~~~
zaroth
I kept reading and re-reading 'treat' \- 'like' \- 'alike' trying to figure
out where the difference was.

For those left wondering -- it's the first double-quotation mark leaning the
wrong way in TFA.

