Hacker News new | past | comments | ask | show | jobs | submit login
2020 Knuth Prize is awarded to Cynthia Dwork [pdf] (sigact.org)
297 points by lmkg 13 days ago | hide | past | web | favorite | 33 comments

I took a class senior year with Cynthia which tried to cover some of the things that she had discovered. One of most surprising things about her work is that she brings this full algorithmic firepower to problems that I had always considered outside the realm of mathematics - fairness and privacy. Some of her greatest achievements, I think, are figuring out what problems to solve (and of course solving them). She really stood out as one of our best faculty even in a very crowded room.

On a side note, I notice you call her "Cynthia" instead of Professor or Doctor. I've noticed other Harvard students do this with other professors too.

Is this a typical Harvard thing?

At least in the US, having a Ph.D. and insisting on being referred to as "Doctor" is considered a real asshole move. Usually one might address the person once as "Dr. Whatever", but they will invariably reply, "Please, call me Jane".

Even referring to someone as "Dr. Whatever" in the third person is pretty unusual, if you've ever met them. If I was to speak of my college professors right now, I certainly wouldn't use "Doctor". Maybe if they were 70 or had a Nobel or something.

(Related: In the movie Avatar, one scientist introduces himself to another scientist as "Doctor Norm Whatever". It absolutely clangs, at least to my ears.)

If you know them sure. But for a professor who I took a class with? I would always call them Professor Whatever if my only interaction with them was in a classroom context.

Maybe a generational thing? Not sure when you went to school, I went to a midsized state school 3 years ago and mostly referred to professors by their first or last name. I think this is more common in CS perhaps? I don't think my physics friends did the same thing as often though.

Related: as an anesthesiology associate professor (M.D.), I always told the residents to call me Joe. However, there were always a couple who either would not or could not do so, and addressed me for the entirety of their three-year residencies as Dr. Stirt. Diff'rent strokes

Cf. "Dr." Jill Biden


I went to University of Illinois at Chicago (not UIUC or UofC). If I worked with a professor for awhile (not in class more for research oriented things) sometimes it would be appropriate to address them by their first name but I never did.

I think she asked us to call her by her first name at the beginning of the semester and it stuck (but I can't remember and I might be being rude). Honestly I'm as surprised as you are - mulling it over I'm sure I called her Prof. Dwork when talking to her, since that's how I addressed most professors I didn't well personally.

Oh, yeah - the way he phrased it, I assumed she was a classmate, not the one teaching the class.

Common at lots of Universities I think, at least in the UK, and probably in the US too. I knew most of my professors by their first names.

Normal in NZ too. Many undergrads don't even know how to use titles, call us by the wrong one, etc.

There were only two professors I was on a first name basis with in college. One was my advisor, I also worked in his lab for two years. The other was another professor in the department. It was different in grad school. It was first names for professors you worked with. At that point, they don't know more than you do about your research.

A U.S. thing I’ve noticed is calling professors “Dr. Smith” when elsewhere (and certainly in Germany) that would almost be an insult, since it’s “Prof. Smith”.

I know a German professor with two Ph.D.s who calls himself "Professor Doktor Doktor So-and-so". He doesn't introduce himself like that, mind you, but he uses it in his .signature file (is it still called that nowadays? the text that's automatically appended to your email).

I do that with my professors too. US College thing.

Note that applying mathematics to fairness is nothing new, there is a whole field of results, e.g.: https://en.wikipedia.org/wiki/Fair_division

Do you remember any cool examples of an algorithm bazooka?

I really liked Dr. Dwork's Turing lecture:


It's an introduction to differential privacy to an academic audience (i.e. not necessarily computer scientists). I sweeps across a range of surprising real-life privacy attacks that are possible against anonymization approaches that feel good-enough. Really gives you a sense for sort of problem that privacy protection is in today's world of greatly increased data collection and computational power.

Thanks for the link to the talk, it was really enjoyable and informative!

As an interesting example of what differential privacy is, consider this excerpt from Wikipedia:

"A simple example, especially developed in the social sciences,[15] is to ask a person to answer the question "Do you own the attribute A?", according to the following procedure:

1. Toss a coin.

2. If heads, then toss the coin again (ignoring the outcome), and answer the question honestly.

3. If tails, then toss the coin again and answer "Yes" if heads, "No" if tails.

(The seemingly redundant extra toss in the first case is needed in situations where just the act of tossing a coin may be observed by others, even if the actual result stays hidden.) The confidentiality then arises from the refutability of the individual responses.

But, overall, these data with many responses are significant, since positive responses are given to a quarter by people who do not have the attribute A and three-quarters by people who actually possess it. Thus, if p is the true proportion of people with A, then we expect to obtain (1/4)(1-p) + (3/4)p = (1/4) + p/2 positive responses. Hence it is possible to estimate p.

In particular, if the attribute A is synonymous with illegal behavior, then answering "Yes" is not incriminating, insofar as the person has a probability of a "Yes" response, whatever it may be."

Not only is this differentially private, it's locally differentially private, which is an even stronger privacy definition. It's "local" because the user adds randomness themselves. Generic differential privacy is a weaker definition because it lets whoever's running the algorithm collect raw data and then add randomness somewhere in the computation pipeline to produce privatized outputs.

This kind of example also predates the definition of differential privacy by about 40 years [1], although the motivation is pretty much the same.

[1] https://www.jstor.org/stable/2283137?seq=1

Basically invented Proof of Work (see Bitcoin):

Cynthia Dwork and Moni Naor. Pricing via processing or combatting junk mail. In Proceedings of Crypto, 1992. Also available as http://www.wisdom.weizmann.ac.il:81/Dienst/UI/2.0/Describe/ ncstrl.weizmann_il/CS95-20.

Is anyone aware of any work for how to apply differential privacy to language models?

So the main question I have is let's say I'm working with sensitive data like emails or doctors notes. How can I train an ML model that would still learn something useful without leaking private data.

When I say "leak", an example would be I train an RNN on some company data email data and when I feed the RNN "$AMZN" the network would say SELL.

How can I quantify how much the model has learnt and how much privacy has been leaked.

Check out Han Song's (MIT, most well-known for NN compression) Lab new paper on gradient leaking and follow-up work. https://arxiv.org/abs/1906.08935 I attended a talk by him recently and was very impressed by the work of his lab in this area.

Quantifying shareprice movements as a function of data leaks? That's brilliant!

This is not the original Differential Privacy paper, this is an invited talk (I agree with you that it remains a great read). However, the original paper is "Calibrating Noise to Sensitivity in Private Data Analysis"[1] with four co-authors: Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith.

[1]: https://people.csail.mit.edu/asmith/PS/sensitivity-tcc-final...

Anyone else find it funny that the first paragraph of a Knuth Prize announcement writes "treat like alike" instead of ``treat like alike?''

More accurately, the typeset output has ”treat like alike” instead of “treat like alike”, which suggests that the TeX input (most likely) had "treat like alike" instead of ``treat like alike?''. Some historical context that led to features like this (or gotchas, today) in TeX and other systems of the time is here: https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html

I kept reading and re-reading 'treat' - 'like' - 'alike' trying to figure out where the difference was.

For those left wondering -- it's the first double-quotation mark leaning the wrong way in TFA.

That is pretty funny. An easy way to catch a junior TeXnician.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact