Hacker News new | past | comments | ask | show | jobs | submit login
Apple's emphasis on 'differential privacy' (wired.com)
101 points by rargulati on June 14, 2016 | hide | past | web | favorite | 39 comments



Google differential privacy technology, used in Chrome, starting in 2014:

http://www.computerworld.com/article/2841954/googles-rappor-...

https://github.com/google/rappor

https://arxiv.org/abs/1407.6981

(I worked on this)

I don't see any mention in the article of what algorithms Apple is using, or a link to the code. In the area of privacy, the code should really be open source, for obvious reasons.


Differential privacy is an interesting research area. I know there have been several survey talks, including at NIPS. I wanted to point out this:

http://rsrg.cms.caltech.edu/netecon/privacy2015/program.shtm...

which has some nice slides and discussion on the "reusable holdout" ("thresholdout") which is a technique to allow one to use all of the training data to fit a lot of models, but also offers guarantees against overfitting.


Our intro to use of differential privacy in improving machine learning (admittedly, a different concern than preserving privacy): http://www.win-vector.com/blog/2015/10/a-simpler-explanation...


I would assume this will help Apple to be compliant with the EU General Data Privacy Regulation. The GDPR defines quasi-identifiers as PII.


Yes exactly!!!!


??? Quasi-identifiers are a Pentium 2?


Personally Identifiable Information, I assume.


Article blocks those using an ad-blocker


Here's the thing about AdBlockers...

Ironic, considering I'm trying to read an article about privacy, and the entire reason I use an adblocker is to try and protect that privacy.


If you're using Safari, you can hit the Reader button and get the full article without crappy styling or ads! (It worked for me on Safari for OS X with AdBlock for Safari installed.)


Block their ad-block blocker with anti-adblock-killer https://reek.github.io/anti-adblock-killer/

Built into uBlock Origin with a checkbox to enable it in the settings.


That actually didn't work. I forced that list to update just in case, but it doesn't make a difference. I also wasn't getting the usual reader mode prompt in the address bar.

Safari Develop -> Disable Javascript works fine, though.


It generally does work, but sometimes new blocker scripts will still get through before an update makes it into the pipeline, and it's possible that some are even randomizing or otherwise altering their scripts to avoid detection.


Because you're entitled to their content.


They're not entitled to execute code on my computer that I don't want executed. If they want to prevent access to their content, that's fine, but they aren't doing that, they're making an absurd attempt to keep their content online while forcing people to disable their adblockers, which protect people from excessive resource downloading, slow websites, behavior tracking, and malware. The correct response is to block the script that tries to force that, not disable the adblocker. Or, better yet, block the entire site.


No, the correct response is to not go to the site in the first place. All I saw from your response is that you should be entitled to go read their content, despite the fact that they don't wish for people with ad blockers to go to their site. So you are saying you are entitled to read their content, rather than simply not going to it in the first place.


Yes, obviously people are entitled to read content served to them. That's how the web works.


Cool. Let's put ad-blockers in the UserAgent string so that both parties are approaching the transaction honestly, and see how that pans out.


You can't invent some new standard for "honest behavior" and then accuse others of dishonesty for not following it. There's nothing dishonest about not informing sites about what addons one is running. We're being just a straightforward as sites are with their information about exactly what PII is captured, by whom and exactly who gets a copy of it and for what purposes it is used. If new standards of reporting are to be expected, then sites - most of which can't even tell who is serving the ads they have on their site - would be screwed.

I actually only use adblockers on my tablet, since I have misgivings about the effect of the end of web advertising on the availability of sites to poor people (since ads effectively act as a redistribution mechanism), but I don't blame anyone for not being willing to subject themselves to that crap.


This isn't a new standard. "Don't take shit you haven't paid for" has been a standard for quite some time.

Seriously, at the end of the day, why do you feel entitled to their content, instead of simply not going to their site?


This is a sponsored comment, please watch this ad before reading it: http://bit.ly/1UbKM9T

The new standard was Doctor_Fegg's idea that people should announce in their user-agent that they are using an adblocker. In any case, nothing is being "taken". The entitlement comes from where it has always come from - one makes a request, they reply with whatever they want to serve. Payment is only a moral obligation when we have agreed to do so, otherwise we'd have to pay for every musician playing on the street, regardless of whether we enjoy their content or not. People trying to push moral obligations to prop up their flawed business models don't get my sympathy.


But this isn't that. This is people getting content served to them after it was initially denied.

You still haven't explained why you're entitled to their content for free. How do you expect them to actually pay their bills? And if you don't care about that, then why are you reading their content?


The content is being served and then they're trying to examine your browser and tell your browser to hide it if it meets certain conditions. The content is already in the browser.


Legally? Yes.


No, not really.


Yeah, legally you're totally allowed to download stuff from websites.


It's not illegal to block ads and continue consuming content.


How would this work for image and facial recognition, especially with regard to individuals? It seems the features needed to recognize a face are by definition identifying.


From what I understand from the conference is that facial recognition is done locally on device and not transmitted. They said that quite clearly.

There is probably some specifics I'm missing though, and it raises the question about how they identify others faces. I've not used the feature so I can't say. I'm guessing it's a manual tagging when the device can't find a match and then on device processing afterwards once the algorithm has learned.


Google photos only groups similar faces and requires the user to assign a name to that group. Apple could presumably perform the grouping in the cloud and relegate the association of the group with an identity to the user's device.


In the keynote they specifically called out face & image recognition as something performed entirely on-device, without touching the network.


That was my understanding too, but seems like to handle the vast amount of image recognition required for faces and also objects which they demoed; there would need to be a large data set to learn from, and results from that learning saved.

The data set to learn from probably can be gathered from stock photos or other open source images, but can that learning be saved down to phone and used to compare against, or would it be too big?

If too big, does that mean characteristics of an image would be computed on the phone sent to the cloud compared and the results sent back?


Yes, the results of all the learning can be saved in a compact form which is efficiently run on the device. Apple just yesterday unveiled their neural network API [1] which they are presumably using for these tasks. The description states: "BNNS supports implementation and operation of neural networks for inference, using input data previously derived from training. BNNS does not do training, however. Its purpose is to provide very high performance inference on already trained neural networks."

So they're really saying that no data leaves the device for either object or face recognition at classification time. Now, whether they are using iCloud-stored photos (which the user has already agreed to share) for training, I don't know, and any privacy issues would still be important at that point.

[1] https://developer.apple.com/reference/accelerate/1912851-bnn...


If you're referring to the new facial recognition system in iOS 10's Photos app, I believe they mentioned that the entire system would run locally on your own hardware. I took that to mean that they would not be sending anything, identifying or not, to Apple's servers.


I find it incredible that along with "we will allow users to save their data in the cloud" comes now automatically "and we are going to exploit their data". Even to paying customers, who shouldn't be the product.


Paying customers are not the product. Programs that use machine learning techniques that require huge amounts of data input into them to be useful are the product. The problem is providing that data without compromising privacy, and this is Apple's solution to that.

You can certainly argue that this is "exploiting" the user's data, but any computer program whose purpose is to summarize or make inferences from a provided data set is, by definition, exploiting provided data, right? Privacy questions mostly revolve around consent and tracking. Is it potentially a privacy concern for Google or Apple to deduce that the email that I have from American Airlines and Hilton relates to travel plans, add that trip information to my calendar, add the airline ticket to my phone's wallet, let me know about transit options to and from the airport (possibly calling an Uber for me), and so on? Sure. But it's also really useful.

That Apple is trying to be able to provide ever more Google-esque assistance without leaving a whole bunch of trackable and user-identifiable information on their servers, instead keeping as much as possible on my personal devices, is something I find laudable, not overly concerning. My biggest question is more whether or not they're actually going to be able to make it work as well as Google does -- or at least close enouugh to Google's quality level that it's worth using.


"We are exploiting your data so that we can provide you a better customer experience", i.e. targeted ads. I know the melody.


iAd is dead at the end of the month. How else does Apple advertise to you?


Apple is really very determined to Privacy and security_!!!




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: