Hacker News new | past | comments | ask | show | jobs | submit login
How much does Apple know about me? The answer surprised me (usatoday.com)
150 points by tanu057 on May 4, 2018 | hide | past | web | favorite | 72 comments



I have a theory that the only reason Apple is today's beacon of user privacy is that they couldn't manage to compete in either the ad (iAd), mail (at least not at G-scale) or social network world (see: Ping).

The only path left for them was to say they were all about user privacy. If Apple had succeeded wildly in any of those three spaces, I think they'd be caught up like Google, Facebook et al.

I'm not saying this is a bad thing for the tech ecosystem, but I do think it was lucky positioning on their part.

Interested to hear opposing opinion on this.


iAd did poorly because Apple was unwilling to share data about their customers with advertisers.

>Advertisers Not Thrilled With Apple’s Practice Of Protecting Its Users’ Data... rather than offering a cookie-based ad-tracking and targeting mechanism, it essentially requires partners to tell it what kind of audience it needs to reach, and then trust that Apple will handle the rest

https://techcrunch.com/2014/02/18/advertisers-not-thrilled-w...

Ping was intended to integrate with Facebook, but Jobs canceled that integration at the last second, citing “onerous terms that we could not agree to”. It's uncertain if those terms were privacy related.

http://allthingsd.com/20100902/steve-jobs-on-why-facebook-is...


The only problem with that theory is that Apple is a much older company stemming from the days that the only way to make money was to sell things to consumers. Nearly all of that company's infrastructure and inertia has always been that way. They may be opportunistically touting privacy now like they're doing us a favor (and they kinda are...), but the main favor they did was simply never commit massive resources to a pivot to data-driven services. iAd, for example, was basically a "me too!" way of creating revenue for early iOS app devs (because Google's Android was the only competition in that market).

For all of the other companies mentioned: data was basically their only asset and business model, and they were started 25+ years after Apple.

(Edit for grammar)


Or they realized that hardware and software (App Store) is the better long term approach. Hardware is what Apple has always been known for and Jobs in general if you add in Next.

Google and FB sell ads and their products are built around better selling ads. Privacy is counterintuitive to their business model. Apple and even Microsoft can and should focus on privacy because their business models won’t be impacted as much. They can strengthen their relationships with customers who buy into their ecosystems.

Android is the oddball for google and I don’t know enough to comment about privacy.


Google collects tons of data on Android through their Apps (Chrome, Search, Maps) in connection with Play Services.

That doesn't seem like a less sustainable model, it seems better in the longer term. As the hardware market is saturating Apple has a harder time selling newer phones (turn up the marketing machine), but Google can still sell personalised ads.


Maybe not directly an opposing opinion, but I'll point out that Apple makes more money than any ad, email, or social media company, so I think it's hard to argue that Apple has somehow screwed up as a business.

In other words, to me it seems reasonable to think that they affirmatively chose a strategy that has worked out for them--not that their strategy was forced upon them by their shortcomings.

I would also quibble with your characterization that there is some difference between the ad, email, or a social network worlds. Google and Facebook are ad companies--look at their revenue. Apple is not.

I will say that I don't think Steve Jobs set out 18 years ago to turn Apple into a privacy company. I think it's reasonable to say that Apple concentrated on products and services, and discovered privacy as a differentiating strategy along the way.

But it's just as true to say that Google and Facebook discovered data-driven advertising along the way too. Both projects started out without a business model at all.


> ... so I think it's hard to argue that Apple has somehow screwed up as a business.

I don't think anyone was stating this. Merely that their ventures into spaces that typically would be privacy-hostile didn't work out for them anywhere near as well as it did for Google and Facebook.

OP's point seems to be saying something more along the lines that perhaps those particular failures helped them decide to stay (or become) more privacy-focused.


It was becoming plain a few years ago (by 2015 at the latest) that Apple didn't have the aptitude to compete in big Data or machine learning. It was very clever of them to bet on privacy - the one strategy that has little need for big Data and ML.

The other big bet is that something might happen to convince users that privacy is good. We'll see how that plays out - users aren't particularly bothered by Facebook revelations as their DAUs attest. Kids are growing up with microphones in their bedrooms (Amazon echo dot for kids) so privacy will be a deeply erodes concept ~15 years from now (if it isn't already).


If that’s true then it makes me that much more confident in Apple’s privacy practices. I will trust a profit motive to respect privacy much more than a CEO or leadership team that respects (or claims to respect) privacy purely as an ethical principle.


This doesn't make sense. A CEO who touts it as an ethical principle, if they're being honest, is likely to actually try to protect your privacy. For profit reasons, there's no reason to actually protect your privacy as long as they keep up appearances.


There are certainly legal reasons to not directly lie about the privacy practices of your company. I’m sure it has happened before, but are we really worried about, for example, Facebook lying about WhatsApp messages being end-to-end encrypted? Surely researchers would expose that lie pretty quickly.

And CEOs and leadership teams come and go. If the only thing making a company respect your privacy is the principles of the CEO, and especially if that respect for privacy is bad for the company’s bottom line, then you can bet that CEO is going to have a change of heart after a board meeting or is going to get fired as soon as there’s a bad quarter.


Microsoft was there in ~2011-2014. Check these out https://www.youtube.com/watch?v=9x4_dozWkq0

https://www.youtube.com/watch?v=bSIlUXOH2iA

From Scroogled to "telemetry" collecting user keyboard input in 4 years.


Interesting theory. However, if you look back and watch many of Steve Job's interviews, you will find that he has been a long time proponent of protecting user data.

Additionally, he has also stated many times to make sure to inform users when an action involving privacy is about to take place "repeatedly and in plain English". Though I am not sure how much of this hardline stance on privacy still exist within Apple today.


I agree, if the success was there they'd follow it.

But to be fair I think Steve Jobs was pretty interested in privacy and put some of that philosophy in the company. So they were already pursuing privacy goals but there would be these conflicts of interest that they would eventually have to resolve.

Luckily they continued to suck at social networking and many online things. It works better for them to double down on privacy.


What if the causal relationship was reversed? Perhaps the success wasn't there in those areas due to a previously established philosophy concerning privacy.

Grasping and purely conjecture obviously, just putting it out there.


This seems reasonable to me. If the leadership decisions and the culture are a constant "drag" towards an organization being competitive in these areas, the organization will do badly at them, and be unable to compete. This can be seen as a feature rather than a bug, of course depending on where you sit.


I think it's important to bear in mind that Apple's goal (generally) isn't to do cross-platform social apps. iMessage is a success because it bridges to non-Apple platforms via SMS. Everything else is largely siloed to Apple devices. This naturally limits them.


Are you familiar with this patent application?

http://ipwatchdog.com/patents/US20090265214.pdf


Oh, the good invention of "Page Has Moved"



They certainly used the data they had from their franchisees to decide where to put their apple stores.


That may be true, but I am not complaining.

Note that I do not fully agree with the parent.


<Your Siri requests —"Show me how to get to PF Chang's," or "What year was Steve Jobs born?" go back to Apple — but it uses a random identifier to mask your identity. So a Siri search for the closest Chipotle restaurant will only tell Apple that a user requested the data, but not associate it with me.>

I find that comforting. In fact, if I was a large organization and processed a ton of user data, I'd want to store that data anonymously too, due to the sheer risk of having that personal data.


And then you have users who always go to the same PF Changs but use navigation in case there is traffic and they should use an alternate route. Your competitor app will learn this and adapt and yours won't. The average customer will have no idea you aren't storing their data and likely doesn't even care while they aren't reading an article about privacy. While I personally wish everyone were to take your approach is also tying one arm behind your back.

Edit: I once worked with an ex-googler who told me to always store all the data you have because you never know what you can use it for, you can't get it back if you change your mind and storage is cheap. Hard to argue with this if it's "just" about competitive advantage and monetization.


> always store all the data you have [...] storage is cheap.

Here's the flip side: You can't lose data to a breach that you don't store. You can't have a rogue employee crow on social media about how they have access to data you don't store. You can't be liable for GDPR violations about PII you don't store.

Data is certainly an asset, but it's also a huge liability - and laws are starting to catch up in order to enforce how big of a liability it really is.


> Here's the flip side: You can't be lose data to a breach that you don't store.

That means NO storage, not even "anonymous". (which Apple clearly does)

> You can't have a rogue employee crow on social media about how they have access to data you don't store.

Requires NO storage, which they clearly do.

> You can't be liable for GDPR violations about PII you don't store.

If you store it "anonymous" you can, since the only requirement for it is to be personal data and there is zero change it can't lead to the person and anyone working with those 'unique' identifiers can tell you they most likely aren't that anonymous and the data can be used to trace a single person.


> That means NO storage [...]

There's a difference between storing only the data required for conducting your business, and storing all the data you possibly can for some imaginary future use. I'm suggesting the former is a better practice.

> If you store it "anonymous" you can

Anonymity via random-but-unique IDs is a tenuous protection at best when storing everything you possibly can. Take, for example, the fact that with a gender, a zip code, and a birthday, you can be uniquely identified with around 85% accuracy [0]. None of those are traditionally considered to be PII, and it's pretty likely that these are the kinds of things stored "anonymously" by the "store it all for later" kinds of companies.

[0] https://www.eff.org/deeplinks/2009/09/what-information-perso...


> And then you have users who always go to the same PF Changs but use navigation in case there is traffic and they should use an alternate route. Your competitor app will learn this and adapt and yours won't.

Your conclusion doesn't follow from your premise.

Once a user navigates to a specific PF Chang's from a specific location, the navigation app can suggest an alternate route that takes into account traffic, regardless of whether that user has a history or not.

In your scenario, the history of the user data would not affect the ability of the app to suggest routes that are less trafficked.

EDIT: change adverb in last sentence


His point was to store that data because you might want it in the future - not that its necessarily useful now, in this situation.

(Ex. 2 years from now, when Apple gets into the restaurant business, they might want to analyze what users (user ids) search for what types/distances of restaurants.


Storing lots of data indefinitely is not cheap, it has a large fixed cost to develop and maintain the storage system, incremental storage is often cheap though.

For a company like Google that may have a reasonable need to store a lot of stuff (multiple versions of the web corpus, Gmail, drive), it may be cheap to also store search queries forever and who knows what else. For a company without an intrinsic need to store large data for long periods, it's not cheap to add.

Collecting information you don't plan to use and don't know how you will use is likely to mean when you do use it, you didn't collect it in a suitable fashion, so you may not be able to use it anyway. In the meantime it's a privacy liability with no value.


Doesn't this assume you don't use Siri for anything personal?


There might be logs somewhere that could be used to trace the original user.


Clickbait headline - could it get changed to clarify the alleged answer is “very little”?


Agreed, headline is very clickbait-y, should be changed


To be fair, MB of data is kinda a crummy metric because for Facebook that's primarily photos and videos


I always question the effectiveness of obfuscating data using "unique identifiers" against a party determined to de-obfuscate the data. Aside from that, however, this is an encouraging read.


You may enjoy learning more about Differential Privacy. https://machinelearning.apple.com/2017/12/06/learning-with-p...


Some interesting previous discussion regarding the limitations of Apple's differential privacy can be found here: https://news.ycombinator.com/item?id=15224312


Differential Privacy and anonymous-but-unique identifiers are disjoint methods of protecting privacy. Per TFA, they appear to be using the latter.

And WRT differential privacy, if enough data is captured and associated with a single identifier, then it's sufficient to get a pretty good idea what the user actually does. That's the point of differential privacy in the first place, since it indicates that a statistically meaningful amount of the data is valid.

For example, if my user id has five visits out of 1000 DP recorded visits to google.com, it's unlikely that I actually visited google.com. However, if there are 200 recorded visits, it can be safely said that yeah, I intentionally visit google.com.


Exactly my thought. If they can connect all siri queries made by a user, it doesn't matter much if they don't store the name of that user. Deanonimisation happens easily through correlations, some correlations are really hard to predict.

Id't be better if they just didn't store an ID with the data.


I think the article is badly phrased, and that they mean that _each_ individual Siri request from a device is from a random identifier, so they can't track it back to you.


According to Apple's privacy page[0]:

> Siri and Dictation do not associate this information with your Apple ID, but rather with your device through a random identifier. Apple Watch uses the Siri identifier from your iPhone. You can reset that identifier at any time by turning Siri and Dictation off and back on, effectively restarting your relationship with Siri and Dictation.

So it looks like it keeps the same ID until you turn Siri off; if you turn Siri back on, you'll have a new ID.

[0] https://www.apple.com/privacy/approach-to-privacy/


Per a recent article about Siri and third parties:

> According to Wired, which reported on the story back in 2013, the data is shipped off to Apple's data farm for analysis, where it generates a random number that represents both the user and voice file. Six months later, Apple "disassociates" the user number from the clip, thereby deleting the number. The files are then stored for an additional 18 months, all for the sake of testing and product improvement.


Dissasociation is removing the entry on the map table. At least, it is where I work (Not an apple employee).


This was my interpretation as well.


Apple needs to create a dashboard exactly like the Google one.

Their agreement allows them to collect tons of data but there is no transparency on what they have and no ability to remove or download.

The Google one also has all your devices in one place with what apps and permissions you have granted. Exactly what I would like for my Apple devices.


Ironically, complying with these asks would require more data to be pushed off the individual devices.


Doesn't that already happen when you upgrade to a new device?


A lot of the data transferred to a new device happens over an encrypted channel - a channel that Apple has no access to, and thus can't populate on a dashboard.


Okay, wait. So tech company can store all that data under “xyz anon user” and then they don’t have any data about you?


Depends on the data but yes.


It would be nice to have more than "Apple also offers data downloads in the privacy section of its website. It's hard to find..."

I can't find it... I started at icloud.com which sent me to appleid.apple.com which sent me to apple.com/privacy/ and I never found a data download link.


How can it direct you to the nearest Chipotle if it doesn't know where you are? It must know who you are at least at the time of the query


Just pointing out the obvious: Absence of evidence is not evidence of absence.

The only eyeopening part would be the fact, that Apple doesn't store much about it's users, which isn't verifiable.

Also unique identifiers are not the only way to link data. And due to differential privacy the guarantees that the data cannot be linked decays over time (see https://news.ycombinator.com/item?id=15224312)


The question is motive. Apple's business model isn't predicated on having a lot of data on you. They have to keep a list of your purchase history for instance so you can redownload purchases.

Also, why would they risk lying? What's in it for them?


I'm by no means implying any malfeasance. I fully agree with your thesis about Apple's business model.

Depending on your threat model in regards to your privacy the assurances made by Apple may be enough.

To me, it doesn't really matter. As long as I have no proof or control over how my data is being processed, it's better to just assume the worst case and practice data minimisation.


> Absence of evidence is not evidence of absence.

That's neither obvious nor true. Absence of evidence is more likely if there really is an absence of whatever; thus it is in fact evidence of absence, if only weak evidence.


This click baity title didn’t come from the article. You’ll never guess what happens next!


The title of this article is click-bait, surely? Apple have made much PR out of their privacy stance. Indeed, the article says: "Apple makes a big deal about its different approach to privacy on the company website".

How is this eye opening?


The eye opening part was how little data there was and the hoops needed to verify the recipient. This is in comparison to Google and Facebook. Both have orders of magnitude more data and sent the data with, apparently, little verification.


It's not really eye opening considering Apple products cost money. Their business does not lie upon selling data to advertisers or other entities.


But they could collect data and presumably profit more. If they did collect more data and profited from this data collection would they really lose enough customers to offset this increase in profit? It seems to me they wouldn’t. I think they are taking a moral stance on the issue at least for the time being. This may change in the future.

It is surprising though for the company not to lie about this given the shenanigans that many large companies engage in.


It's eye opening that the author doesn't use iCloud? Why? Almost nobody uses iCloud. Each of its components is worst in class.


I use iCloud for storage. It’s worked great for me. I’ve never had a problem with it. It wasn’t eye opening to me that the author doesn’t use iCloud.


Perhaps because we've all been conditioned to ignore PR, or to believe the exact opposite of what PR says. Seeing that it's true is an unexpected situation.

It's also eye opening in the context of a comparison against the other FAANG companies in question.


For you maybe it isn’t

But you’re only one of 300 million in the states and 7 billion on the planet

And it’s USA Today, mainstream news. Maybe it’s not targeting someone with nothing better to do than memorize Apples policies


Did we really need a photo of the author holding a prinout of the data?

> It kept a copy of every app and song I'd downloaded, every tune I'd added to my iTunes music library, and every time I needed repair on a multitude of Apple devices going back a decade.

Is this surprising in any way? If you buy something (or "buy" it for free on the App Store) of course the company you buy it from keeps records of that.


If you buy something of course the company you buy it from keeps records of that.

I don't think it's a problem, but they could not tie it to your identity. After all, when you buy something with cash, the company might keep a record of the purchase, but it's not (or at least not always) tied to you.


> Is this surprising in any way? If you buy something (or "buy" it for free on the App Store) of course the company you buy it from keeps records of that.

It need not; for example it doesn't need historical data that have been superseded; if it no longer has an app available for download it could purge the fact that you bought it; it could give purchasers an anonymized download key that their phone could store (or cache in icloud someplace).


I'm not complaining per se, but quite puzzled why someone would downvote this factual comment. Plenty of companies sell you things without maintaining a record (say, a brick and mortar shoe store) and there's typically little reason to do it online either.

Yes, I know plenty of online companies do routinely spy on users but I see no reason to consider this "of course". This is admittedly creeping into meatspace (there's absolutely no legitimate reason for causing things like automatic toll tags or driving licenses to supply privacy-busting data to third parties, but on the other hand plenty of sandwich shops manage to have "frequent diner" programs that are simply a card the customer carries which is punched each time a sandwich is purchased. There is no reason why an online business shouldn't do the same -- and GDPR should drive businesses to do so. It's in the customer's interest.


That’s pretty rough from a financial audit perspective.

For budgeting purposes I’ve tried to find a full list of my iTunes purchase, and couldn’t find it in the store. Didn’t think to look in privacy.


8days= more time to filter user content and claim they have no data about you




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: