Hacker News new | past | comments | ask | show | jobs | submit login
Improving Siri's Privacy Protection (apple.com)
92 points by epaga 54 days ago | hide | past | web | favorite | 40 comments

"As a result of our review, we realize we haven’t been fully living up to our high ideals, and for that we apologize. "

This is a good response by Apple. I hope the incident will motivate a higher level of proactive privacy protection and provide an example for others to follow. Apple is in a position to do much good. I think their privacy focus is a great business decision as it promotes an advantage their competitors will be hard pressed to follow as consumer privacy awareness and demand increases. It is still not enough though. If only Apple would become as aggressive about privacy as are entities such as Purism, The Tor Project, and the Electronic Frontier Foundation; individual privacy could improve in a big way very quickly.

> Siri uses a random identifier — a long string of letters and numbers associated with a single device — to keep track of data while it’s being processed, rather than tying it to your identity through your Apple ID or phone number — a process that we believe is unique among the digital assistants in use today. For further protection, after six months, the device’s data is disassociated from the random identifier.

Interesting, I thought I had heard it widely reported that Apple was keeping hold of audio records tagged with your Apple ID for 6 months, before anonymizing. That looks like it wasn't the case, and Apple was only tagging those recordings with a device ID, presumably to associate recordings with other recordings.

Yeah, "widely-reported" was The Verge. As John Gruber points out[0], The Verge wasn't wrong, but I can't say their reporting would give the average reader a good grasp on what was really going on. That would include myself: https://news.ycombinator.com/item?id=20724558

[0] https://daringfireball.net/2019/08/apple_siri_privacy

From 2017, about the recording and tokenization steps:

Siri records your queries too, but she doesn’t catalog them or provide access to the running list of requests. You can’t listen to your history of Siri interactions in Apple’s app universe.

While Apple logs and stores Siri queries, they’re tied to a random string of numbers for each user instead of an Apple ID or email address. Apple deletes the association between those queries and those numerical codes after six months. Your Amazon and Google histories, on the other hand, stay there until you decide to delete them.


From Wired, “Apple finally reveals how long Siri keeps your data”, in 2013, about later disassociation from the tokens:

Once the voice recording is six months old, Apple "disassociates" your user number from the clip, deleting the number from the voice file. But it keeps these disassociated files for up to 18 more months for testing and product improvement purposes.

"Apple may keep anonymized Siri data for up to two years," Muller says "If a user turns Siri off, both identifiers are deleted immediately along with any associated data."


I don’t want audio recordings or transcripts to rain in their servers. I don’t even want “smart” Siri. I want stupid Siri aka proper speech to text locally + a fixed set of commands I know/learn/discover

There is no need to send all of this to them. If I want to suggest a command, allow me using a simple form.

The concept of data collection being off by default, and explicitly asking users for permission to turn it on is exactly what I would expect from all of FAANG going forward.

It's the default settings that matter most. Especially when the privacy permissions are frequently hidden away and defended by dark UI patterns intended to keep user from finding data access permissions and turning them off.

Now that Apple is adopting this position, will Google, Facebook, Microsoft and Amazon follow suit?

The inadvertent triggers is what worries me. My HomePod is always randomly saying things like “I’m sorry I didn’t get that” or “go ahead”. And creepily one night just said “I’m still here”.

Ideally they’d fix that but in the mean time I’m glad I can at least rest a little easier knowing Apple isn’t listening unless I opt in (which I might consider if it helps them actually fix the false triggers).

Maybe they need a dashboard that shows all the requests and let you mark which ones were false or wrong instead of having someone directly listen?

It's interesting how they talk about "doing as much on device as possible"... but Siri's voice recognition still works by sending the Siri request audio to an Apple server and doing the voice recognition there (only the trigger phrase "hey siri" is recognized on-device). Why isn't it all done on-device? I'm pretty sure even older iPhones have more than enough CPU-power for that.

Isn't this something that only recently has become possible, and still has an accuracy cost?

iOS already has a separate feature "offline dictation" which works on-device. I don't understand why that isn't used for Siri.

Also about "recently"... back in the late 90s I had a desktop PC running Windows 98, I think it was a Pentium 166 MHz with 32 MB RAM. On it, I had a voice recognition program called "Dragon Naturally Speaking". It required a little training with my voice but after that, it worked remarkably well. And that was over 20 years ago on a PC with a - by today's standards - very primitive CPU. Decent voice recognition isn't new technology.

The problem with Dragon was that it was so inaccurate that a moderately skilled typist could produce final corrected text much more quickly than they could dictate text and then make corrections.

Looking online, this is a feature Apple is adding in this year's iOS and Google is adding to Pixel devices only so far, so I would expect to have to give them some time to get on device speech recognition working at all as a first step.

Exactly. Also, smart Siri is so stupid, it recently started correcting the contact I want to call to someone else, even though it was properly recognized and transcribed.

Funny thing is I call this person daily, and the other one was 6-7 years ago

Frustrating as hell

>>Third, when customers opt in, only Apple employees will be allowed to listen to audio samples of the Siri interactions. Our team will work to delete any recording which is determined to be an inadvertent trigger of Siri.

Unless the government send a warrant, then we will share anything we have with them and we will do what they want to do. i.e.: "we want all audios from this zipcode".

This is essentially what has happened to Chinese iPhone users except not just one zipcode but the entire country.

> users will be able to opt in

Would you help make Siri better? Yes|No

I'd have preferred that they asked me to grade my interactions but this seems to be the right move.

I feel if they provided you with a way (on device) to provide a corrected version and/or mark individual transcription entirely on device, with the option to forward individual options.

E.g. Wanting to “help Siri get better” shouldn’t have to be an all-or-nothing opt-in. Though obviously individual posts are also probably less likely to be sent.

Or maybe simply sending the local model updates would be sufficient to improve things globally?

You can actually provide a corrected version to Siri by pressing the little "Tap to edit" under the transcript! I'm not sure if the correction ever get sent to Apple (maybe it's on-device training), but I've found after a while Siri accuracy improved a lot, e.g. no longer interpret "Log" as "Lock" when I said it.

Is there a way to do that for HomePod?

None that I’m aware of, sorry :(

I feel like I'm in bizarro world. It opens with:

>At Apple, we believe privacy is a fundamental human right. We design our products to protect users’ personal data, and we are constantly working to strengthen those protections.

And ends with Apple saying they will no longer store audio recordings listened to by third party contractors by default. If Apple cares about privacy, why was that not opt-in to begin with?

If Apple cares about privacy, why was that not opt-in to begin with?

Apple has to invent a time machine before you'll offer any forgiveness? They made a mistake, if previous behavior was so egregious that no improvements going forward will suffice, then from where I stand the only remaining option is to use other vendors.

GP never said there's anything Apple could do at this point to win them back.

IMO, a company that really took privacy seriously would never, ever have let humans listen to Siri recordings without explicit user content. There is in fact nothing Apple can say post-mortem to change this.

Now, whether any of the mainstream alternatives are better for privacy is a legitimate question—but at this point, I don't think anyone should choose Apple because of their privacy claims. If there are no more incidents in the next five years, maybe we can reconsider at that point, but that is how long it should take for Apple to regain trust. (That, or Apple opening their systems to public audit, which will never happen.)

Until Google gives us a way to turn off their buying a copy of everyone's brick and mortar store credit/debit card transaction data, they cannot be taken seriously as a privacy option.

>Google has been able to track your location using Google Maps for a long time. Since 2014, it has used that information to provide advertisers with information on how often people visit their stores. But store visits aren’t purchases, so, as Google said in a blog post on its new service for marketers, it has partnered with “third parties” that give them access to 70 percent of all credit and debit card purchases.


Even if the mainstream alternatives are also bad, two wrongs don't make a right.

"We didn't include an option to turn off data analytics for Siri in the privacy section of the control panel where all the other data analytics options are" is a far cry from "We are now buying a copy of everyone's credit card transactions so you should buy all your targeted ads from our fully armed and operational surveillance capitalism Death Star". (to borrow a Star Wars metaphor)

The trouble is that we don't know what Apple is doing. It's mostly a big black box. Maybe they are tracking everyone's location and credit card transactions.

I've been on the exact opposite side of this argument for the past year, both here on the internet and IRL. Sure, we can't look inside Apple's systems, but if they say they're doing X Y and Z, we ought to believe them. Apple knows that if something comes to light after they've billed themselves as a privacy company, it will be a big scandal, so they'll want to keep their word.

Well, here we are, and now I feel stupid. Fool me once...

The fact that Apple is bragging that they take privacy seriously while Google has been bragging to advertising customers that they have been violating the privacy of everyone's banking transactions is everything you need to know to see that the two companies have a very different approach to user privacy.

However, Google can still adopt Apple's position that data collection should be turned off by default and only turned on by the users after you explain what will be done with the data.

They won't since spying on users is how they make the vast majority of their money, but they could.

Well they can perhaps also stop doing business with Chinese government and then continue preaching about privacy. They've proven flexible with their morals when it came to money too many times to be trusted. Just like any other corporation.

Maybe they should also stop doing business with the American government? ;-)

If Apple stopped doing business with China they would have to stop selling products all together.

There just isn't the manufacturing capability anywhere else in the world to compete with China. It's a shame that the world outsourced such an important skill but that's where we are.

Other phone makers manage to manufacture phones in China without handing all Chinese customer data to the Chinese government. It is possible to manufacture in China without selling to Chinese consumers.

Don't forget that they take google's money and put them as the default search provider in safari.

Apple loves privacy, but they love money more.

> If Apple cares about privacy, why was that not opt-in to begin with?

Partly because Siri wasn’t Apple to begin with. It was a company they acquired with its own app, with established processes. M&A doesn’t necessarily replace all processes with new processes.

A couple simple, potential explanatory thoughts/ideas strike my mind:

- All this digital privacy stuff is new to humanity

- We've only been trying to figure it out for a couple decades

- We're still learning—sadly, mostly through trial and error—all the ways in which people can come up with privacy-invading techniques and tools

- Motivated actors are clever, and think up ways to invade, violate, and compromise privacy, often in ways software/hardware makers haven't already thought of—but seem obvious in retrospect

- Apple's concern for privacy has emerged alongside its development of devices and software that can be used to invade privacy—they can only take action against a danger/threat after they've thought or become aware of it

- Again, all this stuff is new. We used to run e-commerce and punch in credit cards without thinking about SSL certs—cos shit was new, shiny, and magical.

We're all becoming increasingly aware of just how much privacy needs to be a core development premise of just about any technology. And we're currently stuck in a weird place of individuals and companies trying to increase privacy levels while other individuals, companies, and governments are trying to break it down, get backdoors, invade everyone's privacy and track them everywhere they go, etc.

tl;dr — Respecting, protecting, and increasing digital privacy is a constant process.

Are you going to put the same post under Facebook's, Google's and some other news... or does this "It's all new!" cop-out count only for Apple?

Remember, they plastered "What happens on your iPhone, stays on your iPhone" ads all over CES this years while they kept transcribing Siri audio. And they did that until they got caught. Just like they throttled iPhone performance until they got caught and THEN they finally told users about that.

Why would they get a benefit of doubt?

Not the OP but for some reason this privacy-first messaging comes across as much more genuine from Apple than it does from Facebook or Google.

Probably due to the comparatively small number of privacy incidents, coupled with the business models of those companies compared to Apple.

Apple may not be perfect but at least it feels like they are trying.

Or perhaps it comes across as more genuine because they have better marketing while they follow the others as soon as it suits them to make a buck? They lied to their customers faces with building sized ads.

Follow the money.

Google and Facebook are built on renting our privacy out to the highest bidders. The minutiae of our lives are their raw materials, and to make more money they have to find new and exciting ways of exploiting our information.

Apple is built on making things we want to buy, then selling us stuff to do, watch, play, read or hear on those things. They’re also very good at telling us about those things and why we might want to buy them.

So yes, privacy-first messaging comes across as more genuine when it comes from companies that aren’t designed from the ground up to scrape every bit of information about us that they possibly can.

One can think Apple's decision was wrong while also thinking it wasn't intentionally malicious.

Five or more years ago, I gave FB & Google the benefit of the doubt—when the first few major privacy fiascos came to light, making it clear that it was a big deal and privacy needed to be cared about. Sadly, Facebook and Google have repeatedly shown themselves to be actively hostile to user privacy, and continue to double and triple down on invading privacy and breaching trust. They've been the giants who have proven the need for making privacy a primary concern of all tech hardware/software—because they are dependent on breaching and invading privacy to earn their income.

Apple does not depend on providing third parties access to identifiable data and profiles to better target ads, identify users, and destroy privacy and anonymity to earn their income. This pretty much means a wholly different default outlook from Facebook and Google. Apple continues to not just market, but take active steps toward increasing privacy and data protection for its users across its hardware lineup. I'm not at all suggesting Apple isn't worthy of criticism when they fuck up—they absolutely are. I'm only responding to the question of why a certain feature may not have been opt-in to begin with—when it appears obvious in retrospect that it should have been.

In short, Apple still has my reserved, cautious trust. Facebook and Google have long since lost it, and I can't even imagine what they might be able to do to ever earn it back.

When Apple starts building the privacy-related reputation that Google and Facebook have, I'll be all over the hate train. As it stands, as someone who believes he really cares about digital privacy for himself and others, I never thought about the fact that a Siri recording might be listened to by humans—but now, in retrospect, of course it was (I would have wanted to listen to recordings to grade my own software's responses), and of course it should have been opt-in to begin with.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact