
Siri, Privacy, and Trust - baxtr
https://daringfireball.net/2019/08/siri_privacy_trust
======
mightybyte
I still categorically refuse to enable Siri or any other voice activated
technology. And yep...this confirms that was the right decision.

I wouldn't be surprised if those recordings also include classified
conversations. Some extremely high levels of classification require you to go
into sound isolated rooms and leave your phones outside, but there are still
plenty of levels of classified information where that is not a requirement.

------
danShumway
I think people are potentially overestimating too how obvious this was in
hindsight. Quick experiment: we know that Youtube recommendations are based on
your watch history. And it seems reasonable to assume that humans could see
part of that. But how much can they see?

Watch history seems like something that would be difficult to analyze in tiny
chunks, you'd really like your researchers to see large periods of watch
history. So maybe anonymization is, "here's a months worth of videos, and it's
linked to an ID, but not a username."

 _Without looking it up online_ , for anyone who wants to take a guess, how do
people think that Google's Youtube algorithm training works for
recommendations? How much of your full watch history do you think a contractor
can see at a time? What anonymization efforts do you think Google uses? Or do
you think it's mostly automated based on automatic classification?

If the Verge or Motherboard put out an article tomorrow that said that
contractors could see your entire watch history, would you be surprised? If an
article came out tomorrow and said that to help train email classification,
human contractors were reading GMail messages, would that surprise anyone
here, or is that something we collectively expect already?

------
ghaff
I'm somewhat of two minds about this.

On the one hand, it's hard to really disagree with Gruber (and his quote from
Steve Jobs). Being fully respectful of privacy means being as fully,
transparently, and granularly opt-in as possible--in plain English.

On the other hand, doing so will almost certainly slow progress on a voice
assistant and make it less generally useful relative to other assistants that
don't have such protections in place (all other things being equal). Even if
some opt-in fully, it probably won't be sufficient to make up for the many who
don't.

Is making everything opt-in the right tradeoff? Probably. We do it for program
crash information after all and (I guess?) that's ended up being good enough.
And, honestly, if voice assistants don't progress quite as quickly as they
might in a privacy oblivious world, it's not the end of the world.

~~~
traderjane
In tension with the motivations of their app store, the degree to which Apple
services and apps fail to persuasively satisfy is the degree to which their
users will be tempted to leak data to 3rd parties.

~~~
ghaff
At some level tensions make markets. I know it's not as black and white but,
to the degree that Apple establishes themselves as the company that
prioritizes trust relative to other companies that don't--but which have
assistants, etc. that are incrementally better--that's for users to choose.
And different users will choose differently as they should be able to.

One can debate whether one or the other is the better _business_ decision but
that's a separate discussion.

------
mikestew
In light of the quote from Jobs that Gruber has at the bottom, I would have
been fine with it had I known up front. But you pulled a fast one on me,
Apple, and that is explicitly what Jobs was trying to prevent in saying that.
On top of that, you made me feel foolish in the general trust I have
previously placed in Apple to respect my privacy. _That_ , once lost, is
difficult to get back.

I’m not throwing my household full of Apple shit in the dumpster and rage
quitting, but I’m quite disappointed in you, Apple. Most disappointing is that
I now have to keep an eye on you.

------
Terretta
Gruber is acting surprised. This isn’t a surprise. Here, from 2017:

 _Siri records your queries too, but she doesn’t catalog them or provide
access to the running list of requests. You can’t listen to your history of
Siri interactions in Apple’s app universe._

 _While Apple logs and stores Siri queries, they’re tied to a random string of
numbers for each user instead of an Apple ID or email address. Apple deletes
the association between those queries and those numerical codes after six
months. Your Amazon and Google histories, on the other hand, stay there until
you decide to delete them._

[http://themillenniumreport.com/2017/03/not-only-are-alexa-
si...](http://themillenniumreport.com/2017/03/not-only-are-alexa-siri-echo-
home-listening-to-everything-you-say/)

From Wired, “Apple finally reveals how long Siri keeps your data”, in 2013:

 _Once the voice recording is six months old, Apple "disassociates" your user
number from the clip, deleting the number from the voice file. But it keeps
these disassociated files for up to 18 more months for testing and product
improvement purposes._

 _" Apple may keep anonymized Siri data for up to two years," Muller says "If
a user turns Siri off, both identifiers are deleted immediately along with any
associated data."_

[https://www.wired.com/2013/04/siri-two-
years/](https://www.wired.com/2013/04/siri-two-years/)

~~~
danShumway
Neither of those articles mention that real people listen to the recordings:

The updated version of the milleniumreport lists:

> _Update: But what about the bigger pool of data, the aggregated voice
> queries from each system’s user base? According to this Bloomberg story,
> Amazon, Apple, Google, and Microsoft are using all that variety to hone
> these systems’ understanding of spoken language even further. Several times
> a day, Amazon uses the entire stack of Alexa queries to educate its A.I.
> about dialects and casual speech. Microsoft has mysterious fake
> apartments(!) set up to record and understand natural speech patterns.
> Google slices and dices the audio it’s already captured, then remixes it to
> help train its system. All these methods are meant to make your voice
> assistant smarter in the coming years._

Nothing in that paragraph would suggest to a normal user that actual human
beings listen to their recordings. In fact, it would suggest the opposite:
that in order to get around human review, Microsoft would go to the trouble of
setting up fake apartment buildings to do tests in. Amazon educates its AI
about queries using the entire stack of queries? Well certainly, that's not
happening with a human being, it would be too labor intensive.

This is in an article designed for non-technical users, that has to explain
much more legitimately obvious things like wake words:

> _Listening to what you say before a wake word is essential to the entire
> concept of wake words. The process borrows a page from the pre-buffer on
> many cameras’ burst modes, which capture a few frames before you press the
> shutter button. This just does it with your voice._

Yet when it gets to the "low-paid contractors will occasionally listen to you
having sex" part, suddenly everything is very vague.

Ordinary people did not understand this, and based on Gruber's article, a good
number of technical people did not understand either. And in light of the
quote at the end of this article, why should Gruber need to read Wired to
figure out what's happening on his devices? That's not a good user experience.

Yes, maybe you can argue that he should have understood what "for testing and
product improvement purposes" entailed. But he didn't. A lot of people don't
know how AI works, and they don't know how ML testing works, and they
shouldn't have to.

~~~
Terretta
> _Neither of those articles mention that real people listen to the
> recordings_

Recall that Siri was an acquisition _. If you go back to media then, you’ll
see many companies were doing this at the time. There was lots of coverage
about humans training. Some companies even got in trouble that it was_ only*
humans, even the supposedly AI parts. (An email transcription service, for
example, had a human transcription center in Egypt.) Point being, this was
known, and seems to have been forgotten.

Even now, all of them mean people reviewing quality when they say for product
improvement purposes.

Anyone who has called any big company knows exactly what the phrase “for
quality improvement purposes” means. They tell you they’re going to be
recording your call for quality improvement purposes and while they don’t say
this part, you absolutely know it means there’s a chance someone is going to
listen to it.

Since that’s about the only time people ever hear the phrase not buried in
legalese, it could be assumed that’s what it means in other contexts as well.

* EDIT: Linking to a history of Siri since it’s been over a decade now: [https://www.bestaiassistant.com/siri/old-siri-history-siri-a...](https://www.bestaiassistant.com/siri/old-siri-history-siri-apple/)

~~~
danShumway
I don't disagree with you that it's reasonable to link those phrases to a real
person listening, or even that people _do_ link those phrases in other
contexts.

But they haven't linked them where AI is concerned. There could be a couple of
reasons for that -- it could be that phone recordings are _primarily_ human
analyzed, and because AI has more automatic learning, people just assumed it
was different. Or maybe it used to be obvious and then over time people just
assumed AI matured enough that it wasn't necessary any more. But regardless,
it seems pretty obvious to me that a lot of people (even smart people that I
respect) didn't understand.

So to me, talking about whether they _should_ have understood doesn't change
much -- I still think it would behoove a privacy-centric company to be a lot
more clear now that we know that people don't understand.

------
sod
Seems like at least one top level executive at Apple doesn't breathe the same
privacy mindset that marketing made me believe how they operate. And I feel
like an idiot laughing at friends for using alexa or google assistant. At
least they didn't pay a premium to become spied on at home.

~~~
giancarlostoro
I think it was a foolish oversight on Apples front. I hope they take this as
an opportunity to reevaluate all their services and come forward with any
findings. That would be something. Sadly I fear they may just wait till the
next exposure to fix that one as well.

On another note I never use voice assistants. I just cannot be bothered to
relearn how to speak for them.

------
TazeTSchnitzel
It's clearly a misstep on Apple's part, but I think their intentions were at
least clearly benign. Note that Apple do not link your Siri information to
your Apple ID, they do not use it for any other purpose than improving Siri,
they keep it only for 6 months and you can delete it (and change your
identifier) at any time.

But they definitely should let people choose whether their data is stored to
help improve Siri, and whether it will be managed by humans.

~~~
majewsky
> Apple do not link your Siri information to your Apple ID, [...], they keep
> it only for 6 months

Are you sure about that? A quote in the article reads:

> [Apple] says it keeps recordings for six months before removing identifying
> information from a copy that it could keep for two years or more.

~~~
TazeTSchnitzel
Huh… the identifying information is at least not your Apple ID, but
unfortunately because Siri needs to be able to recognise the names of your
contacts etc, it wouldn't be hard to reconstruct it.

------
kgwxd
As a general rule, if trust is required, I consider privacy surrendered,
especially when that other party is not legally required to protect the
information. Even if the people running a business today really aren't looking
at your data, the people running it tomorrow might.

------
jancsika
This looks like the same phenomenon of an investor continuously doubling down
as their investment tumbles, assuming that they'll be even richer at each step
when the investment rises back up.

Except here it is FAANG who just keep breaking the user's trust to get the
data. But that is a necessary step on the road to some "El Dorado" system that
will have been trained on so much unethically collected data that they are
done and it can finally become an ethical system.

Edit: clarification

------
andrerm
Paying or not, you're the product.

------
sc90
I'm curious to know how the contractor's have handled any crimes/intent to
commit a crime while reviewing the recordings. What's the policy in such a
case?

~~~
ghaff
One would have to ask Apple for their policy--and I doubt they would tell you.
Obligations would depend on state law [in the US]. My understanding is that
generally speaking you don't need to report a crime although, in some states,
there are exceptions for certain felonies [serious crimes--e.g. you witness a
murder].

------
m3kw9
Way overblown, none of the speech is identifiable back to the user, unlike
google which need to use it to target ads. He can get triggered quite easily
from reading his political posts

