
Amazon appears to be tracking every tap on Kindle - CrankyBear
https://twitter.com/adrjeffries/status/1222277544730337280
======
Kuiper
One of the selling points for Kindle is that if you switch devices partway
through (e.g. switch from reading on your tablet to reading on your phone, or
switch from reading the ebook to listening to the audiobook in your car), it
remembers what page you're on, so you can resume exactly where you left off.

Amazon actively touts this "Whispersync" feature in their marketing. (From the
Kindle product page: " _With Whispersync, switch from Kindle to the Kindle app
without losing your place (requires Wi-Fi)._ ") One would presume that Amazon
achieves this by tracking whenever readers tap the screen to advance to the
next page. (And having a timestamp for that tap matters for resolving merge
conflicts.)

Also worth noting that in the case of Kindle Unlimited (Amazon's "Netflix for
ebooks" program), authors get paid per page read. (If a person reads the first
5 pages of your book and drops it, the author gets paid less than if they read
the whole thing.) One of the things that Amazon has to deal with is fraud
prevention, to detect when authors are finding ways to game metrics:
[https://techcrunch.com/2018/06/11/notorious-kindle-
unlimited...](https://techcrunch.com/2018/06/11/notorious-kindle-unlimited-
abuser-has-been-booted-from-the-bookstore/)

~~~
rtkwe
There's a less invasive way to do that though by storing the latest location
for each device. You don't need to store each and every page turn for the page
sync feature.

~~~
tqi
How is "tapped on page 11" different from "is on page 12"?

~~~
alpaca128
90k rows of "tapped on page X" is definitely different from "is on page 12 in
book Y" with exactly one entry per book.

~~~
tqi
"Is on page 12 in book Y" at what time? If you send an update every time
position changes, then there is no difference.

~~~
rtkwe
There's a difference between storing every single tap and just storing the
latest position for each book and device pair. And the Kindle doesn't update
whispersync every page tap anyways it syncs periodically so the Kindle is
storing that tap info and sending it all so it's not like this is just a
factor of logging the data it gets sent for updating page position. [0]

[0] I think Kindle on Android is the worst for this. Sometimes I don't get the
position synced to my Kindle even 30min+ after leaving the Kindle app. Seems
like the way to guarantee the server gets updated is to either exit the app or
to return to the library.

~~~
Godel_unicode
There's a "sync" button. It's in the top nav on physical Kindles and the
hamburger menu everywhere(?) else.

------
wiredone
...Just wait until they hear about services like Google Analytics, and
MixPanel.

Jokes aside, i often find the excitement around "tracking" to be frustrating.
Like fears of global conspiracies and mind control, the idea that somewhere a
companies engineers are using this data to track your inner most thoughts is
crazy in practice. Instead they're using it as a way to diagnose a bug when
your ebook crashes or as a way to figure out how to make sure you're not
getting stuck in a poorly design UI.

~~~
braythwayt
It's not a dichotomy.

It can, for example, be that out of 1,000 companies tracking your every click,
your every character typed, and your every web site visited, 999 are doing it
for better bug tracking or feature development.

But the 1,000th company is Facebook.

So it feels reasonable for people to ask, "What are you tracking, how is it
used, and how can I be confident it won't be abused?"

~~~
noident
>What are you tracking, how is it used, and how can I be confident it won't be
abused?

Because of the nature of computer security, you cannot be certain that any
data that you give to a 3rd party will be secure. Once it leaves your machine,
it's out of your control, forever. Maybe an initially privacy friendly policy
vanishes once the company is bought out, or a data breach occurs and the data
is suddenly indefinitely in the public domain.

This doesn't just apply to Facebook, it applies to every single company
storing data. The only way to prevent is to have a data retention policy and
invest heavily in security, which pretty much nobody does except for the
really big players.

------
WalterBright
When I read my Kindle, I turn off the wifi/cell. I only turn that on when I
want to look at the Amazon store.

I do this mostly to save battery life, but also so it doesn't track (but of
course it may save up these logs to transmit when I do turn the wifi/cell back
on).

The Kindle at least keeps track of my last read position in all my books.
Foxit only tracks for the last two pdfs. Ditto for every other reader I've
tried. I've had to resort to keeping a separate note of my place in the book.

~~~
delecti
> I do this mostly to save battery life, but also so it doesn't track (but of
> course it may save up these logs to transmit when I do turn the wifi/cell
> back on).

It absolutely does save up those logs. I worked on some of the code which
saves up the advertisement view metrics. Admittedly that was more than 2 years
ago, but knowing the team we handed that off to, I would be astonished if
that's changed.

~~~
WalterBright
Thanks for the confirmation.

------
jgmrequel
While I'm surprised at the granularity, this seems like it would be needed at
some point to generate read times and such. Kindle tells you how much time is
left in a book and this is either a WAG or something based on data.

~~~
sigwinch28
Yes, this information would be needed at some point to generate read times.

But why does it need to be calculated on Amazon's servers? AFAIK Kindles are
running a linux kernel with a lot of busybox, and calculating a running
average doesn't appear (to me) to be a particularly difficult calculation.

Perhaps it can be argued that this calculation uses battery, but so does
sending all of this telemetry to el Amazon.

What I'm saying here is that I think that we shouldn't concede privacy in
return for convenient little UI widgets, especially when the computing power
is available, cheaply, locally.

~~~
freepor
I think your Kindle use case is single device but mine is across 4 devices so
doing anything fully locally doesn’t really work. Unless each device would
need to be trained.

~~~
sigwinch28
I hadn't considered multiple devices, but don't we also have to factor in
possible different text sizes on these devices, too?

Also, slightly OT: has Amazon ever said whether that reading time is
calculated locally or on their servers?

~~~
alpaca128
> don't we also have to factor in possible different text sizes on these
> devices, too?

The kindle tracks progress via the actual amount of text read instead of
pages, so the screen and text size should be irrelevant. It can still be
switched to display the page number, but that is also independent of the
amount of text on the screen(meaning it doesn't necessarily increase with
every swipe to the next "page").

~~~
sigwinch28
>The kindle tracks progress via the actual amount of text read instead of
pages

This doesn't seem to me like what the tweet is describing, wherein the kindle
is registering every tap instead of distinct "x words/chars progress made".

If they were capturing and storing "X words in fiction read in n seconds", I
could understand it, but they're not: they're registering every tap. I'd be
interested to see how this matches up with "userChangedTextSizeToBlah" data if
this is how they're calculating reading speed.

------
whateveracct
I've never met a PM who didn't want this stuff if it was feasible.

------
giosalinas
This is very common, everyone is doing it, and it just for measure UX and
understand pain points from users, that's all, nobody is fixated with your
taps.

~~~
freeone3000
I really can't wait until "everyone is doing it" ceases to be an excuse.

~~~
frandroid
We need a word (like "Godwin's Law" or "whataboutism") to shove in people's
faces when they spam the response threads with "everyone is doing it" and "are
you surprised?" when a tweet or news item like "BigCo or BigGov is doing Bad
Thing X." No we're not surprised but we didn't know, I want them to stop, and
fuck you, your comment is not adding to the conversation.

~~~
mlyle
It's a valid argument, though. I'll put it in different words to show.

"This is useful for legitimate reasons and is an industry standard practice
--- even though it has rife potential for abuse, too."

~~~
freeone3000
It being a "standard industry practice" doesn't really affect the utility of
the trade-off. "Kindle tracks every page turn in order to determine how long
you have left in the book" would be a statement of the trade-off, and how much
abuse there is and the worthwhileness would be a potential topic of
discussion, but "industry standard practice" doesn't really have much bearing.
If it's wrong for one person to do it, it's wrong for everyone, even if all
people are doing it.

~~~
mlyle
> It being a "standard industry practice" doesn't really affect the utility of
> the trade-off.

It doesn't affect the utility of the trade-off, but all else being equal we
usually accept what exists now. Fighting what already defacto exists is hard--
either as lone consumers, where we're tilting at windmills to little
effects... or as regulators, where we risk unintended consequences.

> "Kindle tracks every page turn in order to determine how long you have left
> in the book" would be a statement of the trade-off

Nah, Kindle tracks all these actions so that developers can improve ux and
understand how the device is being used. Wonder how often people turn pages by
mistake? Look for quick pairs of page forward with page backward. Then maybe
you can think about touch sensitivity. How much will be people be annoyed if
you remove the buttons? See what proportion of users exclusively, mostly, or
don't use buttons.

> If it's wrong for one person to do it, it's wrong for everyone, even if all
> people are doing it.

It's wrong for anyone to abuse the information. If we have an industry of
people using the information ethically, and then one bad actor misuses the
information, we need to consider that background. Do we take measures solely
against misuse, or do we attempt to stop the collection to have more certainty
in stopping the misuse?

------
berdon
Ex-Amazon employee from a while back (to which I only speak on behalf of my
own feelings and imply no other intent...). IIRC this was driven from a
analytics standpoint, wasn't actually really used outside of bug deep dives,
and was horrible to implement.

That's not to say it hasn't started to be abused but at the time it was
completely from a "customer/UX first" stance.

------
nedrocks
I'm certain this type of data is tracked on most devices. In an optimistic
outlook it drives a better customer experience because AMZN can capture trends
in data to discover things like "oh people change the page forward too
frequently accidentally" and track down root causes. In a pessimistic sense it
can help target you based on how long you spend reading specific content and
which content you highlight as a reader.

Generally every product I am aware of tracks interaction based data such as
where someone clicks or taps and what context they are in. Consider things
like `utm` parameters which suffix most links people click to determine the
context they clicked on something and what they clicked on.

I do not see this as sinister. I imagine somewhere in settings one can turn
this feature off but I don't know for sure.

* Disclaimer: I am currently employed at a subsidiary of Amazon. These views are my own.

~~~
luckylion
> "oh people change the page forward too frequently accidentally"

Would you need to log every page turn with every book and time and date for
something like that though? Wouldn't that be a more specific event like "turns
page forward, turns page back within x seconds"? This sounds more like "we
don't know what we might use this data for, but it's better to have it and not
need it than to need it and not have it ... who knows, maybe we can deduct
some profile from knowing how quick the user read through that chapter in that
book" than legitimate use cases.

~~~
nedrocks
This assumes the client can keep state in a more reasonable way than a server
can piece it together. Definitely a stateful event is more powerful but is
likely more lossy.

From what I've seen tracking simple events and then piecing them together en
masse tends to show up significantly more frequently.

~~~
luckylion
Sure, I mean, it's also easier because you don't have to know the questions
you might want answered.

Given the very private nature of the data ("he read Marx and Mao, and read
some sections carefully!"), vacuuming up as much as possible doesn't sound
like a good idea. Add to that the almost chronic inability of large
corporations to protect data, they really should start treating data
collection as a liability rather than an opportunity.

------
whatitdobooboo
Does this really matter? Maybe I'm missing something?

~~~
JohnFen
It matters to some people. Or, at least, it matters to me. The less data that
is being phoned home, the more comfortable and happier I am.

~~~
eclipxe
Why does this matter to you?

~~~
JohnFen
For a number of reasons, but the main one is what I said in my comment -- it's
important to me to minimize data leakage, even for relatively trivial data.

In this age of Big Data, when it's relatively inexpensive to weave together a
large number of small data points to come up with an overall profile that is
truly invasive, I have to consider every byte that is sent to be a risk, and
to be avoided when at all possible.

~~~
eclipxe
Thanks for answering. If you don't mind me asking, why is it important to you
to minimize data leakage? (Not trying to say it's wrong, I'd like to
understand your POV).

~~~
JohnFen
I think I answered that in my comment... my biggest concern is that I want as
little data about me as possible in databases.

There's also the question of autonomy. I actively resent data collection
without my informed consent. Companies that do this are, in my opinion, being
abusive and infringing on my right to autonomy and, to a degree, to have
control over aspects of my existence that matter to me.

~~~
eclipxe
Thank you, I understand.

------
noident
There should be an open e-ink device and ecosystem. Even if you go through the
trouble of flashing your kindle with a custom image, you're still stuck
signing into your Amazon account to borrow a book from the library.

Librarians will never give out your checkout history to anybody without a
national security letter or a warrant. Meanwhile, Amazon is probably passing
your reading records around to 50 different analysts and storing them in
databases where dozens or hundreds of engineers have access. When I go to
amazon.com, I see recommendations for other books similar to those that I've
read, including those that I didn't buy through their store.

~~~
Mediterraneo10
> Even if you go through the trouble of flashing your kindle with a custom
> image, you're still stuck signing into your Amazon account to borrow a book
> from the library.

You could always just get all your reading material from Library Genesis
instead. I have owned several Kindle models over the years, but I have never
had to interact with Amazon. I put the device in airplane mode the moment I
took it out of the box and kept it that way, and I have downloaded all my
reading material from LibGen (or a publisher who provides DRM-free ebook
files) and moved it to the device via USB.

------
jedieaston
Presumptively, they can’t do this with an unregistered Kindle, right?

Which wouldn’t be as convenient, but you could buy books from Amazon, and then
use Calibre to strip DRM and send to kindle over USB.

~~~
zozbot234
We should not be required to strip invasive DRM as a privacy-protecting
measure, that's ridiculous. Buy a physical, paper book, that's not going to
track you.

~~~
zaat
E-books do have some benefits. To list a few: they require much less weight
and space, they are searchable, they offer dictionary and they can be
delivered to you in seconds regardless of your location.

~~~
zozbot234
And I do take advantage of these benefits when I can do that easily, such as
with a plain, DRM-free format. I'm just not going to use invasive DRM-based
platforms that come with these blatant privacy concerns.

------
TheFiend7
This isn't necessarily sinister. I know UI/UX designers and a lot of them talk
about wanting a heatmap of clicks on a given webpage for experience reasons.
They want to know where and what you're clicking and how often so they can
optimize processes and flow.

It gives insight into how things are used and what's used most often etc.
Granted this isn't necessarily directly applicable to this case with the
kindle, but similar in concept.

~~~
stefan_
Facebook receiving purchase events from Dominos with the toppings you ordered
isn't necessarily sinister. They just want to know what you are eating!

Users are not your experimental group. This attitude should have died years
ago, latest with the GDPR.

~~~
TheFiend7
Very different concepts actually.

One is optimizing/improving experience based on anonymous data, the other is
building user profiles for targeted ads.

Mapbox is a fantastic example of mass data aggregation of users that has been
anonymized.

EDIT: Why should we do clinical studies on medicines when that could be
invasive to a persons privacy collecting such personal information? Is it
necessarily wrong? Tools can be used for good and evil, that's the problem
here, not the tool itself.

~~~
pseudalopex
The Kindle data is tied to the account. And who says Amazon isn't using it to
target ads?

Clinical trials require informed consent and institutional review. Regulating
software telemetry like other human subject research would make privacy
advocates very happy.

~~~
TheFiend7
We actually agree with each other so I'm struggling to see what your point is
that contradicts mine.

I explicitly said what I'm talking about isn't necessarily directly applicable
to amazon kindle. I also agreed with you regarding software telemetry being
used responsibly and irresponsibly.

So I'll repeat I don't see where we disagree.

------
dragonsngoblins
What I want is an ereader with symmetrical page turn buttons. I can't be the
only person who switches which hand they are holding their book in and doesn't
want to have to rotate the book 180 degrees and wait for the page to flip when
doing so. Surely changing hands is a common use case

I'd not worry about tracking if the experience is good enough. Please someone
make sane controls for these devices again

------
cryptozeus
Why is this a surprise? They do show advertisements so how do you think they
gather target data ?

------
blackbrokkoli
This is a great example, one could almost say the epitome, of what is wrong
with modern product design.

Gigawatts of electricity are used to run sophisticated neural nets, advanced
data pipelines built to funnel all that data to the mother ship and thousands
of dev hours went into this elaborate tracking mechanism. You need that of
course, "to improve UX". I do not have a problem with that for such a device,
per se. Fair enough, go real overkill on your "user research" then.

And yet, so many UX aspect of a Kindle are just plain bad. Problems which have
varying degrees of complexity, to be fair. But some of them are so trivial yet
impactful that it is hard to imagine they would escape the attention of _a
single_ UX engineer worth his salt looking at a Kindle for two days. Some
examples:

* I have a cheap Kindle, which means the lockscreen shows ads. They are so hilariously anti-personalized I can't even. Like, you have all this Big Data and you think I will ever buy a run-of-the-mill cliche romance?

* The recommendation system itself. When you "start out" on your Kindle, your recommendations are literally just every single other book the few authors you read have ever written.

* Having airplane mode off seems to drain the battery massively even if I am not connected to WiFi/Bluetooth nor actively attempting to

* One can view interesting usage statistics, but only if you declare yourself as your own child and activate a password based content lock

* I have to manually flip through sometimes dozens of pages of imprint, one-line-per-page copyright notices and so on until the Kindle realizes I have indeed "read" the book so it stops displaying it at the top at "99% progress"

What I want to say: Do all the Edge AI and IOT and Orwellian Surveillance for
all I care. But maybe fix the boring old low hanging fruits first?

------
c0restraint
Wouldn't they need to, in order to provide this functionality? I use several
devices to read Kindle books and it synchronizes my state across devices
(latest page turned across each device). The info in this screenshot looks to
pertain to that.

~~~
luckylion
Given that kindle devices have different sizes (and usually different font
sizes), can be in portrait and landscape mode etc, is "page" even meaningful
for cross-device synchronization? Sounds like syncing the text position would
be closer to it, not "has advanced a page (whatever page might be)".

------
klysm
I’d be somewhat surprised if they weren’t. Deep, invasive telemetry is widely
practiced

------
shkkmo
What I'd love is a small, bug free e-reader, ideally waterproof, that I can
load my own books onto and doesn't track me. Unfortunately such a product
doesn't seem to exist.

~~~
Nition
A Kindle is one of those (barring the waterproof) if you keep the wifi off and
transfer your books via USB to your PC with Calibre.

~~~
shkkmo
Kindles are usually pretty limited in terms on the formats they work with
since they really only want you to buy books through their store in their
formats.

~~~
zozbot234
Calibre takes care of that.

------
tanilama
Is this surprising? Sounds like a very natural step approach logging.

------
scottmcleod
Why is this even surprising? What kind of product team doesn't track user
behavior for improving the product or supporting other features?

------
hef19898
Kind of unrelated, but I did discover some, say, strange behaviur of Alexa
ately. See, I don't use Amazon music on my phone but Google Music. My wife and
kinds listen to Amazon Music on Alexa. We also have _very_ different music
tastes. I was kind of surprised that when my wife asked Alexa to play music
she likes Alexa started to play stuff like Five Finger Death Punch.

I do have the Alexa app installed on my phone so.

------
peripitea
Putting aside whether this is ok for a second: This should not be a surprise
to anyone. Of course they are doing this. I would almost argue that it would
be foolish of them not to collect this data. The lifetime expected value of
the data is so much higher than the marginal cost of transferring and storing
it.

I can't think how many times on the product side of services I've run that
we've been grateful to have this kind of data. Sometimes it's for obvious
reasons that we always knew, but often it's for suddenly crucial reasons that
we never could have anticipated when we first started collecting the data.

Again, that doesn't mean it's OK and I strongly support GDPR-style data
privacy legislation in the US. But in the meantime I guarantee you that just
about every service you use is gathering data like this and a whole bunch
more.

