
Spotify GDPR data export: user receives 250MB containing every interaction - juliand
https://twitter.com/steipete/status/1025024813889478656
======
tedeh
What grand times we live in, where you can actually get this kind of data from
the services that you use.

Having the law say your personal data is owned by you and not some company
just because it's on their server may turn out to be a landmark in consumer
friendly legislation!

~~~
tensor
Frankly, I think a lot of this data isn't the users, but rather Spotify's. If
Spotify didn't exist then the interaction data with it wouldn't exist. I don't
see how it can possibly be "owned" only by the user here. Does a user "own"
security footage in a store that they enter? Definitely not.

~~~
sbr464
I’m upvoting and agree in the realistic point you are making, but feel this
isn’t the most popular point of view right now? I personally believe info just
shouldn’t be captured period, beyond reasons for authentication protection
purposes/identifying malicious/off pattern use of my login/auth token. We are
releasing a new business/info mgmt product soon that has no GA/full story/user
tracking whatsoever. It’s not clear why everyone enables a floodgate of
tracking just ‘cause. We are implementing better feedback/reporting channels
instead.

~~~
fipple
One thing that people constantly praise Spotify for is the quality of their
music recommendations. You can find so many testimonials of people saying that
their "Discover Weekly" playlist suggested them songs which they felt they had
been searching for their whole lives.

Needless to say, such accurate recommendations of music, which even close
personal friends have trouble doing, is based on building as complete of a
psychological profile of the user as possible. There wouldn't be any way to do
it without gathering lots of information about your very personality.

Surveillance isn't always bad -- after all, what is a baby monitor?

~~~
logifail
At some level, I think one _should_ have privacy concerns with baby monitors
... did your baby choose to opt-in?

(Three kids here, and no, we never used a baby monitor...)

~~~
fgonzag
I hope the sarcasm is flying over my head here...

------
menacingly
I enjoy the mental exercise of finding where boundaries lie.

For instance, if you simply observe the actions people take when they talk to
you, that's obviously your observation. If you were to, say, journal it, it's
still yours. It's a weird thing to do, but it's yours.

If you used the journal to optimize yourself, perhaps to make conversation
with you more enjoyable, again, that's weird, but perhaps also merely a paper
version of what already goes on inside your head.

What if talking to you were really enjoyable, so that while people could
technically avoid it, they usually didn't want to?

At what magnitude does the volume of people you're observing reach a scale
where the people you're observing start to believe your observations are
theirs?

~~~
tjoff
It's a huge difference when a company does it rather than an individual (as in
your example).

~~~
menacingly
It's a convenient difference, but I don't think it actually impacts anything.

It's an indirect way to address the level of resources you have at your
disposal, which is itself only important for the scale at which you can
capture the data.

In general, for the things people are OK with citizens doing but not OK with
corporations doing, they mean an individual could not do it at a scale that
bothers them. I'm specifically curious about what that scale is.

Certainly for me, there exists a hypothetical scale at which an individual
gathering and recording detailed observations of other people becomes a little
unsettling. Perhaps not criminal, but it falls into a "wish it didn't happen"
bucket.

------
valgaze
Impressive--
[https://twitter.com/steipete/status/1025029133175336960](https://twitter.com/steipete/status/1025029133175336960)

"They even store the brand of headphone I use. How do you even get that data,
digging deep in CoreBluetooth?"

~~~
m45t3r
If you think about it, now it makes sense why big names in smartphone industry
like Apple and Samsung are removing P2 plugs from smartphones in favor of more
powerful interfaces like Lighting/USB-C: so you can track more information
about the user.

Just imagine: you can track which kind of phone a user that likes to listen to
Heavy Metal, for example, likes to use, or which phone is more popular at the
moment. Based on this you can develop phones that is more likely to sell or
use specific marketing campaigns depending of the kind of music a person
listen.

~~~
jdietrich
There's a much more mundane explanation - waterproofing. Lightning and USB-C
connectors can both be made intrinsically waterproof up to IPx7, while the
3.5mm jack can't. Waterproofing is a key point of differentiation for recent
flagship phones. An iPhone 7 will survive a dip in a toilet bowl or a pint of
beer, but an iPhone 6 probably won't.

~~~
detaro
So all the other phone manufacturers selling IP68 phones with headphone jacks
are lying?

~~~
_r_o_y_
I had a Xperia Z5 and every time water touched the headphone jack the phone
would go crazy thinking that I was plugging and unplugging something
repeatedly.

~~~
mhh__
On the assumption that the phone itself is fine, that could be fixed in
software.

------
blaerk
It's kind of weird (and worrying tbh) that the user doesn't get _all_ the data
by default. Shouldn't all the data be sent upon request, is there a clause
saying 'only after nagging the TRUE data will be sent?

~~~
amarkov
There's a sense in which summary views are the real data. If I asked Spotify
to share my data, and they just sent me a 250 MB file of every interaction
they've ever recorded, I would conclude they're trying to obfuscate which data
they actually use and how they use it.

~~~
patmcguire
Yeah. If Netflix sends me every byte I've ever viewed, that's pretty useless.

~~~
cwkoss
Exhaustive list of your data (sorted, redundant copies not included):

0x00

0x01

0x02

0x03

...

0xFE

0xFF

~~~
adtac
I think just 0 and 1 would suffice ;)

------
_4xjr
Here is a template for whoever else wants to request their data:

[https://www.dropbox.com/s/fx5yyrru1uvx6no/sarletter.txt?dl=0](https://www.dropbox.com/s/fx5yyrru1uvx6no/sarletter.txt?dl=0)

Source (@mikarv on Twitter):

[https://twitter.com/mikarv/status/1012386696934182912?ref_sr...](https://twitter.com/mikarv/status/1012386696934182912?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1013539192805064704&ref_url=http%3A%2F%2Fdancemusicnw.com%2Fspotify-
gdpr-data-exports-user-tracking%2F)

------
wahlis
If Spotify didn't give you all data with your first request I guess that they
are in breach of GDPR?

~~~
mgiannopoulos
Yes, you would then have a basis to file complaint to your (within the EU)
country’s data privacy authority

------
rhcom2
Has anyone tried this request in the US? I'm assuming they would just tell you
to politely shove it?

~~~
Symbiote
They are a Swedish company.

The British regulator considers the rules to apply to all people, not just
European citizens, so I think the Swedish would have the same opinion.

In general, Europe has rights that apply to everyone, regardless of
citizenship. This makes a difference when it comes to searches at the border,
drone strikes, refugees and so on under the European Convention on Human
Rights.

This should become a selling point for EU businesses.

------
bhauer
Remember all of the data Winamp2 used to gather and send to third-party
servers?

~~~
starsinspace
It's weird how perception on these things has shifted. So many practices are
"normal" today which used to be clearly labeled "spyware" only 15 years ago.
They successfully rebranded spyware, now it's called "telemetry", or similar.

Anyone remember the huge privacy-related outrage when Windows XP came out,
because it forced users to do challenge-response activation? How times have
changed...

~~~
Avamander
It might just be that all those people have switched to free software and
aren't being vocal because of that?

------
zachruss92
While this is an extreme example, I'm not the least bit concerned about
Spotify collecting this data. This data is likely used by Spotify to
understand user behavior to improve user experience and improve their
recommendation engine, or to simply understand how users interact with the
app. I was delighted to find that Spotify sends you concert notifications of
bands that I listened to the most. Personally, as long as they are not sharing
this data with others without my permission they can collect this info. All of
this is clearly stated in their privacy policy including the bit about
Bluetooth) [https://www.spotify.com/is/legal/privacy-policy-
update/#s5](https://www.spotify.com/is/legal/privacy-policy-update/#s5).

~~~
KenanSulayman
> "this enables us to access your GPS or Bluetooth to provide location-aware
> functionality"

They only Bluetooth in the context of acquiring location data..

------
a-dub
I have a friend that always used to joke that if Spotify ever failed and got
sold off to private equity, they would shift to a business model where they
mine for embarrassing music and behaviors and then extort people to keep quiet
about it.

------
MikeKusold
I'd be interested in knowing if he is a paid subscriber or not.

I understand that Spotify needs data to power Discover Weekly, but I'm not
sure I'm comfortable with this amount of data.

------
maym86
This is great. I wonder if the EU can write a law to allow us to see what
adverts are being served to which audiences on platforms like Facebook and
Google. More transparency please.

------
wiremine
> Having the law say your personal data is owned by you and not some company
> just because it's on their server may turn out to be a landmark in consumer
> friendly legislation!

It's interesting to think where this goes in the future. Can I demand my
purchase history from physical McDonalds restaurants at some point? Why limit
it to internet-related interactions? (Or maybe this is already included in
GDPR and I'm just not aware?)

~~~
Symbiote
You can already demand this data from McDonalds, if they retain it.

You've been able to for about 20 years, in some form with the previous laws.

(More commonly, you could demand the paper records your employer has about
you.)

------
callesgg
The main issue as i see it is that he did not get data when he asked. He had
to complain.

------
swiley
Man I was comfortable running spotify as the only non-free app on my Linux
machines, now I'm not.

It's back to ocp and mods/classical music/occasional purchased for me I
guess... (except on my phone of course which is a lost cause)

~~~
mmanfrin
Why are you uncomfortable with Spotify now?

~~~
Avamander
Because it's a piece of proprietary software?

------
chiefalchemist
Is this data collected by Spotify, or __all__ data Spotify has for a user?
That is, it's certainly possible for any given service to gather / aggregate
data from sources other than itself.

------
Sujan
Has someone started a list of GDPR data export requests and their results? I
wonder what interesting information is out there...

------
idbehold
Seems like CSV would've been a better format than JSON for this type of data
based on the screenshots.

~~~
mirimir
Indeed. Your typical data requester isn't going to know code for working with
JSON. And converting JSON to CSV is a pain.

~~~
oxymoron
To be fair, GDPR stipulates only that it should be available in a common
machine-readable format. It doesn’t require the most convenient format
conceivable.

Also, CSV can’t easily handle nested objects. If the data model is even
slightly more complex than a plain table, it doesn’t make much sense. I’d also
argue that even if the source data is stored in an RDMS without exotic data
types, a JSON with a nested object representation is probably going to be more
friendly even to non-developers than multiple files with opaque foreign keys
linking back and forth.

~~~
mirimir
Sure, simple JSON you can view in browsers.

But with CSV you can just use spreadsheets. Are there n00b-friendly apps based
on R, Python, etc?

And can't you always convert JSON to multiple CSV files?

~~~
PeterisP
Only if you accept a potentially unlimited number of CSV files/sheets. Many
forms of data aren't really easily normalizable to a limited number of flat
tables without losing information.

~~~
mirimir
OK, then. How would someone who's not technically sophisticated interpret such
JSON?

I suppose that some service could handle it. But then there's another level of
trust and GDPR compliance.

~~~
PeterisP
Open it in any decent plaintext editor/viewer, it'll likely have support for
'prettyprinting' json, and it'll be readable.

If you want to 'do stuff' with the data, then JSON is a very (IMHO most, but
your mileage may vary) reasonable format; unless that data really is just a
single flat table, it would be hard to blame them from picking this format.

------
velcrovan
OK, I’ve been wondering if I could get this data. Now I want mine.

------
MYEUHD
If something is free, you are the product being sold...

------
brian_herman
Meh you are the product spotify is free!

~~~
mikeash
Spotify has over 70 million paid subscribers.

~~~
cuckcuckspruce
So they're paying to be tracked. Nice gig if you can get it, Spotify!

------
mikeash
Confusing title. I thought there was some horrible bug that sent a 250MB file
every time you clicked something! After reading the tweet I realize it’s
saying the 250MB file _contains_ every interaction.

------
glitchc
Instead of "with", please use "containing" in the title.

~~~
aprao
Agreed. I couldn't understand why every single interaction would generate
250MB of data!

