Hacker News new | past | comments | ask | show | jobs | submit login
Getting my personal data from Amazon was weeks of confusion and tedium (theintercept.com)
362 points by Ansil849 on March 27, 2022 | hide | past | favorite | 182 comments



I'm one of the creators of YourDigitalRights.org, a service which automates the process of sending data requests (it's free, open source and were a registered charity). What is described in this article is, unfortunately, a common case with some big tech companies.

I've recently started an experiment to send data deletion requests to 600 data brokers and document what happens. It's dark patterns all the way down.

The solution is to escalate your request to the local data protection agency (attorney general in case of California). I believe that if enough of us do this it will make a difference, even in the case of Amazon.

Following this realization we've recently added an optional features which will follow up with you some time after a request is made, and depending on the outcome, offer to automate the escalation process.


"I've recently started an experiment to send data deletion requests to 600 data brokers and document what happens."

Another idea for an experiment is to send 600 data deletion requests from 600 unique computer users simultaneously to a single data broker and see what happens. If the escalation process is automated when the data broker fails to respond, the most interesting results IMO will be from the data protection agency. It is difficult to ignore 600 cases. It also tests the broker's and agency's systems. In theory these systems should be able to scale. If they cannot, then it is arguable the broker and/or agency is making an assumption that privacy is something that only some people, a relatively small number, care about. At the very least there would be a question of whether these systems are adequate for what they are supposed to do.

This experiment might be thought of like a petition that requires a minimum number of signatures. What is the purpose behind having petitions and minimums for the number of people who sign them. Here, a minimum number of people must sign on to make a data deletion request before the bundle of requests are actually sent.


> I've recently started an experiment to send data deletion requests to 600 data brokers and document what happens. It's dark patterns all the way down.

I would love to read a long form piece on your findings!

It sounds like it would a great way to advertise YourDigitalRights as well.


We're going to be speaking about this at Good Tech Fest 2022 [1], and will also write it up post it to HN.

https://www.goodtechfest.com/good-tech-fest-2022


> The solution is to escalate your request to the local data protection agency

In many countries in Europe, you can expect a response within 30-90 days acknowledging receipt, and an actual response that they'll look into it and request a statement from the company after a year or so.

Sometimes you get lucky, but then only hear back that the company has now changed that one specific bad behavior (after profiting from it for a year or two) and thus the case is closed.

Zero meaningful enforcement.

Until NGOs start suing at scale, nothing will change.


In Europe the GDPR states that businesses have a single calendar month to respond to you, either with the data, how to easily get them, or a damn good reason why they couldn't comply in time.

It happened to me once that a business hasn't responded in time, but after I sent them a follow up that they are breaking the law by ignoring my request, they responded right away the next day.

I also only sent the requests to the biggest offenders, so the sample is small.

30-90 days is therefore possibly the period one should expect to wait to get a response, even though more than approx. 30 days is illegal.


I have been asking a broker to remove my data for weeks and they are giving me the runaround. However, I’m not in CA (another US state). Anything I can do?


Please send me an email with the details (it's on the website).


You're doing privacy a great service, you're charity is awesome!

Are your services also work outside of the US, like, say, Canada?


Thank you! We are about to launch support for the Brazilian LGPD, and have 17 other regulations we want to support this year, including Canada.


What is the process of getting all my data collected by GOOGLE and MICROSOFT?



and APPLE?


and WALMART


Can we have a non profit for this? I think one of the issues for getting tax exempt status is designating a “charitable class” of people that it would be helping


Um, they are a charity?


> Conscious Digital MTÜ is a registered Estonian non-profit organization number 80600079.

Ah they are, wonder about the US version

The tax deductibility for us tax residents is a working major incentive


That's so great. I wish there was something like this for Germany.


A similar service for German users is Datenanfragen.de


We support the GDPR, so it will work in Germany.


When I went to the German site and tried to generate a data request, the generated email was in English. Is that intended? I think I'm Germany you're much more likely to get a response if you write in German.


Does the GDPR mandate usage of English? I know orgs that will allow antispam eat any communication not in the local language.


I really do hope our politicians in Canada pass something like this


Perhaps not quite what you’re looking for but this may be of interest: https://accessmyinfo.ca


I had to click through more than 100 links to download all the data, how can this be acceptable? Specially coming from Amazon. How hard is it for them to create an archive with all the data? This is ridiculous, I can't imagine how was the meeting when they decided to produce purposefully such garbage UX.


Here's a picture of the UI for the download, with 123 different "Download" buttons. https://twitter.com/nelson/status/1503848290193862658

I did an Amazon download too, after Amazon's subsidiary Goodreads lost all my data of 9 years. I'm grateful for how the GDPR and the CCPA mandate that companies provide data downloads. Amazon is clearly doing the bare minimum to comply. Other companies do more; Twitter's data download comes with a fully working offline Javascript app for reading and searching your tweets!


I'm not sure "an archive with all the data" is what you really want either. At some point you'll hit the limit of how much data does your filesystem allow in one file. Not sure if your browser is that good at resuming failed downloads too.


Google Takeout lets you split into multiple zips based on size. Amazon's take on this is just willfully evil.


Exactly the problem I had.

It would take Amazon almost no effort to make a single archive with all those files in.

I cannot help but view this as deliberate obstruction.


Pretty sure there are a gazillion browser extensions that can do that for you. Not ideal but hardly the end of the world.


Can't you open up the developer tools use a css query to select all the buttons, and send a click event to them all?


Yes, it's doable in a one liner, but then you get a warning of your browser, then 100 windows asking to confirm to save the file, it still is terrible UX. And form my experience, most users are not aware of the dev console.

For others that may be interested by the data you get, it is quite detailed, I would recommend to use VisiData that make it really easy to navigate inside zip files and csv the files inside (most often a single file).


Yes, most commenters on HN could could, but for the general public, it's not a great interface.


I run a part of the data request process at our company. This article is an example where people expect anything technology related to be magic.

We have to go through EVERY tech stack we own and look for that person's data. It's amazingly manual and tedious and takes about 6 people about an hour per request.

We're working to automate it, but needless to say we try not to broadcast it too broadly.

I hate that everyone jumps on any bad experience as a "dark pattern" when there's plenty of incompetence to share the blame.


The article clearly acknowledges this:

> Given Amazon’s obsession with speed and eliminating friction to foster faster consumerism, the dawdling data solicitation process seems like it just might be intentional, designed to dissuade requests. A far simpler explanation comes through an invocation of Hanlon’s razor, the old adage to “never attribute to malice that which is adequately explained by stupidity.” Amazon whistleblowers cited by Politico have said that the company “has a poor grasp of what data it has, where it is stored and who has access to it.” If that’s the case, then it stands to reason that it can take a month or more for Amazon to process a data request. As former Amazon chief information security officer Gary Gagnon succinctly put it in an interview with Reveal, “we have no fucking idea where our data is.”

However, there are many dark patterns here unrelated to how long it took:

- Repeatedly trying to direct users away from a data request to their "Your Account" page, which has a tiny fraction of your data

- 6+ different pages and many many clicks required to make a request

- Data divided into ~100 different downloads (how many more days would it have taken to make one zip?)

Also, Amazon is an ad company. They have well-designed APIs and and tools for managing all this data because it's how they make money. When an advertiser accesses your data, they don't have to manually download 74 different zips.


I understand what you mean and I agree about everyone whining that the "sky is falling" but in my opinion, you shouldn't collect what you can't easily give me.


> I hate that everyone jumps on any bad experience as a "dark pattern" when there's plenty of incompetence to share the blame.

While I understand you; this is Amazon. It's laughable to think, for an organisation with the technology and resources of Amazon, that this is anything but laziness, "malicious compliance" or a deliberate "fuck you".

Having me forced to click over a hundred download buttons to get the data I requested is not ok for a company Amazon's size and is not because they couldn't spare the resources to have someone write a few lines of code to archive those into a single tar.gz/zip and provide one button to click, it's deliberate.


It's not magic. It's expecting compliance with the law.

When you make a car, you need to add safety belts and lights.

When you control data, you need to have a way to provide the data to the person who the data is about.

You don't get to complain because your company decided that it's easier to only add the lights and safety belts when someone complains in a cumbersome manual process.


It's revealing how hard this stuff is when Google's Data Liberation Front needed 4 years to release Google Takeout – which I consider to be best-in-class for personal data access.


I disagree. Google Takeout is a sham. It doesn't have all the data they collect about you. It's almost adequate for data portability, but not quite. It's useless for data transparency.

Google Docs keeps keystroke-level logs of everything you type, for example. That's not in Takeout. Neither are things needed to conduct a security audit (that's a paid service for Workspace customers). Neither is a lot of advertising profiling data.


It is a hard problem, but the GDPR went into effect 3 years and 10 months ago. That date didn't come as a surprise, but was known 6 years ago. Anything newer than that should have taken data requests into account from the design stage. Anything older than that has had ample time to adjust. More than that 4 years you quote for Takeout!


> We have to go through EVERY tech stack we own and look for that person's data. It's amazingly manual and tedious and takes about 6 people about an hour per request.

GDPR went into effect 5 years ago. If 5 years later you still haven't automated this...


Seriously, this is a direct result of Amazon failing to prioritize this. First off, blaming the customer for calling the architecture out as being bad is crazy. The customer has no say in the architecture - it's all on Amazon.


Ironically, automating it would mean that cross-data-silo joins were automated. This would be a net decrease of user privacy.


That is not necessarily true.


People in glass houses shouldn't throw stones. The author may want to read the privacy policy[0] for the site they are publishing their story on. They are collecting all sorts of data that they don't need to. And IANAL but apparently your rights to access the data they hold on you are restricted only to locations where they legally have to allow it.

[0] https://theintercept.com/privacy-policy/


A few highlights from the privacy policy:

> We may automatically collect information such as: Device connectivity and configuration data, including the type of device you’re using (computer, mobile phone, tablet, etc.), the operating system running on that device, the type and version of browser software you’re using, the resolution of your computer monitor or other device display, and the screen colors available on your device;

> We may obtain information about you from other sources, and combine such information with information we have collected about you through the Services. To the extent we combine such third-party sourced information with information we have collected about you on the Services, we will treat the combined information in accordance with this Privacy Policy. These third-party sources vary over time, but have included third-party donor partners and publicly available sources.

> We and our Service Providers may collect and store your information in log files as well as through the use of cookies, web beacons (also known as “tracking pixels”), embedded scripts, and other tracking technologies now and hereafter developed (“Tracking Technologies”).

> Do Not Track. Your browser or device may include “Do Not Track” functionality. At this time, we do not respond to Do Not Track signals.

> Our agents, vendors, consultants, and other service providers (“Service Providers”) may receive or be given access to your information in connection with their work on our behalf. These Service Providers are prohibited from using your information for any purpose other than to provide this assistance, although we may permit them to use aggregate information which does not identify you or de-identified data for other purposes.


The multiple download buttons is not a dark pattern to prevent you from downloading your data, it’s just bad UX, it’s a feature you add to check a legal box and it doesn’t get priority for usability. Probably someone just shrugged, this is good enough and moved on. They should definitely give you it all in one zip file but “Never attribute to malice that which is adequately explained by a developer rushing to get something done by making it just barely usable” (the fact I worked at AWS as a development manager has nothing to do with the above and is solely my opinion)


Is the ridiculously annoying process for ending your Prime subscription also just accidental, bad UX?

It miraculously uses very similar patterns.


True, and the ridiculously easy process to accidentally sign up for a subscription to each of your smallest of purchases.


It makes sense that Amazon would dedicate resources to making the sign up processes easier because it actually makes them more money compared to making the cancelling easier. That would be a waste of time to work on.


Do the excuses ever end?

> It makes sence that Amazon would entirely forget to provide 'cancell subscribtion' button and will keep taking your money forever'


If the developers felt rushed to make such a feature that is the fault of their management but given the history of Amazon which includes publicly facing service status boards that don't update unless senior management approve the outage it is more likely that Amazon doesn't really want people to know what they know about their users.


> it’s just bad UX

"Bad" I would say means you can still achieve your task.

I was presented with over sixty download links, and not being an idiot or someone to be taken for a ride, I refused to go along with it, and that means the UI is not bad but failed.

What's more, it OBVIOUSLY failed.

There's no way a single person at Amazon could have genuinely sat there and thought, "yes, this, THIS is it, THIS is the right way to make this page", not, that is, if their goal was the user actually getting hold of their data.


Ok well, the same thing could be said by any organization in the world regarding any dark pattern ascribed to them.

As far as never ascribing, if a company is super big and rich and would find it beneficial if people give up trying to do something because of bad UX I think it's a reasonable assumption that the bad UX is an example of a dark pattern. Otherwise this helpful concept to describe actual things that companies do to tire out users and get them to relent in doing things the companies don't want done would have to disappear.

TLDR: If what Amazon is doing here isn't a dark pattern, what is?


Even if this were the case, it'd still be customer time neglicience. This is one of the largest companies in the world


This is unnaceptable given how inportant user interface is the rest of the site. one click shopping


> It’s a bit like if you have a stalker who’s been shadowing you around, meticulously documenting everywhere you go, everyone you talk to, and everything you do, who’s now handing you a form to fill out if you want to see the boxes of files they’ve been keeping on you.

This has me thinking. I can get an injunction for a human stalker who's going after me at home, my workplace, following me wherever I go, etc.

According to US law, companies are also people. So, why can't I get an injunction against, say, Facebook/Meta ?

Get enough of these injunctions, and these shitty privacy-invading data blackholes would dry up pretty quick. If they don't, then they'd be liable for violating court orders. That usually never ends up well.


Because you clicked Yes on their EULA.


I said facebook for a *very* specific reason: https://medium.com/@SpiderOak/facebook-shadow-profiles-a-pro...

There's absolutely NO agreement with shadow profiles.

And on to your EULA excuse - show the court that:

1. That YOU accepted a EULA

2. That the EULA was even presented

3. That the EULA agreed (if proven) is the same one withe the onerous terms

4. That the user didn't revoke permissions (affirmative consent is a thing)


I agree with you, but it might be difficult to prove that someone is keeping a shadow profile.


The difference between a stalker and Amazon is that Amazon does not get any data from you (or at least 99% of what this author could request from Amazon, some ad tracking stuff might be an exception) if you do not willingy give it to them. Don't have an Amazon account and use it do order things or search, talk to Alexa, etc - and they will have no data.


This totally disregards the concept of shadow profiles..


I’d think that like fb, they collect data on individuals regardless of accounts. One example of this is their facial recognition services. Given that they force higher pricing of products not on their page, it becomes challenging to simply “go somewhere else.” It’s also been shown that they extract business data from their aws customers.


I like this detailed walk throughs, although obviously subjective it reflects well on the many obstacles and dark patterns that are put in the way.

The "funniest" one certainly is that there are dozens of download buttons to actually download the data in the end.

So, it seems understandable that the author got quite frustrated with this process Amazon built.


I had the exact same experience. I wouldn't mind if they would be sued for this. It's audacious, a dark pattern, user hostile, lazy.


What I found "queer" (besides the tediousness/whining) was the:

>It’s not explained how Amazon acquires this third-party audience data, but according to this dataset I apparently am a homeowner, in possession of a luxury sedan and SUV, and in the 45 to 54 age range. This was all news to me, as I am none of those things.

This kind of data is seemingly what "circulates" about you and on which advertising statistics and targeting are made.

Should we believe that it is only a singular glitch or that most of these data is simply wrong/made up?


Probably LexisNexus. Their sales offerings will promise datasets offering information on political affiliation and marital status of arbitrary addresses.

Or info shared from partners/affiliates. People talk about you, and most of it is BS, but you as the consumer should just accept it so businesses can monetize their datasets!

Ain't it great?


HN probably aren't required to let me download my data, but it sure would be nice. Does that option exist, on this site?


I consider HN to be one of the most user hostile sites there is regarding user content because they don't allow deleting comments. They force people into making a manual request. Which means the feature essentially doesn't exist for casual use.


The right to be forgotten is in tension with the need to preserve public discourse.

Nevertheless I have deleted many comments within the available regret window. I do wonder whether they're actually removed from storage, or merely elided by software.


"need to preserve public discourse."

Never heard of it before?


If you really want it, you can probably get a dump of your data by emailing dang & co (hn@ycombinator.com). They're pretty responsive. Don't overwhelm them with superfluous requests, though.


I'm sure it does as many third party clients offer features where you can read back your own posts etc.


I'm personally using https://github.com/dogsheep/hacker-news-to-sqlite#usage, it's great. You basically just need your username and that's it


Ooh nice thanks for the tip!!

I'm making a "life log" system that stores stuff I do online automatically. This will come in very handy.


Oooh, I'm a big fun of lifelogging! You might want to check out some of my projects like https://github.com/karlicoss/HPI#readme :)


Nice, I had no idea this was already 'a thing' :) Thanks again for this!

What I want to do is indeed capture my emails, social media posts, photos, location data and chats (I run everything through Matrix anyway so that's pretty easy). And then store it in a database (or just a filesystem per day, not sure about that yet - I see your concerns about databases for this and I agree). With the more sensitive stuff GPG encrypted.

I'll see if your projects can help me out with this, thanks! Like you say in your readme, indeed my goal is to regain control of my information. And enable myself to actually do something with it.


You can do it if you know how to use an API. See the 'API' link at the bottom of the page.


Will this include all logs & data that aren't publicly visible? The HN software employs various dark patterns such as shadowbanning & rate limiting accounts, all this info would have to be disclosed to, in addition to any internal communication about your account.


Really? I would have expected a bigger outcry if GDPR et al. required disclosure of shadowbanning & rate limits; could you possibly direct me to where I can find the exact requirements? Because that sounds like great fun to go exercise.


Actually, that's a question; the GDPR (Europe) and the CCPA (California) both require data download options. I don't know if Hacker News is a business that qualifies for this regulation though.


The article would be stronger if it didn't overreact and exaggerate, but then again I do appreciate the sarcasm. The 74 zip files are the most egregious part of it though. You can't zip those mofos into one file? It's spiteful somehow, like you asked for water and Amazon said "Here you go" and threw it in your face.


That is something that I like about Google. It only takes a minute to get to what they admit to data they have collected. Also easy to dump all data and then download it a few hours later. I mostly just use paid for services (GCP, Play books and movies, sometimes Colab Pro), but Gmail is my backup email and I like to download that occasionally.

re: Amazon: I like to refresh my VPN IP address, and go to Amazon in a private browser tab to avoid being "gamed" on item pricing. I login once I have the price set.


Google Takeout is "all your data", this Amazon download seems to actually be "all data about you". It seems you can't get the latter from Google.


> It ultimately took about 19 days for Amazon to fulfill my data request, in stark contrast to its reported median time of 1.5 days to process a data request, as per the company’s California Consumer Privacy Act disclosure for 2020.

That's interesting but not particularly surprising. I bet the median request isn't for all data. An all-data request may involve pulling data from cold-storage, which I'm not surprised would take 2+ weeks (it's quite possibly a relatively manual process).


What I’m surprised we aren’t talking about is the encrypted blobs that Facebook provide when you download your data. With no instructions on how to decrypt to view your actual data.


I did this myself a month ago or so. In addition to the process and the multiple downloads, I was very fascinated to discover that many reports were delivered as PDF. Why would that be, if not to make it more difficult to access?


Wired Magazine recently did a feature on “Amazon’s Dark Secret” of what this mess looks like from the inside:

https://www.wired.com/story/amazon-failed-to-protect-your-da...


Yes, they have multiple download buttons and it takes a bit, but I got the same with Google; it only took a few minutes to download the data once made available.

I was most surprised by the sheer amount of audio data kept: in my case, more than 5Gb of wave files dating back to when I set up my first Alexa 6 years ago. I believe at least 50% of everything Alexa heard in my house is recorded there. That's when I started looking for an offline alternative, since -after the initial novelty wore off- we're only using it to listen to music, turn on/off smart home lights and ask the occasional random question (convert C to F, etc.).


Has anyone tried to get their data from Apple? Was the experience any different?


I’m more of an anti-Facebook bias person myself


One of those relatively few circumstances where structuring the company into service teams is nothing but a hurdle, rather than a net advantage, to delivering on customer expectations.


The full, complete set of tedium the author describes is:

* Navigate through a handful of pages.

* Scroll to the bottom of a menu.

* Click an email confirmation link.

* Wait 19 days.

* Click 74 download links.

That last part is pretty dumb! But it's also the only thing that seems remotely tedious, and I'm not sure where at any point he'd be confused. The author implies some sort of issue with the 19 day waiting period, but it seems entirely plausible to me that many of the datasets being requested have "ask an engineer to run through this long manual process" as a dependency.


Distributed systems store information in different databases and warehouses. You don't want your Amazon.com retail data co-mingled with Alexa data for multiple reasons. Two of the preeminent security concepts is least privilege and data segregation.

Your data exists in different files and databases. That's why you get multiple files containing your data. And let's ignore that the zip archives contain files of different types.

If all the files were of the same type, what would you prefer, that Amazon edit these files and combine them into a single file? How could you prove that Amazon didn't edit out any files maliciously?

The typical way to verify file integrity is by checking hash sums. But here you don't have access to the original hashes (because they're internal Amazon files). Even if you had access to the hashes, we know the hashes wouldn't match because we're presupposing that the files have been modified to combine them.

If they were to combine all the files together, there would be no way for Amazon to document that nothing was changed. Which means the process isn't auditable, and people will come up with conspiracies about how big bad Amazon is sanitizing files before sending them out.


74 zip files could themselves be added unmodified as 74 individual entries within a parent zip file, optionally, for ease of download convenience. The hashes of those 74 zip files within the parent zip file would be just as auditable as with the current process.


That's true, but then people will complain that they had to unzip 75 files instead of 74. The real issue here is that there are 74 files. Which is an issue without a good solution.


Unzipping 75 files is a one click job on any reasonably current system I know of.


> what would you prefer, that Amazon edit these files and combine them into a single file

This, additionally, adds the complication that they could be accused of making the data onerous to access by providing it as a monolithic zip, too big for some users to download over unreliable connections.


> That last part is pretty dumb!

It's not just dumb, the whole process is at the edges of the law. Art. 12 GDPR mandates "intelligible and easily accessible form", which navigating through a number of pages, wait times and finally a 74-link download is certainly not fulfilling.

The gold standard, for what it's worth, is a direct link from the privacy policy page in the section that details GDPR subject rights to the page that provides the download - basically, three clicks in total.

> but it seems entirely plausible to me that many of the datasets being requested have "ask an engineer to run through this long manual process" as a dependency.

Which is ridiculous for a company at Amazon's scale and again at the edges of legality - Art. 12 GDPR mandates "without undue delay" and the one month is clearly meant as an upper bound here, not as the regular case.

That is the problem with American companies and also the US government: they all default to hoard data in warehouses and make use of it later, and completely ignoring that all the data they hoard must also be made accessible to the people it's related to.


Clicking a bunch of links is pretty accessible, perhaps you’re translating accessible to convenient?


The spirit of the GDPR law was to make life for people easier. Putting hoops in front of users that are clearly not needed - Amazon could, for example, offer a single ZIP file like Twitter does - will some day earn them trouble.


This is yet another example of how the GDPR is bad law. "Intelligible and easily accessible" is way too vague.

Are 74 zip files intelligible and easily accessible? Of course not! I don't want to pull 74 links!

Is 1 zip file intelligible and easily accessible? Of course not! Way too big to pull in over my low-bandwidth connection.

Are zip files intelligible and easily accessible? Of course not! Not everyone understands compression.

...etc., etc. I'd have a lot more respect for that law if it spelled out concretes instead of handwaving technical details and leaving it up to regulators to decide what passes and what doesn't.


This is yet another example of a random commenter on HN parroting "GPDR bad" nonsense while being intentionally obtuse.

Laws are often written with "common sense" in mind. HN commentators prefer to eschew common sense to try and excuse bad actors, bad behaviour, bad UX, bad anything.


That is how laws are usually written. Hashing out the details will be done by the courts.


having to wait 19 days is really unnaceptable. the rest are just annoying


I very much doubt that there is any human interaction on the Amazon end of this workflow.

What seems more likely is that because this doesn't generate revenue it gets the minimum resources necessary to complete the request within some legally mandated time frame. The request probably sits in queues for most of its life.

If a court order requests these same data, I suspect that it can be produced in under 24 hours


And you'd be wrong. There are humans involved at every level of GDPR requests.

Signed,

Someone who has handled such requests for AWS


That feels like an untenable solution, it wouldn't take much to create a denial of service...


Very little about GDPR was designed with technical reality in mind. It's a grand example of using the mallet of law to try and beat the world into the shape someone wants it in, ignorant of why it's in the shape it's currently in.


I can imagine it takes a month so older backups can cycle out and then don't have to dredge up data they're about to no longer keep on you anyway.


I've been trying for over two years to get my data from Amazon.

I eventually got to a point where Amazon provided a web-page, which has no less than sixty-two download links on, each of which would have to be manually operated.

It's properly tantamount to obstruction.

After finally reaching this point, Support were arrogant and high-handed - "We will not do any more than we have. We look forward to seeing you on Amazon in the future."

I still do not have my data.

I tried to start the process off a second time, but it went nowhere. I chased it, and then had some very disconnected and confusing responsese from Support (email from some random guy in Support who by the looks of it had been told to email me, but neither he had been told what for, nor I that it would happen).

I've not spent more time on it since then.

I stopped using Amazon about two years ago, because I've come to the view that the stories about how Amazon treats warehouse staff are accurate.

I want to get my personal data, so I can close the account.

Amazon of course refuse point blank (in the usual, slimey, support-talking-past-you way) to delete any personal data, so all you can do is delete the account and hope in the end Amazon expire the data.


> I eventually got to a point where Amazon provided a web-page, which has no less than sixty-two download links on, each of which would have to be manually operated.

> I still do not have my data.

> I want to get my personal data

Is there a good reason why you don't take the three minutes to click the 62 download links?


One the things people can ask for is who data is shared with. Its a massive paper trail but so many entities dont want to comply with data protection laws, its not just big tech its any large entity because interpretation of the laws is so vague, but thats the beauty of legislation, its vague.


I tried one time, they wrote me they where going to send me my data, but never did (!)

I gave up

I'll try again now :)


"more than a month" :)

> Data Request Confirmation

> We’ve received and are processing your request to access your personal data.

> We will provide your information to you as soon as we can.

> Usually, this should not take more than a month.

>In exceptional cases, for example if a request is more complex or if we are processing a high volume of requests, it might take longer, but if so we will notify you that there will be a delay.


There really should be an open standard for data subject requests and delivery methods and formats. Maybe something built on WebFinger?


Does anyone know if companies are obliged to do this in india?


Clicking on the author's byline, it says the author is a 'security researcher focusing on privacy issues revolving around source protection, counter-forensics, and privacy assurance.' I would assume, therefore, the author would have at least passing knowledge of security and web applications.

> Amazon at this point makes some intonations about how this email verification step is necessary because your privacy and security are the company’s top priority, though considering that when your data is available you’ll need to check your email anyway, it’s not clear how checking your email twice adds any security.

People can argue about whether email should be used for authentication purposes. But what is the alternate model suggested? From the formulation of the complaint, the author seems to suggest that it'd be better if Amazon did not decouple authentication and payload delivery.

Sending the payload (in this case, a load of personal data) to an email address without first checking whether the requestor is in control of the address is a horrendously terrible idea. I'm starting to wonder about the author's security chops.

> Though Amazon says that it will “provide your information to you as soon as we can,” “soon” is apparently meant to be interpreted on a monthly time scale, as the page further states that “usually, this should not take more than a month.” Though of course, “in exceptional cases, for example if a request is more complex or if we are processing a high volume of requests, it might take longer.” This protracted time frame forms an intriguing juxtaposition to the otherwise universal emphasis on speed that facilitates shopping on Amazon.

It's much easier to put information into various databases than it is to determine what databases contain information about a particular user, and present that data to the user in a secure, auditable manner.

For example, you have to make sure that the user gets all the information they asked for (which means you have to determine whether the information exists, and if it doesn't you have to log the nonexistance of the data, lest you be audited). And you need to make sure user doesn't get information about someone else (which has happened in the past).

Distributed systems are hard. It takes time to determine where all possible information could live. And you have to make sure you're providing the correct information. And do this flawlessly, every single time lest you open yourself up to bad press and potential fines. This all takes time in systems as large and distributed as Amazon.

---

If the author is as knowledgeable in the security space as their byline suggests, I'm left to think that their incurious write-up is just trying to throw red meat at the 'We hate anything associated with Amazon' crowd.

For what it's worth, my team at AWS processes GDPR requests within 3-or-so business days. But we can only do that because there is a single data warehouse for our product/service.


Perhaps all of this would be a lot easier if you actually built some simple automation to process requests. What could possibly take 3 days to process? The only plausible reason is that you’re wasting developers time on what really belongs in one of the myriad tools AWS itself provides for such tasks.


There’s nothing that’s simple when you’re dealing with 10s of thousands of different datasets across many different internal team and service boundaries with their own security setup depending on the data that’s being stored.

The cost of automating and properly securing it (since “gather all customer data into one place” is generally not great as it’s a single point of failure from a security perspective). All of that isn’t really worth it to spend the effort automating if the total number of requests for data is not that high.


My reply was to one person presumably on a pizza team at AWS. Surely they would realize some savings from automating their own retrieval requests.

As others have pointed out aggregating all of the different reports into one download is a trivial task itself suited well for automation.


> Distributed systems are hard. It takes time to determine where all possible information could live. And you have to make sure you're providing the correct information. And do this flawlessly, every single time lest you open yourself up to bad press and potential fines. This all takes time in systems as large and distributed as Amazon.

This implies that Amazon is serving GDPR data requests manually, rather than the whole process being automated. Surely that can't be true?

I agree that identifying where data can be stored, and extracting it correctly, is a difficult problem. But that problem is identical for every user, and it should only need to be solved once. You aren't determining from scratch which databases contain user data on every request, right? Nor are you re-defining your export schema for each user, or re-implementing the identity authentication, or deciding which pieces of data don't need to be in the data export for some legal reason, or any of these other systematic difficulties.

And if this automated, why suggest that the difficulty applies to serving each individual data access request rather than just to defining and implemeting a repeatable process?


GDPR requests are handled, at least in part, manually. I have direct knowledge of how GDPR is handled within the product/service I support. And yes, it's manual.


Thanks! That's my mind blown, then :) I can believe that the volume is low enough that manual work is acceptable, but even then I'd have thought that you'd want things entirely automated to eliminate the chance of human error.


I'm not sure how low in volume the requests are. It's really hard to automate because you have to gather data from many internal teams and products, and the data is intentionally siloed to enhance customer privacy and data security.

Manual work isn't the best way to handle it, but the costs of automating (in terms of security, intricacy regarding different storage systems, etc.) is too high to really automate it on a grand scale.

Where I work, which is low traffic generally, we process around 10 requests or so a week (from what I've seen).


To create a script/bot/application/whatever that can access all potential data, you have to give something read privileges to possibly hundreds of backend systems and products. This is horrendously bad idea security wise. If that service account gets compromised (either from an external or internal threat), you have a single account that has access to everything Amazon stores. This is bad for the company and bad for its customers.

There necessarily have to be multiple workflows to maintain the data segregation necessary to protect data at the scale we're talking with Amazon.

And assuming you could securely create this automated workflow, you'd still need a person manually verifying the end result to ensure that all the data scraped is in fact owned by the person who made the request. Within the past couple of years, there was a news story where someone got a different person's Alexa data after asking Amazon for their own data. That can't happen again.


The automation would be a bigger risk than granting humans carte blanche access to customer data? That seems like an odd security conclusion.


Sorry, I don't buy any of this.

Automating the process doesn't need to imply that there's a single service with direct access to all of the data. Just from a basic software engineering perspective, it makes a ton of sense each product's data export to be a separate service owned by the product team, so no disagreements there. But by talking about how hard it is to figure out what data you have stored and export it correctly, you were implying that you had no such per-product service either, and each export is an artisanal custom job.

The question of safeguards is interesting. I don't really see how having a human in the loop is adding any real security: a computer is going to be far better at deciding whether the request is valid or not. As an operator, being assigned a ticket to do an export of account 123456, what are you going to do other than do that export? A computer, on the other hand, can actually verify whether the request is actually authorized. That can be done in a way where a compromise of your central data export service account can't be used to fake the authorization.

(A quick design sketch for one option: each account has a public key encryption keypair, managed by the identity system. When the central data export service requests an email verification, that is done via asking the identity system to sign a ticket. The identity system triggers a flow that asks the user to validate the request, and as part of the flow informs them of just what operation they are validating. User approval of the request signs the ticket with their private key. This ticket is sent to each data export service, which checks that the user id they're exporting has signed the ticket, and that the ticket contents match the request: i.e. same userid, operating is a data export, the data export covers this service. You will need to trust your identity system to not be compromised, but if it is, you're completely screwed anyway.)

> And assuming you could securely create this automated workflow, you'd still need a person manually verifying the end result to ensure that all the data scraped is in fact owned by the person who made the request. Within the past couple of years, there was a news story where someone got a different person's Alexa data after asking Amazon for their own data. That can't happen again.

The odds of a human doing a good job of this kind of validation are basically zero. Either they are following a checklist that a computer could execute more reliably, or they are just randomly poking at some 1 GB data dump trying to find the needle in the haystack.


It's not very often that each and every point in an article just feels "fabricated" or over the top.

It starts with finding the page: Amazon -> Customer Service -> Search for "personal data" -> Search result #1 is "Request Your Personal Information" which nicely explains what to do and links directly to that page.

The need to verify or activate a data request via clicking a link? Of course required so some third party can not just request your data to your inbox (and process it along the way) without you actually wanting to do that.

All the mentions that most of the data is available in your Amazon account? Well, what many people are looking for (order history etc) is and even nicely formatted, searchable and cross linked to make it much more convenient.

Clicking "Remove address" only removes it from the list of addresses? Of course, addresses you ordered to in the past can not be deleted as they have to legally be stored together with other order information.

And the list goes on and on.

I get that it is scary that a big company keeps all the data you gave to them. And it is also unfortunate for you that it is not their business goal to make it instant and pretty for you to look at all the data. But there is no reason for them to do that.

If you don't want Amazon to have your data, don't user Amazon. When you use Amazon, the way you can get a lot of data from them is actually pretty good (also compared to other companies which pretent search history does not exist and so on).

(And bye to some Hacker News points. This will get nicely downvoted I suspect.)


> each and every point in an article just feels "fabricated" or over the top.

What I thought were valid points from the article:

- Unclear data: "cryptic strings of numbers like '26,444,740,832,600,000” for various search queries." This is easily the worst offender IMO.

- A wait time of 19 days

- Separating the download into 74 buttons


True, those are kind of valid.

The unresolved foreign keys are indeed unfortunate, I wondered about these myself when I got my takeouts in the past. I explained them to myself as something that is not actually available in the same datastore to query or join, but maybe a constant or some other system that does not include personal data. Still not nice of course.

I think the wait time and many download buttons were discussed extensively in other comments here. With cold storage as explanation for the duraiton, and just no legal need to make the takeout _convenient_, those also have a pretty good explanation I would say.

So valid, but still no scandal.


Yup. I agree. The wait time doesn't make sense. They should be able to spin up extra servers from the spot market in seconds. Even if they're using Glacier, that should only be a few hours.

I wonder if they execute the 74 data queries in serial to drag it out.

And the multiple downloads is just bogus.

That being said, I agree with the general point that the article is a bit overly dramatic. Amazon does a pretty good job with the request. It just takes too long.


I helped build a system for privacy compliance at a large non-faang tech company. Honestly 19 days seems crazy but this is what we dealt with:

It’s 2018 and you have to bolt this mass export/delete on every stateful service in your company. Many of these are “critical” services that are not actively worked on and have a very limited maintenance budget. That is, some team with a lot of existing responsibilities absorbed it along the way and they have no bandwidth for it.

So in some cases their mechanisms for retrieval/deletion were pretty egregious and so we agreed on a rate limit and we would queue these requests up and handle all of the paperwork. You get 30 days to comply and if you need another 30 all you have to do is send an update within the first 30.

So, quite possibly, they have a rate limit and a queue on at least a handful of backend services and it truly truly does not matter as long as the queue is under 60 days.


I've worked at an organization with a similar timeframe for some types of data requests (B2B, not GDPR-style ones). There were many parts of the organization which were mismanaged, but that wasn't one of them. That type of data request ("get all my data") involved walking through all the data we had. It wasn't indexed in a way which made it easy to grab.

This was an expensive batched job we ran monthly. We spun up a cluster of cloud machines. A map-reduce style operation would organize the data by customer. We'd ship it off to all the customers who requested it that month.

Adding appropriate indexes or similar would have been man-years of engineering work. This involved, for example, walking through server logs line-by-line and seeing which ones were associated with which customer.

There wasn't a compelling business case to do that. For normal operations, once a month was fine. If a customer had a particular need,, we could hypothetically do a one-off request out-of-line, but customers used the data for types of analytics where a one-month delay wasn't an issue.

I know of other pipelines with similar delays, for example, due to lack of automation. A person runs a task once a month, and automation would cost more than a person.

I won't chalk this up to dark patterns, so much as speeding things up having zero business value to Amazon. I just walked through the process, and at least the first two steps seemed very normal. Amazon sometimes does outrageous things, but here, I saw nothing to get outraged about.


I wouldn't be entirely surprised if there was a human involved in gathering some of the data. If requests for data are rare enough, it might be more economical to pay someone in a customer support farm to collect some data than to pay for developing and maintaining an automated process. At least in the short term. Otoh, not automating something like this seems out of character for Amazon.


Wait time could be explained by some data requiring manual work. Maybe there is an offline hard disc out there that is labeled "Users A-D - 2003 April"


> Clicking "Remove address" only removes it from the list of addresses? Of course, addresses you ordered to in the past can not be deleted as they have to legally be stored together with other order information.

I agree. The author set out with an agenda and spun every step of the process in the most negative way they could come up with.

There are some legitimate complaints (wait time, for example) but it’s hard to take these articles seriously when it’s clear that the author started with a conclusion and tried to work backward to build a story around it.

Sadly, these articles get a lot of clicks and shares because “your data” has become a nebulous scare phrase in journalism and Amazon is a popular company to hate right now.

That said, I bet if any one of our own employers was subjected to the same treatment by the same author with the same agenda, we wouldn’t come out much better. If someone wants to smear a company, they will.

Data export can be very confusing for end users, especially when they discover things like their shipping record with old addresses isn’t deleted when they remove the address from their address book. The old shipping records are necessary for everything from customer support to warrant claims to fraud detection to recall notices to regulatory compliance. Trying to shame Amazon for literally just keeping shipping records is bananas.


> And bye to some Hacker News points.

The lowest score you can get on a comment is -4 (https://github.com/minimaxir/hacker-news-undocumented#downvo...).

> This will get nicely downvoted I suspect.

Complaining about downvotes before they happen is more likely to get you downvotes than anything else you wrote in that post.


Oh, I did not know that. Thanks. A bit less "aversion" then for the future.

My last sentence was triggered by having written a comment on another comment first, which insantly went to -3 (but later kinda recovered), so I almost didn't write this one, just not to have to get the negative feeling. It's a nice Sunday after all.


Don't sweat it. None of us will lay upon our death-beds wishing we had scored more points in an internet popularity contest.

Sometimes a downvote is because you made a salient and equitable point that threatened someone's cookie jar, an angry conservative enraged that someone expressed a progressive view (and vice versa), some humourless bastard who failed or declined to recognise what you thought to be in obvious jest, or a narcissistic asshole incensed that you dared observe their poor behaviour. These you may consider to be upvotes in disguise.

Notwithstanding all this, I suspect you will also discover there's a strong current of support for those surgically dismantling yellow journalism.


Agreed, I have successfully downloaded my order history from the beginning of my account, very interesting to look through. Though I'm not sure why I was buying solaris books in 1999 :) Others like enders game I still remember.


>I get that it is scary that a big company keeps all the data you gave to them.

Situation in 2022: it is scary that someone has something I willingly gave them.


Define "willing" in this context, though. You, myself and most people on HN have a really good idea of what data we willingly give Amazon, while the average person does not. Is it really an accurate statement that people willingly give them their data when they don't actually know what they're giving?


What data does Amazon have that you haven't given them?


I think you misunderstand the comment the other commenter made - there is a lot of info Amazon has about one that is collected via dark patterns.

Also, Don't they also buy data from 3rd parties to augment what you give them? Like stats of credit card purchases and stuff? Always assuming that all these big players do that.


>Also, Don't they also buy data from 3rd parties to augment what you give them? Like stats of credit card purchases and stuff? Always assuming that all these big players do that.

They do! That's even mentioned in this article.


You're misunderstanding. I will rephrase:

You said that everyone "willingly" gives Amazon their data. The average person does not know what kind of data Amazon collects on them, therefore I am positing that it's not fair to say that they are willingly giving it over.


Do you think that if you asked random people something like...

"Do you think that Amazon stores a list of the items that you have bought from them and the addresses where they sent them"

...the majority would say no?


And if you asked them to tell you every other bit of data Amazon collects on them, do you think they would be able to tell you what all of that is? Because common knowledge within the tech community - and as evidenced in the article we are discussing - make very clear that that's not the only data they gather on you.


The average non-technical person I’ve talked to has posited that Amazon is actively, persistently listening via their Alexa-enabled devices and using that audio to drive recommendations.

This doesn’t seem to deter any of the people who’ve mentioned it from purchasing and plugging in Alexa-enabled devices, or from shopping on Amazon.

I don’t think you’re giving non-technical people enough credit. They may not know the exact mechanisms, but they’re generally aware that companies are monitoring their activity and using it to market to them; it’s just not a big deal to them.


Well shoot, I've never thought of it that way. I guess it's perfectly reasonable that they've extrapolated my behavior out so they know when to raise the price of items I intend to purchase. Yep, not underhanded at all.


There still is no viable sarcasm tag for plain text that everyone will pick up :)


That search you suggested doesn't appear to exist in the app. You mind telling me how to access this data through the app if it's so easy?


For me (amazon.de, EN language setting): Open app -> "More" burger menu botton right (three horizontal lines stacked on top of each other) -> Scroll down to "Customer service" -> Scroll down to search feature -> "Personal Information" -> #1. I think this is really just a webview to the same part of the website with a different design.

Takes a bit more tapping and scrolling than clicking on desktop, but that is more he fault of the smaller screen and how apps work I would guess.


Ah, it was only like 5 options deep and then it gave me a chat "assistant" which I used to search the term "my data" which gave me the link and the drop down box mentioned in the article to scroll to the bottom of to request my data. Which sent an email to my husband's email address that I need to open to confirm the request. Super easy. Not hidden at all.


>The need to verify or activate a data request via clicking a link? Of course required so some third party can not just request your data to your inbox (and process it along the way) without you actually wanting to do that.

You mean like some data hoarding company that offers free email that scans all of your messages to provide better "sorting", provide quickly accessible Tracking buttons, or similar features? Would something like that be considered doing evil?

>(And bye to some Hacker News points. This will get nicely downvoted I suspect.)

meh. The loss of 4 points is nothing when making valid points


> But there is no reason for them to do that.

If that is not their business goal, perhaps the GDPR needs to be strengthened and strongly enforced until it is.


That is certainly a political decision that can be made, I agree and would actually be happy about that.

If that happens, I am sure Amazon will invest the time and money to comply with that. At the same time it will put many smaller business out of business though, as they do not have the resources to do that. Even the current state of having to fulfill data requests is quite a problem for mayn of them.


Those smaller businesses will just use a standard webshop package that will incorporate this feature because most of their customers will want it.. The same way these companies use stuff like Magento or PrestaShop instead of rolling their own.


Exactly. But that will something additional they will have to buy (and install, and maintain) if GDPR would include to "make it instant and pretty for you to look at all the data.". Because that is what the parent discussion was about.


Hmm I doubt it really. I think most webshops will just include this feature.


Small business have fewer customers. I imagine their workload will scale down to manageable levels. If not there will be market demand to create automation for whatever out of the box system they're using to maintain data.


I am getting quite tired of the "small businesses" argument about the GDPR. It's starting to become the "think of the children" equivalent but for data protection.

Would you also be against food safety or physical product regulations (ban of leaded solder or other toxic chemicals)? After all, those can and do affect small restaurants and other businesses as well.


In general, no! But if someone proposed that all restaurants should perform chemical analyses on random samples of their food to check for spoilage and cross-contamination, I would have very similar questions about where the taco shack down the street is supposed to find an affordable chemical lab. Making it "instant and pretty for you to look at all the data" is a large, expensive endeavor and I don't see why it's necessary to achieve the regulatory goals here.


> But if someone proposed that all restaurants should perform chemical analyses on random samples of their food

To be fair, people propose things all the time. It only becomes law when enough people agree that it is needed. That process isn't always perfect but in general it works.

The reason we don't have a "General Food Safety Regulation" is that the current situation is good enough, either because the existing regulations are sufficient or that the industry can self-regulate (as it's usually bad for business to poison your customers). As a result, in most Western countries, you can be confident that any business that sells food will not poison you.

If we suddenly had a food poisoning epidemic because all vendors were unscrupulous and selling spoiled food, I would totally be in favour of stronger regulations even if it means small taco shacks can't compete. Having to go to a farther/more expensive place that can afford such checks is a price I (and I suspect most other people) am willing to pay if it means not getting food poisoning.

The GDPR came to be because it was determined that the existing data protection regulations were inadequate and the industry demonstrated that can't be trusted to self-regulate.


I don't think comparing food safetey or toxic chemicals that hurt your health to the design, usability and accessibility of a data export is very valid. The parent argument was not about not having to export data at all. It was about how well designed it was.


The "small businesses" argument is brought up in every discussion of the GDPR including much worse transgressions than merely bad UX in the data export process. I was not exclusively referring to this particular instance.


> they do not have the resources to do that.

Good - the aim is for them to not store personal data in the first place, much less build business models that rely upon it. Rather than allowing the population to take on the negative externality of surveillance capitalism, it is absolutely right that the burden must fall on those creating the problem.

I don't see this as any different to the complain that small restaurants cannot afford to pay their workers - if they can't afford to comply, they can't afford to be in business at all. It's simply a margin problem.


You're giving the argument too much credit. It's more akin to a large restaurant arguing that small restaurants could be put out of business by health inspections, so maybe we should hold off on the idea. Rather, keeping a clean kitchen is something they all should be doing anyway from the get go.

Any pain for Amazon in Amazon's process is entirely Amazon's fault. If systems are built with the requirement of letting users export their data, then the additional effort to do so is trivial. This argument about the GDPR essentially boils down to technical debt from companies that played fast and loose with personal information, and we shouldn't entertain it.


> If systems are built with the requirement of letting users export their data, then the additional effort to do so is trivial.

It’s unreasonable, IMO, to think that companies should have had the foresight to see legislation that would happen two decades after the company had already existed and as a result build a system for retrieving user data that has no profit generating potential.

GDPR is good because prior to it there really wasn’t any economic incentive to provide this information.


You're implying that arbitrary "legislation" just arose out of the blue. Rather, it's based on a long held idea that companies are merely trustees for customers' data. So their position is more akin to having built a shed straddling a property line a decade ago, and now complaining that they couldn't have known that their neighbor might eventually want it moved.


I never said GDPR is arbitrary legislation. In fact, I called it a good thing in my initial post.

My point is that without legislation companies generally are not going to do things that don’t make them profit directly or indirectly. Aggregating user data for users to see is not something that really generates revenue and so companies prior to GDPR didn’t really do this en masse.


Your argument rests on the idea that the GDPR was an unforeseeable (arbitrary) requirement, rather than a straightforward implementation of a predictably-relevant Schelling point. While businesses won't go out of their way to do things that don't generate revenue, it's not unreasonable to think they will do some basic forward-looking due diligence. When storing personal information on a whole bunch of people is a core part of your business, it's reasonable to expect that eventually those people will want some control over the records kept on them.


1. Privacy legislation existed in European countries for years (and often for decades)

2. GDPR was in the works for several years, and when it went in effect, companies were given 2 years to become compliant

3. GDPR went into effect 5 years ago, and has been enforced for 3 years

So please stop with the "poor companies could not foresee this, and didn't have the time to implement this"


Europeans have valued privacy and data protection for quite a while now culturally. The ePrivacy Directive is from 2002 (derisively referred to as the "cookie law"). And GDPR had a multi-year grace period. It's simply a result of companies ignoring building these kind of functionality for far too long.


The parent argument was about "to make it instant and pretty for you to look at all the data." - not GDPR in general, which I fully agree with and like very much. It is a very different thing if you give users the power to get their data, or want to force companies to present that data in a way laypersons can understand and "like".


From the article:

> Given Amazon’s obsession with speed and eliminating friction to foster faster consumerism, the dawdling data solicitation process seems like it just might be intentional, designed to dissuade requests.

> It ultimately took about 19 days for Amazon to fulfill my data request, in stark contrast to its reported median time of 1.5 days to process a data request, as per the company’s California Consumer Privacy Act disclosure for 2020. There was no option for expedited Amazon Prime data delivery and no button equivalent to an instantaneous Buy Now (née 1-Click) option when selecting my data.

When you use Amazon services, I don't think there is a single, global database of all your data. Amazon has many different offerings (prime video, alexa, music, photos, books) often with many individual organizations and sets of teams within those organizations. Each customer-facing feature is supported by some N number of services, which collect and store data in different systems. These can be modern day systems built from the ground up with "privacy data reporting" as a first class feature, or they could legacy systems that were built any time before GDPR and other compliance laws came online.

Some of these systems are write optimized as opposed to being read optimized. Others aren't even backed by a relational or NoSQL database. Instead, they may contain your data in some format that you cannot quickly query in constant time.

It makes little sense for Amazon or any other company to invest hundreds of millions of dollars, if not more, to stand up entire organizations to migrate off these systems – simply because a median 1.5 day turnaround time is too high or so that Nikita Mazurov doesn't have to wait 19 days. Presumably that 19 days is closer to their 80th percentile or 90th percentile turn around time. Under the GDPR legislation, a Google search shows that the maximum turn around time under the law is about ~1 month.

Any single system that has to integrate with practically EVERYTHING else in your entire company is going to be complex, no matter how much you try to simplify it. Your data may be stored in some format that's meaningful if it's given to you as is.

Or that data may be stored containing proprietary information. Or it could contain implementation details. For instance, I've built systems where we stored "magic numbers" in place of string into a database, mainly to save on storage costs. I probably wouldn't want to return those magic numbers to a customer, because it would be meaningless.

What I'm getting at is to even return one record from one specific service isn't necessarily just a SELECT query (assuming the data is stored in a relational db to being with).

This article is full of outright negativity, trying to fuel outrage and assuming everything on Amazon's side is malice, incompetence, or some combination of both. I couldn't but help and look up the author's page on The Intercept:

> Nikita Mazurov is a security researcher focusing on privacy issues revolving around source protection, counter-forensics, and privacy assurance.

I don't use such strong language on HN, but here's my own thesis: This is an egregiously padded resume. Best case, it describes a university student/researcher who has never actually solved any real world problem. It's a combination of that fact and the fact that this article was deliberately written in a way to generate clicks by manufacturing outrage.


Probably easier to buy it off the Dark Web.

And what is the Bitcoin to Bezo Bucks conversion rate right now?


My Amazon account got accessed from within. Several Amazon employees/reps confirm it. But when I asked who and what happened to the employees who did they don't tell me anything. It's ANNOYING.


When you buy from Amazon, you are supporting their various awful practices. Yes, you


I will never be able to reconcile humans simultaneous need for everything to be good and pure but you know also cheap shit. Accept and embrace the gray area we all live in.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: