
GDPR makes it easier to get your data, but that doesn’t mean you understand it - Tomte
https://www.theverge.com/2019/1/27/18195630/gdpr-right-of-access-data-download-facebook-google-amazon-apple
======
xg15
As much as it has been ridiculed, those GDPR data exports seem like a good
use-case for RDFa or microformats:

The exports need to fulfill two use-cases, which seem often at odds with each
other:

1) Provide the data in a well-known, easily machine-readable format, so it can
be imported by an alternative service.

2) Provided the data in an easy to understand human-readable format, so users
- even non-technical ones - can understand what data was collected.

JSON is a good format for machine-readibility and is relatively understandable
for technical users - however, as the article explains, without the schema,
it's difficult to be sure what many fields actually mean. Also, for non-
technical users, it's not always clear how to open JSON files and, if
unformatted, they might be hard to inspect without specialized software.

HTML on the other hand is very good to present data in a self-explanatory and
understandable way and us even relatively easy to parse. However, the
structure can be completely arbitrary, so for import, you'd effectively have
to write a scraper for each data item and update it each time the formatting
changes.

Seems to me, you could solve both problems by exporting everything as HTML
file but reuse already known data modelf and annotate the actual data with a
standardized notation.

------
sroussey
GDPR exports tell a lot about about how a company models their data. Some
companies gives near a sql export which shows you their internal schema.
Others use obvious export formats.

None use the same format for much of anything.

At privicy.com we are automating GDPR exports for your accounts, and so we
study these export files.

It’s almost like there is a competition for how to make the data different.
Let’s take location data as an example.

Latitude and longitude: are they Integers (google), Floats (Facebook),
combined into one value as strings (Snapchat)?

Any why does uber record location every time you touch the screen?

And then there are people formats...

