Hacker News new | past | comments | ask | show | jobs | submit login
Data Transfer Project by Apple, Facebook, Google, Microsoft, etc. (datatransferproject.dev)
250 points by aleyan 7 months ago | hide | past | favorite | 69 comments

Would potentially be useful if you could take advantage of this _after_ you've been excommunicated from a service for whatever unknowable violation you committed.

Presumably, compliance with this standard would enable the "here's your shit" part of "here's your shit, now get out." for your "excommunication".

That's what Coinbase does.

Not quite. If you never gave them ID back in the day, you can’t get out any more.

So give them your ID.

That should be in the digital bill of rights.

I'm curious... has anyone drafted a digital bill of rights? If not, maybe someone should in order to get the ball rolling.

I've wanted one for the better part of a decade. It would be great to have something like the first-sale doctrine for digital goods, some method of eliminating phone-home DRM when a business shuts down or service is discontinued, etc.

It's in the GDPR - right to data portability (and associated sections). I guess that's as close to a data bill of rights as we have right now.

GDPR, as a side effect, entitles users to get a copy or their data after a ban, as long as the service has not already deleted the data.

But it could be a pdf?

GDPR says it must be machine readable:

The data subject shall have the right to receive the personal data concerning him or her, which he or she has provided to a controller, in a structured, commonly used and machine-readable format and have the right to transmit those data to another controller without hindrance from the controller to which the personal data have been provided


DTP tackles Article 20 of GDPR

And just below this article is currently another saying how companies don’t really give a proper damn about GDPR.

Would you be able to trust GDPR to actually return “useful” data?

If curious, past threads:

Data-Transfer-Project - https://news.ycombinator.com/item?id=23887000 - July 2020 (27 comments)

An open source platform promoting universal data portability - https://news.ycombinator.com/item?id=17596146 - July 2018 (10 comments)

The Data Transfer Project - https://news.ycombinator.com/item?id=17580502 - July 2018 (47 comments)

The Data Transfer Project - https://news.ycombinator.com/item?id=17574707 - July 2018 (50 comments)


Not sure if there are others, but thanks for this.

For me, it is difficult to pin down what exactly this type of thing should be.

Is it purely for data migration? ie: I am closing my facebook account and want to extract an archive copy of all my contacts, posts, uploads, etc

Is it better to function as a direct transfer? How could it possibly make sense to transfer my old hackernews comments to my new facebook account?

The more I think about it, the more I just come back to email. Not necessarily the specific implementations, just the high level design: From any domain, I should be able to send a direct message to a contact in any domain. They should be able to view any basic[0] content I post (text, images, calendar) and respond in kind with basic content regardless of the domain either of us use.

I'm not sure that fully-federated-everything is the best answer and I would expect most reasonable implementations to include "Sign in at facebook.com for the best experience" or whatever.

I can't personally imagine the ideal system yet but I assume it must be somewhere in the unmapped middle ground between Facebook/Twitter/Apple silos and thousands of impossible-to-trust sloppily-federated micro-domains hosted by random individuals.

Edit: As an aside, the issue of authentication seems critically important with no clear designs that would provide a secure and usable solution. Though, the issue of account name squatters does already exist, it is relatively manageable with so few domains and no inter-operability between domains.

[0] This concept of "basic" data seems to be more-or-less captured by the "verticals" described here https://datatransferproject.dev/documentation

On authentication - or rather, authorization, it's usually not relevant to establish identity ass opposed to access rights - I strongly feel this should build upon cryptographic decentralized identifiers - on registration, send the service a signed number of your choice.

You can now sign in on future visits by signing messages with the same key. No e-mail or phone number needed (but can still be requested by the service, of course).

We're kind of seeing this by a second-order effect in the Ethereum dapp space, where you need this functionality to interact with the blockchain etc. Every user has some form ow Web3-compatible software, most commonly Metemask browser extension. I think it's an interesting ground where this could start spreading - the key infrastructure etc are already in place!

(And in case anyone gets confused, it can be used perfectly fine without actually transacting to any blockchain or holding any cryptocurrency - it's just normal elliptic curve keys with easy-to-use APIs)

You can talk about authorization alone but only because authentication is necessary to handle authorization and so can be assumed.

send the service a signed number of your choice. You can now sign in on future visits by signing messages with the same key

This part is the authentication.

My extreme opinion is that the post office should run OAuth servers.

Possibly related to Apple’s recently added feature to transfer photos from iCloud to Google Photos: https://news.ycombinator.com/item?id=26344739

Yes, that's exactly what it is. It's Apple's implementation of DTP.

After antitrust action from regulators and lawmakers from EU and the US seems inevitable, the contributors to the Data Transfer Project now, suddenly, believe portability and interoperability are central to innovation.

Well yes, that's how it works.

Laws create incentives and businesses respond rationally.

I'm personally glad it works.

Businesses are supposed to make money and lawmakers are supposed to set the rules of the playing field to benefit consumers. It's a good combination.

Exactly. This where I disagree with the left camp when they paint corporations as evil. No, big companies are not inherently evil. They played by the rules that, you, in congress laid out. It is foolish to expect a profit driven entity to do things out of the goodness of their heart when their competitors are utilizing the rules available to them. It like saying a football team is evil because they are physically tackling their opponents.

I think most people see acting entirely and solely in the pursuit of maximizing profits as evil and they use "corporations are evil" as a shorthand for that.

That evil behaviour from corporations is largely due to the focus on maximizing shareholder value to the exclusion of all else. That's a fairly recent way of running a business that came about around 1970, primarily from Milton Friedman.

I think most people expect a company to work towards healthy profits, while also taking into account all stakeholders, not just shareholders, their business interacts with.

You're right in the sense that the rules are the way they are, so corporations act within those rules. However, those rules were largely put in place to make it easier to pursue the maximization of profits and were pushed by corporate lobbying.

So, if an entity wants to act in an evil way, but is constrained by rules, then gets those rules changed so it can act evilly with impunity, surely that entity should be seen as evil?

I think we are both saying the same thing. The problem is creating a weak set of rules (you can pay employees less, you can fire at will, you can circumvent taxes, you can pollute etc.), and then expecting one company to do more when their competitors don't while both are competing in the same race. I think congress has failed the people and hide behind, "look at that evil corp".

Heck, Zuckerberg is pleading congress to pass strict privacy laws and Amazon is pushing for higher min wages. The reason is that it will level the field and they don't have alone play by a different set of rules and see users go to competitors.

Zuckerberg wants Congress to pass privacy laws only because that may preempt state level laws. If the states do it, there will be a mix of laws, some of which will inevitably hurt Facebook. This is a common tactic and is made easier by the easy money in DC.

Sure. But it is also the right thing to do. Online privacy laws is best at federal level (not state and county levels).

Why? I see no inherent reason why this is so.

The “left” made laws, then the “right” weakened them and stopped enforcing them. Corporations are not even playing by the laws on the books now, especially the antitrust laws. What’s missing is the will to hold companies accountable for breaking the laws that are destroying competition.

Right. What you are saying is that govt is being ineffective in making and enforcing laws. They should maybe focus on fixing that and less on "evil mega corp" narrative to gain cheap political points.

Both left and right populism personalizes things.

Traditional leftist position is that evil is structural, class etc. People are people. Changing structures fixes problems.

Traditional right position is that structures don't matter, less the better. People are mainly poor because they are lazy. Corporations are evil because they have bad people in them. Remove those people and you fix things.

> No, big companies are not inherently evil.

No they aren't. They are mostly amoral. Meaning they aren't inherently moral or immoral. They just act in accordance with their main directive which is to make the largest possible profit while keeping with the letter of the law (mostly).

However, what the left emphasizes and the right often forgets is that they (corporations) aren't just reacting to pressures from the competitors, the public and the law makers. They also exert enormous pressure in all these spheres in the direction that benefits them. Not too rarely to the detriment of the public at large. That's when they can and do sometimes turn evil.

> It is foolish to expect a profit driven entity

Yeah, this is the actual problem the left has: profit-driven entities. Nobody cares about groups of people working towards a common goal (ie, "corporations").

Of course profit-driven entities want to increase their profits at all costs. What's desired is systemic change and reorganization of production around different principles beyond just profit (or rather, eliminating it entirely). No leftists have a problem with companies themselves.

There are also numerous examples of companies blatantly and knowingly ignoring the rules or lobbying to have them bent to their wants. Or both.

The US took very long pause from this principle.

It all started with Robert Bork and his book The Antitrust Paradox https://en.wikipedia.org/wiki/The_Antitrust_Paradox

Google and Twitter have been offering data exports for ages though, but importing that data into different products often required either purchasing shitty propriety software or using scripts that were hacked together and abandoned on someone's GitHub. Don't know if there's something similar for Microsoft and Apple though, but in the end this is just a standardized API on top of already existing APIs and no one involved had to reinvent the wheel here.

I'd be surprised if this wasn't a widely requested feature that all involved companies have been ignoring in their backlogs for too long and now they've accelerated this, got management approval and finally managed to get a couple of senior engineers together because of impending legislation that might force their hand.

>or using scripts that were hacked together and abandoned on someone's GitHub.

I think you meant to say 'thankfully provided by their benevolent creators for my benefit'.

Obviously, yeah, that's actually something I should have added. It just isn't a solution for most people who want to switch from X to Z. But of course it's awesome of everyone scripting and reversing these things, which takes a lot of time. They definitely deserve praise and/or at least a coffee.

An alternative would be to self-host with something like sandstorm.io, and granting temporary permission to cloud providers to access some of the data, on a per-grain basis.

I have no idea how the economics would work with this.

FYI, talked about 3-years ago here: https://news.ycombinator.com/item?id=17574707

Nice to see it's finally landing.

Can someone please eli5 how this relates to Solid [0]? Is it an alternative? Completely unrelated? Would they work together -- and if, how?

[0] https://solidproject.org/

In Solid the primary copy of your data lives in a neutral server and multiple apps can access it. In theory, since Solid isn't really deployed and major apps will never be willing to adopt it.

With data portability you can export data from one app and import it into another but there's no ongoing sync.

Neutral server ends up as a reconciliation engine for eventual consistency if unable to gain enough traction to be the source of truth.

solid is dumb. and something that only makes sense for a comp sci from the 60s. everyone else who reads the project in simple english will see how dumb it is today.

in simple english: It is the dream project of whoever come up with cookies. basically cookies as first party data that you can download, upload, shared. All while having either the trouble of hosting a lot of infrastructure (just like the creators of email protocol thought everyone would do, ha!) or relaying all that info to a 3rd party like google or facebook. The nightmare scenario to everyone saying 3rd party cookies are bad.

For me, this touches on everything that is wrong with the setup we find we have. How we lose our privacy and are beholden to our corporations and governments.

* We are separated from our data. It should be ours, and we should be able to allow for corporates to access it if we choose to and we are able to understand the usage.

* The options we are given here are to be able to move our data from one corporate entity to another. Hardly the solution individual ownership of one's data and privacy.

* We are looking to government legislation to make this right for us, but governments like having access to all the data that the corporations share with them. Governments are in the business of managing populations at scale - the more information they have, the better modelling, nudging, manipulations of the population they can do. Basically corporate and governmental interests align.

* Not to forget that corporations lobby governmental entities for the legislation they want. Even if the legislation states one thing, there are ALWAYS backdoors that are understood.

I'm sorry to say that the attack on privacy is a coordinated one with governments AND corporations. If you hope that this time the government will write better legislation or that corporates will do the right thing, you are mistaken. They only care about being perceived to do the right thing - so public relations.

If you are aware of all that, and have a solution, I would be interested to hear. I think any solution would involve individuals acting very defensively about their data. Any solution that begs government or corporations for better action this time is doomed.

I just want to be able to airdrop things from mac to windows or nadroid devices

If you specifically mean files you can try https://www.sharedrop.io to share files in the browser (I'm not affiliated in any way, just an occasional user.)

I would very much like the ability to transfer between services but also completely delete once I've backed up locally.

A more significant development, I think, would be if online services let you keep your various accounts permanently in synch. That way you could write a post on one platform and know that your followers on another platform would be able to see it.

Sadly that still wouldn't fix the problem that you have to visit each platform to see responses from users that don't similarly syndicate their own posts. That might lessen those platform's concerns about implementing this automatic synching feature, though, and take them a step closer to being properly federated.

That's why the solution is federation. Data portability is a kludge designed to draw attention away from federation.

Ben from Stratechery touched on that recently.

A good point he had is that this kind of thing would, seemingly counter-intuitively (but it makes sense) strengthen the incumbents and stifle both innovation and competition.

Innovation -> Interoperability has a (maintenance) cost and would probably quickly devolve to "lowest common denominator functionality" while raising yet another barrier to entry for new companies

Competition -> Incumbents could pretty much just exploit new companies as "market research" and gobble up all their features and data if they deem the experiment successful, at no cost

I'm not sure how strong those arguments are, actually.

* Interoperability is feature just like any other, and the difficulty of implementing/maintaining it must be a fraction of the difficulty of competing with the network effects of entrenched online services. Indeed, I would hope that interoperability would pay for itself, in terms of effort, because of the number of users that can migrate to the new service, as well as being a selling point in itself (since people would be reluctant to sign up to yet another incompatible silo).

* I think incumbents don't need to rely on interoperability to do market research and copy features of competitors. It's true that Facebook would be able to see private posts on Mastodon instances that it federated with, but I don't know what useful data Facebook could gain from that which it couldn't gain from A/B testing its own huge user base. If anything, I would expect Mastodon instances to gain more from this exchange, because they are gaining access to the bigger pool of data.

Me too. But the idea that FB could ever delete user data made me snort, and I’m not sure the others are an awful lot better.

Isn't Facebook legally obligated by GDPR (for example) to delete a user's data on request? Noncompliance is risky; it only takes one disgruntled employee to cause you a lot of pain.

There is no way you can actually delete anything in the cloud, because once it is in the cloud, it is not under your control anymore.

The ability to "delete" something is only apparent. You can just tell the customer you have erased her data, but preserve it anyway, not to mention other parties like secret services or competitors, that could be interested on your data too.

If you have valuable data, people(like the Chinese or competitors) will offer your workers millions of dollars(or just threaten them or you like 3 letter agencies) for access to this data.

What would really help is a STDlib across major languages for the core data models. Think the programmer's equivalent of iLife. You're not going to sell me on a big REST structure until I'm happy with the objects I'm getting.

Is this going to be open to end users, or limited to the existing tech oligopoly?

I wonder how they vet any small companies? Just like stealing/selling Chrome/firefox extensions how will this work if a small company withe nextcloud offers migration - then it is 'acquired' by 'evil' company.


> Q: Why aren’t there more, smaller companies in the Project?

Notably absent is Amazon.

Yeah...I'd really like to extricate myself from Amazon Photos/Drive, but I haven't even begun to investigate how much of a project that will be.

I’ve done that. You just download everything locally and reupload you your new service. Of course if you have a lot stored there you have bandwidth/storage to think about.

We need this for music streaming services

Even office suite market's money today is on on-line collaboration. Microsoft would benefit tremendously with a decent open source reader for Microsoft Office formats.

I wonder if this is in response to (perceived) threats of pending regulation...

Unless you can choose to move data instead of copy data from service to service then all this is doing is making it real easy for every service to get access to all of the big pool if data that every other service has on you.

Isn’t it enough to just copy and then delete origin account?

That's what a move is. But that's not what typically happens even when you delete your account. Facebook, for example, only claims to get rid of some of your data, aparantly only removing personal identifiers.

> DTP is still in development and is not quite ready for everyone to use yet

Well, ok then.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact