What does it mean for users to be in full control over their data [pdf]

dahart · on Dec 26, 2022

This is an interesting idea with a good-hearted vision of privacy, so I’m glad someone is thinking about it and proposing something. The most valuable idea here is that you originate your data, not the company you’re interacting with just because they implemented a form. The laws currently let them frame themselves as owners of the data, so the number one thing we could use is a law that views personal data as owned by each person regardless of who collects it or how they collect it.

I’m a bit skeptical that the shape of this concept as outlined will come to be, though maybe the framework and ideas can be used. It feels like they dove into the implementation, describing how it can be done, but haven’t addressed how to get companies on board nor provided incentives for them. We already have the technology for privacy and data control, what we really lack is the legal environment and proper incentives. In that sense I’d guess what we need is the strategy for how to pass these laws and establish incentives more than a description of the tech pipeline, no?

andsoitis · on Dec 26, 2022

> The most valuable idea here is that you originate your data, not the company you’re interacting with just because they implemented a form.

While this may work for certain use cases, I don’t know that it is a stable definition of “my data”. One starts running into paradoxes when you consider data you generate when having a conversation with others, for example. I haven’t given this deep thought but perhaps our desire to use concepts or “property” is ill fated. On the other hand, a privacy lens can capture cases that “my data is what I generate” cannot adequately address, such as a friend snapping a picture of the inside of my house and posting it on social media.

dahart · on Dec 26, 2022

Oh yeah I totally agree, I don’t know how to define personal data (and I didn’t mean to presume to state one), especially whenever it involves someone else, and I think you might be right about whether the property concept is useful. I also think it’s really problematic to aim for the ability to revoke information we’ve chosen to share publicly. This concept piece doesn’t quite say that explicitly, it says you should be able revoke access to your vault, and that there should be information that can only be accessed through the vault (not copied), so effectively adds up to enough control to make private information that was previously shared. Some people are getting the idea that information that has been made public (even by themselves) should be revocable. We’ve never had that ability in history before, and information that’s “public” falls outside the bounds of this paper (because public information can be legally copied). Still, it’s an interesting debate and I think we’re going to see these paradoxes start to resolve with better and better reasoning. I’m glad it’s being debated even if there’s no solution yet.

eternityforest · on Dec 27, 2022

Transferable ownership is a very old idea, I don't see why it shouldn't apply to data.

What if that personal data is an autobiography that I have released under CC0? If the law says I still own it, can ask for it to be deleted, etc, then CC0 or any other creative commons license means nothing.

What we actually need is laws preventing people from getting arrested or losing access to housing, credit, or insurance based on warrantless data collection, rather than making the collection effectively impossible.

I don't want a world without CCTV cameras, where cell carriers don't know our last position if we get murdered, and where journlists can't report on things in public or look through a data breach.

dragontamer · on Dec 26, 2022

If a store notes that Customer#510 has stopped buying tampons and has started to buy baby clothes, and wishes to sell this information to an advertiser. Does Customer#510 have any expectation of control over this data? Who really owns the data?

You are not in control of what others know about your shopping history. Your shopping history _WILL_ change when life events start up, and that's enough for advertisers to target you with new ads.

I'm sure a private individual doesn't like "leaking" this kind of data to others. But even if you didn't have an account, Web Browsers have cookies, your computer and internet has IP addresses, your credit card numbers and bank accounts also track you. The store (Amazon) will know your change of behavior and use that to bombard you with baby clothes advertisements.

pronlover723 · on Dec 26, 2022

I don't know where to draw the line but I feel like maybe one line would be a business can not share data about me with anyone else. They can use it to help their own business directly but can't share it with another business for any purpose.

As an example, there was (is?) a law that a video rental store can't share your rental history.

https://en.wikipedia.org/wiki/Video_Privacy_Protection_Act

Unfortunately I suspect that law didn't carry over to Netflix/AppleTV+/Hulu/Amazon/PornHub viewing history nor did/does it apply to purchase history like say Patreon/OnlyFans, etc...

I'm sure that limit is problematic as well. Various companies might want to hire a 3rd party to do data analysis. Should that be allowed? What about a service like Office 365 where there are 3rd party apps?

Taywee · on Dec 26, 2022

It's a spectrum. Most people understand that when they interact with a store, that store can and will use that data in pursuit of operating their business better.

Very few people are blanket against any and all use of data about customers, or against any and all forms of advertisement. But most people really don't like companies harvesting their data in order to sell to other advertisers. Most people are sick of every company realizing they can make a little more money by advertising anywhere they possibly can, so now we're in a reality where the $2000 TV you buy in the store has pop up ads built in, records what you watch, and the company (LG in my scenario) will sell that data and use it to advertise to you more effectively. If you want to opt out entirely, you have to completely disconnect the device from the network.

Privacy and personal data aside, advertisement has strongly changed the consumer world for the worst. Nearly everything when a microprocessor tries to gather salable data, and now that can be everything from your toilet to your blender. You can avoid smart devices for the most part, but it's almost impossible to buy a modern TV that doesn't try to do anything except display video and audio.

dahart · on Dec 26, 2022

> The store (Amazon) will know your change of behavior and use that to bombard you with baby clothes advertisements.

Targeted advertising is innocuous compared to some of the ways your shopping history could be used against you. Imagine if health insurance companies could decline claims based on too many purchases of pizza and beer over the years. I thought this was supposed to be illegal, and I don’t know which laws apply now or have in the past, but ~20 years ago I did hear from a friend who worked in the credit card industry that he had seen this kind of thing happening. Has left me with nagging fears ever since. (I guess I should find out more so I’m not just spreading FUD :P)

dredmorbius · on Dec 27, 2022

A friend worked for a Big River merchandiser.

In their fraud department.

As a software developer, part of that involved undergoing private investigator training.

It also involved utilising various consumer information databases.

The story I heard involved looking up an ex and seeing what kind of information was available. Not merely card balances and net purchases, but line-item detail.

This was in the early aughts. Which means that practices may have changed, but also that there is now vastly more information available.

Firms are not monoliths. There have been multiple instances of line workers, managers, and/or contractors whose interests have diverged from the firm's.

A number of eBay executives were convicted and sentenced to prison for harassing a couple whose newsletter the execs disliked (2022):

<https://www.cnn.com/2022/09/29/tech/ebay-exec-jail-harassmen...>

External hackers, thought to be from China, hacked Gmail largely seeking information on Chinese dissidents and Tibetian activists (2015):

<https://www.foxnews.com/tech/gmail-accounts-compromised-by-c...>

A former Twitter employee was charged with spying for Saudi Arabia (2019):

<https://www.theguardian.com/technology/2019/nov/06/twitter-s...>

And six Amazon employees were indicted for accepting bribes on behalf of sellers to distort the Amazon marketplace (2020):

<https://arstechnica.com/tech-policy/2020/09/doj-amazon-worke...>

So far as keeping contractors in line, even the NSA and CIA have had problems on that front --- Snowden and Winner, notably. Let alone their own staff (many cases of moles and foreign agents).

blackbear_ · on Dec 26, 2022

I think that the red line is definitely crossed whenever a third party is involved.

It's okay for a shop to optimize what they sell and recommend things based on customer data. What is not okay is for a shop to share customer data with any other third party.

Note that by third party I do not only mean advertisers, but I also mean hosting services such as Amazon, Google, Facebook etc.: if you think at them as long streets with shops on the sides, I do not think many people would appreciate having said street filled with surveillance cameras that track everybody's movement and purchases and sells this information.

ineptech · on Dec 26, 2022

I think part of this is spot on - the loss of control of our data is intimately tied to the fact that many of the use cases we want require a server, and most people don't have one - but the solution described in step 6, non-profit like co-ops to buy and share server resources, seems dubious to me. It seems much more likely that we reach the future they're describing, if at all, through typical families adding a cloud virtual server to the list of things they spend $10/mo on. The amount of data we need to share with the world won't require that much bandwidth, even including social media (for non-celebrities anyway).

Other than that, very intriguing. I don't know if it's all workable but one has to start somewhere.

deafpolygon · on Dec 27, 2022

> The amount of data we need to share with the world won't require that much bandwidth

At some point, bandwidth won't cost anything at all.

I envision a future where every home comes with a server for the people living there. This 'server' could be responsible for smart home features, e-mail, digital television, payment processing, and so on.

vgivanovic · on Dec 26, 2022

"Only original data are stored in the vault; no copies are allowed anywhere" means no backups. As a user, I don't want that; I want my data to be available independently of the physical medium it is stored on.

Even more problematic is: What about data stored on disk, for example, and in memory? That's two copies. Is that allowed? If so, how does that meet the requirement of "no copies allowed"?

dahart · on Dec 26, 2022

You can define the vault to include it’s backups, so that’s not really an issue, but you’re right that defining access in terms of copies is problematic. Copyright laws already address this by defining who has the right to make and distribute and consume copies rather than trying to define what exactly constitutes a copy in the digital age. Maybe what they need first is to establish a copyright over personal data that cannot be transferred, similar to Moral Rights?

yellow_lead · on Dec 26, 2022

I don't understand how one could prevent companies from making copies of this data. Presumably their service provides an API to access the data (one time or limited time). All it takes is one dev storing it in the database and your privacy model is ruined. That said I didn't read the whole paper, so if there's a way around this im happy to be told

flipbrad · on Dec 26, 2022

The answer would be incredibly hardcore privacy laws, backed with incredibly hardcore enforcement, in all jurisdictions.

In other words - a very dark pipe-dream.

kkfx · on Dec 26, 2022

Ehm sorry but talking about "full control of our own data" in a paper written with Microsoft Word and presenting a mobile app is like talking about peace while empty a magazine aiming at someone else body.

Beside that to have our data, witch does not means privacy, does only means owning A COPY of our data in a fully locally usable form, we need home storage, desktops, home offline copies etc. We need to state a thing: to be Citizens in the digital world, as we need a home in the physical one we need a digital home that's OURS not rent from someone else. Witch means today having a home server for sharing purposes, with a static ip, enough upload, a domain name (or at least a subdomain) and so on.

WE CAN'T OWN anything on a mobile simply because the platform is managed/manageable from remote NOT under our control. We can't own anything on proprietary tools or services.

The dream have a name: classic internet. Witch means a network of interconnected hosts, where one of them is ours, down to the hw. A world of desktop computing where we both produce and consume, sharing between peers, the cloud at maximum reduced to "scalable cache/computing resources in the Plan 9 alike model" and so on. The rest is just marketing.

fsflover · on Dec 26, 2022

> WE CAN'T OWN anything on a mobile simply because the platform is managed/manageable from remote NOT under our control.

It depends. If you use Librem 5 or Pinephone, then you do control your data. Apart from that, I agree with your comment.

andsoitis · on Dec 26, 2022

> WE CAN'T OWN

Ownership isn’t predicated on physical custody.

blackbear_ · on Dec 26, 2022

It's not so simple in the digital domain where CTRL+C/ALT+TAB/CTRL+V is all you need to have your very own copy of things.

dredmorbius · on Dec 26, 2022

See earlier discussion (6 months ago) about the project generally:

<https://news.ycombinator.com/item?id=31833026>

wintermutestwin · on Dec 26, 2022

In a capitalist society, we should be able to set the price for something that we own. My data is worth way more than the cost to provide the trivial services that Facebook, Gmail, etc provide.

We should also have the right to not sell it. My ISP with their monopoly, Facebook shadow profiles, etc are blatant examples of theft of my personal property.

dredmorbius · on Dec 26, 2022

This strikes me as a well-meaning project which has utterly misconceived both the source of the problem, and its solution.

Apologies for length. I'm thinking that privacy solutions need their own version of the "why your anti-spam idea won't work" checklist, which would be shorter...

<https://trog.qgl.org/20081217/the-why-your-anti-spam-idea-wo...>

I've written a bit on the hierarchy of failure, or reversing the sign, the requisite success chain, in problem resolution:

<https://old.reddit.com/r/dredmorbius/comments/2fsr0g/hierarc...>

Schluss fails at stages 2 & 3 (diagnosis and etiology), fails to clearly define 4 (objective), and embarks on "garbage can theory" solutionism in 5 (redress), a/k/a "when you have a hammer, every problem is a nail". Specifically Schluss is applying techno-solutionism to a problem which, whilst it has a technical component is fundamentally grounded in commerce, law, and risk. Given these foundational failings, I'm confident in predicting that Schluss will fail in its (poorly-formed and poorly-communicated) objectives entirely. I say this with absolutely no joy.

Furthermore, Schluss's proposed solution fails because data simply don't work that way. The way information works is through records and transmission, where a record preserves a transmission and a transmission reads from and/or generates a record. The notion that a single canonical data store assures that by technical means alone further stores and transmissions don't happen evidences a grievous failure to grasp the problem.

I've compiled a set of early concerns (largely pre-1980) expressed over computerised data, which are recommended reading. In particular I point to the works of Paul Baran written whilst at RAND in the 1960s. For those unaware, Baran is one of the co-inventors of packet-switched data routing, an essential foundation of today's Internet.

<https://www.rand.org/pubs/authors/b/baran_paul.html>

<https://diaspora.glasswings.com/posts/bf4f5f10f6120138799c00...>

Today's extraordinarily invasive data-surveillance regime, in both its surveillance state and surveillance capitalism instantiations, arises from a set of factors:

- Data storage is cheap and immense. Total data storage has been doubling every few years (2--4 by a quick search), and has been for decades, dating to at least the 1960s.[1] Significant thresholds were crossed in the early 2000s when disk ("spinning rust") storage crossed the 1 GB threshold, with the emergence of SSD storage after about 2010, and with the increasing prevalence of multi-GB RAM storage, with 24 TiB presently among the highest-memory tiers available on Amazon AWS. Your Humble Correspondent recalls working on a shared-compute resource in the early 1990s with a score and a half other analysts which had an aggregate storage of about 2 GB, for a mid-sized data-heavy corporation.

- Legal theories, most especially "Third Party Doctrine" in the US, which holds that "people who voluntarily give information to third parties—such as banks, phone companies, internet service providers (ISPs), and e-mail servers—have 'no reasonable expectation of privacy' in that information." <https://en.wikipedia.org/wiki/Third-party_doctrine>[2]. Given the practical impossibility of conducting a normal life without use of such services, the doctrine effectively establishes a national surveillance apparatus and abrogates Constitutional guarantees of privacy.[3]

- A compelling commercial case. There are critics of surveillance capitalism's actual effectiveness, notably Cory Doctorow.[4] His case boils down to 1) influences are at best marginal and 2) it's a grift, mostly against advertisers. To which I respond that margins matter and advertising is an $800 billion global industry, costing roughly $100 per head, most of which is actually allocated to the 1 billion richest people on Earth, so if you're reading this figure your "free Internet" (and television, radio, and legacy print media) are costing you $800 per year, for each member of your household. Free ain't cheap. And it's remarkably corrosive in terms of privacy, manipulation, dark UI/UX patterns, and shitty, shitty content.

- Legally-permitted data exchange. Data-based credit risk assessment dates to the dawn of modern banking (a fuzzy line, as are most, but let's posit 12th century Italy), with currently extant firms such as Dun & Bradstreet tracing their origins to the early 19th century.[5] Practices such as ethnic, religious, racial, and "old-boy" network profiling led to outcomes such as redlining (mentioned by Baran in his writings, again, highly recommended). Modern data-based predictions utilising machine learning result in what Cathy O'Neil has colourfully termed "weapons of math destruction" in her book of the same name.[6]

- Effectively no liability for misuse, leakage, or exfiltration of data. I've long since stopped tracking major commercial data breaches. I strongly suspect Wikipedia's list is grossly incomplete.[7] The modern corporation is a liability-externalising engine, and there are few liabilities more effectively externalised than data loss, and its various fellow travelers of fraud, "identity theft" (that is: fraud), phishing (that is, fraud), impersonation (that is: fraud), social engineering (that is: fraud), and ... well, more fraud. Defences and consequences are pushed to the individual, precisely the level at which they are both most damaging and least-effectively addressed.

IF we are going to tackle this problem, THEN these underlying foundations must be attacked.

Data exchange must be greatly limited. Whilst data are cheap individually, large scale data aggregation does remain expensive, and if not justifiable based on commercial value, it will cease.

Liability for unauthorised disclosure and abuse must be entirely revised. "Data are liability" has been my own watchword for much of the past decade. Legally it has far less validity than I'd like.

Practices based on data aggregation must be both closely regulated and highly taxed. The taxes serve both to reduce the profitability of such businesses, and to provide an additional legal tool for prosecution against offenders. The old saw about Al Capone stands.[8]

Individuals must be granted strong legal protections to correct, or remove, data absent a compelling public interest in that data being available. I'm well aware that this sets up a conflict between privacy and free speech rights. Those are inherent and unavoidable, the question is where one finds a balance.[9]

I also hold that privacy is an emergent response to changes in informational landscapes and increased capacities in capture, storage, transmission, and processing. As such, evolution of privacy follows rather than leads such technologies, it is inherently* reactive, no preemptive. I'm not aware of an explicit similar statement from others though Jeffrey Rosen's The Unwanted Gaze, much of the work of Daniel J. Solove and Helen Nissenbaum have struck me as above par. Harvard historian Jill Lepore has an excellent biography: <https://scholar.harvard.edu/files/jlepore/files/lepore_secre...> (PDF).

Above all: Privacy is fundamentally the ability to define and enforce limits on information disclosure. Approaches to improving it must account for both elements. Schluss's proposal by contrast fails both.

walterbell · on Dec 27, 2022

The US is moving towards mandating (~2024) that banks expose customer transaction data via a standardized API that customers can use to grant third parties permission to access their data. In theory, a third-party "personal data store" could be a consumer and custodian of bank transaction/payment data. In aggregate, this could offer visibility comparable to a CBDC.

Proposals like Schluss and Tim Berners-Lee W3C "Solid" protocol use rhetoric about privacy and customer control, but their focus on data portability is likely to increase aggregation (decreasing privacy) of customer data that is currently siloed across multiple organizations and schemas.

> I'm thinking that privacy solutions need their own version of the "why your anti-spam idea won't work" checklist, which would be shorter...

This would indeed be useful, e.g. in a git repo with a template applied to earlier proposals, for the edification of future proposal writers.

dredmorbius · on Dec 27, 2022

Keep in mind that KYA and financial disclosure laws in the US also make many transactions far less private than was the case in a world relying on cash or direct payments.

That said, thanks for the information.

I (hopefully obviously) agree with you regards Schluss and TBL's Solid. I suspect those projects are well-intentioned. They're hopelessly naive regarding stated goals.

walterbell · on Dec 27, 2022

> Keep in mind that KYA and financial disclosure laws in the US also make many transactions far less private than was the case in a world relying on cash or direct payments.

With the BIS push for CBDCs in 2023, there is limited time for well-intentioned projects to get up to speed on adversarial and competitive incentives :( There may still be room to negotiate a minimum CBDC transaction value before KYC/identity mandates apply, to retain some fig leaf properties of cash for small transactions. Here are a couple of references on CFPB Section 1033 and "open banking":

https://www.protocol.com/newsletters/protocol-fintech/open-b...

> sources on Capitol Hill tell him that draft rule-making can be expected six months after panel review, and rules 90 days after that, putting the end of what would be a 12-year wait for rules governing the field of open banking sometime near August 2023.

https://workweek.com/2022/10/26/money2020/

> Director of the CFPB, Rohit Chopra, taking the stage to give an update on the bureau’s rulemaking on Section 1033 of the Consumer Financial Protection Act, which will obligate financial institutions to share consumer data upon consumer request ... the role that consumer-permissioned data sharing can play in promoting competition and enabling consumers to “break up” with their banks more easily. This seems to be, in the view of the CFPB, the chief virtue of open banking.

There's also UCC Article 12, which has been ratified but not yet adopted by U.S. states, which links some rights to possession of control/keys, even if a blockchain asset may have been stolen, https://www.clearygottlieb.com//news-and-insights/publicatio...

> Article 12 – dealing directly with the acquisition and disposition of interests (including security interests) in “controllable electronic records,” which would include Bitcoin, Ether, and a variety of other digital assets ... Control under Article 12 is designed to be a technology-neutral functional equivalent of “possession.” It generally encompasses circumstances when a party has the “private key”

New laws for open health data recently took effect, practical results TBD, but one can imagine Apple/Google/Schluss/Solid seeking to aggregate and host health data, https://news.ycombinator.com/item?id=33127810

> Under federal rules taking effect [Oct 2022], health care organizations must give patients unfettered access to their full health records in digital format. No more long delays. No more fax machines. No more exorbitant charges for printed pages.

If 1990s US telecom deregulation had achieved the promised-land vision of symmetric broadband to homes and small business, we might today have an installed base and viable business models for "home servers". This would enable banking/health customers to take both physical and logical custody of their data, with legal rights within the perimeter of their homes.

dredmorbius · on Dec 26, 2022

Notes:

1. Some typical citations: "every ten years" (1968) <https://books.google.com/books?id=ZH3pAAAAMAAJ&q=data+storag...>, "every five to eight years" (1961) <https://books.google.com/books?id=36dRmdlvcE0C&pg=PA255&dq=d...> "every ten to fifteen years" (1972) <https://books.google.com/books?id=3fIWAQAAMAAJ&q=data+storag...> via Google Books search on "data storage doubles every years" bound to 1900--1979 <https://www.google.com/search?q=data+storage+doubles+every+y...>.

2. Wikipedia cites Thompson II, Richard M. "The Fourth Amendment Third-Party Doctrine". Key case law is Katz v. United States (1967), United States v. Miller (1976), Smith v. Maryland (1979), United States v. Graham (2012), amongst others.

3. I'm aware that the US is not the only jurisdiction on Earth. It is however a major jurisdiction, one for which all of the present FAANG surveillance monopolies principally reside, and which has a major influence on others. The rising significance of China amongst Internet service providers ... does little to improve on the situation, in which case the US is among the better present defenders of personal privacy. Yes, that's an awfully low bar.

4. See: How to Destroy Surveillance Capitalism (2022) <https://onezero.medium.com/how-to-destroy-surveillance-capit...> Buy: <https://bookshop.org/p/books/how-to-destroy-surveillance-cap...>

5. <https://en.wikipedia.org/wiki/Dun_%26_Bradstreet> citing <http://www.fundinguniverse.com/company-histories/the-dun-bra...>

6. Pub 2016. Overview: <https://en.wikipedia.org/wiki/Weapons_of_Math_Destruction> Buy: <https://bookshop.org/p/books/weapons-of-math-destruction-how...>

7. Though that doesn't diminish my appreciation of the effort. Results here: <https://en.wikipedia.org/wiki/List_of_data_breaches>

8. The gangster was convicted not on organised crime, smuggling, or murder charges, but for tax evasion in 1931. <https://www.fbi.gov/history/famous-cases/al-capone>

9. I've explored this question somewhat, though still haven't fully developed the notion, as one of informational autonomy, encompassing a set of related and often conflicting concerns. See: <https://diaspora.glasswings.com/posts/622677903778013902fd00...> and some followups on Diaspora* <https://diaspora.glasswings.com/tags/autonomouscommunication> and Mastodon <https://toot.cat/@dredmorbius/tagged/AutonomousCommunication> and <https://toot.cat/@dredmorbius/tagged/InformationAutonomy>.

TheDudeMan · on Dec 26, 2022

> You – and you alone – decide who may know what about you.

Why would someone think this is true or even could be true?

flipbrad · on Dec 26, 2022

In November 2022, Marie-José Hoefmans & Onno Hansen-Staszyński & Bob Hageman wrote a paper called "Schluss"

Whoops, I just shared their personal data. "You can only be fully in control when you are in full control of the originals of your data – and no copies are allowed to exist."

Privacy and other fundamental rights are sometimes at odds. Privacy and realism are sometimes at odds.