Hacker News new | comments | ask | show | jobs | submit login
What Data Does Facebook Collect When I’m Not Using Facebook, and Why? (fb.com)
461 points by rock_hard 10 months ago | hide | past | web | favorite | 320 comments

If one takes out all of the defensive posturing (entire paragraph explaining that Twitter, LinkedIn, Amazon, and Google all do the same thing), and the intentionally simplified (cookies, which are identifiers that websites use to know if you’ve visited before), we are left with this:

Facebook tracks virtually every website you visit, as well as usage data from many apps, even if you aren't logged in to facebook or don't have an account. We use this data, and tie it to your identity whenever possible, so we can charge advertisers more money.

I mean, I interviewed with a company last week for a software engineer role. I won't name names, but I was basically told a bit more about what they do, and the idea was "we track customers across shop websites to see what items they prefer, what types of items, how often, color, etc. etc., and build relations between all of these attributes to help surface more of what the customer wants to buy."

And here I was thinking people the past month or so are putting FB at the forefront of the conversation about the use of what should be private data, when there are already companies doing this everywhere on the Internet. Any typical website with a storefront (Macy's, GAP, etc.) track what you buy, when, why, etc.

It's definitely about more than Facebook. I don't blame FB PR for making sure to explain other companies are using your data and basically manipulating you with it.

I once showed up to one of those MIT Biz-Tech mixers a few years ago and one of the presentations we had to sit through was an almost appalling peek at one of these tracking systems through the eyes of its enthusiastic business owner.

Afterwords we had time for open questions, and I joined the line. When it became my turn, I asked the owner about whether he had any ethical or moral qualms about what he was doing and he just laughed. He gathers data from hidden interactions with client websites (fashion brands, for this demo) to quite literally 'follow the customer around' and 'influence their purchasing decisions'. He was quite proud of just how deep into his shoppers' lives his product was integrated.

When I looked around the room during the height of this exchange, most everyone seemed indifferent and only a few seemed to acknowledge or validate my arguments. It was a supremely disappointing and depressing experience.

> When I looked around the room during the height of this exchange, most everyone seemed indifferent and only a few seemed to acknowledge or validate my arguments

It was the same thing, pre-crisis, in finance. "Of course I screwed him on pricing! He isn't my customer, he's a counterparty. He should have expected I'd screw him."

When an industry is making money to society's detriment, it will not fix itself.

I work in finance, for a dark pool. Our "value add" is to provide the anonymity to both buyers and sellers (both in who they are, the side they're on and their limit) while doing minimal disturbance to the market. We only disseminate the symbol on an IOI, and we don't quote prices, at all. We facilitate moving large block trades with minimal information leakage to market et large while abiding by NMS rules. Effectively, we run blind auctions for large blocks of US traded equities.

What are you trying to say? You add tremendous value for large customers who have the ability to not let everyone on a lit exchange know what they are doing. Actually from what I've heard more than 54% of all stock transactions are on on unlit alternative trading venues.

In you position, you are privy to a lot of information about your customers identities and order sizes before the order is executed. Do you sell this information to other customers? Because if you did, your example would be comparable to what's being discussed here.

Even if they didn't sell this information wholesale via data brokerages, your argument leaves open wiggle-room for claiming innocence while still monetizing the information via selling aggregate data.

This is something that is wholly uncovered in this whole debacle -- aggregate data (aka "anonymized" data ) is the cake beneath the icing. While Facebook et al may play nicely with regulators (and the general public) by quickly backing away from wholesale "unfiltered" data access, they can skirt the rules by providing "researchers" with anonymized data.

All of them want this anonymized data for their own analytics reporting to measure success metrics and most will walk away when the spigot is closed.

> they can skirt the rules by providing "researchers" with anonymized data.

Probably, given the many past cases of academics using this kind of data to demonstrate de-anonymization attacks, with some very special vetting of who the researchers are and where their loyalties lie.

Here's a decent writeup of their first big research-related privacy blunder: https://www.chronicle.com/article/Harvards-Privacy-Meltdown/...

I think the point here is either that

1) industries can evolve to no longer be predatory against their own customers/consumers of their services, or

2) just because one sector participant is behaving in immoral behaviour doesn’t mean all are, and labelling the entire industry as corrupt as a result may not be fair.

How is that related to the parent comment? Those trades pre-GFC weren't on dark pools, that was mostly voice (over-the-phone trading).

Thanks for sharing and this in many ways captures the central contradiction of our societies.

People often talk about ethics and values when they expect it from others or wider society. Yet they often fail to act in ethical ways when it comes to them and somehow manage the dissonance.

Our current system incentivizes profit over all else and many will respond to this. If our system rewards unethical behavior we should not be surprised with the results. Regulations around safety, environmental, labour and other ethical issues have to be put in place to temper the worst impulses. If slavery was legal many would be willing justify it as long as they gained from it.

We will have to regulate and disallow micro targeting that drives this insatiable need to stalk users and hoover data to build incredibly invasive profiles. And unless something is done it will keep getting worse.

It's almost as if the incentives of capitalism are inherently exploitative

when the capital is borrowed then another party needs to earn a return which amps up the exploitation and justification... i.e. I need to pay back my investors and also pay myself.

What do you mean, "when"? This makes it sound like we could fix Capitalism by banning banks and investors or something. They're inherent to this system. Institutionalized loans were a corner stone when Capitalism emerged a couple of centuries back in Europe. If we want to organize society differently, we should have a honest and broad debate about that rather than focusing on one aspect / one group of supposedly bad actors.

True, I think there are specific organizational structures that allow or encourage exploitation and they are not limited to capitalism. imho, when people need to please another with power over them they more readily justify their actions when they exploit others. Institutionalized loans seem a step removed from this as there is less opportunity for influence from the powers that be.

Under capitalism, man exploits man. Under communism, it's the other way around!

Can we possibly have discussions without them devolving into false dichotomies between two or three ideologies? It may come as a shock to some, but it’s not all about crony capitalism or Marxism, there are other options, hybridized options, and room for new thinking. It’s stultifying to see every critique of what the US laughingly calls the free market devolve into people screaming “Communism then, haha! Mao and Stalin killed 100 million!”

Can we at least pretend to argue in good faith?

> there are other options, hybridized options,

If you put economies on a line between free markets and communism, the economies do better the closer they are to free markets.

> and room for new thinking.

About every mix has been tried already, and the results are clear.

no one is forcing anyone to buy shit. i'm on the internet all day browsing all sorts of sites and all of these companies are completely powerless wrt to me. you know why? because i haven't bought anything except underwear from amazon and groceries from my local supermarket in about 9 months.

I'm glad to hear that you aren't influenced by these tech giants, but the problem is that most people are. When these tech companies can greatly influence elections and popular opinion, we should question their morals and decide how much personal data they should be able to access.

I understand your point, but people are volunteering to give up this information. At some point, grandma has to learn to stop calling the psychic hotline.

Explaining to grandma that she’s wasting money calling a premium rate number is relatively easy, and if she ignores you, the impact to society is minimal.

How do you explain to grandma that publishing photos of her grandkids is to the detriment to society as a whole?

Does grandma brag in front of the tax collectors about how much her children are making? Does grandma rat out the grandkids partying too hard to their parents? Grandma isn't stupid, grandma can keep secrets, but no one explained to her that Facebook isn't a benevolent entity. Her perception of Facebook is a cozy digital dining room where she can talk to the friends and family that don't have time to visit anymore.

I think it's important to remember that Facebook provides a service people really want, which makes it hard to convince them to mistrust it.

Is it OK for a business to collect and sell their customers' personal information if their customers didn't knowingly agree to it? These tech companies are hiding their intentions in the fine print and it's clear that their users don't understand what their personal data is used for. We can't expect the average person to spend 10 minutes [0] reading and trying to understand the legalese on every app or website they sign up for.

Most countries have consumer protection laws to protect the rights of customers and stop deceptive business practices. When a significant portion of a company's customers feel deceived, it's easy to see that the company is not being honest to their customers.

[0] https://www.theatlantic.com/technology/archive/2012/03/readi...

Your point is moot, if advertising had absolutely no effect on customers, then it would not be profitable to use it and the industry would not exist.

In fairness certain types of advertising have very little roi but continues to be used because of past association or the network effect.

That's your behavior in commerce, sure. How were your political and social views nudged?

Advertising is also for services and websites and political campaigns, universities, clubs, schools and charities not just consumer goods. Many of these actors benefit from having targeted advertising, just because you are not buying consumer goods does not mean a profile of your online activity is not being created and sold to advertisers used by these various groups.

Oh, yeah? I bet you think you picked the brands you bought, too.

brands of what? milk? eggs? cheesy crackers?

An anecdote is not data.

The particularly scary thing about Facebook and Google is that they can link it to your name/location/other PII easily. Your ISP is in this group, too.

Joe's Ad Tech Shop has a tougher time of that, even though they do build exactly the same profile.

Everyone wants to know what people do on their site, the big technological problem with all of the above is that the same companies do it across everybody's sites.

I've been to presentations of marketing teams that said they know how much is left on your mortgage when you hit their website, and they use that data, and a lot more, to figure out what price tier of items show up on the landing page.

The level of tracking that happens that people aren't aware of is scary. What Facebook has been revealed to do is pretty low on the creepy scale compared to what is happening.

Person X works hard to get that raise, promotion, bonus, whatever to build a better life for themselves and their family. Not so it can be siphoned away by person Y sneakily charging them more. All dynamic pricing of this kind should be forced to be transparent. Display the real price alongside the custom one they want to charge you.

Since before money was invented, salesmen would size you up and steer you towards what products you'd be most likely to buy, and would dynamically set the price based on sizing you up.

Walk into a car dealership, and you'll see this blatantly in action.

Told my sis to put on her most raggedy clothes you can still wear. I put on some threadbare shorts, some worn sandals, and a t-shirt with the collar fraying. Then we went car shopping. 3k off a used car's listed price for her, lol.

You can take that too far - they can think you're a bum and refuse to talk to you. (Happened to me!)

But yeah, it's very plain that how you are treated by a salesman and what they offer and at what price is very strongly connected with how you dress, are groomed, speak, what car you drove up in, etc.

Waste of time to go to dealer to negotiate. Do everything over the phone. Never relent their insistence to come in. Finalize price over phone. Done. Success.

I find the cars online, then just go into the area, and always remember I can walk out.

But you can’t negotiate with a website, and it won’t tell you you’re seeing a custom price. It’s not like a store (or actually most car dealerships) where there is a price displayed in the open

I’m OK with being shown different products. But not with generic items being marked up.

Years ago, I walked onto a car yard. I watched the guy look at my jeans and my t-shirt, and he tried to sell me a $2000 car. Exactly as you said.

Care to name products/companies? It would be fun to play around with a VPN and see what variations in pricing would really happen.

The difference is this: a website tracking you while you're there is just using their own data.

FB, Google, twitter get data from all the websites which make the error of using their scripts: be it analytics, social buttons or "login with whatever". Everytime you use assets hosted by a third party they get free data about your users. Every time you use a third party service you're giving data about your users.

I don't think that's the case.

Facebook's ad/privacy page told me KLM had shared data with Facebook. I don't remember agreeing to that, I'm diligent at unticking the box or whatever, but in any case, I doubt it only goes one way.

KLM's website includes at least 15 trackers.

I'm looking forward to 25 May. I will write to all these companies, and demand that they stop sharing my data.

Does the GDPR give us the tools to do that?

I suspect most big companies will have teams set up whose only job is to respond to GDPR-related requests.

I think the difference that is pointed-out most of the time is the scale at which FB operates.

Even if other sites are doing (fundamentally) the same, they probably don't have such a clear and overarching picture as FB does.

I don't disagree with you though. Having most sites track you sucks.

I mostly agree that the scale of a system like FB makes its impact far greater.

But on the other hand, any company that can admit to controlling (at least in part) a person's interests, choices of purchases, etc. control a significant more of a person than you'd think.

Said person thinks they're making the decision themselves, but in reality they're being prodded in a certain direction. When they make a purchase and use the item, others see, and they might also be influenced as a third party, and so on.

I'd say it's pretty dangerous, but it tends to be overlooked.

All true, but Facebook is unique in its reach. It has tracking bugs on far more websites than almost any of these other ad services. It has specific personal information about billions of people, including their social network, their private messages to friends, their exact relationship status history, photos, videos, voice recordings, location information, including lots of such information for hundreds of millions of other people who have never consented or even been given the opportunity to consent. It's the biggest, most intrusive database of personal information in the world, and it sells access to that data in order to serve poorly targeted ads (which honestly are a scam on their actual customers as well). And their entire business model is premised on this leaky bucket--in order to allow sales of tightly targeted ads, they also are knowingly revealing personal information to malicious parties who can and have and will continue to take full advantage.

So sure, there's an entire industry built for this, but Facebook has some truly unique properties that make it light-years more dangerous than your typical ad network.

I had an interview once with a company that essentially wrote tracking libraries for Starbucks's app. Everything you mentioned in your post is pretty much exactly what the app did. Their nonchalance around their extent of tracking was chilling to say the least.

If these companies instead spent their time on improving the UX of their sites so that users (and not some creepy AI Clippy) could easily find stuff they cared about and in the sizes they needed then the companies would easily dominate their respective markets. Taking a cue from the physical world if I walk into my local grocery store I would be seriously creeped out and annoyed if I was greeted with a cart full of recommended items based on my previous purchases.

I'd like that. Saves me having to find everything.

It is about saying, "NO" and not creating this in the first place. That's the point. Until devs figure out they are responsible for this the quicker it will end. Would FB and Palantir be where they are today without people smarter than the founders enabling this?

I suspect the majority of people implementing this stuff are either largely apathetic or are so thick they wouldn't ever see the bigger picture. Most wouldn't have a clue where to start with even the simplest of articles posted here.

Showing me my favorite style clothing in my favorite color is manipulating me?

This seems to be news to so many people but marketing and advertising has been tracking and advertising to you this way for centuries including sharing of data.

To your question: yes it is. You may find it convenient, but it is definitely manipulating as well. And it really depends on where you see that clothing. Is it on the "buy" page? Ok maybe not so bad. Is it on an advertisement in a flashlight app on your phone after you last googled for something similar on your laptop? Is that still just a convenience, or have we entered territory that is creepy for you yet?

Marketing and advertising has not been tracking all of us _as individuals_ for more than the last couple of decades. How could they possibly have done this in the previous eras of dumb broadcasting and dumb print media? Newspapers can't follow you around in department stores to see what you like and target future ads to _you_ as an individual. They can advertise in places that make more sense, e.g. kids toys on a kids TV channel.

So yes this is new and yes this is manipulation, and this isn't the only way it has to be. We can still have convenience and better advertising without the mass tracking of individuals. But as long as we have people defending the current status quo as you have here, then nothing will change.

No, this is not new, as ad agencies would gather information on where you shopped and what you bought and sold that information to others. Then you might receive a flyer in the mail advertising something complimentary to your purchase. National retail chains paid hundreds of thousands of dollars for such information going well into the early 1900s.

Playing devil’s advocate but if a company “wants to help surface more of what the customer wants to buy”, why is that a bad thing? Would you want to step into a store filled with stuff you would never buy?

Many reasons:

- It assumes this is all that the data is used for and that the data doesn't get a second life with companies that can use the information to punish you.

- It assumes that the data is never stolen by malicious entities and put to use against you

- It assumes that all the data collected is required to provide the information promised

- It assumes you want to buy things from theses stores in the first place (think ideological reasons for not shopping at a place)

- It assumes you knowingly consented to these terms with understanding of the cost

Basically no matter how you shape it, it's the advertisers trying to avoid saying they've entered you into a system and contract without your consent, but it's okay because "you get to see things you want to buy", neverminding that this rarely is the case and the algorithms are hyped up beyond belief, often offering advertisements for items you have already purchased and need to rarely purchase. (e.g., you just bought a water heater, here are some other water heaters you might like)

I'm not really convinced by lengthy Terms of Services and FAQs like Facebook's that they honestly care about people being upset as they miss the point entirely. You still don't know what was collected specifically, you cannot opt out of it from their side (must use blocking tools which companies constantly change their code to avoid), there's no way to verify about any of the information they've collected or whether they've stopped collecting it if you opt out, there's no way to confirm that your data is deleted if you request it, etc.

All this for slightly targeted ads?

It's 50/50, honestly. In some cases doing that is incredibly useful for the customer.

The problem is when the business decides to start promoting similar, yet different items, knockoffs, etc. that "match" your profile, yet aren't what you would've wanted to originally buy. So you wind up getting steered to something that may be of lesser quality, etc. This might be the result of the company making a larger profit on the alternate item or something else.

>Facebook tracks virtually every website you visit

Is that true though? Many sites have Facebook share buttons that don’t actually load anything from Facebook unless you use them - they use a locally hosted button (I do this on my own sites, and got the idea from seeing it on so many other sites). I don’t see like buttons (which are directly loaded from Facebook) with anywhere near the frequency I once did either. In fact, the like button is only used on 0.5% of the world’s websites [1]. Even the Facebook pixel, a common target of the scorn of privacy advocates, is only in use on ~12% of the top 1 million sites [2].

12% is hardly “virtually every website you visit”. It’s pretty alarmist to say that.

[1] https://trends.builtwith.com/widgets/Facebook-Like-Button

[2] https://trends.builtwith.com/analytics/Facebook-Pixel

> Many sites have Facebook share buttons that don’t actually load anything from Facebook unless you use them - they use a locally hosted button

This is incorrect. "When you load up your site with a host of sharing buttons you're – unwittingly perhaps – enabling those companies to track your visitors, whether they use the buttons and their accompanying social networks or not" [1].

> the Facebook pixel, a common target of the scorn of privacy advocates, is only in use on ~12% of the top 1 million sites

So only tens of millions of people whose browsing history is being siphoned off without their consent?

[1] https://www.wired.com/2013/03/social-sharing-buttons-that-re...

This is incorrect

And yet this is precisely what I do on my sites, and I have seen it done this way on many, many other sites as well.

> So only tens of millions of people whose browsing history is being siphoned off without their consent?

Tens of millions of people are asking Facebook for the Share button in case they want to share the content they're looking at.

Fixed that for you.

I have not yet met any person who has used, or wanted to ever use the share button of any of the tracking companies. On the other hand, I have met a lot of people who have no idea what that button does to their privacy.

Only the soul-less marketing drones without any sense of morality, ethics or empathy for others are asking for the share buttons, even if the only party profiting from them are the tracking companies.

"When you load up your site with a host of sharing buttons you're – unwittingly perhaps – enabling those companies to track your visitors, whether they use the buttons and their accompanying social networks or not" [1]."

This can be prevented and in fact, may have to be prevented with the new EU privacy law.

Here is a wordpress plug-in that helps to protect the privacy of your users and helps comply with the new law:


It basically requires you to activate the social buttons first with a click.

How you can do it "unwittingly"? If you put a button on your site that says "like us on Facebook" or "analytics by Facebook", how can't you realize you're inviting Facebook into your relationship with your site's visitors, and you do it completely voluntary, expecting to gain more exposure, better marketing tools, etc. ?

How does Facebook track visitors with a locally hosted button?

I think a lot of people dont realise you can share to Facebook with a standard link and don't need their javascript.

>So only tens of millions of people whose browsing history is being siphoned off without their consent?

If you have an issue with this, disable javascript and third party cookies on your browser - take some personal responsibility. Also, go setup a protest on Google’s front lawn, because Google Analytics (which also tracks you) is in use on nearly 75% of the Quantcast Top 100k sites [1]. That represents billions of people whose “browser history is being siphoned off with their consent” as you put it and over 6x the number of sites that the Facebook pixel is on.

Since you seem to have a significant issue with Facebook, I’d suggest that you block all connections to them, which should solve your problem. Most firewalls, including the free Windows Firewall, enable you to easily block all connections to any root domain. You can configure your firewall to block *.Facebook.com for example, and then you will no longer have to worry about Zuck monitoring you.

[1] https://trends.builtwith.com/analytics/Google-Analytics

> If you have an issue with this, disable javascript

I have an issue with Facebook doing this to OTHER PEOPLE and thus dramatically influencing the world I live in. I can and do block their Javascript, but that doesn't block it for others.

The idea that our decisions are just independent and everyone can decide for themselves is (A) wrong in terms of how we are affected by decisions made by others (B) anti-social and (C) denies the reality of organized power.

If Facebook had to just be a bunch of unorganized, separate, independent people instead of a company with organized structure, they'd not have the power they have. They have power because of their organized structure. The idea that the other side of thing (the regular citizens) should only act as isolated individuals while the companies act as enormous organized power is ridiculous.

If it has so much influence, then it must be controlled by more powerful superiors than just that Zuck guy, isn’t it. He must be thankful for they to leave him a possibility to sell t-shirts.

That sounds like a tautology along the lines of Creationist thinking (this is so well-designed, it must have a designer…)

If there are more powerful superiors, then wouldn't the same argument apply? They must be powerful because they have superiors, etc. etc. ad absurdem…

My point wasn't that any ONE person (Zuck or anyone) is all powerful or influential. The point is that SOLIDARITY i.e. ORGANIZATION itself confers power on the organization.

Organized power might be distributed where no one person can dictate the organization's direction. Or it might have an all-powerful-dictator. Either way, organizing confers power to the organized entity.

Corporations are powerful, even though they are constrained in various ways and the power is wielded by a mix of actors within the corporation.

Consumers / citizens who each act unilaterally have less power even though their aggregate decisions can have powerful effects in the market (because they are disorganized, they can only choose from what the supply side offers, we don't get true demand-driven products).

Organized power doesn't necessarily mean top-down either. See the Starfish and the Spider (book about decentralization). Decentralized entities are more persistent (no single head to cut off to kill them), but decentralized ≠ unorganized.

Google is awful. But so is Facebook. Shifting blame to Google in a discussion about Facebook is a distraction from the problem.

When someone pipes up saying how terrible it is that one company is tracking people on 12% of the top websites in the world, I think it's directly relevant to point out that another one is tracking people on 75% of that same group of sites. The tracking is either an issue or it isn't, regardless of who is doing it. If it isn't, then this is a pointless discussion, and if it is, then clearly this particular discussion should primarily be about Google.

Regardless, it is a moot point because third party tracking is here to stay, even under GDPR - you'll just have another message box to dismiss when browsing. My favorite so far is this one - http://prntscr.com/j67usw . I offered a few ways to slow down that tracking above, but the only true way to not be tracked is to not use the Internet, because none of this even broaches the subject of what data ISPs collect about your behavior.

I can appreciate that argument, and I agree with you, but I don't think that's the argument you made in your original comment. It's a good one, though.

> If you have an issue with this, disable javascript and cookies on your browser - take some personal responsibility

This is a problematic attitude. It's the car dealer telling the lemon [1] buyer to "take personal responsibility" for being sold junk.

The point of government is we don't stand alone. Facebook is a menace, and I block them. But I shouldn't have to. And neither should my mother.

> it’s hypocritical to only have a problem with Facebook when Google is doing it on a scale that Facebook can only imagine

Whatabout whatabout [2]? "You can't arrest me because there is another arsonist in town" isn't a valid excuse.

(In any case, third parties calling out some, but not all, bad actors in a category isn't hypocrisy. It's prioritization. Hypocrisy would be Google calling out Facebook's advertising model, or an Enron executive criticising Facebook employees' complicity.)

[1] https://en.wikipedia.org/wiki/Lemon_%28automobile%29

[2] https://en.wikipedia.org/wiki/Whataboutism

>This is a problematic attitude. It's the car dealer telling the lemon [1] buyer to "take personal responsibility" for being sold junk.

That's a flawed analogy. Paying for a car and being defrauded is in no way related to being tracked by websites. A better analogy is walking around naked and then being upset when people look at your bare ass. If you don't want people to see you naked, wear clothes. If you don't want people to track you on the internet, don't expose yourself.

>The point of government is we don't stand alone. Facebook is a menace, and I block them. But I shouldn't have to. And neither should my mother.

That's not the point of the government. Many of us don't want the government to be an all-encompassing nanny state deciding what's okay for private parties to do with public information. Its incredibly disturbing how many people not only tolerate, but openly welcome government regulation of private, interpersonal behavior. You say they should block Facebook. They have already blocked Backpage. What else do you want to give them the power to block? Where do you draw the line? Personally I think Facebook is extremely scummy. I'm not on Facebook, and I block them in every way - but I certainly don't want the government blocking them - or anyone else - on my behalf.

> Paying for a car and being defrauded is in no way related to being tracked by websites.

This is incorrect: they're both examples of informational asymmetry being used to disadvantage a consumer. In both cases, that consumer needs to possess technical knowledge in order to understand the ways that the counterparty entity is exploiting them. In the case of the car dealership, at least the consumer is aware of the stakes when they step onto the lot, i.e. they are planning to buy a car. The problem with Facebook is exactly that people aren't aware of how they are being monetized, and that there is an explicit financial incentive to obscure that from them. They are stepping onto a car lot, or more accurately a surveillance operation, that has been made to look like an amusement park. "Personal responsibility" is a convenient fig leaf for people who want to pretend that the amusement park wasn't the sales pitch. If you don't like the original analogy to a used car salesman, consider the need for similar regulation around financial services, clean water, pharmaceuticals, etc. etc. etc.

Yeah, the situation here is more akin to you go outside wearing clothes, but they've developed x-ray glasses and you're now like, "What? Just wear a lead apron everywhere you go".

There's an old saying "when you owe the bank a million dollars it's your problem, when you owe the bank billion dollars it's their problem" (like I said it's an old saying ;-)

I think there's an analogy here-if enough people are being successfully abused by private companies it's no longer a matter that you can just pawn off on some sort of concept of personal responsibility. If your beloved private companies are threatening to exploit people hard enough to threaten the very existence of democracy, then it certainly starts to look like something where exploring a government role is worthwhile.

I was apparently in the process of editing out the “hypocritical” statement when you replied - I actually didn’t even mean for that to get posted but accidentally hit post before the final post was ready. Anyway I was mainly advising that you block Facebook since you so vehemently dislike them.

> I was mainly advising that you block Facebook since you so vehemently dislike them

Yours is a technical solution to a human problem. If we care about democracy and the future of the Internet, we will dismantle Facebook. (I expect we will, though in typical democratic fashion, after years of quibbling.)

> a human problem.

That's the very crux of the whole "debate" over everything from Facebook to "Russiagate". The bottom line is, you can't fix stupid. We have a very serious problem with stupidity and ignorance in this country. Far too many people lack basic critical thinking skills. You can't childproof society because far too many people are easy to fool. This "problem" won't be fixed until we have a critical mass of people who can think for themselves, be skeptical and aware.

> If you have an issue with this, disable javascript and third party cookies on your browser - take some personal responsibility.

I tried and doing so breaks pretty much every website. Because it’s not only the tracking stuff that makes use of JavaScript and third-party cookies.

Ignore him. This is one of the classic responses to any challenge of the status quo. If you don’t like the way things are then you should upend your entire life, go live in a cave, subsist on bugs and stop challenging my world view.

Tl;dr He was trying to sound helpful while telling you to go fuck yourself.

I was doing no such thing. I was simply telling him the only realistic way to cut down on third-party tracking.

Using uBlock Origin, uMatrix, or NoScript gets you very far in this regard. You don't need to whitelist that many scripts on each site before they start functioning.

It may, but unfortunately you have limited choices. You can also browse in an incognito window, which will at least disable tracking from session to session. However, even then your IP address is being associated with certain behaviors.

The reality is that you have a choice: use the Internet and have some tracking happen, or don't use it at all. Even the GDPR doesn't stop tracking - it just requires better disclosure of it and mandates certain handling procedures for the data that is collected on the backend.

If you just have a dislike for certain websites tracking you, you can use the Windows (or other) firewall to block specific domains as I stated above.

Firefox has support for multiple profiles! I have a profile that I only use for Facebook, Gmail, and Twitter, and one that I use for everything else. Probably not perfect, but provides more isolation.

I beg to disagree. GDPR doesnt stop the tracking but forces the one doing the tracking to get your opt-in, FREELY given consent for each and every use of private data. And if you require consent, this is then not freely given and in such case invalid (forget about trackwalls).

(I am not talking about criminals here, they wont care)

Effectively this means that if user doesnt want to be tracked you wont track it. And that you cant load facebook button on your page if you dont get consent (or you are liable for lawsuit, by user and, the more interesting part, by facebook as you are providing them with illegal data.

Please read this, you are taking GDPR much too shallow... it is not and quite a few companies will have problems as they were too lazy to read the GDPR but were rather relying on messed up opinions on the internet.


Does GA share data outside of GA? Last I looked into it, I came away thinking that data is kept to privacy standards equivalent to e.g. Google Sheets.

It appears that they do. From [1]:

When you visit a website that uses our advertising products (like AdSense), social products (like the +1 button), or analytics tools (Google Analytics), your web browser automatically sends certain information to Google....we may use the information we receive to, for example:

Make ads more effective....

[1] https://policies.google.com/privacy/partners

In this case, the "may" part in there is relevant, and not just an attempt to add in ambiguity to defuse things.

The Google Analytics property owner has to explicitly opt-in to allow Google to do this. And part of that opt in includes the site owner agreeing to an addition ToS certifying that they both disclose they do so and have appropriate consent to do so. You can learn more about that at [1].

In practice, most site owners toggle this feature on without realizing the liability they've agreed to. Because it enables additional reports in GA (by merging and exposing the demographic targeting data their ad system has), as well as pushes GA data into Adwords and DoubleClick if you want to link your accounts. But, Google does keep the GA data siloed off by default.

[1] https://support.google.com/analytics/answer/2444872?hl=en

But the end result is that most GA widgets are reporting data that is being used by Google to improve ad targeting, since most site owners turn the feature on.

[citation required] but even if true, getting site owners to opt-in to exchanging their user's privacy for psychographic and demographic correlations seems materially different than Facebook's "no opt out" on a relatively unrelated feature like a Like button on a 3rd party site. They used to and could again make the button work without using it to gather browsing data.

75% of 100,000 is 75,000

12% of 1,000,000 is 120,000.

Not advocating for one side or the other, just correcting the math.

One person commented on 12% of top 1m sites. Another is commenting on 75% of top 100k sites.

EDIT: Checked the source material links and GA is on 65% of the top 1m. That is definitely more, and a more useful comparison I think.

2003 called and wants it HTML back. Seriously, the web today with JavaScript disabled?

Grandma configuring her what now?

I'd like to see traffic distribution against the sites which use facebook. I suspect your p90 for all traffic is against a large proportion of sites that use these and thus "virtually" every site in fact is closer to true than your raw numbers.

Well that’s a bit of a self-fulfilling prophecy. Most sites that pay to advertise on Facebook are going to use the pixel for retargeting and conversion tracking. But only 25% of the Quantcast Top 10k sites use the pixel, and the number for the whole Internet is 0.4%.

I wouldn’t say that anywhere near most of my browsing originates from Facebook ads...in fact I’m not entirely sure I have ever clicked on one except to view competitors’ landing pages. I’m guessing that most other people don’t limit the sites they visit to those they are shown in paid Facebook ads either. So the overall numbers are extremely relevant.

You missed my point. I’m suggesting that the 75th percentile of traffic goes to sites that probably have Facebook integrated via like button, ads, auth. (Way that fb can track you.)

Therefore most traffic on internet is traceable. Key being traffic not count of sites in my analysis.

Well, that's what the Quantcast numbers are for...the top 10K sites are the top 10K trafficked sites in the world. And only about 25% of them have the Facebook Pixel.

Pixel isn't the only way to embed Facebook on the page mate. Facebook plugins and facebook auth also.

I went through the stats for those above as well. I used the Pixel because it's by far more prevalent than any other Facebook-hosted stuff (plugins etc).

Here are some numbers [1]:

> Over the last week ... the Like button appeared on 8.4M websites covering 2.6B webpages, the Share button on 931K websites covering 275M webpages, and there were 2.2M Facebook pixels installed on websites

[1] https://twitter.com/facebook/status/985999292862152708

Yes, but that 8.4M websites is out of a universe of ~375 million websites - so it's a very small percentage (I am using the number of sites that BuiltWith tracks, which probably isn't all websites, but it's a minimum number). The pages number is meaningless, because sites that use it will use it on every page they have - if I have a site with 2 million auto-generated pages, then I account for 2 million of those web pages.

You won't see the locally hosted button suggested on Facebook's website, so it's far less common than the tracking version.

12% of websites can still reveal a lot about you - a few news sites and they'll know your political leaning and topics of interest. A recipe site and they could deduce your race. A lyrics site can indicate your gender, race and age. Etc.

If the denominator of that 12% number includes adult websites, which presumably know you don’t want to share to facebook, then this number may mean something a little different than many will assume it does

> Facebook tracks virtually every website you visit

That have placed Facebook buttons or analytics code on their site, which Facebook has no way of forcing anybody to do. So if you don't like sites giving Facebook your info, maybe you should raise this question with those sites?

and if someone sells you poisoned food, maybe you should raise this question with that vendor, instead of advocating for nationwide safety standards?

If they bought from a poison vendor and then sold it to me as food, then definitely they are to blame and "safety standards" have nothing to do with it - they knew what they are buying and selling, it's not some kind of mistake or accident. Facebook buttons and tracking scripts didn't just seep into sites by accident because of poorly maintained seals or rusty firewalls and weren't fraudulently sneaked in by spies sent by Facebook - they were deliberately installed by site admins, because they wanted to install this precise code, with full knowledge of what it is for. Site admins bear responsibility for them.

if someone uses emotionally charged but utterly inapplicable rhetoric to persuade you, should you dismiss them out of hand?

poisoned food indeed

Non-technical people don't necessary know that adding a Facebook button means tracking people. Because it's the complete opposite of common sense, Facebook buttons to a non-technical users are just buttons, and Facebook could have built them that way too.

> Non-technical people don't necessary know that adding a Facebook button means tracking people.

They should then hire people who understand it, or not put these buttons on their sites.

Docs for FB pixel say: You can use the Facebook pixel to understand the actions people are taking on your website and reach audiences you care about. "Understand actions" means tracking. Who is doing the tracking? Well, it's called Facebook pixel, how many guesses does one need to get that one?

Pretty much all other FB gadgets download content from Facebook. If site owner doesn't understand that means tracking by Facebook, maybe they need to educate themselves or ask for advice before blindly copypasting third-party content into their site? Facebook button is way not the worst thing one may copypaste from the internet...

I think though they realize it and they are completely ok with it, in exchange for the traffic that gives them.

> Because it's the complete opposite of common sense,

No, it's not - if you put a Facebook button on your site, you are bringing Facebook into your relationship with your site visitors. Then you learn Facebook is part of that relationship - what a surprise! Come on. If you don't want FB as part of the deal, don't invite them in.

You are talking about the Facebook pixel, not the Facebook button, they are not the same thing.

Facebook does not warn website owners that adding Facebook button with the official method tracks users (I wonder why...): https://developers.facebook.com/docs/plugins/like-button

From the Facebook docs: "A single click on the Like button will 'like' pieces of content on the web and share them on Facebook. You can also display a Share button next to the Like button to let people add a personal message and customize who they share with."

I see no mention of tracking there. And "people should guess" isn't really a viable excuse to me, it should be stated that their button tracks people and might be unsuitable depending of what they want to do. The documentation should also offer other non-invasive methods to do it.

"Surveillance capitalism" https://www.schneier.com/blog/archives/2018/03/facebook_and_...

Bruce Schneier's article on this is well worth reading - he takes a broad view of issues both with Facebook but also with the wider industry. And he discusses it from a perspective of societal benefit and harm, which I think is useful.


But for every article about Facebook's creepy stalker behavior, thousands of other companies are breathing a collective sigh of relief that it's Facebook and not them in the spotlight. Because while Facebook is one of the biggest players in this space, there are thousands of other companies that spy on and manipulate us for profit.

Harvard Business School professor Shoshana Zuboff calls it "surveillance capitalism." And as creepy as Facebook is turning out to be, the entire industry is far creepier. It has existed in secret far too long, and it's up to lawmakers to force these companies into the public spotlight, where we can all decide if this is how we want society to operate and -- if not -- what to do about it.

There are 2,500 to 4,000 data brokers in the United States whose business is buying and selling our personal data. Last year, Equifax was in the news when hackers stole personal information on 150 million people, including Social Security numbers, birth dates, addresses, and driver's license numbers.

You certainly didn't give it permission to collect any of that information. Equifax is one of those thousands of data brokers, most of them you've never heard of, selling your personal information without your knowledge or consent to pretty much anyone who will pay for it.

Surveillance capitalism takes this one step further. Companies like Facebook and Google offer you free services in exchange for your data. Google's surveillance isn't in the news, but it's startlingly intimate. We never lie to our search engines. Our interests and curiosities, hopes and fears, desires and sexual proclivities, are all collected and saved. Add to that the websites we visit that Google tracks through its advertising network, our Gmail accounts, our movements via Google Maps, and what it can collect from our smartphones.

And unlike facebook, Equifax DOES sell/license your personal data to third parties. I worked somewhere that bought it.

At the begining of the year there was a huge roar about the GDPR. Now it finally becomes more understandable by everyone why EU has created it. And quite frankly, rest of the world will need to start pushing for the same or even harsher legislation. Companies has pushed the whole thing too far :(

GDPR will to large amount take care about 3rd party tracking based on two things: the consent needs to be given for each and every 3rd party tracker site before it is loaded and another thing is that the website owner will be held liable if 3rd party provider will abuse the data without consent or use it for non stated purpose (sell it to someone else f.i.).

If you are more interested into details, here is an excelent explanation: https://www.youtube.com/watch?v=-stjktAu-7k

I'm arguing with Facebook about getting access to the data they have from tracking me as opposed to what's available when you download your profile data at the moment, having closed my account a while back.

I'm literally waiting for the day that GDPR comes into force to exercise my right to be forgotten.

How’s that going? I was pretty surprised when I’d downloaded my data to see they hadn’t given me so much of what I know they have...

Pretty badly to be honest[1], seems like their local data protection authority is Ireland, who like to sit on their hands. That's partly why I'm waiting.

[1] - http://www.europe-v-facebook.org/EN/Get_your_Data_/get_your_...

Are you simply wanting to see your data, or do you want it deleted? After GDPR comes in I’m going to be looking into the latter.

Both. I want to:

1. See what data they hold.

2. Have it deleted.

3. Formally withdraw consent to the collection of future data.

4. Follow it up in 6-12 months with a subject access request.

I have heard that the facebook "download your data" feature on the website does not give you all the personal data they hold. For that you need to contact them directly (maybe a letter), and quote data protection law to them.

to be fair, I think it is worth educating people that facebook definitely isn't the only actor doing this. It is pervasive.

I'm so tired of every corporation comparing themselves to others to justify everything. "Well GE hid losses for decades with fradulent accounting so we can too!"

How about just stop exploiting a technically illiterate government that is incapable of regulating you? You know when you're doing it, everyone working on an analytics engine knows exactly what the fuck they are doing, trust me. How about you just stop?

And even if they claimed exactly otherwise, I wouldn't trust them. I'm not sure there's anything that can be done at this point to undo that distrust.

You should ask yourself the question whether it was Facebook or the media that made you distrust Facebook.

Because I can’t recall a single instance of when Facebook or Zuck lied to the public... but I recall plenty of lies the media has spread about Facebook

Someone doesn't have to lie to lose your trust.

It's like having a camera pointed at your neighbor's house, but a company's neighbor is the whole internet.

Thanks for summarising, because I can't see the actual article, because I applied the fb blackhole list to my /etc/hosts, for exactly this reason.

>Facebook tracks virtually every website you visit

How can I know that I'm infected? How to disinfect myself and feed them false info?

> or don't have an account. We use this data

And should stop that. Because not the police.

Is this possible if you are using something like ublock and/or ghostery?

Good summary!

Yeah when you put it like that it sounds bad, but Facebook really are right that literally every other advertiser on the planet does exactly the same thing.

Honestly I think the bigger issue is that Facebook tracks you on third party sites when you are logged in to Facebook - in those cases they easily know your exact identity. Most advertisers can't do that. Google could, but I'm not sure if they do or not.

This is getting so ridiculous, I really dislike how Facebook is using the issues of late as a way to "reeducate" people on how it's ok to be doing what it's doing. It's not ok, and not acceptable.

It's a shame we've allowed a web to be built which allows for this kind of exploitation.

I'm also acutely aware that it's not just Facebook.

I'm not entirely convinced that if we had a chance to do it over that we wouldn't just repeat the same mistakes. Are there other massively profitable businesses that are free to users without some form of exploitation? Advertising and the industry that surrounds it is absolutely tainted as they chase analytics to make each set of eyeballs most profitable to themselves. How do we get "free" services while also having great privacy control in a way that the company can operate at a scale like Google or Facebook?

> How do we get "free" services while also having great privacy control in a way that the company can operate at a scale like Google or Facebook?

Can you justify why we as a society should want companies to operate at a scale like Google or Facebook?

> Can you justify why we as a society should want companies to operate at a scale like Google or Facebook?

The simple (and inevitable) solution is to break up Facebook. No need to even get that complicated, at least to start: Facebook, Instagram, WhatsApp.

> The simple (and inevitable) solution is to break up Facebook.

There was a time when anti monopoly laws were taken seriously. That you are the number one social network should not be used to also become a big player on advertising, selling items, casual games, etc.

Letting companies leverage on a monopoly to gain other markets is dangerous for the economy.

> There was a time when anti monopoly laws were taken seriously

Over the past century, economists came up with a measure of firm concentration called the HH Index [1]. The HH Index is "the sum of the squares of the market shares of the firms within the industry (sometimes limited to the 50 largest firms)". (It's derived from the Simpson ecological diversity index.)

Market share is a measure on customers. The DoJ used the HH Index to measure monopolies because monopoly power was understood to result from excessive market share. The case of non-paying consumers was never legally considered.

TL; DR More than apathy explains the delays in bringing antitrust action against Facebook.

[1] https://en.wikipedia.org/wiki/Herfindahl_index

> The case of non-paying consumers was never legally considered.

Is there an already-available analogous quantity to market share here? Or do you have thoughts about what might be used?

BTW, it's interesting that the HH index uses sum of squares to measure concentration of the distribution, where something like the Shannon entropy seems more natural -- although, "seems more natural" is subjective I suppose.

Facebook, Instagram, WhatsApp, Messenger.

I only add the latter because it's separate enough already, and you can now sign up "without" a pre-existing fb account. Is it similar in function to WhatsApp? Yep, but if we split the company along product lines Messenger should be considered a suitable product.

Having worked in ad-tech for a while, and dealt with cookie matchers and retargeting -- I'm actually surprised to the degree to which companies keep pursuing this stuff .. because it wasn't clear to me from the statistics that it was actually effective.

I was involved in the relatively early days of this, so probably techniques have improved. But I worked for a startup that was doing search retargeting, among other things, and when I ran the numbers I just didn't see a huge increase in click-through rates. The data just wasn't there.

Like I said, I'm sure techniques have improved. But I still continually see ads retargeted to me that are both creepy and ineffective. Marketing me things I've already bought. Or lost interest in. I can count on one hand the number of times I've seen an ad based on retargeting that I thought "oh yeah I should pursue that."

> It's a shame we've allowed a web to be built which allows for this kind of exploitation.

Allowed? We volunteered for it.

Eh. We should want innovation - to "let a thousand flowers bloom". But when companies start getting so big they are a detriment to society in some way, we should regulate them.

We let train and oil companies do their thing in the late 1800s - which helped society. And when they got too big and damaging in the early 1900s, we regulated them. We're on the same course again, I think.

The same way I 'volunteered' for Selective Service in the US when I was 18.

No, way different.

You can choose to not use the internet, but that's probably akin to choosing to living in another country to avoid registering for the secret service.

I really don't understand how people are being exploited here

“We give you a number of controls over the way this data is used

“You can remove any of these advertisers to stop seeing their ads.”

“Finally, if you don’t want us to use your Facebook interests to show you ads on other websites and apps, there’s a control for that too”

Reading that critically, there doesn’t seem to be any way for Facebook users to control what data Facebook collects.

I wonder whether that page will change due to the GDPR.

Facebook would legitimately be able to argue that the ad targeting is part of their core product, the ads are critical to the function of the website, and that you agree to have your data processed by signing up.

From my understanding of the GDPR, no I really don't think that they would be able to argue that, actually. Particularly when I'm not logged into their site.

IANAL, but I think it depends on what data they collect.

Part of running any advertising business is accurately billing the advertisers. To do that, they need to measure enough information to track viewed impressions, click-through rate, etc.

If they collect only enough information to perform business functions like these, I believe that would constitute a legal basis for processing under the GDPR.

They may have a harder time using that justification for information that is only needed for ad targeting. In that case, consent may be an easier legal basis for them to establish, for which an account would be useful.

remove ads from facebook and you still have the functionality of facebook that their “data subjects” care about

so that’s likely a no. you could probably not argue that successfully

*edit: with that being said, i’m almost certain that there will be some way people find to offer ads and targeting with the GDPR

“Ad targeting is critical for the functioning of our core product”, _and_ ”if you don’t want us to use your Facebook interests to show you ads on other websites and apps, there’s a control for that too”?

Possibly, but that would mean a narrow interpretation of ”Facebook interests”. Data obtained through tracking pixels on other sites technically could fall outside ”Facebook interests”, for example. I don’t see how that would fall within the GDPR requirement that they must tell users what they do with their data in a language that the users understand, though.

Two questions Facebook needs to answer:

1. If I don't have a Facebook account, how can I delete the data Facebook has collected on me without creating an account?

2. If Facebook doesn't sell data, what is the purpose of shadow profiles?

> If Facebook doesn't sell data, what is the purpose of shadow profiles?

Presumably to sell you ads, which Facebook doesn't consider selling your data--it's just selling access to you, which is only valuable because of the data. Does it really matter if they didn't sell your information if they just used it to show you the same ad they would have showed if they had sold your data? It seems they're relying on the idea that advertisements are somehow an OK form of privacy violation.

> Does it really matter if they didn't sell your information if they just used it to show you the same ad they would have showed if they had sold your data?

On the surface, that's actually a quite meaningful distinction.

But, less so when you consider that businesses go in cycles, and one day Facebook/Google/whoever could get desparate enough to decide they need some extra cash... or they could go bankrupt, and have it auctioned off...

But, the data is ultimately only useful (legally) in order to spur clicking that you get a cut of, or to deliver a prominent message/ad. Does it really matter that it’s facebook that gets the cut rather than some faceless broker? I am still assaulted with ads targeted on my personal data, and with political ads that are obviously manipulative. That (political part) is the part that ultimately unnerved people, and that part isn’t going anywhere while ads form the basis of facebook’s business.

Data collection as a whole is certainly a booming market, but it allows facebook to bypass the very reasons they were called into congress: mass manipulation. Facebook is a platform of mass manipulation. Ads are just the polite name. All of us realize that CA is a small fish compared to the other data warehousing companies out there. Does congress?

What happens when your Facebook/etc data is used to deny you health insurance because you appear to be at risk for cancer? Or to deny you a job because you listen to the same style of music associated with a demographic that has higher-than-average rate of criminal records? Or to give you a higher home mortgage interest rate?

There a lot more people who would buy all the (normally private) details in Facebook profiles than you are guessing. Imagine if political advertising companies could purchase just one bit of it - say, your email address - from Facebook.

Targeting ads is the tip of the iceberg in terms of what could be done with the data if Facebook/Google/whoever gets more desperate for revenue.

How can they show me an ad if I don't have a Facebook account?

Two ways:

1. Facebook runs its own ad network, so you might see Facebook ads even if you never go to Facebook's website.

2. You might still go to their website even if you don't have a Facebook account. Many local restaurants and community organizations only have a Web presence through their Facebook pages.

Facebook runs an ad network (Facebook Audience - how successful, or how much used this is, I don't know though.

If you're visiting a site or using a mobile app served from facebooks ad network, you don't need to be on facebook or have a facebook account to be shown an ad.

From the post: "Facebook ads and measurement tools, which enable websites and apps to show ads from Facebook advertisers, to run their own ads on Facebook or elsewhere, and to understand the effectiveness of their ads."

By having data about you facebook can infer stuff about other people who are on Facebook and serve better ads to them.

With #2, they don't sell the data because they don't need to, and because it would threaten their profits.

They use the data themselves to categorize people and sell advertisers access to the categories. As an advertiser, I don't get a list of people - Facebook keeps that hidden. What I get is the knowledge that my ad is shown to the people I want in the category I want, because Facebook does that part for me. Shadow profiles enable them to do this in a much broader way, and also allows them to build growth engines into their system so if you do decide to join facebook, you see hundreds of suggestions that make you more likely to get addicted.

Is there a rock solid proof that shadow profiles exist? I only ever seen anecdotal evidence... and Zuck said he's not aware of them.

The wikipedia page doesnt have much:


Mark Zuckerberg said yes, but he doesn't call the data a shadow profile [1].

With apologies to Shakespear:

"What's in a name? That which we call a shadow profile By any other word would smell as repugnant;"

[1] https://www.usatoday.com/story/tech/columnist/baig/2018/04/1...

1. how can I delete

Why should the burden be on the non-user?

Facebook put this burden on non-users, and they're currently arguing that this is "OK" in their probe.

1. You can’t. They didn’t have an incentive to prioritize this on product roadmap, at least not until now.

2. Most likely to jump start new accounts. They’ll be able to target ads at your new account better if they already have years of data about your interests.

> at least not until now.

They have it to finish before the GDPR enforcement date (25 May 2018) - at which time those organizations in non-compliance may face heavy fines.

So, all companies big and small need to have a good data governance that allows to delete data on demand. And, better than that, stop collecting any data that they don’t have been authorized to get.


When asked question 2 before the US Congress, he repeatedly said that it's for security reasons, something about identifying attackers. It sounded to me like a corporate version of crying "national defense!" They really need to give a full accounting of what those are used for.

We know the answers to these, don't we?

1. There is no way to delete it. Facebook owns it forever.

2. Owning more data is better, for selling ads, for studying trends and behavior, for purposes as-yet ill-defined, and just cuz. And they don't sell data, they give it away.

> And they don't sell data, they give it away.

I don't think they sell or give away tracking data. All the data disclosures so far has been about social graph data, which is very different data set. Detailed traffic logs from network like FB would be something that is very hard to collect, wouldn't make sense for them to give such a valuable resource out for free.

Yeah, probably right, I'm just sort of making the point that "it's always worse than you think" and/or that the worst-case scenario is probably always the correct one where Facebook data collection is concerned.

It's not your data because your browser fingerprint hypothetically coincides with somebody else's. Deleting the data would delete somebody else's data too.

No, it's still my data even if FB mixes it up with someone else's. Deleting just part of the data is their problem, not mine.

You pose an interesting question actually, when is data you generate, "your data"?

If the data is only associated with a browser fingerprint then I see the argument for it not being your data; it's just data. It's like saying Google Street view has your data because they took a picture of the front of your business or your house. Does it make a difference if it's your business or your house? If you move out, is it still "your data"?

Next, if they have your friend's contacts, is it "your data"? I sync my contacts with a number of services, but I've tended to think of this as "my data", not my friend's data. Is it because I grew knowing about the white pages; so assuming phone numbers are public, a collection of phone numbers to name is my making therefore of my property? Even if it's a collection of data about my friends?

Lastly, I think we'd all agree that combining your name and your browsing info would constitute "your data". Is it because there's non-public data associated with your name? Names aren't really unique, is it because the combination of your name and your browsing info IS unique?

Also, is facebook capable of doing this? If so how?

Never really thought about this before actually.

Ok, you can delete the data, but you can't request Facebook to send you the data it has collected on you.

#2 is probably to jumpstart profiles, but also to have a more accurate representation of the social graph. If I sync my contacts and you show up there, and my friend syncs her/his contacts and you show up there too, assuming phone number are fairly unique identifiers, they can know that we share a mutual friend and that could contribute to their analytics.

Facebook doesn't sell data, they sell the ability for other people to microtarget. If the result of all this is a law saying that Facebook/etc cannot sell your data, then Mark Zuckerberg will be very happy.

If Facebook need to answer this question then so do Google, Twitter, Amazon and basically anyone else with a big enough footprint.


All I know is that the Pi-Hole at our house says Facebook domains are blocked way more than anything Google has running (w/o the numbers in front of me, I wouldn't be surprised if it's an order of magnitude difference). And neither one of us has a FB account, that's just from the craplets littering your average web page.

Kudos to FB for posting this in understandable languague. OTOH, it does have the scent of trying to get ahead of $SOMETHING.

"When you visit a site or app that uses our services, we receive information even if you’re logged out or don’t have a Facebook account."

Then at the end under "What controls do I have": "you can opt out of these types of ads entirely — so you never see ads on Facebook based on information we have received from other websites and apps." Except that that's only possible when you're logged in. The link they provide doesn't work for logged out users. It seems they don't provide any control tools for logged out users/non Facebook users.

Keep in mind they don't say they won't collect and store the information. They just won't use it to customize your ads. They are still collecting the information and will find other ways to use it.

Also, opting out of the ads doesn't opt out of the data collection. The problem is the privacy violation; once they do that, I guess targeted ads are better than untargeted ones. (Or do they mean that they won't share my data with advertisers if I opt out?)

You can opt out of FB ads without having an account via http://optout.aboutads.info.

Not sure why their site only sometimes shows this link :/.

That requires me to enable 3rd party cookies.

The type of ads you see on FB is only relevant to FB users.

Facebook also has the audience network, which serves ads on other websites and apps.

Then I seem not to understand the benefit they see from tracking non-users' information - how can this be of benefit to advertisers if these users are never targeted? In particular, what benefits would be generated by linking this tracking information with the real identify of a non-FB user (that couldn't also be leveraged in a semi-anonymous way)?

In other words, I see the benefit to FB of being able to say "Person A likes 1, 2, 3; User B likes 1 and 2 - perhaps they also like 3?" where A is a non-user and B is a FB user, but I don't see the benefit derived from identifying Person A to advertisers if they can only target that person on FB, where they aren't a user?

They don't allow advertisers to target non-users. I'm sure they use non-user data for clustering, just as you described.

They also probably use it when the user finally does create a FB account.

Contrary to most comments here, it is significant that other companies do. Facebook competes in the free market. If they scale back on data collection, that will hurt their offering to advertisers and cost them money that will go to Google and others. That lost revenue will harm their ability to retain talent and build new products, and ultimately cost them their user base. “We don’t track you” is not as compelling a feature as “free video chat across the globe.” Facebook doesn’t operate in a vacuum. They can’t scale back data collection unless others do. Moreover, the United States doesn’t exist in a vacuum either. Scaling back all US companies will give a leg up to competitors in more lenient jurisdictions.

This is the same logic as organized crime using violence. It’s part of their core business, and if they scale down on violence, other organization will have a leg up on them anf they lose revenue.

We don’t accept mafia escalating violence for profit. I’m OK with facebook (the biggest SNS) losing business because they can’t abuse the system enough.

What about the others ? same can be applied to them, and it doesn’t need to be done in order (no need to wait to regulate Google or twitter to touch facebook)

This is an important point and I wonder about it often. Especially the last one about countries that are lax on individual privacy. Imagine what China or Chinese companies could do with the genetic information they collect from their population. The lack of regulation is scary, but at the same time, they will be at the forefront of genetic engineering and designer humans. Can't think of any solution other than a World government. EU and AU has shown it is possible.

Is a world government necessarily a bad thing?

A union of federated states globally could be a significant force for good.

Impossible for the forseeable future of course but why immediately dismissed as a scary outcome?

I meant it as a good thing. Some things will need to be regulated at the global level. We are already seeing global problems - global warming being the prime example. I think we will end up either destroying ourselves or end up with a world government. World government will clearly be the better outcome. How that government works would also be an interesting topic for science fiction. Hint: I don't think it will be a democracy.

This is why laws are a good solution. Force all companies to abide by privacy laws (like the upcoming EU's GDPR). Then FB's competitors will not have an advantage.

Also I don't think most people would be willing to pay for Facebook which would be the alternative to surveillance and advertising.

There is no option to pay them to not see the ads. There is no option to pay them so they don't gather information about You. THEY OWN YOU. Even if Facebook gets out of business, other will do it. Your phone, your TV, watch, fucking juicer whatever connected to internet is collecting information. THERE IS NO WAY TO OPT-OUT. It may sound creepy and disconnected from reality, but in the core that's exactly what it is. You may not like it, You can disagree, but it happens anyway.

This is true in a practical sense for most laypeople, but it's not literally true and doesn't have to be.

(1) Don't buy "smart" devices that collect data. You don't need them.

(2) Don't visit facebook.

(3) Don't allow your browser to run javascript from facebook or anyone you don't trust.

It's that easy, and there are tools to help you do (3) such as adblockers and other browser plugins. For (2), even if you occasionally visit facebook, you can minimize information collected by controlling the environment you visit from (e.g. tor browser, VPN, virtual machine, etc).

The more challenging problem is data that your friends give Facebook about you, such as your contact info and tagging your face in photos. I wonder how long under GDPR till someone sues their facebook friends for providing information without permission?

> There is no option to pay them so they don't gather information about You.

I would absolutely 100% not trust that they wouldn't track me anyway. How could they not? Even if I trusted them (which I don't) it isn't possible: they'd at least need to know it was me where-ever I want so they'd know to not track me as I'd paid...

Would you pay someone to not follow you down the street? If you did you'd have a queue of people following you to collect their "share".

Yes. They need to identify You to know that You don't want to be tracked. Also DNT header is like laughable IT meme.

Typing in BLOCK CAPS doesn't strengthen your argument.

It's bold replacement.

“How Capital Letters Became Internet Code for Yelling”


There's a difference between emphasizing something and yelling it.

You can use * (asterisk) or _ (underscore) surrounding a word or words for emphasis. Like this (asterisk) or _this_. As you can see, asterisks get interpreted by HN into italics. So maybe you can think of underscores as _bold_?

Ok. Since editing is no longer available, i'll use that notation in the future. Thanks!

You're more than welcome!

It's interesting that the "Privacy Controls" seem to only change how you see the data they've collected about you, but doesn't seem to change how or when they collect data about you. That doesn't sound like privacy control at all!

> What controls do I have?

> As Mark said last week, we believe everyone deserves good privacy controls. We require websites and apps who use our tools to tell you they’re collecting and sharing your information with us, and to get your permission to do so.


I downloaded Facebook's data on me and found about 40 websites they said they had shared my profile with. I didn't authorize that. FB doesn't give me the ability to "revoke" permission to have my info shared with them, either. Nor does it allow me to request that the information on me given to those third parties be deleted.

Actually, you absolutely can on https://www.facebook.com/ads/preferences/

- Advertisers you've interacted with

- - Who have added their contact list to Facebook

Hover over those you don’t know about; the X let you “Remove” those, i.e. ask Facebook to exclude you from those advertisers’ targeting.

If you are in Europe, you should be able, starting next month, to leverage that list to ask those advertisers how they got your data, and expose data brokers. Facebook is simply (finally) offering you the ability to track data brokers.

Once the data has been given to a third party, we can no longer be guaranteed that the third party will delete it. That was one of the main problems with Cambridge Analytica -- they claimed to have deleted data but may not have actually done so.

Any company can upload a list of email addresses to facebook and say they have a relationship with you and then facebook will allow that company to market you. Its obviously against ToS to go out and purchase a list of emails and upload it to facebook and show those people ads. But uh, facebook can't prevent that without killing the feature. So in this case its not facebook sharing data, but companies lying to facebook and facebook going along with it to get paid.

FYI I've read that the "account download" doens't give you all the data. If you're outside the US & Canada, you can make a formal request


> I downloaded Facebook's data on me and found about 40 websites they said they had shared my profile with.

Which section was this in the downloaded archive? I'm not aware of any situations which this would happen except using "Log in with Facebook".

"other people mug passers by too!"

Can we just avoid this defence. The first one you catch, you prosecute fully and punish appropriately according to justice. Almost nobody knowingly signed up to be tracked everywhere they visit on the internet. Hiding it in fine print is every bit a con. You have ensure your conterparty understands the full extent of the contract or you have fraudulently obtained consent.

Facebook now. They should be fully prosecuted. Then we catch the next crook, and the one after until the lawlessness is dealt with.

Throwaway for reasons.

My wife and I enjoy porn in our relationship. On my birthday last year, we were settling in for some, romance, and we open up pornhub only to find ads trying to entice me into a discount subscription for my birthday. They knew it was my birthday! Which, really freaked me out because:

1) I don't have or never have had an account with them. I've never given them my age and birthday. In fact, I never give out my birthday on the internet. I don't even have a Facebook, twitter, or any other social media account.

2) It was on my wife's tablet. I never ever use her tablet, especially for browsing porn.

Anyone have an idea the machinery involved in making this determination?

If you have two devices in a house and one never accesses Google and the other does, Google will know it's you using the non-Google device each and every time.

There's more to mining you then cookies and tracking pixels. They use IPs, hardware and software profiling, behavioural patterns, and even the way you type on your keyboard (this is very, very accurate.)

In short: you need a "clean room" (Tails) every time you go online and you need to change where you're coming online from (VPN). You can also use tools to change your keyboard/typing "profile" so as to cripple that means of tracking. Oh and you can't login to virtually any online account at all, ever, because these cookies can be used too.

So if you use a VPN and incognito mode without logging into any online account, Google, Facebook, et al can still track you accurately?

Curious any other tools besides vpn and private mode you would recommend?

Any more links to read about these behaviorial, software and and type style tracking methods?

https://ssd.eff.org/en -- this is a great start and will lead you down a rabbit hole of links, tools, and intrigue :-)

Just a hunch, if you were accessing from home, most likely your IP address.

Maybe they took a guess it was your birthday and you only noticed it because it actually was.

> What Data Does Facebook Collect When I’m Not Using Facebook

Absolutely everything they possibly can.

> and Why?

Because they want to in order to wring every last fraction of a penny out of the worth you are as a data point to sell to everyone else.

And because no one is stopping them.

> As Mark said last week, we believe everyone deserves good privacy controls. We require websites and apps who use our tools to tell you they’re collecting and sharing your information with us, and to get your permission to do so.

I have NEVER been told by a website that they are doing this, let alone asking for my permission.

The most important question is about "informed consent".

Yes, users consented to giving data to facebook. But they did so via a convoluted click-through legal agreement, which almost no one reads, and which denies Facebook access if you decline. Then they can use your data for what are human subject experiments.

If any other human subject experiment was proposed to a review board with that kind of consent form, it would be laughed out of the room, at best. No way would that be allowed.

"Informed consent" is what Facebook -- and other data companies -- should be seeking. We already have this concept for data and user protection in research. We should just extend it to the private sector.

Wow, wow, wow.

- All others are doing it, so we're doing it, too and that's okay.

- We collect data on you, even if you don't have an account

- If you want to control your data, you need an account

- Despite collecting even more (like your phone number, name and additional info) from your friend's address book they shared with us, we don't inform you about that and surely, there's no way for you to delete that info.

Fuck you, Facebook.

Not one mention of Adobe here after 12 hours. Y'all ain't seen nothing yet.


> We require websites and apps who use our tools to tell you they’re collecting and sharing your information with us, and to get your permission to do so.

Am I missing something or does this line conflict with the rest of the post?

I think the technical term for this is "Passing the buck"...

Shameless plug, I have a little open-source social sharing icons project that doesn't allow FB or any other services to track your website's users unless they choose to share something:


Whatever we consider “data” right now quite simply needs to be thrown out, replaced by things that are MUCH more temporary (e.g. where even the largest “collection” of data becomes useless within a week). And, access needs to be explicitly revocable at any time by either party. There should actually be a difference between, say, me giving you “phone number” and me giving anyone else “phone number” (directly or indirectly). Heck, even for my street address I’d feel better if I only gave out keys that everyone had to cash in at the post office and let the post office convert it to my real house number.

There is no technical excuse at this point to have so many different bits of sensitive information floating around. And many of these pieces of information are permanently associated with you, meaning that any slip-up anywhere at any time can create a problem later on.

> In addition, you can opt out of these types of ads entirely — so you never see ads on Facebook based on information we have received from other websites and apps.

They have this listed as a privacy control, but it doesn't provide any way to tell them to not compile a list of the other sites you visit outside of Facebook.

Facebook is an outright PR war attempting to deflect their NSA scale data collection. If Facebook went out of business tomorrow the world would be a better place.

Question: Will I be able to force Facebook to remove my shadow profile once GDPR goes live?

Up until now, wasn't Facebook collecting large quantities of data about users from data brokers?

Why were they acquiring that data about users?

In the leaked memo from a Facebook senior executive, what did he mean by "questionable contact importing practices"?

Do contacts count as collected data?

Why does Facebook collect contacts data, and why does FB collect it in ways that are "questionable"?

Why are the contact importing practices questionable?

They may not know everything about everyone, but they are working on it.

$500B market cap: the market thinks they will succeed.

Just spent 20 valuable minutes blocking ads from all the random companies, political campaigns, and celebrities who've somehow gotten hold of my contact information and added me to their list. Americans for Prosperity had me on their list for ~30 states, and that's just one of scores of organizations that I've absolutely never dealt with or wanted to provide my information. What I want to know is WHO sold/shared my information to these organizations in the first place. I'd also love for FB to give me to option to hide all their ads without clicking on every.single.one (again, there were hundreds of them...)!

At what point does should web-browsers start to interfere with this data collection? I'm aware of extensions that do this, but the average user isn't going to install one (or remember to reinstall it after they've got a new computer). The Do-Not-Track header seems to be dead in the water - should browsers make a more aggressive move against this? It's obviously a conflict of interest for Google (since they both have a competing product to FB, and like to do a heap of similar data collection themselves) as well as a possible Anti-trust concern; but Mozilla, Apple and Microsoft could be stepping in here.

Firefox has containers and a special one just for Facebook. Apple will only do so if the have a way to get money out of it.

If you never go to Facebook, Safari will prevent Facebook from building a profile across the sites you visit. For each parent site, Facebook will receive different cookies, which defeats their system.

> Social plugins, such as our Like and Share buttons, which make other sites more social and help you share content on Facebook;

It's very disengenuous of them to mention that they collect data on you whether you click the button or not.

Not to mention having the gal to call it 'being social'.

War is peace. Invading privacy is social.

The comments are full of personal anecdotes along the lines of "I've been to those meetups.", and "I worked at a place that bought that kind of data.", and the article: "everyone else does it". What's troubling me is the question, "what incentives do average employees in the tech industry have to operate in a morally admirable manner?". It seem very few, if any. Sure we lack explicit moral guidelines, but what if instead of, or in addition to, focusing on punishment of companies that get ugly, what if we incentivized employees to be respectful citizens? This isn't absolving capitalism of its flaws by any means, more it's acknowledging them. But punishing evil corp is so reactive. Why is the average employee so willing to sell out their fellow citizens, and ultimately themselves, for a paycheck? The obvious answer is, "it's a paycheck". But have we really lost all ability to hold a moral compass? I'd rather not believe that, so how do we calibrate at least a basic set of guidelines that actually effect change in a global industry? Other industries solve this problem with ethics boards and operating licenses. But other industries are arguably more localized. I also suspect many here would find such things anti-egalitarian and generally be against the idea. What other options exist?

Note: I normally prefer reactive punishment because it doesn't impose worldviews and agendas on people--jury of peers makes punishment human, etc. I'm also somewhat morally relative, but it doesn't take a genius to understand that economically viable and moral are not equivalent. The issues of digital data privacy and data ownership are issues of fundamental human rights, and many of us operate under governments founded on principles defending such (point being I concede not everything is relative, and I think socially our efforts are best served defending things we have very little argument about).

Morally one could extend this to average employees in the tech industry whose work utilises online advertising. Whether buying ads, or putting ads on their services.

I'm sure there are thousands of active and passionate Hacker News users, many reading these very words that support online advertising through their job. Perhaps they are wisely being quiet? Perhaps it's because the whole issue is murky and morally ambiguous.

Now - one can make the case for ethical advertising... so why don't we?


Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact